In today’s edition of Data Points, you’ll learn more about:
- Qwen’s improved open-weights image editing model
- Yann LeCun’s last research paper for Meta
- Falcon H1R 7B, a hybrid reasoning model
- SWE-Lego, a training recipe for squashing software bugs
But first:
Google partners with leading robotics manufacturer
Google DeepMind and Boston Dynamics announced a partnership to deploy the Gemini AI model on Boston Dynamics' robots, including the Atlas humanoid and Spot robot dog. The companies plan to test Gemini-powered Atlas robots in factories by Hyundai (Boston Dynamics' parent company) in the coming months. Atlas currently excels at complex movements like dancing and acrobatics but lacks the environmental awareness and decision-making skills needed for industrial work. Google DeepMind designed Gemini to be multimodal and suited for understanding the physical world, while data collected by Boston Dynamics' robots will help improve the models’ real-world performance. The partnership includes safety mechanisms to prevent dangerous behavior, a critical consideration as industrial robots are deployed more widely. (Wired)
Nvidia debuts new reasoning models for self-driving cars
Nvidia released the Alpamayo family of open AI models and datasets designed to enable reasoning capabilities in autonomous vehicles, with a Mercedes-Benz collaboration set to deploy on roads in the first quarter of 2026. The Alpamayo 1 model uses a 10 billion parameter architecture and vision language action reasoning to handle novel driving scenarios by showing its logical process through video input. Nvidia's full-stack autonomous driving system includes Alpamayo models, a safety evaluator, classical AV tools, and Halos Safety OS. The team spent at least five years developing the system with Mercedes-Benz, whose CEO Ola Kallenius demonstrated the system's capabilities by driving uninterrupted for over an hour through heavy traffic in San Francisco and Silicon Valley. (Constellation Research)
New Qwen image generator with open weights available
Alibaba released Qwen-Image-Edit-2511, an upgraded version of its image editing model that improves consistency when modifying portraits and group photos. The model now integrates popular community-created LoRAs directly into its base version, enabling effects like realistic lighting control and new viewpoint generation without additional tuning. The update also strengthens geometric reasoning capabilities and enhances performance for industrial design applications, including batch product design and material replacement tasks. Users can try the model through Qwen Chat with the Image Editing feature, though Alibaba recommends local deployment via ModelScope for best performance. The model addresses previous limitations like image drift while maintaining the ability to edit images while preserving the subject's identity and visual characteristics. (Qwen)
Meta’s new lightweight, nonregressive vision-language model
A Meta AI team (including now-departed chief scientist Yann LeCun) introduced VL-JEPA, a vision-language model that predicts continuous text embeddings rather than generating tokens one by one like traditional models. The approach uses fifty percent fewer trainable parameters than standard vision-language models while achieving stronger performance on the same vision encoder and training data. At inference time, a lightweight decoder translates predicted embeddings into text only when needed, enabling selective decoding that reduces decoding operations by two point eight-five times while maintaining comparable performance. The model naturally supports multiple tasks including open-vocabulary classification, text-to-video retrieval, and visual question answering without architectural changes. On video classification and retrieval benchmarks, VL-JEPA outperformed CLIP and SigLIP2, while achieving results comparable to larger models like InstructBLIP and QwenVL despite having only one point six billion parameters. (arXiv)
New hybrid Falcon reasoning model outperforms for its size
Abu Dhabi’s Technology Innovation Institute (TII) released Falcon H1R 7B, a 7 billion parameter model that combines Transformer and Mamba state-space architectures to achieve reasoning performance rivaling models with up to 47 billion parameters. The hybrid design processes sequences with linear rather than quadratic scaling, allowing the model to generate long chains of thought at nearly double the speed of comparable models: approximately 1,500 tokens per second per GPU at batch size 64. Falcon H1R 7B scored 83.1 percent on the AIME 2025 mathematical reasoning benchmark, outperforming the 15 billion parameter Apriel-v1.6-Thinker and the 32 billion parameter OLMo 3 Think. The model achieved 68.6 percent on the LCB v6 coding benchmark, which TII claims surpasses all tested models including those four times larger. TII released the full model code on Hugging Face under a modified Apache 2.0 license that permits commercial use with attribution requirements and usage restrictions. (arXiv)
Training large language models to break software issues into pieces
SWE-Lego, a supervised fine-tuning approach, achieves top performance on software engineering tasks using a lightweight training method instead of complex multi-stage approaches. The system combines a curated dataset of 32,000 high-quality task instances and 18,000 validated trajectories with refined training techniques including error masking and difficulty-based curriculum learning. On the SWE-bench Verified benchmark, the 8 billion parameter version reaches 42.2 percent accuracy while the 32 billion parameter model achieves 52.6 percent. The method further improves performance through test-time scaling with a trained verifier, boosting the eight-billion model to 49.6 percent and the thirty-two billion model to 58.8 percent. These results show that a straightforward fine-tuning approach with careful dataset curation and training refinements can outperform more complex training paradigms for resolving software issues. (Hugging Face)
Still want to know more about what matters in AI right now?
Read the latest issue of The Batch for in-depth analysis of news and research.
Last week, Andrew Ng talked about proposing a new Turing-AGI Test to better measure AGI capabilities and address the hype around AI by focusing on its ability to perform work tasks as well as humans.
“AI is on an amazing trajectory of progress. In previous decades, overhyped expectations led to AI winters, when disappointment about AI capabilities caused reductions in interest and funding, which picked up again when the field made more progress. One of the few things that could get in the way of AI’s tremendous momentum is unrealistic hype that creates an investment bubble, risking disappointment and a collapse of interest.”
Read Andrew’s letter here.
In last week’s special issue, we asked AI luminaries about their highest hopes for 2026:
- Open Source Wins: David Cox, VP for AI Models at IBM Research, emphasized the importance of open development in AI to foster innovation and collaboration.
- AI for Scientific Discovery: Adji Bousso Dieng of Princeton University discussed optimizing AI models to better address niche scientific challenges and the long tail of research.
- Education That Works With — Not Against — AI: Juan M. Lavista Ferres, Chief Data Scientist at Microsoft, discussed assignments that effectively assessed students’ capabilities in an AI-enhanced learning environment.
- From Prediction to Action: Tanmay Gupta of the Allen Institute explored the development of AI systems designed for long-horizon tasks, moving beyond mere prediction.
- Multimodal Models for Biomedicine: Pengtao Xie from UC San Diego highlighted the need for medical models to integrate visual data, from tiny chemicals to large organs, to improve healthcare outcomes.
- Chatbots That Build Community: Sharon Zhou of AMD explored how expanding chatbots to serve groups enhanced community building and interpersonal connections.
A special offer for our community
DeepLearning.AI recently launched the first-ever subscription plan for our entire course catalog! As a Pro Member, you’ll immediately enjoy access to:
- Over 150 AI courses and specializations from Andrew Ng and industry experts
- Labs and quizzes to test your knowledge
- Projects to share with employers
- Certificates to testify to your new skills
- A community to help you advance at the speed of AI
Enroll now to lock in a year of full access for $25 per month paid upfront, or opt for month-to-month payments at just $30 per month. Both payment options begin with a one week free trial. Explore Pro’s benefits and start building today!