Big Updates for GPT-4 Turbo, Gemini 1.5, Mixtral, and More Plus, AI Helps Rebuild Lost Memories

Reading time
6 min read
Big Updates for GPT-4 Turbo, Gemini 1.5, Mixtral, and More: Plus, AI Helps Rebuild Lost Memories

This week's top AI news and research stories featured Google's Vertex AI Agent Builder, security holes in generated code, a series of policy violations in the GPT Store, and RA-DIT, a fine-tuning procedure that trains an LLM and retrieval model together to improve the LLM’s ability to capitalize on retrieved content. But first:

U.S. and Japan governments launch major AI research initiatives
The partnership is bolstered by $110 million in funding from major tech companies including Nvidia, Microsoft, Amazon, and others. Other partners in the collaboration include the University of Washington and the University of Tsukuba focusing on projects in AI research, entrepreneurship, and workforce development, while Carnegie Mellon University and Tokyo's Keio University will explore advanced AI technologies including robotics and AI-human interaction. (Read more at The Register)

OpenAI introduces GPT-4 Turbo with Vision for general availability via API
The upgraded version incorporates vision and audio capabilities and promises increased speed, affordability, and a larger input context of up to 128,000 tokens. The new API now supports requests for vision recognition and analysis in JSON format, streamlining the integration process for developers. Notable implementations of OpenAI’s updated model include Cognition’s AI coding agent, Healthify’s nutritional analysis tool, and TLDraw’s virtual whiteboard that translates drawings into functional websites. (Find more details at VentureBeat)

Gemini 1.5 Pro launches globally with enhanced audio capabilities and developer features
Gemini 1.5 Pro is now available in over 180 countries through the Gemini API in a public preview. The latest updates include the ability to modify system instructions and a new JSON mode for better control over outputs, as well as the debut of an improved text embedding model for better performance metrics. (Learn more at Google’s blog)

Spotify unveils AI playlist generator 
The “AI Playlist” feature enables users to generate customized playlists based on textual prompts. Accessible via the mobile app, this tool lets users input descriptions like "music to read to on a cold, rainy day" to receive a tailored list of 30 songs. The service is currently in beta and limited to a few geographical regions, with plans to expand in the future. (Read the news at The Verge)

Google launches specialized code-generation models
CodeGemma, an initiative by Google in collaboration with Hugging Face, features a trio of open access, code-specialized language models designed to enhance coding practices across various platforms. The family includes a 2B model focused on infilling and open-ended generation, a 7B model trained on code and natural language, and a 7B instruct model for interactive code-related discussions. The suite is available on Hugging Face's Hub. (Read more at Hugging Face’s blog)

AI project turns personal memories into synthetic photos
The Synthetic Memories project, led by the research and design studio Domestic Data Streamers, is harnessing generative AI capabilities to recreate lost or unphotographed memories. The studio uses AI models like DALL-E to create "memory-based reconstructions" which help individuals, especially from immigrant and refugee backgrounds, visualize past scenes. (Read the report at MIT Technology Review)

Microsoft establishes AI hub in London 
The hub will reportedly be led by Jordan Hoffmann, a distinguished AI scientist formerly with Inflection AI and DeepMind, and will focus on developing advanced language models and related technologies. This move coincides with Microsoft's commitment to invest £2.5 billion in the UK to enhance AI capabilities and infrastructure. (Read the story at TechCrunch)

Mistral AI introduces Mixtral 8x22B 
The 281GB large language model (LLM) is designed to compete with major industry players like OpenAI, Meta, and Google. The open source model boasts a 176 billion parameter size and a 65,000-token context window. (Read more details at ZDNet)

AI-powered reminders help reduce smartphone screen time
Researchers have developed an automated system that learns from smartphone users' behaviors to send personalized pop-up reminders encouraging them to close attention-grabbing apps like TikTok and Instagram. The adaptive AI models, which continue learning from user behavior during deployment, reduced app visit frequency by up to 9%. Although the study is preliminary and had high drop-out rates, the AI interventions show promise in helping users manage their screen time more effectively. Experts suggest that incorporating users' emotional motivations for changing phone usage behaviors could further enhance the AI feedback's impact. (Read an interview with researchers at New Scientist and check out the original paper)

Generative AI adoption boosts artists' productivity and pleases audiences, but may reduce novelty
A study analyzing over 4 million artworks posted by 53,000 users on an unnamed art-sharing website found that artists who adopted AI tools experienced a 25% increase in productivity and a 50% rise in positive reactions to their work. However, the novelty of the subject matter and details in AI-generated artworks decreased compared to those created by traditional methods. The study, conducted by researchers at Boston University, covered the period from January 2022 to May 2023, which saw the release of popular AI image generators like Midjourney, DALL-E, and Stable Diffusion. While the use of AI tools accelerates the ability to produce art, it raises questions about the impact on the creative process and the meaning behind the artworks. (Read more about the study at New Scientist, or peruse the paper itself)

Schools struggle to address AI-generated explicit images of students
School districts across the United States are facing a new challenge as male students use AI-powered apps to generate sexually explicit images of their female classmates. These deepfakes can have severe consequences for the targeted girls, harming their mental health, reputations, and future prospects. As the use of exploitative AI apps in schools is a recent phenomenon, many districts seem unprepared to address the issue effectively, leaving students vulnerable. Experts and affected families are calling for updated school policies and laws to protect students from this form of harassment and abuse. (Read more at The New York Times)

A race to build data centers in the Middle East
The United Arab Emirates and Saudi Arabia are competing to become the regional leader in artificial intelligence. Both countries are investing heavily in building data centers essential for supporting AI technology. The UAE is off to a strong start with 52 operational data centers, while Saudi Arabia has 60, though many have lower power capacities. Despite challenges such as the need for skilled technicians and the high energy requirements of AI servers, both nations are committed to expanding their data center infrastructure to support their AI ambitions and diversify their economies away from oil. (Learn more at Bloomberg)

Researchers question novelty and utility of AI-discovered materials
Google's AI company DeepMind recently announced the discovery of millions of new materials using deep learning techniques, claiming it to be a groundbreaking expansion of stable materials known to humanity. However, UC-Santa Barbara researchers analyzing a subset of these AI-generated compounds have found no strikingly novel or useful materials among them. Critics argue that while DeepMind’s AI methodology shows promise, the specific findings may be oversold and impractical. The debate highlights the challenges of effectively utilizing AI and machine learning to discover truly innovative and impactful materials for concrete use cases. (Read an interview with the researchers at 404 Media, or check out the research paper)

Andrej Karpathy builds a version of GPT-2 directly in C, no Python required
Karpathy, recently of OpenAI, boasted that he could train OpenAI’s older language model using just 1000 lines of code in a single file. Last year, he undertook a similar project for a small version of Llama 2. He proposes to do the same demonstrations on open source models with more modern architectures, including Llama 2, Gemma, and Mistral. In a tweet, Karpathy said his goals were primarily educational, but that the project could have practical implications for future work. (Check out the GitHub repository for LLM.C)

AI giants skirt rules and butt heads in pursuit of training data
Tech companies like OpenAI, Google, and Meta are going to great lengths to obtain the vast amounts of digital data needed to train their AI models. The hunger for data has grown as researchers discovered that more information leads to better-performing AI systems. With concerns that the industry could exhaust high-quality online data by 2026, companies are exploring the use of synthetic data generated by AI itself to train future models, though the effectiveness of this approach remains uncertain. Meanwhile, AI companies have transcribed copyrighted YouTube videos, considered buying a publishing house, and debated gathering data from across the internet despite potential legal issues. (Read the feature story at The New York Times)


Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox