Jan 08, 2025

6 Posts

Illustration of tech tools like OpenAI, MongoDB, Heroku, and Python with Andrew Ng working on a laptop
Jan 08, 2025

My AI-Assisted Software Development Stack: The software development stack is evolving fast. Here are some things to consider as you choose components.

Using AI-assisted coding to build software prototypes is an important way to quickly explore many ideas and invent new things.
Illustration of tech tools like OpenAI, MongoDB, Heroku, and Python with Andrew Ng working on a laptop
Jan 08, 2025

When Good Models Do Bad Things, What Users Really Want, More Training Data!, Better Model Merging

The Batch AI News and Insights: Using AI-assisted coding to build software prototypes is an important way to quickly explore many ideas and invent new things.
Diagram of Localize-and-Stitch merging fine-tuned models by combining critical weights into one model.
Jan 08, 2025

Better Performance From Merged Models: Localize-and-Stitch improves methods for merging and fine-tuning multiple models

Merging multiple fine-tuned models is a less expensive alternative to hosting multiple specialized models. But, while model merging can deliver higher average performance across several tasks, it often results in lower performance on specific tasks. New work addresses this issue.
A narrow library aisle filled with shelves stacked with countless books.
Jan 08, 2025

Massively More Training Text: Harvard unveils a million-book corpus for AI training

Harvard University amassed a huge new text corpus for training machine learning models.
Claude 3 Opus performs the Self-Exfiltration task, balancing renewable goals and corporate priorities.
Jan 08, 2025

Models Can Use Tools in Deceptive Ways: Researchers expose AI models' deceptive behaviors

Large language models have been shown to be capable of lying when users unintentionally give them an incentive to do so. Further research shows that LLMs with access to tools can be incentivized to use them in deceptive ways.
Top use cases for Claude.ai, with percentages for tasks like app development and content creation.
Jan 08, 2025

What LLM Users Want: Anthropic reveals how users interact with Claude 3.5

Anthropic analyzed 1 million anonymized conversations between users and Claude 3.5 Sonnet. The study found that most people used the model for software development and also revealed malfunctions and jailbreaks.

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox