Large Language Models (LLMs)

78 Posts

How Qwen2-Audio performs against the competitors.
Large Language Models (LLMs)

Open Models for Math and Audio: Alibaba advances open-weight LLMs with Qwen2 Math and Audio variants

Alibaba followed up its open-weights Qwen2 large language models with specialized variations.
Conceptual illustration of The A I Scientist, an end-to-end LLM-driven scientific discovery process.
Large Language Models (LLMs)

AI Agents for AI Research: Agentic workflow generates novel scientific research papers

While some observers argue that large language models can’t produce truly original output, new work prompted them to generate novel scientific research.
Machine Translation Goes Agentic: TransAgents, a system that boosts literary translation with a multi-agent workflow
Large Language Models (LLMs)

Machine Translation Goes Agentic: TransAgents, a system that boosts literary translation with a multi-agent workflow

Literary works are challenging to translate. Their relative length, cultural nuances, idiomatic expressions...
AI Leadership Makes for a Difficult Balance Sheet: OpenAI faces financial growing pains, spending double its revenue
Large Language Models (LLMs)

AI Leadership Makes for a Difficult Balance Sheet: OpenAI faces financial growing pains, spending double its revenue

OpenAI may be spending roughly twice as much money as it’s bringing in, a sign of the financial pressures of blazing the trail in commercial applications of AI.
Higher Performance, Lower Prices: AI model prices drop as competition heats up
Large Language Models (LLMs)

Higher Performance, Lower Prices: AI model prices drop as competition heats up

Prices for access to large language models are falling as providers exploit new efficiencies and compete for new customers.
Google Gets Character.AI Co-Founders: Google acquires Character.AI talent and tech in strategic move
Large Language Models (LLMs)

Google Gets Character.AI Co-Founders: Google acquires Character.AI talent and tech in strategic move

Character.AI followed an emerging pattern for ambitious AI startups, trading its leadership to a tech giant in exchange for funds and a strategic makeover. 
Art Attack: ArtPrompt, a technique that exploits ASCII art to bypass LLM safety measures
Large Language Models (LLMs)

Art Attack: ArtPrompt, a technique that exploits ASCII art to bypass LLM safety measures

Seemingly an innocuous form of expression, ASCII art opens a new vector for jailbreak attacks on large language models (LLMs), enabling them to generate outputs that their developers tuned them to avoid producing.
Synthetic Data Factory: AgentInstruct, a framework for generating diverse synthetic data for LLM fine-tuning
Large Language Models (LLMs)

Synthetic Data Factory: AgentInstruct, a framework for generating diverse synthetic data for LLM fine-tuning

Researchers increasingly fine-tune models on synthetic data, but generated datasets may not be sufficiently diverse. New work used agentic workflows to produce diverse synthetic datasets.
Web Data Increasingly Off Limits: Online publishers crack down on AI training data access
Large Language Models (LLMs)

Web Data Increasingly Off Limits: Online publishers crack down on AI training data access

Online publishers are moving to stop AI developers from training models on their content.
Search Gets Conversational: OpenAI launches SearchGPT to rival Google and Microsoft
Large Language Models (LLMs)

Search Gets Conversational: OpenAI launches SearchGPT to rival Google and Microsoft

OpenAI is testing an AI-powered search engine in a bid to compete head-to-head with both Google and its close partner Microsoft Bing. 
The State of the Art Is Open: Meta’s Llama 3.1 outperforms GPT-4 in key areas
Large Language Models (LLMs)

The State of the Art Is Open: Meta’s Llama 3.1 outperforms GPT-4 in key areas

Meta raised the bar for large language models with open weights and published details about how it built one that outperforms GPT-4o and Claude 3.5 Sonnet by some measures.
Mini but Mighty: OpenAI’s GPT-4o Mini offers big performance at a small price
Large Language Models (LLMs)

Mini but Mighty: OpenAI’s GPT-4o Mini offers big performance at a small price

A slimmed-down version of Open AI’s multimodal flagship packs a low-price punch.
Hallucination Detector: Oxford scientists propose effective method to detect AI hallucinations
Large Language Models (LLMs)

Hallucination Detector: Oxford scientists propose effective method to detect AI hallucinations

Large language models can produce output that’s convincing but false. Researchers proposed a way to identify such hallucinations. 
How Open Are Open Models?: Radboud University study ranks AI models on openness
Large Language Models (LLMs)

How Open Are Open Models?: Radboud University study ranks AI models on openness

The word “open” can mean many things with respect to AI. A new paper outlines the variations and ranks popular models for openness.
Like LoRA, But for Pretraining: GaLore, a memory-saving method for pretraining and fine-tuning LLMs
Large Language Models (LLMs)

Like LoRA, But for Pretraining: GaLore, a memory-saving method for pretraining and fine-tuning LLMs

Low-rank adaptation (LoRA) reduces memory requirements when fine-tuning large language models, but it isn’t as conducive to pretraining.
Load More

Subscribe to The Batch

Stay updated with weekly AI News and Insights delivered to your inbox