The GPT2 chatbot mystery Plus, Nvidia H100 prices drop

Published

May 8, 2024

Reading time

3 min read

This week's top AI news and research stories featured GitHub's Copilot Workspace, OpenAI's new licensing deal, an AI system that identifies landmines in battlefields, and an algorithm that accelerates inferencing of large language models (LLMs) by using small vanilla neural networks to predict which parts of it to use. But first:

AI-assisted coding transforms computer science education (IEEE Spectrum)
As generative AI tools become more prevalent in software development, computer science students and educators incorporated this technology but adapted their learning and teaching strategies. While students use AI to solve particular problems and break down complex concepts, educators now emphasize problem decomposition, testing, and debugging skills over just syntax. However, computer science teachers also caution against overreliance on AI, stressing the need to teach students to be skeptical of results and aware of potential biases in the models.

Tech giants see cloud computing rebound (Reuters)
Amazon, Microsoft, and Alphabet reported strong growth in their cloud computing divisions, driven by increased corporate spending and rising interest in AI. The $270 billion cloud infrastructure market is bouncing back after a slowdown last year, with AI services contributing significantly to the growth of platforms like Azure and Google Cloud. As more businesses adopt AI tools and move their computing needs to the cloud, they consolidate IT spending with the major providers.

Mystery “GPT2” chatbot wows experts (Axios)
A powerful new chatbot recently appeared on testing site LMSYS, impressing AI experts with its advanced capabilities. Although the bot was quickly taken offline, speculation is rampant that it originated from OpenAI. While OpenAI CEO Sam Altman has confirmed the mystery bot is not GPT-4.5, many believe it represents a significant improvement over existing models.

Amazon rebrands CodeWhisperer as Q Developer (TechCrunch)
Q Developer expands on CodeWhisperer's code generation features, assisting with tasks like debugging, upgrading apps, performing security scans, and helping AWS users manage assets and resources. It also introduces Agents, which can autonomously implement features, document code, and manage code upgrading processes. Q Developer is available for free with limitations, while the premium Q Developer Pro version costs $19 per month and includes IP indemnity. It joins a growing number of increasingly autonomous and comprehensive programing assistants.

ChatGPT adds memory (OpenAI)
ChatGPT now remembers user preferences and context from previous conversations, allowing it to provide more personalized and efficient assistance over time. Users can control their memory settings, instruct ChatGPT to remember or forget details, view and delete specific memories, or turn the feature off completely. For specific applications, ChatGPT can remember preferred style, tone, and format preferences, a developer’s programming languages and frameworks, or frequently accessed company data and visualizations.

U.S. publishes draft guidelines for AI use (Commerce.gov)
The National Institute of Standards and Technology (NIST) released four draft publications aimed at improving the safety, security, and trustworthiness of AI systems. The publications cover managing risks of generative AI, reducing threats to AI training data, promoting transparency in digital content, and proposing a plan for global AI standards development. A new program, NIST GenAI, will evaluate and measure generative AI technologies, including methods to distinguish between human- and machine-created content.

Nvidia H100 prices drop as H200 release approaches (Tom’s Hardware)
Prices for Nvidia’s H100 AI and HPC processors have decreased as supply improves and demand softens in anticipation of the upcoming H200 GPU. Even on the black market in mainland China, where H100 sales moved after U.S. export restrictions, prices are falling as scalpers rush to sell off inventory before the H200's release. It’s expected that increased supply may lead to more availability and lower prices for both of Nvidia’s AI chips, making them more accessible to more developers.

Anthropic launches iOS app and Team plan (Anthropic)
Anthropic announced two significant updates to its AI assistant, Claude: a Team plan designed for businesses, which includes advanced privacy, security, and admin controls, and an app for iPhones and iPads that enables users to chat with the AI assistant while on the go. Claude joins ChatGPT, Copilot, and Google’s Gemini in offering its AI chatbot in a mobile app.

Reka releases Vibe-Eval, a challenging multimodal model evaluation suite (Reka)
Vibe-Eval consists of 269 high-quality image-text prompts and ground truth responses, created by AI experts to be challenging for even the most advanced models. Vibe-Eval aims to provide a well-established benchmark for multimodal chat models, complementing existing multiple-choice benchmarks and chatbot arenas. Vibe-Eval’s first round of evaluations put Gemini Pro 1.5 and GPT-4V on top at solving hard problems, ahead of Claude 3 Opus and Reka’s own Core model.

REFORMS offers guidelines for use of AI and machine learning in science (ScienceAdvances)
A group of 19 multidisciplinary researchers proposed a 32-point checklist and two sets of guidelines for AI/ML-based arguments to establish evidence for a scientific claim. The researchers are concerned with data gathering, integrity, and generalizability, as well as establishing appropriate statistical tests to measure computational models’ validity. If adopted, these guidelines could help ensure that AI and ML methods remain useful and widely accepted tools, offering researchers and referees better criteria to evaluate their use.

Subscribe to Data Points