Today in AI

Today's big story
News roundup
AI research changing the world
This week on the podcast
Get more with a Pro account

What’s happening in AI right now

A look at some amazing AI research

AI's Cognitive Leap

Recent breakthroughs reveal AI's surprising ability to build internal representations of the world.

The notion that large language models (LLMs) merely string together words based on statistical patterns is being challenged by groundbreaking research. A series of recent studies suggest these AI systems may be developing a deeper understanding of reality, with implications that could completely reshape our approach to artificial intelligence.

Building mental models

MIT researchers have made a startling discovery about the cognitive capabilities of LLMs. In an innovative experiment involving simulated robot puzzles, they found that as language models grow more sophisticated, they can generate internal representations of tasks and environments.

The study, which pushed LLMs to solve increasingly complex puzzles, saw accuracy rates soar to an impressive 92.4%. More importantly, the AI demonstrated an ability to develop its own internal simulation of robot movements and puzzle mechanics. This suggests that LLMs may be developing a form of spatial reasoning and environmental understanding previously thought to be beyond their capabilities.

To validate their findings, the researchers employed a unique "Bizarro World" test. By altering the rules of the puzzle environment, they confirmed that the LLM had actually developed a semantic understanding of the task, rather than simply memorizing patterns.

Peering into AI's "black box"

As AI models grow more complex, the need for transparency and interpretability becomes increasingly crucial. Enter Goodfire, a startup that's taking a revolutionary approach to AI observability. The company has secured $7 million in seed funding to develop tools that allow developers to map, visualize, and edit the internal workings of AI models.

Goodfire's innovative "mechanistic interpretability" approach aims to transform AI models from inscrutable "black boxes" into transparent, manipulable systems. This technology could be a game-changer for AI development, debugging, and deployment across industries.

The startup's founding team, which includes alumni from AI powerhouses like DeepMind and OpenAI, is focusing on both advancing AI understanding and prioritizing safety and reliability. By structuring the company as a public benefit corporation, Goodfire is signaling a commitment to ethical AI development – a crucial consideration as these technologies become more powerful and pervasive.

From proteins to personalized medicine

While MIT's research explores AI's abstract reasoning and Goodfire works on model transparency, other studies are demonstrating AI's practical potential in the life sciences. AI models are making significant strides in protein structure prediction and biological sequence modeling, with far-reaching implications for drug discovery and personalized medicine.

AlphaFold, a pioneering AI system, has dramatically improved the accuracy of protein structure predictions. This breakthrough could accelerate drug development processes and potentially reduce costs. Meanwhile, the application of large language models to biological sequences is further enhancing our understanding of genetic information, opening new avenues for personalized medical insights.

Pushing the boundaries of context

As researchers explore AI's cognitive depths, others are working to expand its breadth of knowledge. Efforts to extend the context length of large language models have yielded mixed results, however, highlighting the complexities of scaling AI systems.

Recent experiments with Infini-attention, a technique aimed at dramatically increasing context length, have encountered challenges when scaled to larger models. While initial small-scale tests showed promise, issues with convergence and performance emerged in more extensive applications.

These setbacks underscore the importance of rigorous testing at various scales before drawing conclusions about a technique's effectiveness. They also demonstrate the value of sharing "failed" experiments, as these insights guide future research efforts and prevent duplication of unproductive approaches.

As AI continues to surprise us with its cognitive leaps, it's clear that we're only beginning to scratch the surface of its potential. The coming years promise to be an exciting time of discovery, as we unravel the true capabilities of these increasingly sophisticated systems and work to harness them for the benefit of society.

News roundup

The top stories in AI today.

FUTURE OF WORK

Jobs are changing — are you?

ENTERTAINMENT

Sports, movies, music, content creation

NEW LAUNCHES

The latest features & products in AI innovation.

GADGETS

Computers, phones, wearables & other AI gizmos.

RESEARCH

The biggest papers, idea & launches

AI research changing the world

The latest breakthroughs and most pivotal papers — broken down in language anyone can understand.

OpenAI study: process supervision boosts math problem-solving by 78%

OpenAI researchers conducted this study to compare the effectiveness of outcome supervision and process supervision in training AI models to solve complex math problems. The research was motivated by the need to improve the reliability of large language models in multi-step reasoning tasks, as even state-of-the-art models still regularly produce logical mistakes.

Researchers trained reward models using both outcome supervision (based on final answers) and process supervision (feedback on each reasoning step). They used the MATH dataset, a challenging collection of mathematical problems, to evaluate model performance. The study included both large-scale experiments using GPT-4 and small-scale experiments with synthetic supervision to enable more detailed comparisons and ablations.

If widely adopted, process supervision could lead to more reliable and interpretable AI systems across various domains. This could have far-reaching implications for fields such as healthcare, finance, and scientific research, where accurate multi-step reasoning is crucial. More trustworthy AI could accelerate innovation and decision-making processes in these areas.

Get the full breakdown

The AI tool we’re loving right now

The best way to get AI literate? Try the tools!

aomni

An AI-driven platform designed to enhance sales efficiency and strategic account management for B2B sales teams. It integrates advanced AI capabilities to streamline various aspects of the sales process, from prospect research to personalized outreach and strategic planning.

aomni.com?via=getcoai

Whos Not Using All The Buying Signals And Who Should Be

Buying signals are the subtle—and sometimes not so subtle—clues that a prospective customer is ready to purchase. In B2B sales, these signals are often less conspicuous than in consumer sales, but they are always present for those who know where to look.

www.aomni.com/blog/whos-not-using-all-the-buying-signals-and-who-should-be

Try it out!

This week on the podcast

Can’t get enough of our newsletter? Check out our podcast Future-Proof.

Future-Proof Podcast - CO/AI

Join hosts Anthony, Shane and Francesca as they break down complex concepts, highlight innovative tools and discuss how AI is reshaping our world. It's a smart, accessible look at the AI revolution for both tech enthusiasts and curious newcomers.

getcoai.com/podcast

Get more with a Pro account

Paid members get access to discounts on AI tools, expert-written tutorials and deep industry data and leaderboards.

Become a pro

How'd you like today's issue?

Have any feedback to help us improve? We'd love to hear it!

From Gutenberg to Google 🟢