Understanding Emergent Capabilities in LLMs: Lessons from Biological Systems
https://ift.tt/H5LSzNt
How natural systems fundamental laws help explain AI’s unexpected abilities
Image by Michaela. Pixabay.com
Note: This article presents key findings from our recent research paper, “A non-ergodic framework for understanding emergent capabilities in Large Language Models” [link1] [link2]. While the paper offers a comprehensive mathematical framework and detailed experimental evidence, this post aims to make these insights accessible to a broader audience.
However, how many kinds of sentence are there? Say assertion, question and command? There are countless kinds; countless different kinds of use of all the things we call “signs”, “words”, “sentences”. Furthermore, this diversity is not something fixed, given once for all; but new types of language, new language-games, as we may say, come into existence, and others become obsolete and get forgotten. We can get a rough picture of this from the changes in mathematics.
Ludwing Wittgenstein, Philosophical investigations, first. publ. in 1953, 4th ed. 2009, Wiley-Blackwell, UK
Motivation
I remember the summer of 2006 vividly. As I sat at my desk, surrounded by stacks of papers on innovation management theory and the logic of invention, my research had hit a wall. The econometric frameworks for understanding how innovation emerges in organizations felt mechanical and out-of-date. I was missing something essential about how new possibilities arise from existing capabilities. That’s when I first found Stuart Kauffman’s book ‘Investigations’ ¹. Its ambitious subtitle—"The Nature of autonomous agents and the worlds they mutually create"—is both appealing and spooky. When I finally began reading, I found myself drawn into Kauffman’s exploration of how complex biological systems perpetually generate novelty. His central idea, the “adjacent possible,” described how each new innovation in biological systems opens up new possibilities, creating an ever-expanding space of potential futures.
The idea wouldn’t let me go. Kauffman’s mathematical framework for understanding how biological systems explore and expand their possibility spaces through restricted combinations of existing elements rather than random search felt profound. It opened a path to think about innovation not as a linear process but as an organic expansion of possibilities.
Fast forward to 2024. The rise of increasingly sophisticated large language models such as GPT 4o or Claude 3.5. Current research primarily emphasizes empirical observations and scaling laws—regularities found in the empirical analysis of LLM as well as in many real-world complex systems describing how system’s properties change with size. But prevailing approaches still struggle to explain the real large language model’s behavior. Fulfilling the promises of superintelligence and AGI will require a clear response to the question of how edge...