Back to Blog
Blog

A Conversation with Blaise Agüera y Arcas: On Intelligence, Life, and the Future of AI

March 9, 2026
John Gabrieli Buchet

The following is an edited transcript of a conversation between John Gabrieli Buchet (Poggio Lab, MIT) and Blaise Agüera y Arcas (VP, Google) recorded as part of the Poggio Lab's AI: Foundations - for Academia (and Startups) seminar series, which asks what fundamental principles of intelligence academia can still discover and translate into real-world systems. Blaise will be speaking at MIT on March 16th at 4pm in Singleton Auditorium (46-3002).


What does it mean to call something intelligent - and when did this question get so hard to answer? For Blaise Agüera y Arcas, VP at Google and founder of Paradigms of Intelligence, the answer begins not with LLMs but with the origins of life itself. His book What is Intelligence? (MIT Press, 2025) argues that intelligence is substrate-independent prediction, a property running unbroken from the first self-replicating molecules to modern AI. LLMs aren't imitating intelligence, he contends: they're the real thing.

Here, Agüera y Arcas reflects on the 2021 LaMDA conversation that reshaped his own thinking about AI, the computational life experiment his team at Google conducted that seeks to model how life arises and why, and some of the biggest remaining unsolved challenges in AI, as well as the approaches he believes may help solve them.


Is Basic Research Still Worth It?

John: All right, so this seminar series starts from a single framing question: Is it still worthwhile to be a researcher in AI? The landscape has been shifting, and many problems that once defined academia are now dominated by scale, data, and incumbent advantage. If scaling laws plus clever engineering continue to drive better results, why bother messing around with theory? And why should we be doing basic research on the principles of intelligence?

Blaise: This is a great question – it came up a couple of years ago at Cosyne, where I was on stage giving a keynote with a colleague from DeepMind and a few computational neuroscientists. I found myself in the weird position of defending computational neuroscience against a couple of computational neuroscientists who were despairing that it was just a scale game. So I'm very much on team basic research – team science, if you like.

First of all, the idea that just growing transformers solves all problems is definitely not the case. A couple of years ago there was an idea that maybe we didn't need to solve the problem of long-term memory, because if you just make the context window longer and longer - even though one should think of it as something more like a short-term memory – maybe scaling it up turns it into a long-term memory. But I think it's become clear to a lot of us that this is not the case. Now we have LLMs with short-term memories way longer than our own – none of us can hold the entire Lord of the Rings in mind at once, which is roughly the size of a state-of-the-art context window nowadays – but that's not equivalent to having a long-term memory. In fact, some failure modes you see in LLMs – where you tell them "please stop doing something", or, "we're changing tracks now" – they will start to perform poorly once too much stuff is in their context windows, because they don't forget. They don't keep in mind the sequence of changes you've made so far. There's been some very recent research showing that sometimes the first response of an LLM after a long, complex prompt, is way superior to the results of a long series of interactions. It's because of the same phenomenon.

So there are clearly things we don't understand – not only things we haven't yet engineered correctly, but things we don't understand about how brains work that would be very helpful to our models, and things we don't understand about how models work that would be very helpful to our understanding of neuroscience.

John: Tomaso Poggio has a metaphor I've personally found very helpful: the way AI is right now, we stand at the place where electricity was after Volta but before Maxwell. Engineering has kind of raced ahead, but to take this revolution to its full logical extent, science needs to step up and develop some theories – so we can have not just engineered artifacts but deep understanding.

Blaise: Yeah, I completely agree. My friend and colleague Benjamin Bratton says something similar: there are times when science is ahead of engineering, or theory is ahead of practice you might say, and times when practice is ahead of theory. The Volta-to-Maxwell transition I think is a good analogy. The one I've tended to use is the steam engine era before Boltzmann – so we've figured out how to engineer heat engines, but we haven't figured out the fundamentals of statistical physics or thermodynamics.

"There are clearly things we don't understand – not only things we haven't yet engineered correctly, but also things we don't understand about how brains work that would be very helpful to our models."

Blaise Agüera y Arcas

The LaMDA Moment: A Paradigm Shift

John: Your book What is Intelligence? has gotten outstanding reviews from everyone from Melanie Mitchell to Terry Sejnowski – and I'd like to add my name to that list and say it was my favorite book of the year. Congratulations on a wonderful book.

Blaise: That's so kind, thank you John. Thank you.

John: It's excellent – creative, insightful, thought-provoking. I highly recommend it for folks who are listening. It's a very ambitious synthesis bringing together AI, neuroscience, physics, computer science, philosophy – trying to articulate a unified theory of intelligence. And in many ways, of course, drawing on the title, it's really just trying to answer a single question: what is intelligence? In the book, you trace it back to a pivotal moment: in 2015, your team at Google was working on next-word prediction for Gboard, the autocomplete for the phone keyboard. And essentially nobody working on that problem, including yourself, believed next-word prediction would ever lead to general intelligence.

Blaise: Right.

John: And then, you write about, in 2021, staying up late chatting with LaMDA, and you started to think this would be a seismic shift, this is really going to change our understanding of intelligence. It seems like you were very surprised by this breakthrough. How did that moment impact your perspective on intelligence?

Blaise: That moment was pretty seismic for me – it really shocked me, and I was actually wondering why more people weren't shocked by it. What has become clear since 2021 is that there are still many people – surprisingly many, even in computational neuroscience and AI, as well as AI researchers and engineers – who maintain that we don't have general intelligence in these models, which I actually find extremely confusing. I don't know how we can claim we don't have general intelligence when the models – though they're certainly jagged – are better at a lot of stuff than you and I are.

Anyway, I absolutely did not expect next-token prediction to solve general intelligence, and I don't personally know anybody who was making that claim in 2015 either, although there had been a tradition of saying that P(future | past) is what brains are all about. One of the more modern instantiations of that was Friston's free energy ideas, but I think the idea goes back to Helmholtz at least: the idea that the reason you have a brain is to do a better job not just of predicting the future, but of making predictions across different possible futures conditional on your own actions in the world, and picking the better future. If you think about it that way, it's almost obvious – of course, if you're not able to act in a way that changes your future, what's the point of having a brain? The point of having a brain is to modulate your actions in some way that will influence your own future happiness and existence. So put that way, it has to be P(future | past) – but somehow between saying that in the abstract and saying "just model a big distribution, do machine learning on that distribution and learn it" – it just didn't seem plausible that a brute force attack on that problem could possibly yield what we think of as being intelligence, even if it was kind of formally correct.

So I always imagined – and I think almost everybody else did too – that there were going to be magical regularizers – semantic pointers, frames, some other special structure – and people had all kinds of theories about what kind of pixie dust would be the key to general intelligence. And I think it's important to emphasize that the transformer doesn't have any magic key in it. It just happens to be an architecture well suited to massively parallel training with really large corpora. There's nothing magical about it. In fact, if you just use very large recurrent models of other sorts, or even convolutional nets with big enough context windows, those things kind of work the same way. They vary in how efficient they are and how well they use computational resources given our current architectures. But it's not that the transformer itself brings something really special. So in that sense it really has been just a scale game. But we needed a scale comparable to what brains can do, which is not trivial.

John: It's interesting – you talk about how, facing this moment of confusion and then surprise, you sort of have two options: You can either accept that this is true intelligence and revise your framework – your paradigm – for intelligence, or you can reject it and say actually this isn't real intelligence, which allows you to stick with your prior assumptions. When you think about paradigm shifts, Thomas Kuhn argued that the first piece of evidence contradicting an existing paradigm is rarely the thing that actually causes people to flip, because people are largely conservative and will find all kinds of ways to patch the old paradigm before they're willing to jump over and say: maybe we actually need a new explanation. I'll admit, I shared your initial intuition, and it took me some time to get there myself. Did you immediately think, "Okay, it's time to rethink what intelligence is?" Or what was that journey like for you, and how did you end up thinking intelligence just boils down to prediction - were you already primed from that from a neuroscience standpoint?

Blaise: Well, I was certainly primed from a neuroscience standpoint – first of all, to believe in general function approximation. I knew that neural nets are general function approximators, and therefore that probably some arrangement of artificial neural nets would be able to do what brains do. And that was why I had been excited to go to Google in the first place: it was clear that they were well ahead of everyone else at that point, and Google Research was in many ways like the Bell Labs of the 2010s. I really wanted to be where that progress was happening. And visual recognition challenges had already started to succumb to convolutional neural nets – a sign that things we hadn't been able to solve with conventional computer science, with traditional machine vision, were starting to yield to scaled-up neural nets. But it still seemed like a big leap from image categorization to general intelligence, right? To things like: write poetry, rephrase this story so the characters have different philosophical ideas about the universe, write code that performs some function I specify only in natural language. These kinds of tasks don't have a completely well-defined metric; you can't just set up these tasks and learn them in a stereotyped way. All of that seemed like leaps and bounds away – a kind of quantum leap from anything like classification.

So LaMDA and its predecessor, Meena, were really dumb relative to any LLM you can interact with on the web today. But they could already do those things, in a minimal form. They could write a short function that did not appear in the training data and would work. They could analyze a short poem. They could tell a joke and explain why it was funny. And they could do all these things that were absolutely not in the training data and generalize them. They could also do in-context learning - you could define a new term or concept, say, "now try applying this in such and such a way", and they could. So it became clear when I saw those things in LaMDA that the rest was just a matter of more scale and a matter of tweaking. I didn't see any profound qualitative jump needed from there to what I would consider general intelligence.

But that's not the majority view, and it still isn't. Although I should say my experience, anecdotally, has been that in recent years – and maybe even more so in recent months – regular people are perfectly comfortable saying, yes, LLMs are intelligent, AI is obviously intelligent. It's actually been more the specialists, the experts, who have all kinds of qualms about saying that. That might be something Kuhn, as you pointed out, would have predicted – in some sense it's exactly those experts who maybe have more at stake with respect to some preconceptions about what intelligence is. And, a little cynically, I think it might also have to do with the fact that if you're a smarty-pants researcher, you may be more invested in the idea of your own exceptionalism as an intelligent being.

"Regular people are perfectly comfortable saying LLMs are intelligent. It's been more the specialists who have qualms – those experts may have more at stake, more invested in certain preconceptions about what intelligence is."

Blaise Agüera y Arcas

John: Yes. And as you talk about in the book, also in terms of human exceptionalism more broadly sort of our place in the universe, right? I think it's perhaps a little eerie to see ourselves – what we think of as our unique capabilities – being maybe not quite as unique as we had thought.

What is Intelligence?: A Guided Tour

John: So let's get into the book. For folks here at MIT who haven't read it, I tried to summarize it, but I found it incompressible. So first of all, go read it. But for now, let me offer my best attempt at some key takeaways for the benefit of listeners who haven't read it, and you can tear it apart and tell me what I missed or how you'd do it differently.

  1. Life is computational. You start by saying life is computational. Self-replication requires computation – going back to von Neumann – but it also emerges naturally in environments where computation is possible.
  2. Intelligence is predictive. We don't know all the details of how intelligence works, but we know what it does: intelligence is about prediction. This is what I call your "strong predictive intelligence hypothesis", which you argue unifies accounts in neuroscience and machine learning.
  3. Brains evolved to enable learning in real time. You then move on to the very beginning of life: the single cell, the bacterium still needs to control and therefore predict its environment to survive. The first brains then emerge to do this better and faster, as part of an arms race to learn and adapt in real time in a cybernetic kind of way.
  4. Theory of mind drove the intelligence explosion. Predictors eventually need to start modeling each other. This sets off an "intelligence explosion" in primates and humans leading to complex agency, free will, and consciousness – all of which, you argue, arise from theory of mind applied recursively to yourself.
  5. Brains are societies. Our sense of self is thus a predictive model – useful but also potentially misleading. The brain is not a unified entity but a "society of sub-intelligences," in the spirit of Marvin Minsky's Society of Mind.
  6. Language compresses our world models. Language was invented to share hidden mental states externally – it is a compression scheme for our mental worlds, capable of expressing all human sensory modalities. Thus, language models are receiving sensory input, they're just receiving it through a different modality.
  7. Chain of thought = Turing universality. You draw an analogy: without chain of thought, language models are analogous to System 1, jumping directly to conclusions; with it, they are able to reason step by step, using the context window as a Turing tape. This makes them Turing-universal.
  8. Therefore, LLMs are intelligent. Even though they may be implemented differently, computation is substrate-independent, and LLMs are fundamentally doing the same thing we are: unsupervised sequence prediction. Critiques that they lack world models and grounding, or can't understand causality, are either overstated or have been outright disproven.
  9. Gaps remain, but are addressable. You name memory, internal monologues, and individuation as specific areas for improvement, but you believe they are likely addressable. Looking to the future, you believe a unifying theory of intelligence based on "dynamically stable symbiotic prediction" is out there to be found, establishing common principles for biological & artificial intelligence.

So there's my best attempt at capturing the thrust of the core argument, for those who haven't read it.

Blaise: Yeah, you did a great job – and you're right, it is hard to compress. It's not a short book - it is 600 pages. And I didn't want to write a 600-page book. I get really irritated with books that could have been the length of a New Yorker article and feel padded out. So I tried to squeeze, but there's a lot in there. And I think you did a very good job compressing.


Why Start a Book About Intelligence with Life?

John: I'm glad to hear that, thank you for humoring me. Let's dive in deeper on a couple of points. I remember I was excited when I saw the title What is Intelligence?, because every definition of intelligence I'd seen was either a grab-bag or far too anthropocentric. Then I opened the book and it starts with abiogenesis – what is life – and it was not at all obvious to me how you ended up there. How did you decide to start a book about intelligence with the question of what is life?

Blaise: Right – I think it might be more obvious to think about a book about intelligence as a subset of a larger "what is life" book, rather than the other way around. So yeah, that wouldn't have been obvious to me a few years ago either. But basically there are a couple of reasons why.

On a practical level, I began working on artificial life experiments a few years ago, and thinking about the origins of life. A lot of the pioneers in AI, computer science, and neuroscience were also the pioneers of artificial life as a discipline. Although artificial life is still very underground – of where AI was in the 1970s or 1980s – it's behind, but it's actually a lot of the same characters. And I think the reason why is that if you believe we evolved intelligence only very late in the game – that the brain is a specialized organ that has intelligence, but none of our evolutionary forebears did – then you have a tricky question to ask yourself: what is intelligence, in a sense that is distinct from the general capacity to respond to an environment in a way that ensures your own continued existence? Why don't we talk about mushrooms being intelligent, or bacteria being intelligent?

And I began to realize it doesn't make a lot of sense to say mushrooms are not intelligent or bacteria are not intelligent. The mechanisms are obviously different – they don't have neurons – but if you start thinking about things functionally, in terms of modeling the environment, building conditional distributions, etc. – everything that is alive has that. This was the insight behind the cyberneticists in the middle of the 20th century, that sort of pushes the intelligence question back not just to the origins of brains but to the origins of life. Then you have to ask: how do these feedback loops that begin to model environments and control their own future come about in the first place? In that way, the origins of life and the origins of intelligence started to look like they collapsed into the same question.

And then as I began to do these experiments, something else came to light that really made a bunch of lights turn on for me: the importance of symbiogenesis in evolution. Traditional ideas about Darwinian evolution hold that we have a genome, random mutations happen, and it's like throwing spaghetti at the wall – whatever sticks, whatever gives a survival advantage to one particular genome, propagates differentially. And that's how evolution works.

But that neo-Darwinian synthesis is kind of deeply unsatisfying when you start to ask questions about origins – it's not clear how something like that could get started. How does life come about in the first place? And when you start to really dig into origins-of-life questions - and also into why life becomes more complex over time – there's no reason simple life should come first, and more complex life should come second. If you're just making random changes, there's no directionality to that process. You pretty rapidly come upon this idea of major evolutionary transitions, as John Maynard Smith and Eörs Szathmáry called it, which is that often life combines to form more complex life. Multicellularity is symbiogenesis. Hive creatures like bees and ants are a kind of symbiogenesis. And human culture, which comes about through the cooperation of lots of individual people, is another instance of symbiogenesis.

This gives you an arrow of time, because in order for simple things to come together to make more complex things, the simpler components must have pre-existed the more complex ones. So it really is a constructive process. And that gives you a scaling law. Why are bigger brains more complex than smaller ones? Because they have more parts – more stuff has come together. So scaling laws start to look similar whether you're talking about scaling laws of brains, or of societies – there's been a bunch of work at the Santa Fe Institute and elsewhere about that – or scaling laws of AI models, which is just: throw in more artificial neurons, more parameters, and you get more intelligence. So this symbiogenesis story explains why, when you parallelize more, you get more intelligence. And once you understand the cybernetics story – feedback loops and how they arise – and the symbiogenesis story – how things scale – a lot of the pieces come together, and it starts to make intelligence and life look like two sides of the same coin.


The BFF Experiment: Life from Randomness

John: This process you're describing, it starts with the computational life experiment. We read your team's computational life paper here at the lab, and it provoked some great discussion. Could you tell our listeners about the BFF experiment and what you think it shows?

Blaise: The BFF experiment – the first "BF" stands for "Brainfuck." That's the name of a programming language invented in the '90s by a Swiss graduate student in physics. He named it Brainfuck because it's impossible to program in. It's basically a programming language with eight instructions very close to the fundamental moves of a Turing machine – a simple toy model of a minimal computer that Alan Turing designed as part of the proofs that formed the backbone of his famous 1930s paper that gave birth to computer science. So it's got these eight instructions that essentially move a head back and forth on a tape, increment and decrement bytes on that tape, and execute jumps.

The experiment consists of taking a soup of tapes. The tapes are fixed length – 64 bytes – and start off with random bytes on them. You have, say, a thousand or a few thousand of them in a bucket. You scoop up two tapes at random, stick them end to end, and run the program – whatever it is – on those tapes. It's a slightly modified version of Brainfuck that is self-modifying: there's no separate console, input, or output. Anything they can do is really just about changing the contents of those tapes in situ. In the beginning, of course, they don't have working programs on them at all – just random bytes.

So nothing much happens at first. After running them, you pull them back apart, put them back in the bucket, mix them up, and pick two out again. Repeat. It's just this process of running random programs that can modify themselves. And what you find is that after a few million interactions, as if by magic, programs emerge. These programs become increasingly complex over time and look like self-replicators. They copy themselves and each other – and this happens even when there is no random mutation. It doesn't require random mutation in the neo-Darwinian sense.

It's a parable – and maybe more than a parable – about how life arises and why. Basically, life arises because it transforms the matter around it into more life, and in that sense it is more stable than non-living matter. Living things are more stable than non-living things because they actively create more of themselves. And if the barrier to making a self-replicating program is not that high, it will arise by chance. Moreover, you don't need to rely on truly random events like cosmic rays – it's enough to have random encounters between things that can do stuff in the world and change how they behave conditionally on that world.

Programming languages at a minimum need to be able to change state and have an if-then conditional - these are the minimum requirements for any programming language. And those requirements are there in chemistry. Basic chemistry – obviously, there's a reaction that changes the world: reactants and a product. And it's very easy to make conditionals in chemistry as well. So this gives you a really nice way to go from chemistry to something that looks like it replicates. And in fact, since the book was published, there was a cool paper in Science showing that there is an RNA sequence that can replicate itself just like a BFF program – a relatively short sequence, exactly the kind of thing that could have arisen through random interactions in a soup of monomers that can make RNA-like molecules. So I think it's easy for life to arise – it's not that big a leap. And once you have things that can replicate, they can start to combine – and that's how you get this ratchet toward more and more complex things.

But what this also highlights is how important cooperation is. The connection between two things combining and two things cooperating is a very close one. The moment you have two different replicators that in some way favor each other - maybe one creates the conditions for the other to replicate and vice versa – you have the preconditions for them to start working so closely together that they become obligate partners. And that's exactly that symbiotic process – which ratchets up and up – that we've seen now happen on a large scale dozens of times on Earth, and I think is happening on a smaller scale all the time. The more closely you look – at horizontal gene transfer in bacteria, viral symbionts in fungi – when you put on your symbiogenesis eyes, you begin to see it absolutely everywhere. It doesn't look like life is driven by random mutations at all. It looks like it's driven by these purposeful interactions between things.

John: And was that something you expected going into the experiment? Because the first iteration you ran, I believe you were running the experiment with mutation as part of it, and then afterward you tested without it. Was that something you were testing as a hypothesis, or did it come as a surprise?

Blaise: It absolutely came as a surprise. And I should say the team has also done some experiments more recently showing that at least the kind of complexity that arises in BFF – and it's a very simple experiment – is not greater than the complexity that can arise through random selection.

But the key is that you don't need mutation. And that was a big shock - I expected you would need mutation to get anything to happen – and I was really surprised when, turning mutation down to zero, we still got these replicators. So it's a different story from mainstream biology in the 20th century, at least. There are a lot of biologists who acknowledge these self-modification processes now. But I think it's also a paradigm shift in its own right that is still very incomplete in the biology world. They acknowledge that horizontal gene transfer happens and that symbiosis happens, but the full recognition that this is actually the main engine behind evolution – I think that's held by only a minority of biologists still.

"Life arises because it transforms the matter around it into more life, and in that sense it is more stable than non-living matter. Living things are more stable than non-living things because they actively create more of themselves."

Blaise Agüera y Arcas

DNA's Fractal Structure and the Dynamical Second Law

John: I thought one of the most striking parts of the book and the paper is when you link this insight back to the structure of our own DNA – both in terms of its compressibility, and the extent to which DNA is made up of replicators that aren't actually coding for proteins. That parallelism seemed quite remarkable, I thought.

Blaise: Yeah, that's right, I thought so too. The basic observation is that if you're just doing random mutations, you don't expect any particular large-scale structure in the DNA that arises from that. But when you look at our DNA, you see lots and lots of internal copying – sections that are replicated inside our DNA. And the easy way to see that is by looking at the compressibility of bigger and bigger windows. If you try to compress DNA by looking at more and more of it, the compression factor keeps increasing the bigger a window you look at – meaning a dictionary of x amount of DNA always helps you better compress the rest of it as x grows. You never run out of novelty, but your compressibility keeps improving nonetheless.

And that almost fractal-like structure is exactly what you'd expect when you're looking at a being made out of replicators which are made out of replicators which are made out of replicators – replication happening at every scale. And that's a really cool way of thinking about evolution: that we're really composites involving replicators not just down to the level of single cells, or down to viruses and transposons, but down to even smaller units that go all the way down to basic chemical reactions.

John: Right, and the final piece on this paper, then, before we move on, is that this process makes life almost inevitable. First, you lay out the von Neumann argument that replicators must be universal Turing machines, establishing that computation is necessary for life and for self-replication to occur. And then this paper is almost the mirror image of that – the other half – which is to say: If computation is possible, is that sufficient to generate life? And I feel like this experiment shows – it doesn't prove, but it indicates – that life is much more likely to arise than we might expect, actually, in a way that's quite counterintuitive. When I read your review talking about "computational life", I was interpreting that initially as more of a metaphor. But the literal way in which you close the original von Neumann argument – I found that impressive.

Blaise: Exactly, yes. Thank you. I think it is both – to be taken seriously and literally. But I should say this is still under-theorized – there's definitely more work to do to make it mathematically rigorous. I hope it will spawn work of that sort. In that sense, I think a lot of what we've done here is kind of like Carnot – we have yet to have our Boltzmann of life. But we're in that stage.

John: Actually, Tomaso asked me about exactly this aspect of the paper – about the sort of "dynamical second law" you talk about, regarding the dynamical stability of replicators as a kind of counterweight to the dismal, static second law. He asked: do you have a mathematical formulation or a proof yet? Because thermodynamics does of course, as you're saying with Boltzmann. Is that something you've worked on already, or do you think there's an opportunity to build something like that?

Blaise: I have worked on that to some degree. There's some unpublished work that builds up some of that theory, though there are still some issues applying it to the BFF system in particular. So I'd say it's still a work in progress. But the framework I use for theorizing combines the Smoluchowski coagulation equations from statistical physics – early 20th-century work that shows that if you have things that can combine into larger things, you get a phase transition – with population dynamics. With those two things, I think you have the building blocks for a pretty general theory of symbiogenesis. But to really do this properly, you need to bridge several different disciplines – statistical physics, theory of computation, and some other parts required for this kind of compositionality and symbiogenesis work. So there's definitely work to do there – we're not done.

John: That'll be - we'll be interested then to follow along with that. As you say it's a very interesting empirical demonstration, and if it shows promise, hopefully it's something that can be formalized over time.

Blaise: I hope so.


Architectural Gaps: Statefulness and Feedback

John: Moving on, there's a second piece I wanted to get into from the book. Throughout the book you highlight deep similarities between biological and artificial intelligence – you argue that they're fundamentally both doing prediction, and that intelligence itself is substrate-independent. But you also flag some architectural differences that seem potentially significant, and I wanted to get your perspective on two of them specifically and how significant you think they might or might not be.

One piece that you emphasize – in terms of its role in biological computation - is statefulness: the maintenance of an internal state over time. You have a section on the Portia spider – excellent Adrian Tchaikovsky reference, by the way – and how even with a tiny brain, maintaining and updating this internal state is absolutely central to biological cognition, to world models, to theory of mind, to long-term planning, to sense of self. Even with a tiny brain, it allows you to do some very impressive things.

Blaise: Right.

John: You then go on to say that in contrast, feedforward ANNs are timeless and memoryless. For example, AlphaGo has no sense of self because it doesn't have a state. Yes theory of mind, which you emphasize as playing a huge part in the story of human higher cognition, requires maintaining and tracking this internal state. If this isn't something that most feedforward models – including transformers – have natively, is this something we should be worried about? Or is this something you think can actually be avoided, or in some way bolted on?

Blaise: Yeah, it's a great question. So, you know, I mentioned earlier that the reason transformers took off the way they did is precisely because of their ability to be parallelized during training – and that parallelization is possible precisely because they're not recurrent. So there's something about the way they fit with our computing infrastructure that is incompatible, in a way, with the very idea of recurrence. There has been some nice work on state space models and so on that are recurrent, and it seems clear that recurrent models can perform as well as transformers basically – but they are trickier to train in practice, and so I don't think any of the frontier models today are actually using explicit recurrence that way.

However, the big loophole is that transformers have this very large context window, and every time they emit a token, it goes into the context window – and the context window then shifts one step to the right. You can make a mathematical mapping between that scenario and just thinking about the context window as state, or change the topology of your network in such a way that you can turn that into an RNN. So they are stateful in practice when used dynamically, with a context window shifting one step to the right with every token emitted. And that statefulness is really important when you look at how OpenClaw-type agents can work together – not only to do tasks that require keeping track of stuff, but in some cases to do things that require self-improvement. They require you to change your concept of what you're working on or what you're doing. So they hack around this lack of recurrent state by using files and moving files in and out of their context windows. We're actually seeing models today troubleshooting their lack of long-term memory by making all kinds of really clever hacks with their short-term memory – a little bit like the way somebody with memory problems could use a notebook to develop a different way of living that relies on keeping track externally of who you are and what you're doing.

So when I talk about AlphaGo or the old Nvidia self-driving car from 2016, they're not only stateless because they're not recurrent. They're stateless because there is no time-dependent context window that shifts a step to the left. If your input consists of a single frame of video and you make a decision, but then at the next moment you have another frame and don't have the previous frame, then that really is stateless. But that's not true if the previous frame is now part of your world, as is the action you just took. And that's the way transformers work in practice.

John: Interesting. And relatedly, when you compared modern deep networks with the brain, there's this question of feedback. For those of us interested in both the brain and artificial intelligence systems: viewing cortex as a purely feedforward convolutional neural network is just, you know, not a very good picture. A very significant chunk of cortex is actually feedback connections that are top-down. In the book I believe you connect this explicitly back to the idea of "perception as controlled hallucination": it's not just bottom-up, CNN-like processing of input data, but you actually have top-down predictions that modulate incoming sensory data. And at one point I think you even say "cortex is basically an RNN, structurally."

Blaise: Yes, that's right.

John: How significant or problematic is it then that transformers lack these types of feedback connections? I think at another point you say that it would be horribly wasteful to imagine a CNN that takes, at every instant, the next image of the scene and processes it again feed-forward all the way up - but in many ways, once you throw away your hidden states in a transformer, you're taking in the next token and then just doing it all again, right?

Blaise: That's exactly what you're doing. That's right. So I mean in short, I think this trick – where you keep your outputs in the context window and your history is in the context window as well – is a way of doing recurrence without recurrence. I don't think there's anything you could compute with recurrence that you could not compute with full access to your own past and the world's past, as the transformer has, with a sufficiently big context window. So in that sense, I don't think there's anything fundamentally different computationally about what you could do with recurrent connections that you can't do with a context window that includes those things.

That doesn't mean recurrence is pointless – on the contrary, it seems like it's a really, really good thing for efficiency. The fact that transformers are looking at an entire Lord of the Rings-sized context window to emit every single token is kind of crazy – it seems crazy wasteful. And there's a bunch of work happening - of course, AI researchers are obsessed with making their models more efficient - and there are definitely people working on this problem, and working on architectures that have recurrence of various kinds as well. So I expect it will change, and our models today will look both laughably small and laughably inefficient if we roll forward by 10 years.

John: I think that's right, there's a question of efficiency, but to me there's also an issue of what's learnable in practice. For example, you talk in the book about cybernetics and their vision, and parts of it were strong, but ultimately the fact was that with the series expansions that Norbert Wiener was using, they may have been universal function approximators just like neural networks, but that doesn't necessarily tell you what's actually going to be learnable in practice.

Blaise: Exactly.

John: So my mind goes to: While in theory you may be able to overcome the lack of recurrence by just having really large context windows, it may still affect the types of things transformers are able to learn in practice, even if you have this theoretical equivalence.


World Models and the "Jagged" Landscape of Today's LLM Capabilities

John: This brings me to my last question about the book: You have a section called "Parity Check" where you list distinctions that have been made between language models and the brain, and you bucket them into "probably false" and "probably true." In the "probably false" bucket, you include world models, which has been an issue of interest to me.

When I read closely, I think what you're saying is: Well, language models at this point have been established categorically to have some kind of world model. You're not saying that it's the same kind of model that the brain has, or of the same degree of quality – you're saying that it's more than none.

So Jacob Andreas here at MIT – who I think would agree that language models do have world models – has done some research into this area, and he has a wonderful blog post in my opinion talking about a sort of taxonomy of models: a lookup table, a map, an orrery, and finally a simulation – where with a simulator you could ask questions like: if Saturn were knocked out of its orbit, how would the system evolve? So I think a question that's on my mind is that – and Melanie Mitchell has written on this as well, in some compelling ways I think – is: let's say that yes, language models certainly have some type of world model, otherwise they wouldn't be able to answer many of the questions they do. But does that world model look more like a combination of local heuristics, or is it a true kind of global causal simulator?

Blaise: Right. And if they're limited in one of those ways, is that an architectural issue?

Well, my short answer would be: the real question we're asking with respect to this hierarchy of models, is: how well do they generalize? If you just have a lookup table, it generalizes very poorly. If you have a simulation, it generalizes very well, and it's complete.

John: Exactly.

Blaise: And you know, I think anybody who says that LLMs do not have world models is just in denial. There are many, many things they do that absolutely require a world model of many kinds – including understanding spatial relationships, and understanding all kinds of things that even pure language models learn to do, because it's all encoded in language. Not necessarily efficiently - which is why you require vast amounts of language in order to sort of figure out these patterns. But with enough pre-training, with enough language, they seem to be able to figure all of these things out. However, clearly their model of 3D space from just processing language is not as good as it would be if they were also trained on all the video on YouTube or something like that. And if they were in turn learning 3D space multimodally by crawling through the world like a baby does, it seems obvious they would be able to learn a bunch of that with much less exposure. So environment matters, feedback matters, volumes matter, multimodality is obviously super useful, and the ability to do experiments in the world gives you a lot of leverage with respect to testing different generalizations, and discarding certain ones.

A lot of the gotchas that I see people cite – look, it doesn't stack these things up the right way, or look at the stupid thing it says about putting the goat in the boat, or whatever – there's a lot of gotchaism in people's accounts on Twitter, or whatever we're supposed to call it these days. And what I find so frustrating about those accounts is that people are the same, right? If you give a basic brain teaser to a few different people, some of them will get it wrong sometimes. And we never say: this one person got this wrong, so clearly people don't have world models. That's just nonsensical. You have to look at a distribution of those problems. We always give people the benefit of the doubt – we know we have world models - and we always give models the opposite of the benefit of the doubt: one failure means they clearly can't do it. See, we told you they couldn't!

This is a general issue I have with a lot of the gotchaism around AI. The models also get better at these things over time. As we scale the models up and do bigger and bigger pre-training, the error rates on a lot of these things go down in a pretty dramatic way. I still hear people talking about gotchas as if models get them wrong every time – when in fact they got them wrong every time a year ago and now get them wrong 1% of the time. So again, a lot of fuzzy, imprecise thinking about metrics.

But setting all of that aside, it is absolutely true that LLMs are jagged – they are remarkably good at some things humans on the whole are very poor at, and remarkably bad at some things humans mostly are pretty good at. They have different strengths and weaknesses, and that shouldn't be surprising. We have very particular brain architectures and histories and stuff that is pre-wired genetically, and all of that matters. It matters because we've evolved in a certain way, with pressure on certain kinds of things. So for me it's less a question of lacking or having this or that and more a matter of: what does the landscape look like? I would also say we are quite jagged too, right? So we're looking at two jagged things whose teeth and hollows don't quite match up – like two different keys.

"We're looking at two jagged things whose teeth and hollows don't quite match up – like two different keys."

Blaise Agüera y Arcas

The Future: Social Scaling, Neuroscience, and What We Still Don't Know

John: Ok, so we've talked about the past and the evolution of intelligence, and now we've talked about the present and the way you see the landscape right now. Looking to the future, if we reject this binary and you're looking instead at a more continuous but jagged landscape: Where do you see the biggest remaining unsolved challenges in AI, and what are some promising approaches? Will future architectures look like scaled-up versions of what we have now, or are new approaches needed?

Blaise: Well, my answers are biased by what the Paradigms of Intelligence team is working on right now. Or maybe another way of putting that is: I'm putting my money where my mouth is – the Paradigms of Intelligence team is working on the things that I think are the biggest opportunities right now.

I think social scaling is really important. This kind of multi-agent moment we just had – with AI agent societies – is very interesting, even if it was problematic in a lot of ways, and with humans involved in a lot of complex ways that were not well controlled for. James Evans, Benjamin Bratton, and I wrote a piece recently about this. I'm really interested in how multiple agents collaborating can collectively make a bigger intelligence – and this will be unsurprising given the focus on symbiogenesis, on parts making wholes, that we were just talking about. For me, carrying that fractal forward – not just scaling single models but looking at how ensembles of models work together – is really the key. Not only to thinking about how intelligence scales further, but also to how modalities combine better, how we should rethink alignment. Rather than just saying "Oh, AI should align to human values." Obviously humans have all kinds of different values – what matters is how we work together. Let's understand the basic principles of that. And this even segues into questions about politics and economics: how do we think about thriving in societies that include lots of AIs and lots of people? So for me it's those questions about multiplicity, as opposed to just individual models, that are the most fruitful right now.

John: Speaking of the Google Pi team – I also found interesting the paper on emerging temporal abstractions in autoregressive models you put out recently. Going back to our earlier discussion about the inefficiency of dealing with things at the individual input token level, it feels like part of the idea with this paper is that by digging deeper into the latent space, with this internal reinforcement learning structure, maybe some of these richer higher-level abstractions are already there. We just need to figure out how to control them more efficiently. Is that right? And I'm curious, where you see this research direction going?

Blaise: 100%, yes. Yeah and by the way that is very much neuroscience-inspired, it's really about thinking through the relationship between the basal ganglia and the cortex. A lot of our motivations and high-level behaviors are controlled by this rather simple and ancient structure in the center of the brain – the basal ganglia structures. But the complexities of what all of those actions are, how to do them, etc. are of course based on complex patterns learned by the cortex. And in turn there's an interplay where you can develop simple behavioral patterns that then serve as scaffolding for more complex behavioral patterns to be learned and then controlled in turn by the basal ganglia. Very much inspired by that.

We know that those abstractions exist in LLMs and other neural nets we've built. But that sort of hierarchical control, and the ability to compositionally form new behaviors out of old ones, may not be so easy for the current generation of models because of a limitation in the architecture. That's exactly what we're exploring with that work.

John: That leads into my next question: I believe one of your focus areas at Google Pi is neural computing. As someone who has spent time thinking about both neuroscience and AI, do you still see lessons that AI has to learn from neuroscience? And if so, which lessons? Because I think one of the challenges is, of course, there are many differences between the brain and AI, but as you point out in the book, some of them may not be so significant ultimately toward the functioning and capabilities of the system. So: Are there certain areas that you think might show particular value as we look for inspiration for future neural networks?

Blaise: I don't know, but it is clear to me that we have a great deal left to learn in AI and a great deal left to learn in neuroscience, and one way or the other the things we learn in one area are going to apply to the other. Not everything will apply, obviously – it's not clear that we're always going to have convergent evolution between the two. Transformers are an instance where we haven't had convergent evolution, although there is some interesting work suggesting that the multiplicative interaction, the attention mechanism, may not be unlike what's happening in tripartite synapses, for instance with GABA. So I think the likelihood of learning from that interface remains very high in the future, as it has been for the past 50 years. People at the center of those revolutions have almost always overlapped strongly. The transformer is the only instance I know of where there wasn't a direct inspiration from neuroscience.

So things that I feel like we don't yet fully understand in neuroscience – that as an AI researcher I would absolutely love for us to know about, because that would absolutely give us inspiration – include: exactly how does the hippocampus work? Exactly how do the basal ganglia work? What's the cortical loop all about? What is the learning rule? We still don't know what the damn learning rule is, right? Is there one learning rule or are there multiple ones? Does it involve a different thing happening in cortex versus elsewhere? What are the roles of things happening at multiple scales – at spines and dendrites and so on? I heard of a potential learning rule just this past year involving myelination – differing levels of myelination will change the speed of propagation along an axon, and if you change the order in which spikes arrive at a neuron, you change the response - the learning rule there. There are many mechanisms we don't yet understand, both at a cellular level and below, and at a systems level. And as an AI researcher, I would sure like to know all of those things, because I'm quite sure there would be lessons to be learned there.


For Fans of Science Fiction

John: Right, I was going to bring up – I don't think we have time to get into it today, but you had some interesting references in the book about the hippocampus, in particular the Tolman-Eichenbaum model and potential parallels with the transformer. More broadly, if memory does remain a gap or an issue for transformers – looking to biological inspiration there certainly seems like a promising direction. But we'll save that for another time. Final question for today: I noticed you have outstanding taste in science fiction. You referenced both Ted Chiang and Adrian Tchaikovsky – two of my favorite sci fi authors – in this book. Do you have any other good recommendations for sci fi fans out there?

Blaise: Oh, now that's a great question. You know, actually, the chapter titles and a bunch of things in the book refer to lots of different science fiction references. I'm glad you like Adrian Tchaikovsky and Ted Chiang, I love them both. Ted is also a friend, even though he's a big AI critic – so we have friendly arguments about that. But I think he's the most gifted sci-fi writer of our generation, certainly for short stories.

Another of my favorites is Kim Stanley Robinson. His writing on ecology is great. He's quite a nontraditional writer – his stuff is not very plot-driven – but I think it's brilliant.

I don't know, there are a lot of sci-fi writers I really like, so it's a bit of a tricky question. But I've always loved Octavia Butler as well.

John: Well I think unfortunately that's all of our time today, but thank you for those recommendations, I will definitely check those out! For folks here at MIT: if you enjoyed the conversation, I highly recommend you check out What is Intelligence? – we've really just scratched the surface today. The book is thought-provoking, it's exciting, it's a joy to read. So go check it out, and come join us when Blaise comes to visit here at MIT on March 16th. Blaise, thank you again for joining me, and thanks for the conversation, it's really been a pleasure.

Blaise: It has. John, thank you so much for the really great questions – and I'm really looking forward to visiting.


Referenced Works


About Blaise Agüera y Arcas
Blaise Agüera y Arcas is a VP and Fellow at Google, where he is the CTO of Technology & Society and founder of Paradigms of Intelligence (Pi). Pi is an organization working on basic research in AI and related fields, especially the foundations of neural computing, active inference, sociality, evolution, and Artificial Life. In 2008, Blaise was awarded MIT's TR35 prize. During his tenure at Google, he has innovated on-device machine learning for Android and Pixel, invented Federated Learning, and founded the Artists and Machine Intelligence program. A frequent public speaker, he has given multiple TED talks and keynoted NeurIPS. He has also authored numerous papers, essays, op-eds, and chapters, as well as two previous books, Who Are We Now? and Ubi Sunt.

Blaise will be speaking at MIT on March 16th at 4pm in Singleton Auditorium (46-3002). His book, What is Intelligence?, is available for free online at the Antikythera project.