8 minute read


Simulating humans is incredibly hard, as it would require as an intermediate step to simulate the human brain. Yet, recent advancements in AI exposed a new line of attack, making the problem of human simulation somewhat tractable by restricting to simulating behaviour as opposed to physical processes. In this short blog post we will cover why approaching the human simulation problem from biology is unwise, how to make it more tractable and why the reduced problem is still very hard. We will then stress why creating digital replicas of ourselves that can interact with AI agents at machine speed is valuable and how it will influence the creation of generic Artificial General Intelligences.

Simulating Humans the Hard way

Suppose I task you with the following problem: build a digital simulation of yourself. Don’t bother too much about the why (we will cover that later). There are no particular constraints on time and budget, just focus on building the most realistic simulation, with the understanding that the same techniques should be able to be used to simulate arbitrary humans. Tell me the plan, and I will tell you if it fits what I want.

After some thought (and after doubting my mental sanity), your first answer might be mind upload. The idea is to physically scan your brain and nervous system into a computer to recreate a disembodied version of yourself.

.

Even though I have to admit that it is an interesting proposal, I forgot to tell you a constraint you just broke. I am not really interested in “you”, but more in a “you-like” simulation. I want something tunable, that can be configured to simulate an arbitrary person. Mind upload seems too close to be actually “you” for my taste!

The next option on the table might be brain simulation https://en.wikipedia.org/wiki/Brain_simulation, that is building a bottom-up computational model of a biologically-inspired brain. The main idea is that being able to simulate the core biological building blocks, such as biological neurons, will lead to the emergence of the biological mind. Buying into the idea of intelligence being substrate-independent is not too hard, so this seems viable. But let me add another constraint once again: I would like something that we can do reasonably soon, and reverse engineering the brain seems really far off. First of all, we do not have a good grasp of what we want to simulate, the brain and the nervous system (https://www.frontiersin.org/journals/systems-neuroscience/articles/10.3389/fnsys.2023.1147896/full ). Our understanding of the behaviour and function of individual neurons is still an active subject of research, with the downside of working at biology speed: experiments are expensive and hard to reproduce. Having access to a working brain can only be done scalably with non-invasive techniques and even those are still getting better. So even before talking about the computational problem of simulating a biological brain (see graph below for an order of magnitude estimation of the FLOPS needed to simulate a brain at an increasing level of detail. We are way off!), there is too much uncertainty on what we want to simulate and it will require expensive physical experiments to make progress.

.

The final nail in the coffin to the brain simulation approach is that such hardly won brain simulation would be a blank state, which would need to interact with meaningful sensory stimuli to recreate human intelligence. Simulating all the stimuli, including the social interactions shaping an individual mind, it’s a hard problem on its own.

Your final proposal could be the following: what if we give up on simulating humans themselves, but we limit ourselves to simulating human behaviour? We treat a simulated individual as a black box and we just care about being good at predicting outputs in response to a sensory stimulus. We can learn this mapping function by observing a lot of data. It’s not biologically accurate, but it’s useful. I believe this can work reasonably soon, and that’s what we cover next.

Capturing Behaviour: Machine Learning

Conceptually similarly to the transfer function of control theory or Green’s function in physics we can model human behaviour as a master function which takes as an input:

  1. An external stimulus (audio, video, text, etc.)
  2. The environmental context (when, where, what happened before)
  3. A human individual

and then returns an output as text, audio and more. While finding an analytic causal model would be ideal, it’s much easier to use machine learning to extract correlations from data to obtain an approximation to this function. Statistical pattern recognition: given some inputs, what is the more likely output given all the data we have seen?

Even taking the statistical approach seems daunting, but there are some big simplifications that we can make.

A first drastic simplification is the output: we give up on the idea of simulating a full biological output, in favour of a narrow behavioural output. For instance, what would such an individual say in response to that slogan? On a more ambitious level, one could still aim to build simplified biology-related models by focusing on aggregated neural patterns, such as Electroencephalography simulations.

A second big simplification can be done on the human individual: for a given use case we might be willing to characterise a human by a few demographic traits, say age and gender. At the extreme opposite a complete model could feature 1000s of traits related to an individual status, preferences, psychographics and more.

It’s not advisable to simplify much the stimulus and the environment, since these are data points which we can obtain relatively easily.

Machine learning is the perfect tool for the job; we can leverage existing powerful algorithms and optimised hardware in use for Large Language Models. In fact if we restrict to language-language input-output, Large Language Model can already be leveraged to build primitive human simulations.

So why is simulating humans still hard? We lack the main ingredient: data! To capture the wanted correlations in principle we need data covering all the possible permutations of inputs and outputs. Multimodal stimuli, detailed environmental context and data-tags for each of the individual traits. This combined data just doesn’t exist at the moment, the best data sources we have usually tick just a few of the boxes. For instance, if we are interested in simulating human responses to a text message, we rarely get more than the textual reply, therefore missing out on the context of the response, the unconscious factors leading to the response, the emotional states, and so on. Current out-the-box LLMs have been trained mainly on non-conversational data (books, websites, etc.), so they are not ideal even just focusing on textual interactions. Indeed they are actively bad at human simulations, since they are able to unrealistically answer about everything at length!

One can imagine classifying some levels of human simulations realism, related to the quality of the data that went into training the machine learning models.

  1. Shallow Personas: simulated text-based humans for interactive conversations. LLMs with some prompting and fine-tuning on conversation data are at this level.
  2. Multimodal Personas: As above, but trained on conversational multimodal data such as audio and video.
  3. Agentic Personas: Trained on action data to simulate how an individual would react to a given input.
  4. Value Personas: simulated humans with definite preferences, wants and needs. This requires a large amount of heterogeneous data points, covering for a given human profile enough input-output pairs to enable the extrapolation of a value function, that is what the individual cares about.
  5. Unique Personas: simulated humans with definite personalities, acquired through the equivalent training time of a simulated life. Here it is particularly important to leverage data rich in environmental context, including interaction with other human simulations, and spanning a long time frame.
  6. Bio Personas: Sophisticated simulations with simulated brain neural activities, allowing simulations of real-time unconscious responses such as emotions. Such simulations would require adding physiological data to the dataset.

The impact of simulated humans

Lastly (contrary to best practices!) let’s discuss the why: why bother building human simulations?

With the rise of increasingly intelligent AIs we will need a way to keep up in the digital world, a way to talk to AI agents at machine speed. We need a digital replica of ourselves to let machines understand and talk with us. If we don’t we either don’t leverage the full benefits of automation or it gets dangerous quickly. In the first scenario, we are forced to keep humans in the loop in the long term for any impactful actions that agents can take. If we consider that AIs can do inference at the millisecond scale and in parallel we are giving up on 1000x of efficiency boost. In the second scenario, as we increase our reliance on intelligent AIs that don’t really understand our values and preferences we will end up with poor results. Going toward human-level AGI the risk becomes potentially existential, so it’s imperative that we get good at Artificial Human Intelligence (AHI), that is high-fidelity human simulations, to steer the development of AGIs (where the A usually means Artificial, but here it’s more fitting to say Alien). Alien intelligences are in principle poised to reach a higher level of intelligence (however we choose to define it, see here for an attempt) than humans, we should make sure they understand us at machine speed during their training runs.

Regarding more down-to-earth applications, simple human simulations can be excellent for debugging digital products and AI agents. Advanced human simulations can be revolutionary in how we develop, market and sell products, by allowing fast and hyper detailed A/B testing and counterfactuals. Taken at large scales, human simulations can be influential to formalise and accelerate social sciences (psychology, anthropology, sociology, economics, etc.) and instrumental in designing new economic and government policies. Bio Personas can accelerate neuroscience research. Finally there are obvious applications in entertainment and AI companionship.

This is just the tip of the iceberg of what can be done! In a future part 2 of the blog I may go deeper into how to build useful human simulation in practice. I hope the blog post was stimulating, please do reach out if you are interested in building or using human simulations!

Thanks to Francesco Corallo and Mathusan Selvarajah for helpful comments on an early version of this article.

Leave a comment