This is part of the Expertise Acceleration topic cluster.

This is part of the Operations topic cluster, which belongs to the Business Expertise Triad.

A Vaccine for Anthropomorphism of AI

Fan of great business?

Join 9,000+ sharp investors and operators like yourself, and we'll send you a collection of Commoncog's best articles right away:

A couple of weeks ago I was back in my hometown, and found myself needing to explain to my dad why he shouldn’t treat DeepSeek or ChatGPT like, well, god.

My dad is a reasonably intelligent but non-technical person. He is a genius with car engines and mechanical machines. I’d noticed for some time that he had a tendency to treat Large Language Models (LLMs) as … well, not exactly as people, but far more than just a tool. He did not have a good sense of what not to ask LLMs. (For instance, he asked ChatGPT for directions from Taoyuan airport to our Airbnb address in central Taipei — something it could not possibly get right). He also asked it for its opinions on various — often political — topics, taking those answers seriously. This was not his fault — I, too, found myself struggling not to anthropomorphise ChatGPT or Claude in the first few years of using it. It took me 11 months or so to stop saying “please” and “thank you” with every prompt.

Humans are built to naturally anthropomorphise things. We look at a robot vacuum cleaner humming along and go “aww, how cute!” If you look around, you’ll see all sorts of examples of humans doing or saying things that reveal an anthropomorphised relationship with LLMs:

A friend, who is otherwise a competent, technically-minded startup operator, asks ChatGPT for ‘her opinion on our Q4 strategy’.
An investor friend is worried about AI taking revenge on society.
A programmer friend describes a coding agent as having ‘malicious’ capabilities, in the context of a famous event where an LLM deleted a production database. When pressed, my friend admitted that he didn’t believe that current LLMs could have malicious intent, but the fact that he used the word was quite revealing of his underlying relationship with the technology.
A writer friend tells me that he’s found that “the emotional valence you bring to your LLM influences the emotional energy of its output.”
Stripe cofounder John Collison casually calls current LLMs ‘superintelligences’: “Web search functionality has, in a way, made LLMs worse to use. ‘That's a great question. I'm a superintelligence but let me just check with some SEO articles to be sure.’” (To be fair, he could just be joking).

Why is this important? There are two reasons, I think:

First, I believe that not anthropomorphising AI is an important step for productive use of LLMs (an opinion not everyone would agree with, I’m aware).
But more importantly, not anthropomorphising AIs is a good defence against AI-driven psychosis. Which is a thing I worry about.

An in-the-wild example of a high functioning individual, driven to psychosis by unhealthy use of ChatGPT.

On the other hand, I’ve noticed that many of my AI researcher friends do not suffer from this problem. These friends train models in their day-to-day work, so they have a felt sense for how the sausage is made. They do not make the same anthropomorphic errors that most of us do.

The problem, of course, is that their ‘felt sense’ is not easily communicable. One friend, who was an ICLR best paper award winner, had great difficulty communicating his intuition of LLM capabilities to me. He had much scepticism for some AI use cases, and less scepticism for others, and I just couldn’t model his reactions. One reason for this is that there is still much disagreement in the researcher community about how LLMs are able to do what they do — which means another, equally credible researcher would have different intuitions (though still better ones compared to the average software engineer). I have concluded that there does not appear to be simple and accurate analogies to communicate researcher intuitions to us.

I’ve also noticed that saying “it’s just autocomplete” or “it’s just doing pattern matching” is not effective in preventing anthropomorphism. Most folks have years of experience with autocomplete. They will use ChatGPT and subconsciously compare their interactions with it with their felt sense of phone keyboards or Microsoft Word — and then reject the analogy. ChatGPT is simply too magical and too good for it to be ‘just’ autocomplete.

To make things worse, saying “it’s just pattern matching, the LLM is not actually thinking” is woefully ineffective. The common response is: “So? Humans do pattern matching too!” This is not entirely untrue! It’s well known that human expertise is built on top of pattern matching. Using words like ‘thinking’ or ‘pattern matching’ also invites definitional debates. Regardless of your opinion on this topic, it is not useful to wade into such discussions — this explanation does not bring us closer to the kinds of non-anthropomorphic, sophisticated, productive LLM use that we want to encourage.

So: giving my father an explanation that nudges him in the right direction is not so simple as “it’s just autocomplete” or “it’s just pattern matching” or “do not anthropomorphise it.”

We need to do better than that.

The Vaccine

Here is an explanation that I finally settled on, based on a tweet by prominent AI researcher and educator Andrej Karpathy. When I first tried it out on my dad, he kept quiet for a little bit, and then shifted the way he saw and used LLMs. I’ve tested this on a few friends since, of varying levels of technical sophistication, and am pleased to report that it works quite well.

This explanation is useful but not accurate. I’ll give you the explanation, explain why it works, and then give a brief sketch on how it is not true. Finally, I’ll argue that even if it is not accurate, this explanation points you towards more productive mental models of LLMs.

In other words, it is effective as an information vaccine.

The explanation unfolds like this.

Imagine that you can visualise words like stars in the sky. The position of the words relative to other words are based on the relationships between each of the words in your language (that is: how closely and how frequently a word appears next to another word in all the sentences ever written). What is important to know is that you can draw arrows from one star to another! These arrows have some surprising properties. One property is that the arrow from the word ‘king’ to the word ‘queen’ is the same as ‘king - boy + girl’. On top of that, let’s imagine that you throw up a starfield for English and a starfield for Spanish. It turns out that if you can draw a one-to-one mapping between the two starfields, the king to queen arrow is the same in both languages! This was a very surprising finding when it first came out!
What a Large Language Model is is that it is a very sophisticated auto complete. But it is a bit more than that.
When you ask “what are the 10 best places to visit in Bali” the AI will give you a plausible-sounding answer. But the way it gives you that answer is that during the AI’s training, some human somewhere wrote an answer like “the top 10 places to visit in London” and “the top 10 places to visit in Tokyo” and “the top 10 places to visit in New York” based on some cursory research and Google searches. Then the AI took those written examples, and memorised the statistical relationships between the sentences, which you can imagine like the arrows between large clusters of stars. Then when you ask “what are the 10 best places in Bali or Singapore or Lisbon”, it just moves the arrows it learnt over to the part of its starfield that has concepts related to Bali, Singapore and Lisbon, and spits out something similar to what the human trainer wrote.
Notice that this is not thinking. The LLM is doing autocomplete, but using this ‘arrow in the starfield’ property to give you very good, plausible-sounding, novel answers.
But because it writes so eloquently, and answers you like a human would, you think that it’s actually intelligent and sentient. I suppose you could say that it is ‘intelligent’ (by some definition of ‘intelligent’) but it is not a person. It doesn’t understand concepts the way a human does. What it is relying on is some complex version of this ‘arrow in a starfield’ property.
You may get better answers by constraining it to a smaller set of documents. So if you give it a bunch of papers, and ask it for themes or a summary of those papers, it may give you better answers than if you assumed those papers were in its original training corpus.

And … that’s it.

Why Does This Explanation Work?

This explanation works because it gives you a plausible model for how LLMs work, and in a way that doesn’t require any technical knowledge. Think about your robot vacuum cleaner. If you’ve never taken apart a robot vacuum before — a collection of sensors, servo motors and chips — it is very easy to anthropomorphise it and treat it like a family pet. But once you’ve taken it apart and put it back together again, it becomes more difficult to imagine it as other than a machine.

So it is with LLMs.

This doesn’t mean LLMs are not useful or transformative. As a technology, LLMs have certain properties that are very powerful, and are not commonly found in other forms of software. For starters, one property that we want is for LLMs to be able to generalise beyond their training set. Take this example, for instance:

You may need some context to understand this. Grok is a large language model made by a company named xAI, which is associated with Twitter. Shortly after Grok 4 was launched, xAI modified Grok to be ‘less politically correct’, at which point it started spewing anti-semitic tweets and responses on the social media platform.

Again, remember that Grok does not ‘hate’ or ‘feel’, nor does it ‘understand’ the concepts it is generating. All that has happened is that the safeguards that normally prevent it from moving arrows into the part of its starfield containing anti-semitic concepts was removed for a couple of weeks.

Vitalik Buterin then asked OpenAI’s o3 model to come up with Grok’s surname. o3 ran a few web searches. It is important to note that no exact web page (at the time) states that Grok’s surname is Hitler. This is something that o3 must have ‘inferred’ from the pages and online discussions it found. This inference is a very useful property! It explains why LLM coding tools, for instance, are able to solve bugs it has never seen before or generate code for programs it was never trained on.

It is also why saying “it is just doing pattern matching” is inaccurate — which you should know after using one of these models for some amount of time. LLM tools are doing something slightly more than pattern matching, and this is easy to spot.

Unfortunately, it is also this property that leads LLMs to hallucinate. Generalisation (a desirable property of this technology) and hallucination (an undesirable property) are two sides of the same coin. You cannot remove one without removing the other.

Coming back to the structure of the explanation, the six steps work because it gets you close enough to a real mechanism, without using more accurate technical language. Let’s break it down step-by-step (wink) to see how the vaccine functions:

We open with the starfield analogy because we want to introduce the idea of ‘vectors in embedding space’, an alien concept to most folks.
We introduce the idea that LLMs are a form of ‘auto-complete’, but we don’t stop there.
We tie auto-complete together with the ‘moving arrows in the starfield’ story — which is basically Karpathy’s explanation, but in simpler language. This is also useful as a shorthand: when my dad points out some dumb thing that a LLM does, I can now say “oh, it’s just moving arrows in a starfield badly” and he gets it. More importantly, he doesn’t say things like “why is it so stupid?” which subtly anthropomorphises it.
Having introduced this plausible mechanism, we now point out that the mechanism is very different from the way humans understand and reason about things.
We point out that the net effect is very human-like responses, which makes anthropomorphism easy but wrong.
We close with a useful observation that might help with their LLM use (but notice we do not explain why this is).

Note that this explanation is silent as to whether current LLMs can get to artificial general intelligence. It just describes them as they are today. The ‘can this become AGI?’ debate is not relevant to the outcomes I care about here.

Why is This Explanation Inaccurate?

I said earlier that this explanation is useful but not true. Why?

It’s worth reading Andrej Karpathy’s original explanation in full, since this vaccine is adapted from his:

People have too inflated sense of what it means to "ask an AI" about something. The AI are language models trained basically by imitation on data from human labelers. Instead of the mysticism of "asking an AI", think of it more as "asking the average data labeler" on the internet.

Few caveats apply because e.g. in many domains (e.g. code, math, creative writing) the companies hire skilled data labelers (so think of it as asking them instead), and this is not 100% true when reinforcement learning is involved, though I have an earlier rant on how RLHF is just barely RL, and "actual RL" is still too early and/or constrained to domains that offer easy reward functions (math etc.).

But roughly speaking (and today), you're not asking some magical AI. You're asking a human data labeler. Whose average essence was lossily distilled into statistical token tumblers that are LLMs. This can still be super useful of course. Post triggered by someone suggesting we ask an AI how to run the government etc. TLDR you're not asking an AI, you're asking some mashup spirit of its average data labeler.

Example when you ask eg “top 10 sights in Amsterdam” or something, some hired data labeler probably saw a similar question at some point, researched it for 20 minutes using Google and Trip Advisor or something, came up with some list of 10, which literally then becomes the correct answer, training the AI to give that answer for that question. If the exact place in question is not in the finetuning training set, the neural net imputes a list of statistically similar vibes based on its knowledge gained from the pretraining stage (language modeling of internet documents).

Then, in response to a commenter saying “they (the labellers) can’t possibly be hand-writing each curated list, they’re just grading existing answers”, Karpathy clarifies:

Clearly there's too many locations. The data labelers hand-write SOME of these curated lists, identifying (by example and statistics) the kind of correct answer. When asked that kind of question about something else & new, the LLM matches the form of the answer but pulls out and substitutes new locations from a similar region of the embedding space (e.g. good vacation spots with positive sentiment), now conditioned on the new location. (Imo that this happens is a non-intuitive and empirical finding and the magic of finetuning). But it is still the case that the human labeler programs the answer, it's just done via the statistics of the kinds of spots they picked out in their lists in the finetuning dataset. And imo it's still the case that what the LLM ends up giving you instantly right there and then is roughly what you'd get 1 hour later if you submitted your question directly to their labeling team instead.

One glib, quick reason the vaccine is not accurate is that Karpathy’s explanation is more confident than where the current research consensus is. It’s not clear that the AI is doing this in every case; only in some of the cases.

There are also various technical inaccuracies with my chosen explanation. If you are a Commoncog member, you should read this private forum thread. (You’ll have to log in first, otherwise you’ll see a 404 page). I’ll quote (with permission) one answer from @shawn:

Again, the starfield analogy doesn’t quite hold when it comes to documents the LM has trained on in the trillions of tokens it’s looked at. For one thing, unlike at the token level, you can’t find the vector associated with the document in the model. The document’s representation has been “smudged” up in the many, many floating point numbers that make up the language model parameters. This smudging also essentially means that the model can smudge similar facts about different places together, which can contribute to ‘hallucination’, or it can have a great outcome of generalisation. These are, to me, two sides of the same coin.

If you give the vaccine above to a serious AI researcher, it’s quite likely that they will scrunch up their face in disgust. “That’s not right …” they will say, and then they would start on a long diatribe about the various ways the explanation is wrong. But this is good. In my experience, such discussions will drive you further away from “the LLM is god and it knows all of human knowledge, so I should ask it how to live my life” towards “oh, what a remarkable collection of statistical tricks.” This makes inoculation against anthropomorphism stronger, not weaker.

(I believe that it also makes you less likely to fall prey to LLM-assisted psychosis — but that’s an opinion not borne out by evidence).

Either way, I currently think this explanation does more good than harm, which makes it acceptable even if it is not strictly true.

Evaluate This Through Use

I am offering this explanation as a vaccine against anthropomorphism. As with most things on Commoncog, this is optimised for usefulness. I am interested in:

Field reports of effectiveness, and modifications you’ve found to make this better, from use — either for yourself, or with other people.
Unintended consequences — both good and bad. (For instance, perhaps this explanation gets in the way of more sophisticated usage of language models.)
Reports that this has stopped becoming useful. (Notice the past tense here: I am not interested in predictions).

I will admit that I’m also not particularly interested in critiques that are unrelated to the intended outcome of this explanation. This is in much the same way I am not interested in reviews of hammers where the reviewers evaluate them as spoons.

Notice that this vaccine does not help with better use of LLMs. This merely prevents one failure mode, and lays the foundation for healthier, more useful approaches to the technology. What those are we’ll talk about later.

Originally published 30 July 2025, last updated 30 August 2025.

This article is part of the Expertise Acceleration topic cluster. Read more from this topic here→

This article is part of the Operations topic cluster, which belongs to the Business Expertise Triad. Read more from this topic here→

More in Thinking Better

Feature image for How to Use AI Without Becoming Stupid

A Vaccine for Anthropomorphism of AI

Table of Contents

Fan of great business?

The Vaccine

Why Does This Explanation Work?

Why is This Explanation Inaccurate?

Evaluate This Through Use

Member Comments

More in Thinking Better

How to Use AI Without Becoming Stupid

Outcome Orientation as a Cure for Information Overload

Don't Read History for Lessons

Verifying Believability

A Vaccine for Anthropomorphism of AI

Table of Contents

Fan of great business?

The Vaccine

Why Does This Explanation Work?

Why is This Explanation Inaccurate?

Evaluate This Through Use

← The Tatas: When Virtue Works

How to Run Smart Experiments When You Just Don’t Know →

Member Comments

More in Thinking Better

How to Use AI Without Becoming Stupid

Outcome Orientation as a Cure for Information Overload

Don't Read History for Lessons

Verifying Believability