This is the first in a series of posts about learning in ill-structured, novel domains.
Here’s an interesting question: how does reading history help you with operating in the future?
This question should be familiar, especially if you find yourself running a business, or investing in the markets — experience should, in principle, be the best possible teacher, which of course suggests that other people’s experiences might serve as an ok substitute. But then why study history if everything is path dependent and you’re not likely to experience the exact same things in your life? I often tell friends that I read a huge amount of business biography in order to “expand the set of patterns in my head”; their usual retort is “but then all of that is unlikely to happen again — history never repeats itself; why read business history anyway?” And they kind of have a point.
It turns out that the question we’re interested in is as old as inquiry itself. In Plato’s Meno, for instance, the Athenian philosopher asks: “how can we come to know what we do not already know?” If you want a more precise formulation of that question, the one that comes up in the expertise literature goes something like: “how is it that studying prior cases helps with novel ones?” And it is the expertise literature that we must turn to if we want to find good answers to this question.
One of the first things that you learn when you start digging into expertise research is that experts are better able to handle novelty, especially when compared to novices. And so what we really want to know is the following: how are experts able to perform when faced with novel situations? Can we teach this to novices? And what, exactly, are experts doing differently when they expand the set of cases they have in their heads?
The theory that gives us the best answers to these questions is something called Cognitive Flexibility Theory, originally published in a 1988 paper by Rand Spiro, Paul Feltovich, Michael Jacobson and Richard Coulson.
If you’re a longtime reader of Commonplace, you might already recognise the name. Late last year, I summarised Accelerated Expertise, which remains the single best resource on accelerating skill acquisition that we have available to us today. The authors of Accelerated Expertise argue that all of the accelerated training programs covered in the book build on two core learning theories: Cognitive Transformation Theory (CTT) and Cognitive Flexibility Theory (CFT) — theories that are somewhat unusual in their provenance, since they focus on real world learning in ill-structured domains — as opposed to clean, classroom-based theories that we more commonly see elsewhere. I explored CTT in The Hard Thing About Learning From Experience and Business Expertise: The Importance of Cognitive Agility, and have found it to be a remarkably coherent explanation of learning from experience (and also why some people are not able to do so). But I’ve not dug into CFT until this essay.
Spiro and his colleagues write about the central questions that led to CFT in the Oxford Handbook of Expertise:
How is learning possible under conditions of novelty? (...) Since we are talking about ordinary novelty, commonly encountered in our increasingly complex and rapidly changing world, as well as always found in non-routine cases in professional domains, and frequently found in real-world naturalistic decision-making, this is a milder form of the paradox than that of more radical creative discovery. But it is a paradox nonetheless. If prior case experience does not generalise to new cases (by definition of ill-structured news), then how can one become prepared to deal with new cases? In other words, how is acquiring adaptive proficiency possible?
CFT gives us a few levers to pull on. It tells us how experts are able to perform well when confronted with novelty; it tells us why experts in ill-structured domains place such high emphasis on studying history, and — most importantly — it tells us how to learn from other people’s experiences.
Let’s dig in.
Learning in Ill-Structured Domains
Cognitive Flexibility Theory was created as a response to more traditional learning theories — ones that focused on ‘schema development’. Such theories usually talk about teaching and then deploying ‘pre-packaged knowledge structures retrieved from long-term memory and instantiated by features of some present context’; whenever you read about ‘chunking’ in chess, for instance, you are, in effect, reading about a schema learning theory. The reductive version of the idea goes something like this: when you are learning chess, what you are doing is essentially storing millions of patterns in your head. These chess patterns form ‘chunks’ that you can manipulate in memory, and so when you play a high-stakes chess game, you retrieve these pre-packaged schemas in order to guide your decisions. (Note that these schemas don’t have to be consciously available — at the highest levels of skill, schema retrieval usually expresses itself as intuition. Rather famously, grandmasters tend to describe these ‘feelings’ about board states as ‘lines of force’ or ‘developing areas of weakness’, or other similarly incoherent tacit things).
Certainly, a great deal of real-world expertise does work like this. If you are a writer or a computer programmer, for instance, it is likely that you have a bag of chunked tricks in your head, and you will reach for those tricks when working on an essay or a computer program.
But Spiro and his colleagues weren’t happy with this explanation of expertise. “What about novelty?” they asked. Spiro writes, again in the Oxford Handbook:
A need was sensed for a complementary theory of knowledge representation and use for more ill-structured domains (emphasis added), along with sufficiently flexible technology for case-base learning to develop such knowledge structures, and the ability to utilise them in building adaptive schemas of the moment for novel situations. (…)
The focus of CFT is adaptiveness, when it is required, rather than expertise. CFT is a theory of learning and instruction for adaptive application of knowledge and experience in any non-routine aspect of a domain (or, for domains that are largely non-routine in the new cases one encounters, as a primary cognitive worldview for that domain, with switching to a more reductive worldview when regularities are detected, rather than the other way around).
The emphasis on ‘ill-structured domains’ is worth digging into. What Spiro et al mean when they say ‘ill-structured’ is that the domain contains many concepts that are relevant during application, but that the patterns of combination of those concepts are inconsistent ‘across case applications of the same nominal type’. In other words, there is often great variability in how the concepts look like when you are taking action in the domain, even if these concepts are simple in theory.
Think about business, for instance, or medicine, instead of chess or math. In business decision making, you often have a great many variables that you must juggle in your head, but these are highly tailored to your particular context. What industry are you in? What level of maturity is your product lineup at? What does your cash flow look like? Similar business strategies will look very different in practice depending on company maturity, your access to capital, your org structure (which defines your capabilities), and the state of competition in the markets you compete in.
Why should this matter to us? Well, apart from the obvious implication that much of life is ’ill-structured’, there is also the fact that ill-structured domains demand very different learning strategies, and that we should pay attention to this. In the original 1988 paper, for instance, the researchers write:
The more ill-structured the domain, the poorer the guidance for knowledge application that ‘top-down’ structures will generally provide. That is, the way abstract concepts (theories, general principles, etc.) should be used to facilitate understanding and to dictate action in naturally occurring cases becomes increasingly indeterminate in ill-structured domains. The application of knowledge to cases in an ill-structured domain (i.e., a domain in which cases are individually multidimensional, and irregularly related one to the next) cannot be prescribed in advance by general principles. This is because, in ill-structured domains, there is great variability from case to case regarding which conceptual elements will be relevant and in what pattern of combination. In an ill-structured domain, general principles will not capture enough of the structured dynamics of cases; increased flexibility in responding to highly diverse new cases comes increasingly from reliance on reasoning from precedent cases (emphasis mine).
Thus, examples/cases cannot be assigned the ancillary status of merely illustrating abstract principles (and then being discardable); the cases are key — examples are necessary, and not just nice.
And here we have the first surprising implication of the theory. (I’m fully aware that I haven’t even started talking about the theory itself — this is all in the setup!) The researchers mention this property of ill-structured domains almost as an off-hand remark, as a setup to the pedagogical recommendations they would make in the latter half of their paper. But let’s pause for a moment to consider what they are saying.
The implications, if you think about it, are quite interesting:
- First, the researchers argue that reading case history is central to learning in an ill-structured domain. If the instantiation of abstract principles is hugely variable, then any study of such principles has to be accompanied by a large library of cases expressing those principles in action. We’ve seen a little of this on Commonplace — as an example, I wrote a summary of 7 Powers last year, arguing that it is the best book on business strategy available today. But then I presented a collection of real-world examples that I found surprising and a little weird in my follow-up. It turns out that there are plenty of quirks with the way the Powers play out in the real world; competitive arbitrage is not as efficient as you might think.
- Second, when you are talking to an expert in an ill-structured domain, you shouldn’t be too surprised if the expert reaches for examples to explain herself, instead of articulating a generalised principle. I’ve long struggled with the tension between concluding a bunch of takeaways from any story of a company or a person, and accepting that whatever has happened in that story might never occur again. CFT tells us that experts in ill-structured domains deal with this tension by reasoning from cases instead (“this aspect reminds me of that time when …”), not principles alone, because it is often too difficult and too reductive to extract a generalisable principle!
- Most importantly, though, this property about ill-structured domains tells us that we need to change the way we think about learning within such domains. The way we are taught concepts in school presupposes that the concept is what is important; examples are merely given to illustrate that concept in action, and then discarded later. This mostly works for regular domains like chess and physics and math. But in ill-structured domains, the authors claim that the cases are everything. And indeed this is how things work in medicine. Spiro et al write “Not only is it more difficult to count on top down prescriptions for performance in new cases in an ill-structured domain (i.e., abstract concepts/theories inadequately determine responses to new cases), but there is also considerable indeterminateness in defining conditions for accessing conceptual structures in the first place, to engage the guidance the conceptual structures do offer. (…) It is very difficult for medical students to learn how to get to the basic science concepts from clinical presenting features, partly because of the great variability across clinical cases in the way those concepts get instantiated.” In simpler language: in any ill-structured domain, you cannot easily generalise principles from any particular case; when you are applying concepts, it is easier to reason from past expressions of that concept, instead of working from the abstract principle itself!
As I’ve mentioned, I’ve struggled with this tension between learning from history and accepting that every story is unique — and therefore uncopyable. At the end of What Bill Gurley Saw, for instance, I wrote:
If there's a generalisable question from this investigation, it might be this: if Gurley really did build his career around a single, secret idea, how might you copy that? Unfortunately, I don't have any prescriptions. The world is weird, and sometimes people stumble onto insights that turn out to be ridiculously valuable. (...) The conservative lesson that you might take away from Gurley's story is simply “huh, this is something that can happen”; you file it away in your head and then carry on.
CFT is interesting because it argues that ‘uncopyable’ doesn’t mean ‘no educational value’. The key is to know how to use such case studies. Which brings us to the core theory itself.
Cognitive Flexibility Theory
The central claim in CFT is that experts deal with novelty in two ways:
- First, they reuse fragments of old cases and concepts in new contexts. The researchers call this ‘schema assembly’ — which is a fancier name for the ‘reasoning by analogy to previous cases’ that we’ve talked about, above.
- Second, practitioners with adaptive expertise seem to all share a similar worldview. The researchers call this the ‘adaptive worldview’ — which includes things like rejecting single explanations, holding multiple case-based representations of a concept equally in one’s head, and refraining from reductive models of reality.
We’ll get into these two claims in a second, but you can already see the shape of the theory: since adaptive skill decomposes to these two sub-skills, the job of a CFT-informed learning system is to find ways to train novices along those two lines. You want to find a way to teach schema recombination, and you want to inculcate the adaptive worldview.
Let’s examine these the two claims in order.
Reuse of Old Ingredients in Novel Assemblies
How are experts able to perform when faced with novelty? The simple answer is that they draw fragments of prior cases and recombine them into new schema on the fly. We’ve already spent some time talking about why this is necessary in ill-structured domains — and so it’s no surprise that experts in similarly ill-structured domains all seem to tend towards case-based reasoning.
The researchers write:
Cases and concepts are complex. Their parts and aspects can become the basis for old case-concept combinations to be reused in new contexts. Adaptive responses are novel; their ingredients are often not.
In these new contexts concepts can not retain fixed meanings and still retain their adaptive flexibility. Family resemblances, yes; rigid, predefined meanings, no.
To restate the properties again:
- In ill-structured domains, there is huge variability in the instantiation of concepts.
- Teaching the concepts is insufficient for application. Instead, the concept is shaped by the actual expression of the cases. If you see a new case that expresses a concept in a novel way; you update the concept in your head, instead of discarding the case.
- Worse, it is very difficult to extract generalisable principles from individual cases.
- Which in turn means that reasoning in an applied, ill-structured domain often means that you must reason from other cases, since principle extraction is just so darned difficult.
I must admit that this is incredibly counter-intuitive to me. I am trained to think in terms of principles and arguments, backed by disposable examples. In fact, I’d go so far as to say that the dominant form of argumentation in both academia and in industry takes on this form: you state your assertion, you back it up with reasoning, and then you provide one or two examples to illustrate the point you’re making. Inverting this style of thinking to focus on cases above principles goes against the grain of my training. It means, for instance, that I shouldn’t distil a complex case down to a set of generalisable principles or takeaways. I should just hold the case in its full complexity in my head, as a collection of concept instantiations. I find this very difficult to do.
And yet … this aspect of CFT helps explain something that I’ve found consistently intriguing about the investor Charlie Munger. The last time we’ve talked about this was in my summary of Range, where I wrote about Epstein’s coverage of analogical thinking:
The key to good analogical thinking is when you can map deep structural similarities between examples drawn from different fields. Epstein asserts that it’s not worth it to just compare surface similarities; the more range you have, the deeper the structural similarities and the larger the set of analogies you may draw on.
Why is this interesting to me? This is interesting because analogical thinking happens to be the primary thinking method that Charlie Munger uses.
Longtime readers of Commonplace would know that I struggle with the whole mental model obsession that has taken the self-help world by storm. But I’ve found that Munger’s style of thinking to be more interesting than the actual mental models that he draws from; Munger often reasons by analogy to previous cases, or to the aforementioned ‘mental models’ that he’s taken from multiple undergrad-level domains. In the example I cited in my Range summary, I quoted Munger’s response to a startup pitch:
“We want to win a position in the legal research market with analytics and, as lawyers turn to us for unique insights and easier user interfaces, we can expand our tools and content until we’re a complete alternative to the incumbents. Our technology scales efficiently, so we can also offer lower prices.”
Charlie spotted another pattern. “This reminds me of the ‘Cola Wars’ between Pepsi and Coke. Up until the Great Depression, Pepsi and Coke were priced the same and Coke was dominating the market. But then Pepsi cut their price per ounce by half and their sales took off, with profits doubling too. Price can be a powerful competitive tool when you have a good substitute product.”
I used to think this sort of thinking was sloppy. Good thinkers, I thought, reasoned from first principles; so why isn’t Munger doing that? In fact, how does he succeed by reasoning from analogy so consistently? CFT gives us one explanation: what Munger is doing is that he is giving a concept instantiation. If principles are hugely variable when applied to the real world, then surely the best we can do is to provide an example of that instantiation in action, as an analogy to the case-at-hand. After all, Munger’s underlying point — that “price can be a powerful competitive tool when you have a good substitute product” — is too context dependent to be useful. But when told in the context of a story, Munger can make a nuanced point about the interaction between price competition, brand pricing power, and the fungibility of substitute products.
The Adaptive Worldview
The second property that enables experts to deal with novelty is something the researchers term ‘the adaptive worldview’. The authors of CFT point out that while each novel case is (by definition) unique and new, there are a set of meta-features that help experts make sense of and adapt to novelty. These features are essentially two things:
- Ontological features — that is, a set of beliefs about how the world works.
- Epistemological features — a set of beliefs about how to learn about and understand such a world.
This is some high-falutin’ language, but it’s actually quite easy to understand. Imagine a heart attack. If you are like me (that is, a non-doctor) you would probably default to a visualised ‘prototypical example’ of a heart attack. This would most likely be some collection of common cardiac symptoms, like a man or woman falling over, clutching their chest or arm, going blue in the face.
Expert doctors, however, do not have such a prototypical example in their heads. They understand that a prototypical example is reductive — that is, you will automatically try and map what you’re seeing in the real world to that imagined ideal heart attack in your mind’s eye. No, what expert doctors have instead is a large collection of examples to compare against. They understand that case presentation for heart attacks is hugely variable — and dependent on the patient’s unique history, race, age, and gender. After all, some heart attacks can initially present as indigestion. Others can last days.
The key to the adaptive worldview is to accept that:
- An ill-structured domain rarely has single causes for observed phenomena, multiple concepts can apply to the case-at-hand, and in fact current cases can be seen in very different but equally valid ways; concept instantiations can take on many forms. (This is an ontological feature of the worldview).
- Therefore learning about such a domain demands the practitioner to resist reducing everything down to a single explanation or a single prototypical example. (This is an epistemological feature of the worldview).
I have, of course, simplified the adaptive worldview. I’ll quote Spiro et al directly for the full thing:
(Experts with the adaptive worldview) … pay attention to cases in the variegated richness while de-emphasising the primacy of concepts (which serve a needed subsidiary function to cases in ill-structured domains); use multiple rather than single conceptual relations (as in schemas, prototypes, analogies, perspectives, etc); treat cases as wholes with emergent properties so they are greater than the sum of their parts; increase the attunement to difference and decrease the bias toward seeing similarity; expect unpredictability, irregularity, contingency, indeterminateness; expect to return to earlier cases in new contexts to bring out facets that were hidden in the earlier context — nonlinear revisiting is not repeating; embrace flexibility and openness of knowledge representation over rigidity; stress context dependency over context independence; avoid rigidity in understanding, remaining open instead, with an appreciation for the sometimes limitless range of uses of knowledge in new combinations, for new purposes, in new situations; rely on situation-adaptive assembly of prior knowledge and experience rather than retrieval of intact knowledge structures and procedures from long-term memory … (taken from page 962, The Oxford Handbook of Expertise)
Again, this is highly counter-intuitive to me. I’ve noticed that my mind defaults to prototypical representations of concepts, such as ‘network effects’, or ‘scale advantages’ — I tend to default to just one or two examples of businesses, stored in long-term memory. The adaptive worldview suggests I should be hunting for more cases that express these concepts, but with a special emphasis on cases that appear different from the prototypes I currently hold in my head.
Perhaps more worryingly, because I describe 7 Powers as the ‘best’ book on business strategy available today, I risk rejecting other equally valid if less rigorous takes on the messy reality of business. (I have been on record, after all, as calling Richard Rumelt’s Good Strategy/Bad Strategy a ‘child playing with strategy toys’ … whereas the adaptive worldview suggests that I should reread and look for useful differences, instead of subpar similarities to 7 Powers.)
So what have we covered today?
We’ve taken a look at Cognitive Flexibility Theory, a learning theory that explains how adaptive skill works in ill-structured domains. The observations these researchers have made about ill-structured domains were just as surprising to me as the central claims of the theory; I did not expect it to resolve some of my felt tensions about learning from history.
The theory itself is straightforward. It claims that adaptive skill consists of two things: first, the ability to recombine fragments of previous cases and concepts; second, a common worldview that focuses on adaptation to complexity, instead of reductive explanations or principles for the world.
This piece is a first in a series of essays on Cognitive Flexibility Theory. Next week, we’ll take a look at the pedagogical innovations and learning systems these researchers have built over the past 40 years, and what this tells us about learning in such ill-structured domains.
It’s a lot of work to cover; I hope you’ll enjoy the ride.
Part 2, a more readable restatement of the ideas in this essay, may be found here.
- Chapter 41, The Oxford Handbook of Expertise: Cognitive Flexibility Theory and the Accelerated Development of Adaptive Readiness and Adaptive Response to Novelty — this is a scanned copy of the CFT chapter in the Oxford Handbook; I’ve included this for those of you who don’t want to spend $100 for a 1000 page academic hardcover.
- Cognitive Flexibility Theory: Advanced Knowledge Acquisition in Ill-Structured Domains by Spiro, Coulson and Feltovich — the original 1988 paper.
Originally published , last updated .