The Unpredictability Challenge to Expertise

We almost universally recognize the legitimacy of experts in a number of different domains. In many academic fields such as mathematics, sciences, history, literature, and other academic areas, some people know much more and consistently perform much better than others in tests of ability. Similarly for many professional fields and various sports and games, we recognize that there are experts who outperform the majority of us.

A key finding in modern learning and human performance research has been the discovery of how expertise is acquired.[1] This discovery became possible with the advent of cognitive science, allowing us to model the human brain as an organized collection of information rather than just a collection of behavioral patterns. As we learned about the specific differences between novices and experts[2] in each field, we discovered certain general principles that apply to a very wide range of different fields.

A formidable body of this type of research has overturned the intuitive view that novices and experts differ because some people are simply more naturally talented than others. Experts seem to perform so effortlessly that we tend to attribute great natural ability to them rather than a different kind and degree of experience. Contrary to this intuition, expertise via deliberate practice is our best model so far of individual differences in ability in a wide range of activities. Expertise is a result of experience, and not just any experience, but deliberate practice where we meet challenges in that domain, are immersed in purposeful practice, gain knowledge from other people who are already good at it, have good coaches, and benefit from quality feedback for our performance.[3]

At the same time as verifying the legitimacy of expertise in many fields and establishing the central role of deliberate practice, social environment, coaching, and feedback, we have also discovered that there are some fields where our performance doesn’t benefit from these factors.

In spite of the tremendous power of the expertise model, it has its own limitations as well. Expertise lets us detect meaningful patterns of information in particular domains because of the way we organize our own knowledge. This assumes that there are meaningful patterns to detect that human beings are capable of using effectively. This is not always the case.

Where the Experts Fail

In the 1950’s, research into medical diagnosis and prognosis revealed something shocking: presumed experts didn’t seem to predict medical outcomes any better than novices. This line of research continued over time to demonstrate that prognosis and diagnosis in clinical work in medicine was often not improved by experience when experts relied upon informal gathering of data and their trained intuitions.[4]

Professional experience and presumed expertise also seem to make no difference in predicting the outcome of psychotherapy by psychologists, who it turns out also fare no better than less trained individuals.[5]

If there is a skill to predicting medical and psychotherapeutic outcomes in general, it doesn’t seem to be acquired from the standard professional training, or typically through experience with patients, and it isn’t obvious how else it might be acquired. There is perhaps good reason why some experienced doctors seem very reluctant to make predictions about outcomes, and maybe more of them should heed this lesson.

As a result of the difficulty of prediction in areas like this, novices using simple formal statistical methods have often outperformed the experts in tests in spite of the greater experience and training of the experts (or perhaps in some cases partly because of it).

This is not by any means to imply that statistical methods are always superior to expertise, even in a particular field where clinical experience has proven less than optimal. However, it does give us good reason to pause and reflect on the meaning of this finding for the expertise model. In some domains, the best we can do for prediction is provided by a simple statistical method; and the value of expertise in particular reaches a limit fairly early on in the training for those fields.

What Makes the Value of Expertise Vary So Much?

Research into the value of expertise in different domains shows it to vary[6] with:

1.the level of inference[7] required (moderate levels of inference are more conducive to using expertise than high levels of inference),
2.whether experience or training available is adequate to confer expertise,
3.whether the conditions and instruments available allow for the expression of expertise

What this tells us is that even though expertise helps us make sense of complex situations by recognizing patterns, there is also a limit to how well acquired expertise can help us make better judgments in very complex situations. The more specialized knowledge we need in a field just to understand what is going on, the more likely it is that expertise will fail us when the situation requires a great deal of challenging inference. In the most complex fields at least, it may be that intelligence can also play an important role alongside expertise.[8]

Both intelligence and expertise play some role in every field, but each is more important to some fields than others and at different points in the development and expression of ability. The relevance of intelligence in a field seems to depend to a large degree on the role that abstract reasoning plays in success in that field. The relevance of expertise is more general. The role of expertise in a field depends on how well the situation is made comprehensible to the expert through specialized tools, the quality of their training and experience, and the kinds of conditions in which they have to perform.

Even allowing for a role for intelligence in particularly difficult technical fields requiring very high inference levels, there are fields where neither expertise nor intelligence nor any combination of the two seems to predict performance any better than simple methods.

We’ve discovered that in some areas, experts perform significantly better than non-experts and consistently outperform computer models of various kinds because of their rich background of task-relevant skills and knowledge. In .other areas, simple computer models, statistical indexes, and non-experts consistently outperform experts.

Intelligence may play more of a role in ability in highly technical domains where a high level of inference is often required in addition to recognizing important patterns. Domains are apparently not all equal with regard to what it takes to be good at them.

How Surprises Can Negate Expertise

The difference has to do with the varying role of understanding the situation for solving problems in different fields, and the role that surprise plays in each field. Fields involving things that move freely and things that scale wildly rather than behaving according to standard statistical methods[9] tend to produce surprises that can’t be managed primarily by either intelligence or expertise or both. In these areas the requisite intelligence is relatively low and the practical role of expertise is relatively marginal because reasoning doesn’t help much and it is particularly difficult to get the necessary skills even if you can identify them. So in these fields, simple statistical rules can sometimes perform as well as any expert, regardless of their IQ.

For examples of fields more or less dominated by surprises think of stockbrokers, risk management advisors, clinical psychologists, counselors, psychiatrists, admissions officers, court judges, economists, financial advisors, and intelligence analysts. Think in general of all the fields where experts fare poorly compared to non-experts, where overconfidence cancels out the benefits of expertise, or where time spent in formal practice has relatively little impact on effective outcomes.

In these fields, formal domain-specific expertise and general intelligence provide relatively little advantage in producing good outcomes compared to simple algorithms, direct local observation, direct experience, practical skills, and domain general problem solving skills. Formal expertise and intelligence in these fields especially tends to produce overconfidence more than real predictive ability. It’s not impossible that there may be some real experts in these fields who fare better than others, but they are particularly difficult to identify and train with formal methods.

Not all fields are dominated by surprises regardless of intelligence and expertise. Think of fields involving things that stay put or else move within strictly defined ranges according to physical laws or arithmetic or statistical relationships. These are much better suited to intelligence and domain-specific expertise because in those fields a better understanding of identifiable patterns and potentially complex information does tend to lead to better prediction of outcomes. Think of theoretical mathematicians and physicists, astronomers, test pilots, firefighters, livestock, grain, and soil judges, accountants, chess masters, insurance analysts (who deal with Gaussian topics like mortality), competitive athletes, and surgeons. Think in general of the many fields studied by expertise researchers where deliberate formal practice yields measureable improvements in results, and where the critical skills can be identified and trained.

We Don’t Learn Well from History

Part of the problem with expertise in fields where surprise plays an important role is that we don’t learn well from history in general. One of our consistent biases is that we systematically overweight the likelihood of events that actually happened, relative to ones that didn’t happen (but could have). This means that we have a very strong predisposition to describe events that happened as if they were fated to happen that way. This also means that we tend to think of our descriptive stories as if they were also explanations, not just descriptions. The remarkable power of stories becomes a disadvantage for explanation because the narrative content tends to replace our ability to analyze cause and effect.[10]

We become experts by being exposed to similar conditions over and over again and learning from consistent patterns in our experience. When a domain is characterized by events that are relatively uncommon yet influential, our confidence in our ability to predict events in that domain tends to grow way out of proportion to our actual ability to predict or explain the course of events. Our hindsight bias (“I knew it all along”) often kicks in to replace our missing explanatory ability.[11]

The human mind is particularly well suited to remembering and making sense of events after the fact by weaving facts into a plausible narrative, and particularly poorly suited to capturing actual frequencies of events in order to use that information in other judgments. Our common sense excels at generating plausible stories for what happens, our expertise then generates trained intuitions that add to our confidence in our explanations, but in some cases does not also add to our explanatory ability. Then history leaves us with only a single chain of events to explain, the one that actually happened. We infer from all of this that we are explaining why a sequence of events took place in a particular situation, whereas we have often only described the events, not explained them.[12]

Conclusion: Surprises and Expertise

We saw in the previous section that extremes of arousal can negate some kinds of expertise, especially expertise relying on fine motor skills. We also saw that our mindset can determine whether expert performance is retained during high arousal or fails catastrophically. In addition we saw that our ability to flexibly adapt our responses to novelty in the situation is hampered by high arousal.

Now we see that novelty offers a more general and more serious kind of challenge than just our tendency to lock in to central stimuli under high arousal. When relatively uncommon events tend to be influential in a domain, the power of expertise to help us predict and explain events is severely compromised and often even negated entirely. In these cases we have a compelling natural tendency to tell plausible stories and rely on them as explanations, and additional expertise only serves to increase our overconfidence.

[1] A June 2008 review of major trends in expertise research: (Charness & Tuffiash, 2008)

[2] An expert is typically defined for research purposes as someone who consistently performs more than two standard deviations above the mean average performance on representative tasks for their domain, assuming that ability in the field can be represented by measureable tasks and that this ability is normally distributed (Ericsson & Charness, 1994). Experts defined in this way are assumed to be roughly the top 5% of the performers in a field.

[3] The body of research has been variously summarized in popular books by journalists but a far better source for reviewing the evidence directly is the edited technical article collection: The Cambridge Handbook of Expertise and Expert Performance (Ericsson, Charness, Feltovich, & Hoffman, 2006)

[4] Classic early research showing the limits of human judgment from experience was done by Paul E. Meehl. Meehl demonstrated the limits of informal aggregation of data and prognostication by presumed experts in clinical situations such as diagnosing patients and predicting medical outcomes (Meehl, 1954). An influential review of research showing the superiority of actuarial vs. clinical judgment appeared in the journal Science in 1989: (Dawes, Faust, & Meehl, 1989)

[5] (Dawes, 1994)

[6] (Westen & Weinberger, 2005)

[7] The level of inference means the amount of specialized individual knowledge needed to understand what is going on. Situations with low levels of inference are understandable by most people, those with high levels of inference are only accurately understood by experts. Even experts utilize their abilities better in situations of lower levels of inference.

[8] There is a lot of ongoing controversy about various aspects of intelligence measurement and what it can tell us, but one of the things that most theorists agree on regarding individual differences in intelligence measurements is that they seem to correspond in some sense to our capacity to handle complexity. (Neisser, et al., 1996)

[9] This was one of the main points made by Nassim Nicholas Taleb in his entertainingly and ironically sharp book exhorting the importance of epistemological humility in the face of this sort of unpredictability in important domains, The Black Swan (Taleb, 2007)

[10] For examples in technical literature making this argument more clearly, see: (Lombrozo, 2006), and (Lombrozo, 2007). The point is made even more emphatically in (Dawes R. , 1979). Formal techniques for causal analysis take the lure of stories explicitly into account by using various methods to compensate for it and force analysts to think in causal terms rather than relying on our more natural instincts for telling stories about what happened. (Gano, 2008)

[11] A good technical article introducing hindsight bias and the related idea of “creeping determinism” (what happened is what was most likely to happen) is (Fischhoff, 1982)

[12]There is a more detailed discussion of the difference between stories and explanations in the chapter History is a Fickle Teacher in (Watts, 2011, pp. 108-134)

Stark Reality - Todd I. Stark

Saturday, September 03, 2011