What’s the optimal strategy to win a count-the-jellybeans-in-the-jar contest? One method is to count the number of jellybeans that would fit along each of the jar’s dimensions, and then multiply to estimate the volume of the jar in jellybeans. But a surprisingly accurate guess is… the average of everyone else’s! Even though everyone will be wrong—some too high, some too low—on average these errors tend to cancel out, leaving an estimate that is predictably more accurate than any given individual’s.
This phenomenon is called the wisdom of crowds, and author James Surowiecki argues that it appears whenever a group aggregates individual opinions that are diverse and formed independently of one another. One example is a free market that automatically sets prices so that supply meets demand; another is the success of crowdsourced projects like Wikipedia and the computer operating system Linux.
It’s also the principle at work in the machine learning technique called ensemble averaging, where several different algorithms produce their best guesses at, for example, tomorrow’s temperature, and their various estimates are averaged into one final prediction. Every algorithm will have its own errors, but if these errors are independent of one another they will tend roughly to cancel out.
The basic way machine learning works is that you “train” an algorithm to make predictions by feeding it lots of examples: the algorithm then gradually adjusts its parameters, trying to make its predictions match that training data as closely as possible. Then the algorithm is scored based on how well it performs on “test” data it’s never seen.
So I was surprised to learn that, if you are preparing several different algorithms for ensemble averaging, you don’t train them so that their average is as accurate as possible. Instead, you get better “test scores” if you train the algorithms separately, and then average them.
(Why is that? Why, once you average them, shouldn’t you tweak the various algorithms further if it’ll improve their aggregate performance? The basic problem is that you would lose one of the key features that gives crowds their wisdom: independence. Any additional tweaking you do, in order to get a better average, must correlate the errors of the individual algorithms so that they cancel each other out more exactly on the training data. But to get good results on the test data, which the algorithms haven’t seen yet, you need those errors to keep on canceling out, and if they are no longer independent of each other, you have no reason to hope that they will.)
Since learning about this subtlety in machine learning, I’ve been on the lookout for ways “the wisdom of crowds” can tempt us to shirk our responsibility to be wise as individuals. Here are a few:
- If everyone in a choir sort-of knows what they’re singing, it’s true that together they can muddle through better than the average member could alone. But it’s when everyone has the notes and lyrics down cold that their director can focus on artistic choices for the piece, rather than merely tidying up mistakes.
- Students who work together to solve homework problems sometimes do worse on final exams because they grow to rely on each other’s strengths. A better group strategy is for each of them to solve as much of the homework as they can on their own, and to spend their time together learning from what they solved differently.
- Roommates who take turns with the grocery shopping might both try to save money by making the other buy the toilet paper. (I may be guilty of this.) But it’s better if they each just try to be as frugal as they can with whatever needs purchasing.
But the situation where this comes up most often is in ordinary conversation. I often find myself trying to be the voice of [fill-in-the-blank], filling in whatever viewpoint I think is missing or underrepresented. This usually backfires: the opposing views just get louder to compensate. So whenever I remember to, I try instead to be the voice of nuance and synthesis, affirming the viewpoint I’m hearing even as I point out the ways the real world is more complicated. If each of us tries separately to contribute as accurate a picture of reality as we can, the consensus we reach will be so much truer than if we assume the wisdom of the crowd will fill in what we’re missing.
Another idea from machine learning that you might be interested in is that of experts. It works similar to your ensemble averaging but instead of training the algorithms on the whole training set, each algorithm trains on one _example_. Sometimes these experts get a weight attached to them based on the empirical difficulty of predicting that example. Then at the end, the voice of all experts is combined to form a consensus on the new data. Having multiple experts tell you what they think and then forming a consensus works, but only if the experts are independent i.e. say what they really think instead of changing their prediction based on the others.
This has an obvious parallel to real life, and also carries an uplifting message. Say what you really think, but also listen to other people. So we have arrived at common sense through the rigour of science once again!
LikeLiked by 1 person
It is always reassuring when science backs up common sense! This “expert” idea is really cool—is there anything you can point me to for further reading? I tried googling “machine learning experts” but, in retrospect, should have expected that not to work.
LikeLike
The contexts that I know them from are called “Boosting”, “Hedging” or “Weak Learners”. These contexts are all slightly different, but very related. If you’re interested the second chapter of my bachelor’s thesis provides an introduction to these ideas, which you can find at https://github.com/dvente/bscthesis/. This also includes some scripts that implement three algorithms in this vein if you would like to play with it. If you’re more interested in the theory behind these algorithms you might want to read the thesis of a friend of mine at https://www.math.leidenuniv.nl/scripties/otten.pdf or googling further on these terms. Happy reading!
LikeLike