One of the best friscos for scientists – words or phrases that identify and expose nonscientists to scientists – is the phrase “Correlation does not imply causation.” Real scientists never utter the phrase.
I really have very little to add to Dan Engber’s superb essay on the topic at Slate, so I would encourage all of you to read it. I certainly could not have done nearly as good a job of historical survey behind the phrase as Dan has.
Correlation certainly does imply causation, not in the technical formal sense of “logically entail” but in the more colloquial sense of “suggest” or “intimate.” Correlation is the first clue for the scientist that there might be causation. It’s just that correlation does not prove causation, mostly because there is no proof in science (another frisco to be discussed in the next post). Correlation is one of the necessary, though not sufficient, conditions for causation.
What civilians fail to understand is that causation is never directly observable and causation is never in the data but in the theory. And data (correlational or otherwise) simply support or refute the theory. Data themselves without theory do not tell us anything; data are always equivocal. This is true even of experimental data. It is true that experimental data are far less ambiguous as evidence for the claimed causal relationship, but the claim of causality must still come from the theory, not from the experiment. Even experimental data themselves do not tell us anything and does not establish causality in the absence of a theory which guided its design.
Somewhere along the line, around the time of the advent of the internet (according to Dan’s article), civilians who don’t know anything about science learned to parrot “Correlation does not imply causation” (or its slight variants like “Correlation does not mean causation” or “Correlation is not causation”) and use it to dismiss any scientific study or data that they personally don’t like.
Dan’s research behind his Slate article has unearthed something else very interesting. Not only are people who say “Correlation does not imply causation” entirely ignorant of science, they, much like people who openly identify themselves as “atheists,” also appear to be very nasty people given to name calling. Using Google Ngram, Dan shows that the frequency of the phrase “Correlation does not imply causation” in books archived in Google Books has historically increased in tandem with the frequency of epithets like “douchebag” and “numbnuts.”
Of course, this doesn’t necessarily mean that the same people who use the phrase “Correlation does not imply causation” also use the epithets “douchebag” and “numbnuts.” After all, these are correlational data, and correlation does not imply causation. But the historical trend is indeed intriguing, and I for one believe, mostly from experience, that the two are not unrelated. Both are symptoms of the problems of the internet, where anyone without any qualifications or expertise can say anything, and there is absolutely no quality control. As my idol Tina Fey aptly points out, "The internet is the repository of all human garbage. It's the worst place in the world." Dan’s final Ngram figure suggests that the phrase “Correlation does not imply causation” occurs mostly on online comments for blogs and articles, usually by anonymous (READ: unqualified and ignorant) users.
Follow me on Twitter: @SatoshiKanazawa