against statistical nitpicking in the social sciences
let's try to answer questions, not just say "not good enough"
Lately, there has been a furious controversy in the social sciences over whether the increased teen mental illness in the US (and in some cases abroad) is caused by smart phones and social media apps. This theory has been around for at least five years, but Jonathan Haidt recently published a book called The Anxious Generation advancing this theory in a high profile way. Prof Candace L. Odgers, a prominent psychologist who studies developmental psychology, gave Haidt’s book a scathing review which can be summed up by saying Haidt’s thesis is “not supported by science.” That’s about as strong as a pronouncement as you’ll hear from an academic, and it’s hard not to hear anger and a tinge of jealousy behind those words. Haidt is, after all, a comparative outsider to this particular area of study, and he has just dropped a bomb that prominent developmental psychologists will have to spend a lot of time responding to and accounting for.
The response to Haidt’s book on social media from academics was also hugely negative. There were two common flavors of criticism I saw that I want to discuss.
First, there were a lot of triumphant tweets from academics dunking on Haidt because (they claimed) he misinterpreted or misrepresented a study. For instance, there was some controversy over interpretation of particularly experiments, at least one viral tweet alleging that because Haidt misinterpreted a study, we should be skeptical of his entire argument.
Whatever the merits of the underlying studies, Haidt used simplistic aggregation techniques such as “vote counting”, where you say “12/18 studies on the topic indicated an association” and therefore you conclude that there is indeed an association between two things. Many people said Haidt should have used principled meta-analysis methods which statistically model similar studies and, given some assumptions, produce an estimate of the overall effect indicated by the batch of studies.
Sidebar: I want to also point out the ever-increasing length of economics papers, which in some cases have *tripled* since the 1970’s. This trend isn’t just related to economics, papers are getting longer and longer in the social sciences.
I think both the reaction to Haidt and the ever-increasing length of papers points to a nitpicking mentality in the social sciences, which I don’t think is particularly useful.
To focus on the Haidt thesis that phones cause mental health problems for young people—it seems plausible, you can check several multi-hundred-page open annotated bibliographies (example, full list) for what Haidt claims is evidence, it is a theory that presumably researchers authoring studies saying “phones cause mental health problems” back, and it seems correct if you ask people what’s going on with mental health. That doesn’t mean Haidt’s idea is correct. But it does mean you can’t dismiss it by just saying “science doesn’t support this.” This is doubly true if you understand that you’d expect to see many null results in this type of research because of the difficulty of studying this phenomenon cleanly. That means lots of associational studies, or high variance studies because of the difficulty of measuring outcomes, or both. It also means citing a null result isn’t necessarily evidence of an absence of an effect because of high variance nulls or difficulty in measuring the concept you’re trying to measure.
The academic response to Haidt also had a flavor of “you don’t study this so you can’t comment on it.” While it is true that a generalist coming in to a specific area may miss things because of a lack of context, presumably those are specific things that can be pointed out to counter the wrong argument from the generalist. If a “non-expert” shows up with what appears to be a body of experimental evidence in peer reviewed journals and a hypothesis that is generally plausible, you can’t just claim there is a gestalt that comes with being a specialist in a certain area and dismiss it out of hand. You need to explain to people why it’s wrong. Note that this is different from crank theories that have no basis in the research record and are prima facie implausible. In this situation, we’re talking about a psychologist (Haidt) talking about (developmental) psychology citing dozens of actual peer reviewed experiments, and then other psychologists pulling the “non-expert” card on him. Bonkers!
Because of what seemed like a certainty from the beginning that Haidt was wrong, much of the academic response to Haidt was on less than solid ground. The first flavor I mentioned above was nitpicking the interpretation of individual studies. Unless this misinterpretation fundamentally undermines the bigger argument Haidt is advancing, it merely slightly reduces the probability that Haidt’s argument is right, rather than completely undermining it. Note I am assuming the criticism of Haidt is correct here. Let’s say Haidt cited 20 experiments and you think he misinterpreted one. Well, ok, what about the other 19? Or you’d have to argue that that single study is so foundational to the argument that it does not make sense otherwise. I can imagine this being the case, but I didn’t see that argument anywhere.
To my second point above, I would have also preferred Haidt to use more principled aggregation techniques for their studies. But let’s not overstate the importance of this. If 15/20 studies show an effect and you do a meta analysis to aggregate them (which includes non-trivial and likely false assumptions about exchangeability!) you will likely get an effect. You can’t just say “I would have preferred them to use a hipper aggregation technique”, you have to show that a materially different conclusion would be reached if you did use that technique. Such outcomes are not outside of the realm of possibility in small-ish N Bayesian meta-analysis. But the actual, principled, academic way to object to the “study vote counting” that Haidt used would be to perform the more principled analysis and show that it changes things. Or show it doesn’t, but that it adds more clarity so you should use it! I’d like to note that other experts doing different analysis of the studies is extremely possible because of the clarity that Haidt provided with open collaborative bibliographies.
My big takeaway from the blow up over the Haidt book was that critics of his idea really don’t have much evidence in their favor. Some will say “you can’t prove a negative,” but that’s not really true! You can estimate a precise null in a plausibly generalizable experiment. If there is such a precise null, all critics of Haidt have to do is cite it and his argument would basically be doomed. You don’t even need a competing idea. But in the absence of such a precise null, I think you sort of need to put forth some competing theory that is plausible that you think fits the (admittedly muddy) evidence. Because the goal of social science is to explain things not to nitpick others. And I say this myself as a reformed nitpicker. I was super good at it in grad school. But then I eventually realized you don’t explain anything that way. You need to come up with theories, test them out, find out if you’re right or wrong, rinse and repeat.
Both the response to Haidt and ever-increasing paper lengths suggest to me we are living through a plague of nitpicking in the social sciences. And a legitimization of nitpicking as an academic success strategy. In reality: who cares if you do that 5th robustness check of your diff-in-diff model? Are you really going to overturn the results of the first 4 analyses? IMHO, there are better things for the smartest people on earth to spend their time on, namely collecting data, analyzing it, theorizing, doing math, etc. This doesn’t mean you want to abandon thoroughness, but you also want to avoid wasting time that is just spent satisfying cranky paper reviewers or dumb standards in a particular field.
This is a coordination problem, individual researchers can’t solve it on their own. But I would like to see social science journals and organizations take a tougher stance on nitpicking.