Guest post by Peter Henne and Patrick James
The replication debate currently underway in the scientific community is as relevant to research on political violence as it is to disciplines like biology and physics. In fact, we argue that replicability issues surrounding the transparency and validity of data coding could make or break the credibility of our field and its ability to positively influence policymakers. Scholars of terrorism, for instance, should spend less time debating the definition of terrorism or leaning on expert intuitions, and more time transparently and reliably generating high-quality data.
Andrew Gelman summed up the debate, and provided a good perspective on it, in a recent post on his site. Gelman comes down pretty strongly in favor of replication: scholars should manage their labs and experiments in such a way that their studies can be replicated and, if they are replicated by other researchers, the findings will be nearly the same.
This doesn’t seem directly applicable to terrorism research. We tend to not use labs, and those conducting experiments don’t account for the majority of terrorism researchers. Replication is a big deal for the findings of quantitative studies (i.e., are these numbers real?), but most journals nowadays either include replication data on their website or encourage authors to make it available.
The construction of the data itself is where replicability becomes a really big deal. Could someone replicate the coding that produced the data used in an analysis? This is partly an issue of transparency: is the description of the variable or codebook clear enough that someone could understand what a “2” of “5” for your variable measuring militant group organization really means? It’s also one of reliability. Could someone take the same sources used in the coding, look through the codebook, and come up with the same data used in the final analysis? If the answer to any of these is no, there is a problem.
And this isn’t just a concern for quantitative studies. Qualitative studies do involve a sort of data coding, as they categorize cases and often posit different values of variables in different cases. If people aren’t sure why you put a country into a certain typological cell, your conclusions won’t matter.
We would even argue that this is separate, and maybe more important, than conceptual validity. We need to properly define the concepts we’re using (religion, terrorism, etc.). But if you’re not sure how someone came to code certain conflicts as “religious,” how are you ever going to debate them on what religion means?
Questions over the replicability of data coding in terrorism studies could have as detrimental an effect on this field as issues with lab work have in the hard sciences. Even if the methods are handled brilliantly, if the variables are coded incorrectly or at least confusingly, a study’s findings will have little impact on the study of terrorism or the broader discipline. This matters for policy studies as well. If we tell people that X makes a civil war more severe, but can’t clearly explain what we mean by X, it’s unlikely our findings will influence policymakers.
So the gold standard for good terrorism studies should not be just fancy methods or brilliant theoretical insights. Instead, it should also be clear, transparent, and replicable data coding procedures. Including detailed descriptions of the variables or a long appendix can be tedious, and opens us up to challenges from critics, but it’s necessary to improve the credibility of our findings.
We’ve tried to embrace these principles in the Profiles of Radicalization in the United States project. Run through the National Consortium for the Study of Terrorism and Responses to Terrorism at the University of Maryland, we are producing a dataset that will allow us to assess a number of theories on radicalization. For example, many argue group dynamics are important in explaining radicalization. To test this, we use an index measuring the prevalence of group dynamics in a person’s radicalization. This index, in turn, is based on several variables that measure specific types of group dynamics, such as intra-group competition and cliques. When we complete the project, we will include information on how we coded each of these group dynamics indicators. People can — and should — debate whether we conceptualized it appropriately, but there shouldn’t be any questions about how we actually did it.
The bottom line is that scholars of terrorism — and political violence more broadly — should be as anxious and troubled by questions of replicability as biologists and physicists. We should be spending just as much time debating how we got to our conclusions as we are debating our conclusions or definitions. Likewise, we can’t rely on the judgment of experts to justify our coding, unless it’s backed up by valid and transparent procedures.
Peter Henne a postdoctoral fellow at the National Consortium for the Study of Terrorism and Responses to Terrorism, at the University of Maryland-College Park. Patrick James is a Radicalization Researcher at the National Consortium for the Study of Terrorism and Responses to Terrorism.