Friday, February 20, 2015

Building new datasets for foreign and security policy analysis

I have long been very frustrated about the lack of a systematic empirical basis for our analysis of foreign and security policy. Newspapers, with their many well-known biases, essentially dictate what 'facts' are and 'analysts' and 'scholars' then cherry-pick these 'facts' to buttress whatever theory, ideology, etc. they espouse.  A number of new datasets and tools hold great promise to revolutionize this state of affairs. 



Economic policy analysis is extremely data-rich. Open any (serious) newspaper and economic coverage is 'thick' with empirical evidence. It's all far from perfect, but at least the field has gone to extra-ordinary length to base its analysis on theory AND fact. Domestic policy analysis is quickly going the same way. Rich datasets with public opinion data, voting behavior (both by people and by politicians), etc.are playing an increasingly dominant role in the way we understand that part of our environment. But foreign and security policy is still overwhelmingly a field in which 'prima donnas' claim to possess deep insights into the deeper currents of what is happening in the world - often taking amazing liberties with 'reality'. 
We have been trying quite hard, these past years at HCSS, to remedy this sad state of (international) affairs.  Last year, for instance, we did a study into the many claims that Russia and China were behaving more assertively than ever before. The press is of course full with various claims about this, but how can we actually ascertain the longer-term trends?

To answer this question, we used a couple o different research strategies. One of those was to use GDELT, a large (and somewhat controversial) open source dataset containing events that were automatically extracted from a large number of online news media sources with the help a coding engine named Tabari. By recoding some of these millions of events as 'internationally assertive', the study we published (Assessing Assertion of Assertiveness) was able to show some trends on how Russia and China have engaged in different types of assertiveness from 1979 to 2013.


This year, we're using the new EL:DIABLO pipeline which is based on a new coding engine (Petrarch). I am attaching some preliminary results from ONE newspaper (the Telegraph) for 2014. Petrarch automatically coded a fairly large set of 'events' and categorized those in verbal/material conflict/cooperation. The results nicely visualize the big spike in material negative assertiveness in late February early March (the events that led to the annexation of Crimea) and around the August military intervention is August. Not bad for a fully automated analysis. 

I also try to tell the story behind these date here:
These tools aren't perfect yet. But they do show a lot of promise... Stay tuned!

No comments:

Post a Comment