European hostages: mining the data
PARIS, June 26, 2015 - How many times have I heard this refrain since joining AFP nearly 20 years ago: ‘With millions of stories in our archives, we’re sitting on top of an information gold mine’. True enough. If journalism is the first draft of history (as a wise man once said), then global news agencies that really do cover the world, day-by-day, have as complete a rough draft as exists. But mining that data is harder than one might think.
Or at least it was until the dawning of the digital era.
The European Hostage Project, as we wound up calling it, is a double experiment for AFP. On the one hand, it is a serious attempt to extract some of that buried treasure. A couple of things in particular made that possible. One is the fact that all those news articles, going back more than a half century, were tagged with key words by the journalists who wrote them. Humans are notoriously fallible at that sort of task, but for the most part it was done with consistency. The second was the creation of a computer ‘meta-language’ called XML that labels the different bits-and-pieces of every story we generate: ‘here is the headline’, ‘here is the dateline’, ‘here is the lead (first paragraph), ‘here is the author’, etc. This makes it possible to sift through mountains of data looking for specific things.
To compile a comprehensive database of cases over the last 15 years in which Europeans were taken hostage by non-state groups outside of Europe, we extracted every story in English and French since 2000 that had the keyword ‘hostage’ or ‘otage’, some 40,000 articles in all.
From there, AFP’s data journalist Jules Bonnard built and adapted a range of tools so that – and this brings us to the second experiment – a dozen 2nd-year graduate students in journalism from the Institut Francais de Presse could sort and sift through them, dividing them by country and compiling detailed files for each hostage event and individual hostage. All of that data was entered into another tool built especially for data-driven journalism projects by the French start-up Journalism++.
Collaborations between top journalism schools and major media have been an established part of the US media landscape for more than a decade, offering a win-win situation: news organizations can mobilize a large number of motivated young journalists over a period of weeks or months, something they rarely can afford to do with their own reporters, while students get real-world experience under the supervision of seasoned professionals, and a published result. Sometimes these joint efforts have real impact.
Click to access the interactive database
We are particularly proud of this effort, which has uncovered patterns across time in the terrible traffic in hostages from Europe that up to now, even for veteran reporters covering the topic, have remained elusive if not invisible. A good example, then, of the power of data journalism.
The Hostage Project consists of several elements. An interactive website makes it easy to interrogate the database, and to look for patterns: the nationality of hostages; where and by whom they were kidnapped, how long they remained captive and the outcome of their ordeal; which countries have been most affected; etc. There is also an article summarizing the key findings and offering a preliminary analysis.
Finally, there are five companion news features – crafted under the supervision of investigative journalist Eric Pelletier – based on interviews and research that provide context to this most troubling of subjects, ranging from the intimate relationship many hostages develop with objects during detention, to a look at how companies train and prepare their employees for hostage-taking situations.
Here is the result of their work. Click on the banners to access each article.
Marlowe Hood is an AFP journalist and a teacher at the journalism school of the French Press Institute.