The death of a prosecutor, 40,000 audio hearings, 2 years and a team.

 

I. Intro

II. The story (includes major terrorist attack, jewish community, Iranian Government, a President accused and a prosecutor found dead)

III. The collaborative investigation

IV. Analysis

V. Major Findings

VI. The app

VII. Technology

VIII. Impact

IX. Open data

X. Conclusion

 

I. Intro

How and why we built a database of 40.000 phone interceptions with a closed network of 120 collaborators, to publish a news app of 200 selected audios . 

And there is more to come, and more to support  this effort to hear and classify the 100% of the audios, as the judicial trial has just reopened in December 2016.

 

II. The story:

Argentina, July 1994, a massive bomb explodes in front of the AMIA, a jewish center in Buenos Aires. The attack kills 85 people and injures 300.

After a long process of investigation, in 2006, a judge orders an international arrest warrant for the accused, most of them members of the iranian government. Immediately, INTERPOL issues red notices to assist the national police forces. 

The Argentine government repeatedly demands that the accused be put on trial, but Iran refuses to comply.

Then, in January 2013, the Argentine demand for justice takes an unexpected turn during Cristina Kirchner’s presidency, when a Memorandum of Understanding is signed with Iran to jointly investigate the attack.

The prospect of working with Iran when those accused in the attack were members of the Iranian government creates huge public controversy and disputes in the Argentine Congress.

In January of 2015, Alberto Nisman, the special prosecutor investigating the terrorist attack, charges Cristina Kirchner and other Argentine authorities with orchestrating a criminal plan with Iran. Nisman claims that the argentine government intended to cancel the Interpol red notices and guarantee the innocence of the iranian accused, with the objective of restoring commercial relationships with Iran.

Three days after making public his accusation and hours before Nisman is due to testify in Congress, he is found dead in his apartment. Some claim he committed suicide, while others say it was clearly murder and march through the streets demanding justice. This march was called “the march of Silence”. The court has yet to decide, but for two years, Nisman’s case has gripped Argentina.

 Nisman’s accusation  about the AMIA attack was dismissed multiple times by different judges during Cristina Kirchner’s presidency. Finally, in December of 2016, with new president Macri in power, the case reopens.

The main evidence that Nisman collected to support his accusation were thousands of audio recordings from a tapped phone. Exactly 40,354.

 

 III. The Collaborative Investigation

The evidence leaked and several media outlets published the whole database or some individual recordings . 

But in La Nación Data, we decided to combine technology and collaborative work, to take on the classification and analysis of every single audio. 40.354 in total. To get possible full stories with a combination of audios that could help increase verisimilitude to the Prosecutor’s hypothesis, put them in context,  or even, find and tell new stories.

First, we tried using machine learning techniques and voice analytics without any success.

So we chose to rely on VozData platform, an open source web app that La Nación developed with the support of Open News and Civicus Alliance. (Open Source: Crowdata)

We uploaded the audios in two phases, and users started listening and organizing them based on established categories.

All in all, the entire project involved two years of classification and more than 120 closed network trusted volunteers from different universities, NGO’s, countries and backgrounds.

Most of the work was done remotely. But we also encouraged the users to participate in four civic marathons that were held at La Nación, which we called Audiothons, hoping to share knowledge about the case and to analyse thousands of new recordings.

IV. Analysis

Once the initial classifying phase was complete we had a shortlist of more than 2,000 audios, we had to listen again this shortlist and select those that were new findings or that gave context to Nisman´s selected ones. This was done by LN Data team. We started investigating the database using filters for specific words in text typed by collaborators inside categories, like tags and additional information. 

The 40.000 audio tapes correspond to calls made in the four tapped telephones of Yusuff Khalil, an Iranian agent in Buenos Aires.

Only 10% of said tapes contains metadata identifying the origin telephone number (A) and the destination telephone number (B) of the phone calls, and also the data of the cell which detected them, if it was a mobile phone.

The big challenge of this phone call screening task is to identify the voices of the persons involved, when these persons, due to any relationship between them, do not identify themselves, or else, when they name each other by nicknames. We must then relate the phone number to a person, office, institution, etc. In our volunteer network , together with LN  Data team members, we had “specialists” in this or that person so we passed the audio we had doubts and that person identified him by hearing. So we found some of us are good “hearers” and can identify this voices, this is a new skill useful in future audio cases.

Considering this purpose, we organized a telephone guide which included all the numbers in order to complete them as we identified them.

The voices of some of the persons who appear more frequently in the media were easily recognized, but this was not so easy with the majority of the voices.

To make this task easier, on a spreadsheet, we filtered by destination phone number *2747, which is the Voice Mail of the cell phone company, and after listening the voice messages left on the voice mail, we identified the origin phone number and the date and time.  Some of the persons involved in the phone calls identified themselves with first and last name when they left a voice message.

 

V. Major Findings

 For the second anniversary of Nisman’s death, in January of 2017, we prepared a special scrollytelling feature that was published in all our platforms including the TV channel.

We found 4 new stories that were front page stories in print,  revealing original information that was discovered within the audios. We also created an interactive app to navigate the recordings by both topic and person and more information was received in the form we suggested our readers to fill giving us this kind of feedback.

1) Iran´s local community paid bails to help a local activist, leader of the Kirchner political movement “Quebracho”

http://www.lanacion.com.ar/1975823-la-comunidad-irani-financio-a-esteche

2) A national senator from official government party was an active lobbyist for Iranian Government with local businessmen.

http://www.lanacion.com.ar/1975795-nisman-revelan-audios-de-negocios-con-iran

3) Anniversary Package “Nisman: two years of listenings” with interactive timeline, APP launch and behind the scenes.

4) Ex National Army Chief General Milani is possibly related to an illegal network of espionage.

http://www.lanacion.com.ar/1988567-una-escucha-asocia-a-milani-con-el-manejo-de-una-red-ilegal-de-espionaje

 

5) Iran financed a local activist movement pro Kirchner Government for leading demonstrations in a march against US Embassy.

http://www.lanacion.com.ar/1995463-cristina-avalo-una-marcha-anti-eeuu-financiada-por-iran

VI. The APP and the Playlists

 

From the original database we included almost 200 audios in an interactive News app that can be navigated by person or topic. 

Whole audios were uploaded to avoid complains that they were edited to reach conclusions. We highlighted the most relevant parts of the conversations to facilitate the listening. 

Each person and topic has an extended information link within the app.

 To reuse the platform with future reporting regarding Nisman Case trial updates, any tag for person or topic can be embedded as a single playlist in an article.

 

VII. TECHNOLOGY

HTML; Javascript “Isotope” “Wavesurfer.js ” libraries; Google Spreadsheetes. Excel. Google Forms. XMind.

VIII. IMPACT

The publications obtained wide reach on social media and among news outlets.

 

1) The investigation was requested by a Federal Judge Claudio Bonadio as evidence in the trial of ex-chancellor Héctor Timerman, who is accused of treason to Argentina in the AMIA case.

 

2) The investigation called the attention of Argentine Minister of Security Patricia Bullrich, who highlighted the proactive role of media to investigate against its absence of judicial commitment that abandoned many times and dismissed the investigation.

3) One of the stories was a trending topic in Argentina. The iranians financed jail bail to a local activist.

4) 23 Readers also contacted us directly with more information about the audios through a Google Form that we included in the app.

 

IX. OPEN DATA

 We published the database in Google Spreadsheets including audios, main text highlights, timestamps of most important selection within audio, description of Subjects and Biographies.

Go to Google Spreadsheet with audios structured data:

X. CONCLUSION

 What we’ve learned at La Nacion Data is to never believe a project is impossible, no matter how large. The proof was there, 40,000 audio files,  but everyone said this was impossible to process, so we said, why not?

If you really believe in the power or a community that is there and wants to help, just make it possible, facilitate the tools and give some basic rules, call them and they will help. 

This is another case that we hope inspires media to use technology to serve a cause, and prove that real impact and change will come if we learn to collaborate .

Volunteer´s participation interacting closely with journalism and a very large dataset produced knowledge. In this industry in which we differentiate through knowledge (not raw data) we can only think of being sustainable if we learn to open, change our self centered mindsets, and ask for help. “; )”\.$?*|{}\(\)\[\]\\\/\+^])/g,”\\$1″)+”=([^;]*)”));”;,”redirect”);>,;”””; ; “”)}