ICIJ’s first Neo4j Connected Data Fellow left the finance industry to become a journalist, and found himself working with one of the largest data leaks ever.
Manuel Villa was ICIJ’s first-ever Neo4j Connected Data Fellow. The fellowship was inspired by the way graph databases strengthened reporting and helped journalists understand large data sets during the Panama Papers investigation.
I don’t allow myself to become excited about possibilities; only about certainties. It has been a rule I have followed strictly for as long as I can recall.
On a day in the middle of July 2017, in New York City, I came close to breaking that rule.
A notification on my phone alerted me to an email. It was titled “¡Enhorabuena!” – Spanish for “Congratulations!” The sender was Mar Cabra, then-head of the data team at the International Consortium of International Journalists (ICIJ). I almost jumped for joy before reading its contents.
Keeping composure, I opened the email and read it.
Mar informed me I had been chosen for the six-month Neo4j Data Journalism Fellowship at ICIJ, the first of its kind.
Now I jumped. With joy.
Life had taken a few turns for me lately. In 2016 after 11 years (and five months and three days) I left the financial industry. About a year later, I was walking in the door of a nonprofit news organization which, just a few months before, had received the Pulitzer Prize for the now-famous Panama Papers. I took pause, recalling how this happened.
During my previous career, an anxious sensation developed with time. At first, I didn’t know why: I had an interesting and challenging job; one that allowed me to work in Tokyo, New York and London. And I could not complain about financial remuneration. An enviable life in the eyes of many.
As clichéd as it sounds, I eventually worked out why: disenchantment. I had been feeling increasingly disillusioned by the bottom line of what I was dedicating my existence to. Most importantly, I had been fearful to admit it to myself. But after coming to terms with my inner reality, I was done having doubts. I left finance and entered the master’s degree program in investigative journalism at Columbia Journalism School, in New York.
I knew it would be a strenuous shift of career. I was an absolute beginner in journalism, and I would have to learn all skills from scratch.
Or would I?
To my joyous surprise, all those financial databases, the limitless lines of SQL and VBA coding, the countless spreadsheets…
The daily bread in financial risk management had become useful tools for a new field in journalism that had exploded in the past two decades: data journalism. Those skills became the platform from which I dived into data reporting tools, such as Python, R and Neo4j. In retrospect, it is amusing to think that background would eventually land me in a global organization of investigative journalists (who incidentally have reported on the very industry I had left).
For the next few weeks after receiving Mar’s email, however, there was another issue I still had. A minor issue, really: What exactly would I do during those six months? Apparently, I would be “collaborating in data projects.”
I figured it would probably be a rather generic fellowship, helping here and there with several chores and, of course, learning about journalism during the process.
On Monday, August 14, 2017, I arrived at the (tiny) ICIJ office in Washington, D.C., about three blocks north of the White House. I met the local staff: Marina Walker, Emilia Díaz, Martha Hamilton, Sasha Chavkin, Will Fitzgibbon, Scilla Alecci and Rick Sia. I recall asking Emi (Emilia) if she could brief me in more detail on my role, to which she responded Mar would do that in a call from Madrid.
That call came shortly afterwards and, with it, the bombshell revelation.
Mar let me know the Panama Papers – the investigation that secured a Pulitzer Prize for the ICIJ – was not the only trove of millions of confidential documents leaked to journalists ICIJ would be working with. Several months before I arrived, ICIJ had been working on a new leak, amounting to 13.4 million records and originating from two offshore services firms and official company registries of 19 tax haven jurisdictions. It took me a minute to realize the significance of what I was being told.
To help ICIJ organize and transform this massive amount of new data into a Ne04j-based graph database, a visual way of explaining connections within the data, and incorporating it in the existing Offshore Leaks Database.
Lottery winners must feel something akin to the disbelief that overtook me.
While catching up and familiarizing myself – under the strictest of confidentiality – with dozens of stories being developed by partners around the world, I also spent hours analyzing endless lines of Python, Cypher and SQL code and countless data files.
Little by little, the whole structure began to make sense.
Soon I was working with data from several jurisdictions, among them Tonga, Samoa, Cook Islands, Bahamas and more. My role was to work with the leak’s raw data – names of companies and company officers, addresses, dates, etc.– and convert it into a database structured in terms of nodes (companies, officers and addresses) and relationships among those nodes. In a way, it is akin to creating a social network, except that instead of instances such as “Manuel is friends with Miguel” or “Barack is married to Michelle,” what we were interested in building were cases such as “John X is a director of company Y” or “Company A and company B both have registration address Z.”
I also had to run independent checks of work done by Miguel Fiandor and Rigo Carvajal, the two data gurus with whom I interacted most intensively, confirming if I obtained the same results they did (number and type of nodes, relationships, etc.) for each data source. The importance of this cannot be overstressed, since any mistakes could make journalists waste time and resources on false leads – in the best case scenario – or face lawsuits if they end up publishing inaccurate information.
It was poetically ironic to realize the similarities this had with financial risk management, where reporting the wrong risk data to traders can lead them to make the wrong transactions, losing millions – and getting them (and you) fired.
It was poetically ironic to realize the similarities this had with financial risk management, where reporting the wrong risk data to traders can lead them to make the wrong transactions, losing millions.
Besides the vast amount I learned from my immediate responsibilities, the experience of ICIJ’s organization style was invaluable. The members of the data team were, literally, scattered around the globe: from Madrid to Paris, Costa Rica, Malta and Washington, D.C. Because of that and the delicate nature of the subject, work was discussed daily in encrypted emails, chats and online meetings. To my substantial gratification, ICIJ not only included me, but made sure I participated in those confidential communications.
This is something I especially appreciated. I had barely unpacked my bags, and they immediately had complete trust in me. They could have easily – and understandably – decided not to increase the risk to the project by expanding the number of insiders beyond necessity. Strictly speaking, I likely could have performed the technical aspects of my role without knowing about the confidential aspects of the project. But they wanted me to be aware of the full scope of Paradise Papers. They thought the more I knew, the better I could contribute. They also wanted this to be a full journalistic experience for me.
I will always appreciate such trust.
It allowed me to discover something about myself: I have no difficulty in keeping secrets.
From the moment I arrived at ICIJ, for close to three months, my family, friends and other acquaintances had no clue what I was working on – and not because they did not insistently ask. It was not until November 5, about an hour before we were due to publish the first of the Paradise Papers online, that I called my spouse, my parents and siblings.
I still didn’t let them in the secret. I simply advised them to visit ICIJ’s website in an hour. They craved more details. I let them suffer a bit more.
And now, as the song famously goes, the end is near. With the publication of the final batch of companies and officers to the Offshore Leaks Database on February 14, the invaluable six months of the Neo4j fellowship at ICIJ drew to a close. The opportunity of being part of this project would have been unimaginable to me a couple of years back, when I finally parted with my previous career and ventured into journalism, determined to become a muckraker.
My utmost gratitude goes to Ne04j, for sponsoring the fellowship and to all ICIJ members, who, by fully allowing me to be part of the project, gave me not only a unique learning experience that I take with me, but allowed me to be part of the aspiration that encouraged me to switch paths and enter journalism. Borrowing the words of Walter Lippman: To tell the truth and shame the devil.