BEHIND THE SCENES
How our smallest FinCEN Files stories captured the biggest data lessons
Each Confidential Client profile amounts to only a few hundred words, but the huge data lift involved was emblematic of the entire FinCEN Files investigation.
Unlike previous ICIJ investigations such as Luanda Leaks, Mauritius Leaks, or Panama Papers, the FinCEN Files documents didn’t number in the hundreds of thousand or millions.
In fact, the entire investigation was based primarily on a cache of just over 2,600 documents that BuzzFeed News shared with the International Consortium of Investigative Journalists and 108 media partners, including more than 2,100 suspicious activity reports, several hundred spreadsheets, a few dozen Word documents, and emails.
But don’t let the size of the leak fool you. It took months of deliberate, meticulous data work and diligent reporting to eke out some of the investigation’s big stories and findings, like analyzing the more than $2 trillion in suspicious transactions flagged by U.S. banks, or identifying the networks of shell companies used to launder potentially dirty money.
With any data journalism project, it’s important to put a human face to the figures. Our Confidential Clients feature was a selection of businessmen, fraudsters and political leaders whose stories appeared in the leaked files. The aim was to show how global banks continued moving billions of dollars around the world for clients they suspected were funding illicit or illegal activities.
We quickly discovered that each profile offered a microcosm of the huge amount of reporting and analysis that would eventually go into every story across the entire investigation, whether it was a 7,000-word feature on British shell companies or a 300-word profile of a corrupt politician. Here’s what we learned along the way:
The data required reading between the lines
The files we were looking at were not the sort of frank exchanges between a client and their trusted wealth manager, accountant or lawyer that we’ve seen in past investigations. There were no details of intimate client meetings or clear indications of intentions or motivations.
The suspicious activity reports and spreadsheet lists of transactions we accessed were put together by banks’ compliance officers reporting their suspicions to the U.S. Treasury’s Financial Crimes Enforcement Network, known as FinCEN. These officers often had no direct connection to the transaction or the client themselves, and were often either ill-informed or limited in the amount of details they could provide FinCEN.
The data we had was also quite patchwork. Most of the files were unstructured narratives outlining officers’ suspicions, written into documents from which we could extract some details about transactions. On occasion, we had access to transactional data in spreadsheets. Those files gave us very detailed information from the point of view of the filing bank; some came in the FinCEN Files with the corresponding reports. But there is no mandatory format for the transactional data to be filed — each bank had chosen its own way of presenting the transactional information. Only 46 of these spreadsheets came with contextual narrative files attached.
For each Confidential Client profile, we had to assess what information we had, what information we needed to get, and then piece it all together.
The data was selective
The FinCEN Files documents represent less than 0.02% of the more than 12 million suspicious activity reports that financial institutions filed between 2011 and 2017. According to BuzzFeed News, some of the records were gathered as part of U.S. congressional investigations into Russian interference in the 2016 U.S. presidential election; others were put together following requests to FinCEN from law enforcement agencies. What we had was but a very small window into the world of a few banks, who were selectively picking the information they deemed worthy of being reported to the U.S. government.
It wasn’t possible to pick a bank’s famous client and follow a steady trail of information exchanged between them over the years. Nor was it possible to carefully build a complete picture of each bank’s relationships with their correspondent clients around the world.
The data was full of duplicates
One of the things we like to do in the ICIJ data team is to count things. We count documents, we add up amounts, we calculate date ranges, and we classify entities, such as banks, all in the name of contextualizing information and building understanding. I like to use the analogy of a person collecting shells on the beach: whereas most people would select the most beautiful or original shells and discard the rest, the data team goes through every shell there is, and considers each one against a methodology. In the end, the questions we need to answer are plentiful: is it useful? Is it representative? Can it be trusted? What can we safely say, and what don’t we have enough information about?
But because the suspicious activity reports came from numerous banks, filed over the course of several years, we ran the risk of having multiple banks reporting the same transactions. Different banks can be involved in the same transaction, or a series of transactions can partly intersect with another series reported by another institution. This risk of double-counting also existed when banks filed multiple reports and failed to explain what transactions they had included in previous reports. We had to cross-reference the transaction data from the suspicious activity reports, the transaction spreadsheets, and FinCEN’s own reports to be able to identify and set aside potentially duplicate records.
The data needed more data
Because so many of the rich and powerful rely on shell companies, including for legal reasons, it is often very difficult to identify the individuals that ultimately benefit from operating those companies. The U.S. banks themselves sometimes have a hard time tracking those ultimate beneficial owners (or UBOs), as shown in the FinCEN Files. We had to research what companies were owned by each of the individuals portrayed in the files, and check whether we had additional transactions in the data tied to those companies which hadn’t been identified by the banks.
This happened for transactions tied to Isabel dos Santos, which Standard Chartered flagged as part of a report about various companies called Unitel. Thanks to our previous Luanda Leaks investigation, we were able to identify the transactions that were relevant to the Unitel company partly owned by Isabel dos Santos, and include this amount in her Confidential Clients profile.
Finally, ICIJ reached out for comment to the individuals, companies, and banks mentioned in the Confidential Clients interactive, including those that were part of the transactions we chose to visualize with each client. In some cases their responses helped contextualize and inform our reporting; in all cases we included their responses alongside their profiles.