• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
TechPointCloud

How Tech Made the Pulitzer Prize-Winning Panama Papers Coverage Possible

Barb Darrow
By
Barb Darrow
Barb Darrow
Down Arrow Button Icon
Barb Darrow
By
Barb Darrow
Barb Darrow
Down Arrow Button Icon
May 30, 2017, 1:32 PM ET

Reporters are relying more and more on troves of public—or leaked—data to do their jobs. And they also increasingly depend on technology tools to help organize and sift through all of that information.

Case in point: The Panama Papers, a massive leak of financial records from Panamanian law firm Mossack Fonseca obtained by the German newspaper Süddeutsche Zeitung and shared with the International Consortium of Investigative Journalists (ICIJ). That data led to huge scoops last year by journalists around the world that exposed a network of tax havens used by the rich and powerful in government and private industry. The stories led to the resignation of at least one head of state and embarrassed dozens of others including former U.K. prime minister David Cameron and Russian president Vladimir Putin.

But none of those stories would have appeared without a lot of work preparing the data. This was a mother lode: The Panama Papers comprised some 2.6TB of data and 11.5 million documents about Mossack Fonseca clients many of whom, it turned out, used the law firm and its affiliates to dodge taxes. NSA whistleblower Edward Snowden, who knows a bit about these things, called it the biggest leak in data journalism history. For context, 2.6TB of data would equal the capacity of about 390 DVDs, a stack of which would be nearly 21 feet high.

Related: Behind the Panama Papers

The data came into the ICIJ’s possession in dribs and drabs, and in many formats. Much of it was email and PDF files, of the sort that are created to be printed out and viewed. That document data is called unstructured since it does not come in the neat rows-and-columns of traditional databases. For this type of data the ICIJ team used three open-source, or free, tools—Tesseract software to scan the printed information; Apache Solr to index it and make it searchable, and Apache Tika to extract data from these documents.

Biggest leak in the history of data journalism just went live, and it's about corruption. https://t.co/dYNjD6eIeZ pic.twitter.com/638aIu8oSU

— Edward Snowden (@Snowden) April 3, 2016

And much of the Mossack Fonseca data came from a traditional structured row-and-column database but arrived in very raw form, not in full database files that would normally be shared. It’s sort of like someone sends a list of letters and words instead of a fully formatted Word document—the information is there but is not all that useful.

Because of that, the ICIJ tech staff had to take the leaked information and rebuild the original SQL database structure it came from. The term SQL or structured query language, describes how these databases are set up and how users can request information from them.

Related: The Laughably Bad Security at Mossack Fonseca

From there, the ICIJ relied on an open-source version of Talend (TLND) software, known by techies as an “ETL” or extract, transform and load tool. Talend’s technology let the journalists take the row-and-column data structures they had painstakingly rebuilt and pump them into an open-source Neo4J graph database, which let reporters see onscreen icons representing people or organizations that are based on the original data.

Talend enabled the team take structured data from different sources, and automate the process of putting it all together. “It’s like a recipe. You create a job and get three columns of data from this source, and two from this source, intermix them in SQL form,” Mar Cabra, editor of the ICIJ’s Data & Research Unit told Fortune. Without a tool like Talend the team would have to write a ton of software code to do that.

The moments you win a #pulitzer prize – in @ICIJorg office in Washington D.C. What a day. What a year. Long live collaborative journalism!! pic.twitter.com/iE0TFgvP3o

— Bastian Obermayer (@b_obermayer) April 10, 2017

The next step was to use a commercial product called Linkurious which works with Neo4J to visualize the relationships between the people and organizations mentioned in the data. It creates a sort of interactive flow chart that lets users click on one party to see who that person is connected to based on the Mossack Fonseca data.

If that data were left in SQL form, finding relationships between people and organizations would require writing long and complicated database queries, Cabra said. “In a graph database, if a company is connected to you, and you are connected to other companies, reporters can follow that thread,” she added.

Get Data Sheet, Fortune’s technology newsletter.

At that point the Mossack Fonseca data trove could be shared by authorized reporters in Germany, the U.S., Spain, and elsewhere. Each reporting team could run their own queries, track down their own leads, and do their own reporting. All of that preparation work mentioned above made sure they were all working from one single source of Mossack Fonseca data.

Related: Panama Papers Law Firm Responds to Massive Hack Attack

In April of this year, after more than a year of work and a slew of articles, ICIJ members including Süddeutsche Zeitung, and the Miami Herald, were awarded the Pulitzer Prize for Explanatory journalism by Columbia University.

About the Author
Barb Darrow
By Barb Darrow
See full bioRight Arrow Button Icon

Latest in Tech

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
Fortune Secondary Logo
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Tech

PoliticsColleges and Universities
Pentagon chief blocks officers from attending Ivy League schools and other top universities, including partners on AI and space
By Jason MaFebruary 28, 2026
6 hours ago
AIAnthropic
Anthropic CEO Dario Amodei says ‘we are patriotic Americans’ committed to defending the U.S. but won’t budge on ‘red lines’
By Jason MaFebruary 28, 2026
11 hours ago
sarandos
InvestingMedia
3 things we will never know after Netflix pulled out of the Warner Bros. bidding, handing it to Paramount
By Nick LichtenbergFebruary 28, 2026
14 hours ago
OpenAI CEO Sam Altman
AIAnthropic
OpenAI sweeps in to ink deal with Pentagon as Anthropic is designated a ‘supply chain risk’—an unprecedented action likely to crimp its growth
By Jeremy KahnFebruary 28, 2026
14 hours ago
Big TechAmerican Politics
Your spend as a ‘weapon’: Scott Galloway’s ‘Resist and Unsubscribe’ movement asks you to ditch Amazon, Apple, and Netflix to oppose Trump
By Kristin StollerFebruary 28, 2026
18 hours ago
world's fair
CommentaryRobots
Something big is happening in AI, but panic is the wrong reaction
By Peter CappelliFebruary 28, 2026
19 hours ago

Most Popular

placeholder alt text
Success
Japanese companies are paying older workers to sit by a window and do nothing—while Western CEOs demand super-AI productivity just to keep your job
By Orianna Rosa RoyleFebruary 27, 2026
2 days ago
placeholder alt text
Middle East
Iran is now on 'death ground' amid existential threat from U.S. attacks and could 'go big' in retaliation, former NATO commander warns
By Jason MaFebruary 28, 2026
13 hours ago
placeholder alt text
AI
The week the AI scare turned real and America realized maybe it isn't ready for what's coming
By Nick LichtenbergFebruary 28, 2026
20 hours ago
placeholder alt text
Success
Walmart exec says U.S. workforces needs to take inspiration from China where ‘5 year-olds are learning DeepSeek’
By Preston ForeFebruary 27, 2026
2 days ago
placeholder alt text
Personal Finance
Current price of gold as of February 27, 2026
By Danny BakstFebruary 27, 2026
2 days ago
placeholder alt text
Middle East
Dubai’s worst nightmare unfolds as Iran strikes Gulf neighbors
By Dana Khraiche, Fiona MacDonald and BloombergFebruary 28, 2026
8 hours ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.