• Home
  • Latest
  • Fortune 500
  • Finance
  • Tech
  • Leadership
  • Lifestyle
  • Rankings
  • Multimedia
TechOkCupid

Researchers Caused an Uproar By Publishing Data From 70,000 OkCupid Users

Robert Hackett
By
Robert Hackett
Robert Hackett
Down Arrow Button Icon
Robert Hackett
By
Robert Hackett
Robert Hackett
Down Arrow Button Icon
May 18, 2016, 3:41 PM ET
Screenshot of OkCupid website homepage

Earlier this month, Danish researchers published data from the online profiles of nearly 70,000 OkCupid users—including usernames, political leanings, drug usage, and intimate sexual details—creating a privacy firestorm.

The researchers, Emil Kirkegaard and Julius Daugbjerg Bjerrekær, used data scraping software developed by a third contributor, Oliver Nordbjerg, to collect the information for a study that explored, among other things, the thinking of people on the site. They posted the database along with a draft paper on Open Science Framework, a site that encourages open source science research and collaboration.

Unlike recent incidents at Ashley Madison, a site for people seeking extramarital affairs, as well as some adult networks that cater to people with fetishes, the OkCupid research did not involve a security breach. That didn’t stop the ensuing controversy.

“Some may object to the ethics of gathering and releasing this data,” the authors wrote in the draft paper, which has since been pulled. “However, all the data found in the dataset are or were already publicly available, so releasing this dataset merely presents it in a more useful form.”

Online commenters, OkCupid users, the site’s operators, and academics attacked (and, in some cases, threatened) the researchers for making user information public. Some questioned whether such data harvesting, bundling, and broadcasting is justifiable for academic research and whether it crosses ethical and legal lines.

Although the researchers did not release the real names and pictures of the OkCupid users, critics noted that their identities could easily be uncovered from the details provided—such as from the usernames. “Your private life is a few big leaks away from being an inescapable matter of public record, once a statistician with BitTorrent gets bored,” said Scott Weingart, a digital humanities specialist at Carnegie Mellon University, mused in a post on Twitter (TWTR). He added that it would be easy to identify more than 10,000 of the people in the data dump and link them to their sexual inclinations.

Kirkegaard said that his group posted people’s usernames because it found the data on these self-selected pseudonyms to be scientifically interesting. (What does use of the word “hot” in an alias say about its subject, for example?) He also argued that retaining the information in the dataset would allow certain missing details—like height, profile text, or photos—to be added later.

@esjewett No. Data is already public.

— Emil O W Kirkegaard (@KirkegaardEmil) May 11, 2016

The data, collected from November 2014 to March 2015, is indeed public—sort of. Some of it like bios, photos, age, gender, sexual orientation is easily accessible through basic Google (GOOG) searches. Answers to some 2,600 of the service’s most popular dating survey questions are restricted to people who are logged into the site and who have answered the same questions.

The site’s users can also set certain answers to “private,” which makes the responses inaccessible to others. In this case, the researchers scraped and presented the data accessible through Google and Q&A responses from individual profiles.

“We thought this was an obvious case of public data scraping so that it would not be a legal problem,” Kirkegaard wrote to Fortune.

Last week after the appearance of the dataset began inciting an uproar, Open Science Framework, the site that hosted the data, placed it behind a password-protected wall. OkCupid then filed a copyright claim on Friday ordering the site to take it down altogether. The page where the data initially appeared was initially changed to read: “Unavailable for legal reasons.” Now it simply states “Content removed.”

The editorial board at Open Differential Psychology, the journal to which the researchers submitted the accompanying paper (and where Kirkegaard is the editor), is currently reviewing the submission, Kirkegaard told the science blog Retraction Watch. “If the journal does not take the paper, we will probably publish it elsewhere,” he said.

Get Data Sheet, Fortune’s technology newsletter.

OkCupid, owned by InterActivCorp’s (IAC) Match Group (MTCH), released a statement that complained about the published data. “This was a violation of our terms of service and we sent a take-down notice,” Mathew Traub, a spokesperson for OkCupid, told Fortune in an email. “They appear to have complied.”

Kirkegaard said in a Twitter post that he did not ask the company for permission to collect or publish the data beforehand. Some commenters have argued that the researchers breached research ethics by failing to obtain the consent of the OkCupid users, too, before gathering and republishing their information. They cite, among other things, “code of conduct” guidelines by the American Psychological Association.

Aarhus University in Denmark, the school at which Kirkegaard is a graduate student, distanced itself from the team of students, who undertook the project in their spare time. “The views and actions by student Emil Kirkegaard is not on behalf of AU,” the university said in a statement posted to Twitter. “[H]is actions are entirely his own responsibility.”

This is not the first time someone has scraped the profile data of OkCupid users, of course. At least one individual cleverly “hacked” the dating system to get more romantic matches several years ago. And the site’s co-founder, Christian Rudder, published a treatise on data science that analyzed information from the data-rich dating network. These cases are different, however, from the latest instance of scraping, packaging and releasing profile information publicly.

A better comparison would be a 2008 study out of Harvard University that relied on information culled from Facebook (FB) profiles. The researchers did use some anonymizing techniques, but critics said the protections were not strong enough. The scientists ultimately took down the data.

In a message sent to Fortune, Kirkegaard wrote that he did not rule out the possibility of republishing the data his team collected with more effort put into obscuring the identity of the OKCupid users. Given OKCupid’s interpretation of its terms of service agreement—and its copyright claim—it’s unlikely that the company will sign off on the proposed compromise. As with the Harvard Facebook study, the data may very well remain in limbo.

It’s no surprise that people are sensitive to having their romantic and other interests neatly presented for others to rifle through online, even if done in the name of science. In addition to questions raised about the ethics of certain data science practices, the boundaries of open science research, and the ease of identifying the members of a given dataset, the incident reveals something else, too: People continue to give up vast quantities of their personal data to sites online, expecting privacy.

About the Author
Robert Hackett
By Robert Hackett
Instagram iconLinkedIn iconTwitter icon
See full bioRight Arrow Button Icon

Latest in Tech

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025

Most Popular

Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Finance
Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam
By Fortune Editors
October 20, 2025
Fortune Secondary Logo
Rankings
  • 100 Best Companies
  • Fortune 500
  • Global 500
  • Fortune 500 Europe
  • Most Powerful Women
  • Future 50
  • World’s Most Admired Companies
  • See All Rankings
Sections
  • Finance
  • Fortune Crypto
  • Features
  • Leadership
  • Health
  • Commentary
  • Success
  • Retail
  • Mpw
  • Tech
  • Lifestyle
  • CEO Initiative
  • Asia
  • Politics
  • Conferences
  • Europe
  • Newsletters
  • Personal Finance
  • Environment
  • Magazine
  • Education
Customer Support
  • Frequently Asked Questions
  • Customer Service Portal
  • Privacy Policy
  • Terms Of Use
  • Single Issues For Purchase
  • International Print
Commercial Services
  • Advertising
  • Fortune Brand Studio
  • Fortune Analytics
  • Fortune Conferences
  • Business Development
  • Group Subscriptions
About Us
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • About Us
  • Editorial Calendar
  • Press Center
  • Work At Fortune
  • Diversity And Inclusion
  • Terms And Conditions
  • Site Map
  • Facebook icon
  • Twitter icon
  • LinkedIn icon
  • Instagram icon
  • Pinterest icon

Latest in Tech

Even Nvidia’s own research teams can’t get enough GPUs amid the race for AI computing power
NewslettersEye on AI
Even Nvidia’s own research teams can’t get enough GPUs amid the race for AI computing power
By Sharon GoldmanApril 9, 2026
12 hours ago
You’re looking at the AI revolution all wrong, top economist says: 40% unemployment and a 3-day work week are the same thing
AIdisruption
You’re looking at the AI revolution all wrong, top economist says: 40% unemployment and a 3-day work week are the same thing
By Nick LichtenbergApril 9, 2026
12 hours ago
Zoom CEO Eric Yuan
Successthe future of work
‘I hate working 5 days’: Zoom CEO says traditional work schedules are becoming obsolete—and predicts a 3-day workweek by 2031
By Preston ForeApril 9, 2026
13 hours ago
Nutella seen aboard the Orion spacecraft Integrity.
RetailFood and drink
Nutella jumps on the best product placement money can’t buy: A trip to the far side of the Moon
By Catherina GioinoApril 9, 2026
15 hours ago
kash
Cybersecuritycyber
Trump’s ‘cease-fire’ won’t stop Iranian hackers for long, cyber experts say
By David Klepper and The Associated PressApril 9, 2026
15 hours ago
lego
PoliticsIran
AI-savvy pro-Iran groups troll America with Lego Movie-style propaganda videos mocking American failure
By Sam McNeil and The Associated PressApril 9, 2026
15 hours ago

Most Popular

The U.S. government is spending $88 billion a month in interest on national debt—equal to spending on defense and education combined
Economy
The U.S. government is spending $88 billion a month in interest on national debt—equal to spending on defense and education combined
By Fortune EditorsApril 9, 2026
17 hours ago
Gen Z doesn't want your full-time job. They want several part-time roles, and it's reshaping the entire workforce
Success
Gen Z doesn't want your full-time job. They want several part-time roles, and it's reshaping the entire workforce
By Fortune EditorsApril 9, 2026
20 hours ago
A Meta employee created a dashboard so coworkers can compete to be the company's No. 1 AI token user—and Zuckerberg doesn't even rank in the top 250
AI
A Meta employee created a dashboard so coworkers can compete to be the company's No. 1 AI token user—and Zuckerberg doesn't even rank in the top 250
By Fortune EditorsApril 9, 2026
19 hours ago
2 years ago, Saudi Arabia quietly canceled the ‘petrodollar’ deal with America that wired the world economy for 50 years. Then war broke out in Iran
Energy
2 years ago, Saudi Arabia quietly canceled the ‘petrodollar’ deal with America that wired the world economy for 50 years. Then war broke out in Iran
By Fortune EditorsApril 7, 2026
2 days ago
White-collar workers are quietly rebelling against AI as 80% outright refuse adoption mandates
AI
White-collar workers are quietly rebelling against AI as 80% outright refuse adoption mandates
By Fortune EditorsApril 9, 2026
18 hours ago
Gen Z workers are so fearful AI will take their job they’re intentionally sabotaging their company’s AI rollout
AI
Gen Z workers are so fearful AI will take their job they’re intentionally sabotaging their company’s AI rollout
By Fortune EditorsApril 8, 2026
2 days ago

© 2026 Fortune Media IP Limited. All Rights Reserved. Use of this site constitutes acceptance of our Terms of Use and Privacy Policy | CA Notice at Collection and Privacy Notice | Do Not Sell/Share My Personal Information
FORTUNE is a trademark of Fortune Media IP Limited, registered in the U.S. and other countries. FORTUNE may receive compensation for some links to products and services on this website. Offers may be subject to change without notice.