Last week, Twitter handed $10 million plus every public Tweet every generated (and to be generated) on its social network to the Massachusetts Institute of Technology media lab. This is a massive data set: more than 500 million Tweets are sent daily.
The grand research experiment will visualize how information spreads. The results will show up as both data visualizations and as mobile apps to “create new forms of public communication and social organization.” Oh, and by the way, there will be at least five more projects like this, as part of the Twitter Data Grants program.
“To date, it has been challenging for researchers outside the company who are tackling big questions to collaborate with us to access our public, historical data,” notes Twitter in its description of the initiative. “Our Data Grants program aims to change that by connecting research institutions and academics with the data they need.
But you probably don’t even care because, you know, this stuff is public, right? Plus you probably forgot about some random comment you made, what, five years ago. Perhaps that’s why the Twitter disclosure got very minimal media coverage, at least so far.
Meanwhile, Facebook kind of sort of apologized for the experiment it was caught conducting in June: one in which it rearranged items in the News Feed based on the sentiment of words in them in an effort to play with visitors’ emotions. It actually published the results of that study early this year so other data scientists could learn from it.
“Although this subject matter was important to research, we were unprepared for the reaction the paper received when it was published and have taken to hear the comments and criticism,” writes Facebook CTO Michael Schroepfer in a blog post. “It is clear now that there are things we should have done differently.”
What is the result of this lesson? Tighter research guidelines, data ethics training, and a review board made up of senior executives.
By the way, Reuters reported on Friday that one of Facebook’s next experiments will focus on how the social network can play a role in health care. As in, how member communities can help one another with chronic disease management or in spreading the word for important causes, such as why it’s good to sign up as an organ donor.
During a Fortune Brainstorm Tech event this week focused on marketing technologies, Facebook COO Sheryl Sandberg professed her confidence that members are OK with being data guinea pigs, which will keep them from fleeing to competitors like Ello. “People will continue to use Facebook if they understand that we don’t tell who they are to anyone, if they understand that they have control over what they share, and if we build a great product that continues to connect,” she said.
I’ll buy the notion that people will share information, if it results in some tangible benefit that appeals to their self-interest. But the perpetual stream of cyberbreaches at high-profile brands like JPMorgan, Home Depot, and Target has most otherwise blasé consumers paying far more attention to privacy than in the past.
There’s a lesson here for Fortune 500 companies defining data analytics policies: make sure your scientists get some ethics training because if/when a breach happens things the disclosure process will be far more straightforward.
In my mind, Google is no angel when it comes to manipulating data, but at Fortune’s dinner earlier this week, senior marketing executive Lorraine Twohill described the digital data privacy and “ownership” policy most people want to hear: “The consumer owns the data, that’s why I think controls are so important. The user should be able to know what you know about them and how you’re using it. And they should be able to get out if they want to.”
Yes, it’s all about the data but never forget who really “owns” it.
This item first appeared in the Oct. 3 edition of Data Sheet, Fortune’s daily newsletter on the business of technology. Sign up here.
Correction, October 6, 2014: An earlier version of this post mischaracterized Facebook’s News Feed experiment as one in which it “replaced adjectives in newsfeeds to play with visitors’ emotions.” It actually rearranged items based on adjectives used within them.