Tony Hughes is managing director at Moody’s Analytics.
As an economic statistician who analyzes data for a living, I have been fascinated to observe the recent court case involving hiQ Labs and LinkedIn. My job depends quite crucially on access to interesting databases, and LinkedIn has a remarkable one.
In August 2017, LinkedIn blocked hiQ from accessing its data available in public LinkedIn profiles of registered users, which hiQ had been using to build algorithms. HiQ took LinkedIn to court, and the case is currently mired in the appeals process after a U.S. district court judge in the Northern District of California granted a preliminary injunction against LinkedIn. LinkedIn’s attorneys subsequently filed an appeal to the Ninth Circuit Court of Appeals. Oral arguments were heard in San Francisco on Thursday.
The case hinges on questions of who owns a piece of data and the circumstances under which the information can be viewed as residing in the public domain, accessible by all and sundry. The appeals court judges may rule that LinkedIn owns exclusive rights to the data, which would not have been compiled without the entrepreneurial talents of LinkedIn’s founders. Conversely, the judges may conclude that since LinkedIn users set their profiles to “public,” placing them in full view of search engines and general web surfers, they are giving companies like hiQ free rein to view and use the data as they see fit.
It’s a knife-edge decision with strong arguments on both sides. Either ruling could have profound implications for how people like you and me interact with data in our daily lives.
If the case goes against LinkedIn, the company will likely have two potential responses: Keep the data public and actively compete in the analytics space, or make the data accessible only by those signed into the service, as the act of signing in would, in theory, limit broader public access to the data. The second of these is perhaps the most interesting.
Most LinkedIn users leverage the site to boost their profiles and careers by looking up and connecting with new business acquaintances or search for job opportunities. Personally, I never surf incognito and will normally invite the people I look up to connect. For LinkedIn, which was founded to enable these searches, such a change may be existential. If its data were made private so that LinkedIn profiles would not be searchable by search engines, the utility most users get from the service—to be “found”—would undoubtedly be disrupted and the network would hold much less appeal.
One can easily imagine rival startups, willing to live with open access to any data they collect, springing up to fill the void. Anyone who remembers MySpace, formerly the dominant player in social networking, will know that large networks can be usurped by upstarts with a more compelling value proposition for users. People like the ability to control the public nature of their LinkedIn profile and can be counted on to migrate in large numbers if the feature is ever withdrawn.
LinkedIn will then hold exclusive rights to a database of rapidly diminishing value.
If, on the contrary, LinkedIn prevails in the case, the implications will be monumental. One major positive outcome from such a result will be that social network entrepreneurs will be emboldened to start new and better websites and apps. The industry is only 15 years old, so considerable innovation is still possible. Holding exclusive data rights to a network can be viewed as akin to a patent on a new invention. Society bestows these rights to allow inventors to recoup the investments they made and to reward them for the risks they took to get their ideas into the marketplace. The prospect of holding monopoly-pricing power then encourages other individuals and organizations to expend energy to develop their own ideas. The same may well be true for social networks.
“Inventing” a new database is, generally, very hard. Those who come up with clever ways of extracting information, like building a social network, or who expend the laborious effort to collect data from primary sources, should be able to profit handsomely from their exertions. If such data were made public through judicial decree, free riders would abound and private enterprise would have no incentive to collect data in the future.
On the other hand, with LinkedIn in full control of the compiled information, society would suffer in other ways. The world is being flooded with data, and analysts around the world are grappling with ways to link disparate databases across cyberspace. One can posit that profound insights that could enhance social welfare lie at the intersections of these databases. Perhaps LinkedIn data could be tied to health statistics to yield a cure for cancer? This is a stretch, of course, though we can’t definitively rule it out until research is undertaken. Along the way, companies may identify less profound, but nonetheless useful, insights that allow them to profit by making life slightly better for their clients. These innovations would be greatly diminished if competitive pressures were removed.
So we, through our legislators and judges, face an especially cruel choice in cases like this. We can make our data freely available and have no one bother to collect it. Or, we can bestow ownership rights on the data and potentially miss out on beneficial insights gleaned from its analysis.
As a statistician, I hope that large databases continue to be collected so we can explore how this data can be used not only for private but also public benefit.