You may already know that every time you go online, your browsing history could be exposed to numerous advertisers and data brokers who then send you “targeted” advertisements. But what about visiting the news websites you trust most? Our current research finds that browsing news-related websites actually exposes you to over twice as much tracking as the rest of the web.
The surprising extent to which news organizations subject readers to third-party tracking deserves closer attention. As a society, we often hold news organizations to higher ethical standards. They’re not just businesses; they’re supposed to provide a vital public service, and they depend on public trust. While the ethics of readers unknowingly “paying” for content with their privacy are certainly questionable, the practice is also indicative of the precarious situation the news industry finds itself in. Meanwhile, the rise of ad-blockers – a hindrance to the business model of news websites – has only further complicated matters.
Why the media has turned to online tracking
Ever been stalked by that pair of shoes you clicked on once but didn’t buy? Here’s what’s going on: websites frequently allow third parties (primarily online advertisers) to monitor their readers’ activities and interests. These third parties use external servers that often employ what are known as “trackers,” or pieces of software connected to a “hidden web” that monitors users’ activities.
Ad networks will then show users advertisements deemed “relevant” based on which websites they have previously visited. That pair of shoes you keep seeing is the hidden web in action: even if the websites appear totally different on the surface, underneath they may be connected by a vast network of trackers. And it’s this invisible tracking network that the struggling online news industry has turned to. It’s a story that’s been told time and again: as consumers and advertisers have migrated to the web, the longstanding revenue model for ad-dependent news organizations has come under considerable strain.
In response, many publishers have resorted to various forms of ethically-murky practices to replace dried-up revenue sources. This could mean publishing native advertising or allowing companies to track what pages readers visit – which they’ll then use to create “consumer profiles.” There seems to be a strange silence surrounding the ethics of native advertising, but online tracking has come under increased scrutiny from regulators and civil society groups groups.
Meanwhile, heated debates about invasive digital advertising and tracking flared up recently when Apple allowed ad-blocking in the newest update to the iOS mobile operating system. Because ad-blocking prevents publishers from gaining income they derive when ads are clicked, the CEO of the Interactive Advertising Bureau claimed that “ad blocking is robbery” that could lead to an “internet apocalypse.” Others have suggested that the industry created its own problems with run-amok advertising. A major factor in this predicament is that advertising and behavioral tracking have become so intertwined that users who want to protect their privacy must also block advertisements.
Using “X-ray software” to detect hidden servers
For our study, we were interested in understanding the extent to which news sites use trackers. Using Tim Libert’s open-source software platform webXray, we loaded web pages to detect all of the third-party servers that may collect user data. To get a baseline measure of tracking prevalence, we first analyzed Alexa’s top 100,000 websites. We found that users were exposed to an average of eight external servers on each site. This means that many hidden third parties (again, usually advertisers) may be simultaneously observing an individual’s browsing habits. But even more surprising was our finding that news organizations appear to be among the most active perpetrators of this practice.
Our investigation has revealed that among the 2,000-plus news-related websites identified by Alexa, readers are, on average, connected to over 19 third-party servers – twice as many as the 100,000 most popular sites. The outlets facilitating this tracking include the most respected names in the news industry, coast-to-coast. A visitor to The New York Times’ homepage is potentially connected to a whopping 44 third-party servers, while visitors to the Los Angeles Times’ website get their browsing history leaked to 32 external servers. And if you’re planning on checking the forecast on AccuWeather before heading out, you can expect to be connected to 48 third-party servers.
Even visitors to public media outlets are not safe. A visitor to NPR’s website will be tracked by Chartbeat, Google (GOOG), Nielsen Online, Moat and comScore. In general, we found that marquee media brands are no different from the other 2,000 news sites that we examined. It’s a problem that’s endemic to the entire internet media sector. While these findings are preliminary, they’re in line with recent research, and we’ve used an established research methodology to shed light on this practice.
A clown car of trackers
The findings are troubling for a number of reasons. In addition to ethical concerns regarding user consent and privacy rights, excessive tracking is a major cause of news sites’ slow load times as readers are forced to wait for multiple trackers to download. Picture a clown car driving into your living room and a nearly endless parade of marketers hopping out and competing with each other to peek over your shoulder while you surf the web. News websites are taking the internet’s average “creep factor” of privacy invasion and doubling down on it.
Much of this tracking is performed by well-known companies who wield significant presence and power. For example, Google and Facebook (FB) have code on 92% and 56%, respectively, of the 2,000-plus news-related sites that we analyzed. More troubling, even those who use privacy-friendly search engines like DuckDuckGo and avoid social media aren’t spared from the internet giants’ tracking, which often occurs silently on the sites they end up surfing each day. While many users may be resigned to the power these companies have over their information, regulators across the world have continued to investigate and penalize them for deceptive and unethical business practices.
While Google and Facebook don’t directly sell user information, we found 67 instances of sites leaking reader information to data brokers Experian and Acxiom. Both of these companies sell personal information on the open market, with little oversight or regulation. Furthermore, modern big data techniques allow seemingly “anonymous” data to be used to discriminate against minorities, paired with other information such as email addresses, or even linked back to real names. Although many companies claim not to sell “personally identifiable information,” this is often based on an antiquated definition of the term unsuited for the big data era.
A peeved public
So why does it matter that news sites engage in this practice? Well, your news-reading habits can be a reflection of who you are and what you’re likely to buy. For example, if you read the business section, you may be more likely to buy luxury goods. Indeed, companies called data brokers bundle people into consumer “segments,” with categories ranging from “power elite” and “american royalty” all the way down to “small town shallow pockets” and “urban survivors.” In the absence of regulation or comprehensive disclosure, it is unclear how web browsing histories make their way into these determinations, but the potential exists.
It should come as no surprise that this practice, as it becomes better understood, doesn’t sit well with the public. Extensive survey research has shown that users are opposed to such invasions of their privacy. At the same time, they feel like there’s nothing they can do to protect themselves. Publishers that complain about the ethics of ad blockers should also consider the ethics of tracking users and their outsize role in widely reviled annoyances such as increasing page load times, invading privacy, sucking up data on limited plans and imposing distracting animations and sounds on the viewer. It’s doubtful that visitors to news sites are aware of this tracking, and the websites provide few if any clues that it’s happening. While modern browsers have implemented a “Do Not Track” setting, many online advertisers choose to ignore it. Even if users make it clear they wish to be left alone, they are still tracked.
Media companies digging their own grave
When publishers don’t give people the tools to opt out of extensive behavioral tracking, they leave readers with only one option to protect their privacy: install an ad blocker. The problem of ad blocking runs much deeper than getting rid of annoying pop-ups; it strikes at the core business model of digital journalism, which often relies on ad revenue.
Publishers sit at a crossroads. They can continue down the path of overly invasive advertising or they can try to correct course. They can ignore user wishes in an attempt to make slightly more money on “targeted” ads, or they can respect their visitors’ “Do Not Track” requests. If news outlets attempt to “block the blockers,” it will only result in a protracted – and unwinnable – war with readers. Regardless, trying to improve upon a fundamentally deceptive model is not the best lesson to draw from this quandary.
Indeed, there are signs that the industry may continue ethically dubious advertising practices by increasing their focus on native advertising. Some commentators are advocating for newspapers to just develop better advertising – with “better” too often meaning fewer animations, not greater respect for reader privacy. Others have, once again, called for paywalls or some type of subscription model. And then there are pessimists who question whether any profit-driven, commercial model of digital journalism will ever provide the quality of news and information that a democracy requires.
To be clear, we do not wish for the economic model supporting online journalism to collapse. Democratic society needs journalism, and we recognize that hard news is expensive to produce. But we can’t let commercial imperatives run roughshod over the public’s right to privacy. News organizations should be ethical exemplars, not the bad apples among online actors.