A couple nights ago I was shutting down my computer when, in the process of quitting Firefox, a mysterious screen opened before my eyes with what appeared to be a galaxy of circles and triangles, interconnecting and sprouting new satellites as I watched. Closer inspection revealed it to be a graphic visualization of sites I had recently visited with all the third party sites they had notified in clusters around them. It was a moment out of War Games. Had I been hacked, I wondered? But no, it turned out to be Lightbeam, a browser add-on that allows you to see in real time how your information is being shared on the Internet by the sites you visit, and with whom. I vaguely remember installing it a while back but I never got around to using it, hence my initial surprise when it seemingly opened itself in my browser. But once I saw what it revealed, I was riveted.
Let’s look at some sites, I said to myself. I opened The Guardian newspaper — dozens of third party connections appeared on the graph. Ok, how about the Washington Post and New York Times–dozens more appeared, forging connections with the other sites on the graph through their own third party connections. Then, just for yucks, I tried the Brattleboro Reformer site — 54 more triangles appeared.
Finally, with some trepidation, I loaded iBrattleboro. Wouldn’t you know, with the exception of one little shared image on Tumblr, which we have since removed, we were free and clear. No lines arced out from iBrattleboro to connect it with the rest of the galactic denizens and their vast armies of tracking cookies. Phew! We had escaped the clutches of the evil empire of data merchants — for now.
Confirming a suspicion, I cleared all the cookies from my browser down to the last one, and then loaded the New York Times again. Over 150 cookies instantly appeared with more added the longer I let the page stand idle. At one point, as I compulsively cleared cookies, more appeared even though I hadn’t returned to the Times tab. Despite the fact that the page was not active, they were maintaining a connection with my browser and pushing cookies at me. Not until I closed the tab with the NYT page did the cookies stop rolling in.
I’m in the web biz and have been almost since the beginning. I understand why cookies are used and cases in which they’re not only acceptable but useful, as when, for instance, I want to stay logged into a site I frequently visit such as iBrattleboro. If I’ve applied for an account on a site, I’m fine with them getting my login from a cookie. But 150 cookies planted by a site where you don’t even have an account and have not signed in is simply unacceptable not to mention exploitive. The only reason we tolerate it is because the vast majority of us do not have any idea it’s going on.
Well, folks, it is. The sites you love the most and use the most frequently (with the exception of this one) are pulling data from you almost with every keystroke. It’s called “tracking” and it’s the source of a great deal of revenue to companies as different as Facebook and the New York Times, not to mention the hundreds of other tracking and so-called research companies that serve them and anyone else interested in buying or selling user data.
You can solve this problem through browser settings by choosing to accept third-party cookies ‘never’ instead of ‘always’ or ‘from sites you visit.’ That way, if you go to the New York Times, you’ll only get cookies from the New York Times and not from dozens of advertising and data companies. The only drawback to this method is that some sites will refuse to work for you if you don’t set cookie access wide open by allowing third party cookies. There’s usually no reason why you need third party cookies, but some sites want you to take all of them, for the sole reason that if you don’t, they can’t track you. Your only choice in this case is to avoid that site.
The cookies I found on a survey of several dozen large sites, including government, education, e-commerce, and media, fell into two general categories — social media and advertising. Among the most frequently seen social media connections were the usual suspects: Google, Facebook, Twitter, LinkedIn, and YouTube. These guys are collecting and ‘monetizing’ our data. Google Analytics is commonly used to tell site owners who is visiting their site, so you’ll often see cookies for that. They’re collecting and selling data too. So are DoubleClick and Google which power most web advertising campaigns. The rest of the rabble just like to know where you’ve been, so they can add it to your profile and sell it to advertisers and other interested parties.
If you want to take this voyage of discovery for yourself, download the Lightbeam add-on for Firefox and follow the simple instructions for use. I guarantee you’ll be fascinated. You can also try the cookie experiment (perhaps on a browser you don’t use that often if you’re afraid of disrupting other services). Close all tabs, clear cookies, then load one site, and go back and look at your cookies. Did your chosen site dump a few or a lot? Repeat with other sites you like. It’s highly educational. Even if you don’t clear all your cookies, you might enjoy seeing what’s already on your computer. So many tracking beacons, each one phoning home…
Information is power, and unaccountable Internet companies should probably be prevented from knowing everything there is to know about us. Simply giving our data to them because we don’t understand what they’re doing is not going to prevent them from misusing that data in the future. Hell, they’re misusing it now. Capitulation is the easy path and very likely the one we’ll collectively take, but like all abdications of that sort, we’ll almost certainly regret it in time. Or not.
Happy browsing, everyone!
(Not for the faint of heart)
In my official recorded test of just how many third party sites are contacted in a typical browse of the morning news, I loaded six media sites: New York Times, Washington Post, Guardian UK, Business Insider, Brattleboro Reformer, and iBrattleboro.
Here’s what it looked like:
The big glom in the middle consists of the media sites mentioned above with the exception of iBrattleboro. If you look off to the right you’ll see a little dot with an i in it. That’s iBrattleboro. The other sites contacted over 336 third party sites just by loading their home pages.
Full disclosure regarding iBrattleboro: if there is embedded content displayed on any page, third parties will be contacted and cookies will be planted. In this test, we made sure no embedded content was displayed above the fold. Embedded content refers to things such as YouTube videos and SoundCloud files.
Speaking of cookies, in the test depicted above, I cleared all cookies before running it and then checked them after loading the six sites. There were approximately 190 companies listed, each with one or more cookies. By my estimation, there were at least 300 and probably closer to 400 cookies planted. The browser doesn’t provide an exact count. If you look at the names in the list from the first screen-load, you’ll see that many of them begin with the word ‘ad.’ That should be a clue as to their purpose, although there are others that include the word ‘predictive,’ which is less reassuring.
Click here to learn more about Lightbeam for Firefox.