16 Feb As Data Collecting Grows, Privacy Erodes
New York Times, By NOAM COHEN, February 15, 2009
There are plenty of people who can muster outrage at Alex Rodriguez, the Yankees third baseman who is the latest example of win-at-any-cost athletes. But I’d prefer to see him as at the cutting edge of another scourge — the growing encroachment on privacy. The way Mr. Rodriguez’s positive steroid test result became public followed a path increasingly common in the computer age: third-party data collection. We are typically told that personal information is anonymously tracked for one reason — usually something abstract like making search results more accurate, recommending book titles or speeding traffic through the toll booths on the thruways. But it is then quickly converted into something traceable to an individual, and potentially life-changing. In Mr. Rodriguez’s case, he participated in a 2003 survey of steroid use among Major League Baseball players. No names were to be revealed. Instead, the results were supposed to be used in aggregation — to determine if more than 5 percent of players were cheating — and the samples were then to be destroyed. It is odd that most of the news coverage described the tests as “anonymous.” If the tests were truly anonymous, of course, Mr. Rodriguez would still be thought of as a clean player — as he long had insisted he was. But when federal prosecutors came calling, as part of a steroid distribution case, it turned out that the “anonymous” samples suddenly had clear labels on them. As a friend put it in an e-mail message: “Privacy is serious. It is serious the moment the data gets collected, not the moment it is released.” To Jonathan Zittrain, a professor of Internet law at Harvard, there is an obvious explanation for this kind of repurposing of information — there is so much information out there. Supply creates demand, he argues. “This is a broader truth about the law,” he writes in an e-mail message. “There are often no requirements to keep records, but if they’re kept, they’re fair game for a subpoena.” And we are presented with what Professor Zittrain calls the “deadbeat dad” problem. There are government investigators, divorcing spouses, even journalists, who have found creative ways to exploit the material. “So many databases,” he writes, “as simple as highway toll collection records or postal service address changes, lend themselves to other uses, such as finding parents behind on their child support payments.” Perhaps a more direct explanation is that data collection is part of what Cindy Cohn, the legal director of the Electronic Frontier Foundation, calls “the surveillance business model.” That is, there is money to be made from knowing your customers well — with a depth unimaginable before Internet cookies allowed companies to track obsessively online behavior. “We took whatever was done offline and put it on steroids,” she said, perhaps with the Rodriguez case in the back of her mind. “It requires compliance with the kind of promises that comes with this kind of data collection.” The foundation argues that online service providers — social networks, search engines, blogs and the like — should voluntarily destroy what they collect, to avoid the kind of legal controversies the baseball players’ union is now facing. The union is being criticized for failing to act during what apparently was a brief window to destroy the 2003 urine samples before the federal prosecutors claimed them. “You don’t want to know that stuff,” she says, speaking of the ordinary blogger collecting data on every commenter. “You don’t want to get a subpoena. For ordinary Web sites it is a cost to collect all this data.” The digital format makes it easy to cling to material that normally would be disposed of or would disintegrate. Storage is cheap and practically limitless. And Ms. Cohn says of the people who dominate the Internet, “the people who design software, in my experience, tend to be pack rats.” Journalists are sometimes advised to destroy their notes every few months so that they can’t be used in a lawsuit. Yet, somehow you want those notes — you see only how they could set you free, or lead you back to a new story, not prove your guilt. Even though Google is most frequently viewed as the most worrisome collector of personal data — the mail you send, the documents you write, the books you read — Ms. Cohn said, “I have a higher confidence that Google will do what it says — because Google has lots of people watching them — than other, smaller sites.” As a legal director, she focuses on holding organizations responsible for their promises to customers. The foundation is suing AT&T for cooperating with the government’s surveillance of telephone calls — “they broke their promises to their customers — ‘we are going to route your phone calls, not be an agent of the state.’ ” In an online opinion column for The New York Times, Doug Glanville, a former teammate of Mr. Rodriguez who was part of the steroid survey, begins by writing that “there was one clear moment when I wanted to be treated like a number.” It was, he said, “the day in 2003 that I went in for a drug test as a member of the Texas Rangers.” I’m here to tell him that being treated as a number may be cruel comfort. On the Internet, he can be tracked by investigators quite content to think of him as a number: they call it his IP address.