Yesterday Google took a step in the right direction when it comes to protecting our privacy. Expected to be implemented before the end of the year Google will begin anonymizing the search data it collects so it can no longer be identified with any individual user, after 18-24 months. This is a good step, but does it go far enough?
As is often the case Danny Sullivan has all the details, but the main points are that for each search you do Google collects
- IP Address
- Date & Time
- Query Terms
Your IP address with the help of your ISP can be used to track your computer and the cookie Google sets can be used to track your computer as it moves from one IP to another. Google will now after 18-24 months overwrite both values so the query terms in a search can no longer be tracked to any particular user.
You may remember this past summer AOL released search data that it believed was anonymous. In a matter of days some of that anonymous data was tied to one very real person. In the AOL case, while no person was associated with any specific query, the queries themselves were connected so that anyone could tell which queries were done by the same person. By making a few educated guesses New York Times reporters were able to figure out the identity of one of the searchers.
Google will be overwriting their data in such a way that the end result will be that no two queries can be associated as having been performed by the same person. Google is still working out the best way to randomly overwrite the search data, which is why the policy change will take several months to put into effect.
Does It Go Far Enough?
Not everyone thinks the anonymization goes far enough. There are those who would argue the data should never be stored in the first place or that at the least it should be made anonymous in much less time than 18-24 months. As an end user of search engines I would prefer my data never be collected, but as a marketer I understand how valuable that data can be.
The current Google solution may not be perfect, but it’s certainly a step in the right direction and it does come across as a reasonable compromise for now. Sure your data will be stored for a time, but at least a decade from now no one is going to know what you searched for today. That’s more than the people who upload embarrassing videos of themselves to YouTube will be able to say.
But will what you search for today really be unknown a few years from now?
Google will be making your searches anonymous in their server logs, but thats not the only place your data is stored. Though it’s sometimes hard to believe, Google is not the only search engine. Both Yahoo and MSN are assumed to save their server logs indefinitely at the moment, though given Google’s lead in search I suspect both will fall in line and adopt a similar policy as Google in the coming months.
Beyond search engines all your queries still go through your ISP so they may very well be sitting on your ISP’s server after the two year waiting to be removed from Google’s server.
Your own computer also stores your search history and unless you remember to clear it anyone can get access to it easily.
And lets not forget that Google saves your search history as part of its personalized search. That search history resides in a different place than the search history Google will be anonymizing when it overwrites parts of the server logs. You do have control over destroying this search history, but if you do you lose the benefits of personalized search.
The new policy is a good move by Google. It won’t go far enough for privacy advocates, but it’s a step in the right direction and it’s certainly better than Google holding the search data indefinitely. Now let’s hope the other engines follow suit as do our ISPs.