Researchers at Massachusetts Institute of Technology (MIT) and Oxford University have discovered that the location stamps on just a handful of Twitter posts can be enough to let a relatively low-tech snooper know the address of your home and workplace.
The tweets themselves might be seemingly unharmful – links to cat videos or comments on the latest episode of Great British Railway. The location info comes from geographic coordinates automatically associated with the tweets, and as few as eight tweets over the course of one day is all it takes.
Giving up privacy
Twitter’s location-reporting service is set to ‘off’ by default, but Twitter users often choose to activate it. The discovery and subsequent research paper written by the educational institutions, is part of a more general project at MIT’s Internet Policy Research Initiative. The project is geared towards helping raise awareness about just how much privacy people may be giving up when they use social media.
Ilaria Liccardi, a research scientist at MIT’s Internet Policy Research Initiative and first author on the paper, said: “Many people have this idea that only machine-learning techniques can discover interesting patterns in location data.
“And they feel secure that not everyone has the technical knowledge to do that. With this study, what we wanted to show is that when you send location data as a secondary piece of information, it is extremely simple for people with very little technical knowledge to find out where you work or live.”
In their study, Liccardi and her colleagues used real tweets from Twitter users in Oxford, England, and Boston, USA. The users consented to the use of their data, and they also confirmed their home and work addresses, their commuting routes and the locations of various leisure destinations from which they had tweeted.
The time and location data associated with the tweets were then presented to a group of 45 random study volunteers, who were asked to try to deduce whether the tweets had originated at the Twitter users’ homes, their workplaces, leisure destinations or locations along their commutes. The participants were not recruited on the basis of any particular expertise in urban studies or the social sciences; they just drew what conclusions they could from location clustering. They had no information about the content of the tweets.
The data was presented in three different forms – a static Google map, in which tweet locations were marked with virtual pins; an animated version of the same map, in which the pins appeared on-screen in chronological order; and a table listing geographical coordinates, street names and times of day.
The maps only featured street names, with no names of businesses, parks, schools or other landmarks. But pins and table rows were colour coded to indicate general time of day – morning, afternoon, or evening. The researchers also varied the volume of data that the participants were asked to consider – one day’s, three days’ or five days’ worth.
Participants fared better with map-based representations, correctly identifying Twitter users’ homes roughly 65 percent of the time and their workplaces at closer to 70 percent. Even the tabular representation was informative, however, with accuracy rates of just under 50 percent for homes and a surprisingly high 70 percent for workplaces.
In general, participants also fared better with five days’ worth of data than with three or one. Across all three representations, participants with five days’ worth of data could correctly identify workplaces, for example, with more than 85 percent accuracy. Interestingly, the participants’ performance with three days’ worth of data was generally worse than it was with only one. It could be that, while a single day’s data is likely to be representative of a user’s typical patterns of movement, three days’ worth introduces the possibility of confounding variations, which are ironed out over five days.
Latanya Sweeney, professor of government and technology in residence at Harvard University and a former chief technology officer of the U.S. Federal Trade Commission, said: “Ilaria’s new paper puts two significant bricks in the wall of our privacy understanding.
“First, her survey shows how people can learn sensitive information from seemingly innocuous facts, and, second, people will easily share information they believe is innocuous.”