RFID, GPS Technology and Electronic Surveillance
Mobile Carriers Start to Wrap Their Brains (and Wallets) Around a Gold Mine of Location-Based Usage Data
May 28, 2010Popular Science - A great article in MIT's Technology Review got me thinking of something that's so obvious, but almost always subconscious: your mobile phone provider knows so much about you. Every time you make a call, send a text or download data, your provider knows who you were talking to and for how long, along with exactly where you were at the time of the connection, accurate to within a mile.
This staggering stream of data is a gold mine for the mobile operators--both from an academic and commercial perspective. Now, they just have to figure out how to make the most of it, answering complex privacy questions along the way.
These data reports, or call detail records (CDRs), have been kept for years, logging each connection with the specific ID of the cell tower that carried it. But only now are computer scientists developing alogorithms powerful enough to crunch the data in meaningful and efficient ways.
Tech Review spoke to Ramón Cáceres, a researcher at AT&T, on how his company is putting these newly acquired analytical powers to use. So far, it's mostly academic and civic research. Studying travel habits of AT&T subscribers in New York and Los Angeles, for instance, Cáceres and his team found that the average Manhattanite travels 2.5 miles on average per day, while AT&T-using Angelenos log five miles. Another study, conducted by researchers at MIT, traced location data for mobile users attending major public events like a Red Sox game, analyzing how and when they travel to and from the stadium. Data such as these is proving incredibly valuable to city planners, who previously had to rely on limited surveys which are far less accurate.
But as you would imagine, the data is also tremendously useful (and valuable) commercially. Hyper-accurate location tracking could enable more accurate billboard advertising pricing, for instance. Every major network operator is now trying to decide how to use their golden pile of data for more revenue.
And of course, there are privacy concerns. Large analyses of CDRs are usually done anonymously. But one could very easily twist the numbers to, say, discern the home address of any subscriber by analysing which cell towers their phones connect to most often in the overnight hours.
Massive-scale mobile data analyses such as these are uncharted territory, and while the potentials for both revenue and insight is high, it remains to be seen how the privacy implications will shake out. Really interesting stuff--much more over at Technology Review [see story below].
Mobile Data: A Gold Mine for Telcos
A snapshot of our activities, cell phone data attracts both academics and industry researchers.May 27, 2010
Technology Review - Cell phone companies are finding that they're sitting on a gold mine--in the form of the call records of their subscribers.
Researchers in academia, and increasingly within the mobile industry, are working with large databases showing where and when calls and texts are made and received to reveal commuting habits, how far people travel for public events, and even significant social trends.
With potential applications ranging from city planning to marketing, such studies could also provide a new source of revenue for the cell phone companies.
"Because cell phones have become so ubiquitous, mining the data they generate can really revolutionize the study of human behavior," says Ramón Cáceres, a lead researcher at AT&T's research labs in Florham Park, NJ.If you were an AT&T subscriber and were near Los Angeles or New York between March 15 and May 15 last year, there's a 5 percent chance that your data was crunched by Cáceres and his colleagues in a study of the travel habits of the company's subscribers. The researchers amassed millions of call records from hundreds of thousands of users in 891 zip codes, covering every New York borough, 10 New Jersey counties, as well as Los Angeles, Orange, and Ventura counties in California.
The data set is a collection of call detail records, or CDRs--the standard feedstock of cell phone data mining. A CDR is generated for every voice or SMS connection. Among other things, it shows the origin and destination number, the type and duration of connection, and, most crucially, the unique ID of the cell tower a handset was connected to when a connection was made.
That let the AT&T team know the location of a phone to within a mile radius at the time each CDR was generated, making it possible to determine the distance traveled from home by each cell phone every day. The group found that, on average, people living in Manhattan travel 2.5 miles most days, compared to five miles in Los Angeles.
"But we also found that when you look at the longest trips people make, people that live in New York go significantly further, 69 miles on a weekday compared to 29 in Los Angeles," Cáceres says.Cáceres hopes to work with city planners, who would usually have to resort to expensive and limited surveys to gather such information.
"This kind of data can help them decide how to invest resources, for example if they want to know where to build a new train or subway station," he says.The AT&T work was presented at a recent workshop in Cambridge, MA, earlier this month as part of the NetSci conference on network science.
For now, Cáceres's group is looking to collaborate rather than commercialize. But cell phone networks are thinking about monetizing their data, says Jean Bolot, a researcher at network operator Sprint. This means a "two-sided" business model where they not only serve end users but also make money through relationships with other businesses.
"This is new in the telco space but not in other areas--look at Google, for example," he says.Since almost everyone has a cell phone, the scale of the data is immense compared to other sources. Mobility patterns might, for example, be used to adjust property or billboard advertising prices.
"Just about every operator on the planet is probably thinking about this right now," says Bolot.Another study, presented by Francesco Calabrese, a research scientist at MIT, and colleagues correlated location traces from roughly a million cell phones in greater Boston with listings of public events such as baseball games and plays, showing how people traveled to attend these events.
"We could partly predict where people will come from for future events," the team wrote in a report on their work, suggesting it could be possible to provide accurate traffic forecasts for special events.The surge of research in this area has been enabled by the development of algorithms that can efficiently handle large networks consisting of millions of links, says Vincent Blondel, a professor of applied mathematics at Université Catholique de Louvain, near Brussels, who organized the Cambridge workshop.
Blondel's research includes an analysis of connections between two million cell phone users in Belgium. It revealed that the French-speaking and Dutch-speaking populations of the country are barely connected by calls and texts.
"This is interesting, since there are already discussions within Belgium about splitting the country in two," says Blondel.Research in this area is typically focused on aggregate information and not individuals, but questions remain about how to protect user privacy, Blondel says. It is standard to remove the names and numbers from a CDR, but correlating locations and call timings with other databases could help identify individuals, he says. In the MIT study, for example, the team could infer the approximate home location of users by assuming it to be where a handset was most located between 10 p.m. and 7 a.m., although they also lumped people together into groups by zip code.
"I feel the scientific community should take responsibility for finding out how to trade off having useful data and protecting privacy," says Blondel.He is investigating the effect of techniques like using approximate rather than exact location information, or blurring the exact time stamps of calls from a data set.
No comments:
Post a Comment