The first thing to point out, I think, because a certain contingent of the internet punditry brigade have decided that this revelation must immediately be turned into a stick with which to further beat the iOS vs Android horse, is that Android caches precisely the same data.
$ ./parse.py cache.cell db version: 1 total: 41 key accuracy conf. latitude longitude time 240:5:15:983885 1186 75 57.704031 11.910801 04/11/11 20:03:14 +0200 240:5:15:983882 883 75 57.706322 11.911692 04/13/11 01:41:29 +0200 240:5:75:4915956 678 75 57.700175 11.976824 04/13/11 11:52:16 +0200 240:5:75:4915953 678 75 57.700064 11.976629 04/13/11 11:53:09 +0200 240:7:61954:58929 1406 75 57.710205 11.921849 04/15/11 19:46:31 +0200 240:7:15:58929 -1 0 0.000000 0.000000 04/15/11 19:46:32 +0200 240:5:75:4915832 831 75 57.690024 11.998419 04/15/11 16:13:53 +0200
CI Timestamp Latitude Longitude AccuracyConfidence
196769687 306858899.8 -33.79464816 151.1914875 1500 90 78808909 307497296.2 -33.89795403 151.2098288 1500 90 78808911 307506560.8 -33.89293759 151.2039136 1500 90 78723747 308008772 -33.79670207 151.1811041 1500 90 78783294 308104094 -33.8842238 151.2066338 1500 90 78742991 308217181.1 -33.79704956 151.1957413 1500 90 78723098 308301684.8 -33.80716884 151.1686881 1500 90
The iPhone’s cache data is slightly truncated to fit within the available width here, but if you’d like to see the full schema for the cell tower location table click here for a PNG or here for an Excel spreadsheet.
So it’s worth noting that the contents of the logs kept on both Android and iOS devices is identical. In fact, it’s suspiciously identical. Perhaps it’s just because these are the obvious and only metrics you might look for in a location database, or perhaps it’s because both are most likely loosely based on the database schema Skyhook used (both Android and iOS, until relatively recently, relied on Skyhook for cached geolocation services).
Regardless, Google have openly stated that the purpose of their database is to provide aGPS functionality. Given that the contents of the databases are identical it seems entirely reasonable to assume that iOS’s consolidated.db is, indeed, a geolocation cache used for the purposes of aiding aGPS.
The biggest meaningful difference between how Google and Apple handle this functionality is that Android only retains locally the last 50 cell towers you’ve seen. This is a fairly basic trade-off – the contents of the cache on Android is potentially less interesting but as a result the cache itself is less useful. An Android device needs to get aGPS data from the internet more frequently than an iOS device because its local cell tower location cache is less exhaustive. More internet access means more battery usage and slower GPS lookups.
Google also state that they collect a device ID allowing their own database of cell tower records to theoretically be tracked back to an individual handset. The data appears to be anonymised. I don’t view this as a significant problem either way. I certainly don’t think Google are actually using this data to track individual Android owners and I think doing so would be difficult even though it appears to be possible.
Consolidated.db logs cell tower locations - not your phone’s location
The next thing to make clear is that this cache contains the location of cellular access towers and wifi access points. It does not directly track the location of the phone (and by extension, the user). The following image is pulled from my own consolidated.db file. It’s centered on my home address. Note that there is no marker over my home. The large blob on the left-hand edge is Vodafone’s Australian head office, and the cell tower I was most frequently joined to when my phone was with Vodafone. The various other blobs are nowhere near my house, nor are they spots along my daily commute. They’re simply other towers that the phone has seen:
This bears repeating: The contents of the cache are the locations of cellular access towers. Not the location of the phone. It’s certainly possible to interrogate the contents of the file and determine my rough whereabouts to within about a 2km radius. It’s largely useless for figuring out my whereabouts to any great level of accuracy. You aren’t going to be able to look at the contents of consolidated.db and determine my home or office address.
It is true that the data is sent to Apple. In addition to providing an offline, locally accessible location cache this data is also sent to Apple and used to let Apple build their own central cell tower location database. It’s further used, as is set out in the iOS Terms of Service, in the same was as your GPS-determined location – that is, to provide location-aware services for apps. This is non-controversial though – we already knew this, and within the iOS License Agreement it is made clear that your location will be shared for specific purposes, and the OS provides the ability to turn this functionality off. (That first link is worth reading, incidentally, for anyone who was actually surprised to discover that iOS has an offline location cache – it’s Apple’s response to Congress on precisely when, how and why they collect user’s location data. Pages six and seven in particular deal with the use of cell tower location tracking to enable assisted GPS capabilities.)
When location services are enabled on an iOS device the contents of consolidated.db is batched and sent to Apple twice a day. This ceases when location services are disabled – although, of course, the ability to use location-aware apps ceases along with it. Again, this is non-controversial. Android behaves in the same manner (although it seems updates are sent almost in real-time rather than batched). Neither OS shares your location with the OS vendor if location services are turned off.
With location services disabled, iOS ceases to maintain the local location cache. It does not delete the existing consolidated.db, which has, I think, lead to some confusion in commentary on this issue. But you can readily test this yourself – examine your own consolidated.db file, disable location services, and take your phone out and about with you. When you return, make another backup and examine the contents of consolidated.db again. The database will not have been updated beyond the point at which location services were disabled.
So with location services disabled there is, indeed, no tracking of your location occurring, either by Apple or by the device itself.
Broader implications of having an offline location cache
It’s certainly true that if I now need to determine your phone’s general location (within the ~2km radius of accuracy set out above) at a specific point in time, say for the purposes of a law suit, I can subpoena you for the contents of your consolidated.db file. But then I could always have subpoenaed your mobile phone company for the same information. It’s hard to see how this is a significant problem, or indicative of a privacy issue. It’s also true that if I steal or hack your computer I can probably interrogate your iOS backups to look determine the same information – your phone’s general location to within a couple of kilometres. This is completely circumvented by ticking the “encrypt iPhone backups” checkbox within iTunes, so if this is a significant concern to you then you should do just that and worry no more.
If I’ve hacked your computer and am now trying to figure out your home address then consolidated.db is useless to me – the data simply isn’t accurate enough. Much better, instead, for me to check your wireless access point’s MAC address against Google’s publicly query-able wireless AP location database. For my own home router this gives a location accurate to within a few feet.
So given all of the above, what are the actual, real-world problems raised by the existence of iOS’s cell tower location cache? From what I can see, there are none. In fact, I’m so convinced that there are none that I’m sharing my own consolidated.db file for any and all of you to download.
Further I don’t think it’s sensible to paint this as an iOS vs Android issue – both platforms are doing this, and it seems to be a good engineering solution to the problem of how to rapidly provide location data without the need to resort to internet access or to firing up the GPS radio. Trying to turn this into an issue of platform contention seems, to me, to be misguided.
That said, if there are privacy issues I’ve overlooked then let me know in the comments and I’ll try to address them.