Problem

It is quite common to want to be able to translate an IP to a location. To avoid complications with country demarcations, and charsets; it is also common to put this into a [long,lat]. [lat,long] works well with maps.
There are a range of possible outcomes when attempting to compute location. Obviously LAN IPs have no location. Some IPs aren't registered in mainstream blocks handed to ISPs; so don't show up in records. Alternatively there may be some “hide me” service, as there is with phone numbers. On random real usage, mostly from the US, this amounted to about 3% of requests. There are some IPs which seem to be been logged to ships, oil-rigs or satellites, as they where in the middle of the ocean (this occurred frequently enough that serious use would require a filter). Obviously care must be taken with rounding numbers, not in the least between language systems. For my use, I rounded everything to 5DP, as I'm not confident in the data below that resolution. I was attempting to demark geo-usage, not individual people. It should be noted for plain DSL for endusers, the IP is likely to resolve to the DSLAM, as that is where the edge of the internets IP network is (and thus ones public IP). For example my current personal PC resolves to Skys equipment in WD17, North London. I was having much better data from business users, who had their own high-use equipment.

Please note Mobile IP is a different set of technology, reporting tools built for the DSL or dialup markets won't work. Due to the higher levels of tech infrastructure, better quality data is available, but not generally publicly. The reporting will tell you that your phone is in your current country (the mobile operator is doing all the rest of the traffic routing). This is not covered in this article.

EDIT Everything in this article I state about DSL is too old if the target person is using FTTC. With FTTC the DSL supplier knows exactly the 'cabinet' you are using is, so places you down to the nearest 10m or so. FTTC is also a faster DSL service.

Discussion

There are two general architectures to this concern; firstly get a big database of lookups installed on your server; secondly poll a third party server for each needed lookup. The first is needed for high performance solutions, but the second is better if the IPs will change mapping frequently. Assuming it is possible to update the database every-so often, an installed database is a manageable position.

On the work I do, high performance is generally more important (in my scenario, during peaks, it would need to process >5000 requests per minute). This required that the lookup would take <200ms, or its not possible to process the raw data. These figures are conservative, but this was one service in an entire platform, and couldn't monopolise resources. It should be noted that many of the location databases are mostly focussed on a single country, due to boundaries I mentioned previously. For a global service, I would probably advise using several mapping services.

Nominated vendor

Due to operational cost and supporting a US customer base, a previous project MT used Maxminds. Since I wrote that code, maxminds have rebuilt everything to PHP5.4 and are using github. I would advice looking at their newer services. As some of my usage data was from known sources (we knew who was doing it, and where their office was), I tested the generated location data against what we knew. For business users, it was good quality data. See references to DSLAM throughout this document, its not the lookup software thats the problem. In Aus ~ where alot of this code was written ~ the nearest city is a good guess. Telephony is more complex here, and there are many more people.
If one needs to verify the mapping, put the generated [lat,long] into google maps. Please note mapping systems take vertical axis first, due to historical technology. Graphics systems take the horizontal axis first due to different historical technology.

Other vendors reviewed

If I do another system against this technology, I will update this article with more testing and current data. Please read the caveats expressed in the first paragraph. Please read the inline results, commercial practice makes this type of analysis imprecise.

According to DSL tools my exchange is Kingston. Any up-to-date lookup tools should map my IP to [51.41220,-0.29228]. This isn't the location of my PC, but this is the best this style of lookup can achieve. This statement is complicated by the fact that my ISP keeps resetting the DSL, and so changing my IP.

  • Geo Plugin are a frontend for maxmind. I'm not sure they are useful to reasonable scale services, due to the number of network connections involved. I infer they are implemented using the older static database, as the values they return don't match the current maxmind frontend. They a provide a non-technical-person-friendly JS interface, which is a would be a good reason to use them.
  • Host IP is a community project, with links to commercial vendors. At present they are unable to resolve my personal desktop IP. They quote fast response times, and do provide a sensible/ lean API for lookups. As this is non commercial, they have a static downloadable database; and suggest daily refresh on this. I would be interested in what their update policies where (i.e. what if the assignment of an IP changes, how long would the customer get out of date information?), before I started to depend on them
  • ip2location is a rented service, with a freemuim offering. They map my DSL to O2 in Bath. Due to commercial practice this could be true, but can't be true at the same time as maxminds response. This is a static file. The packing algorithm is less efficient than what maxmind uses, as this data is about ten times larger.
  • geobytes charge a small amount per lookup. This site is branded for 1998. This says my DSL is in Manchester.
  • digital point returns the same location as geoplugin. I can't find an API on this site, so doesn't seem to meet requirements.
  • live2support, has annoying adverts in the way, and supplies low granularity data. Can't find API access on this either.
  • melissa correctly gets my ISP; and my DSLAM is in Brighton. Does have an API, but won't admit which language this is implemented in. From the age on the branding, I doubt it would be useful.
  • webyield thinks my DSLAM is Hemel Hempstead. Doesn't seem to have an API access
  • IP to Lat Lng thinks my DSLAM is Halton again. No API access, probably maxmind legacy database.
  • Telize is again saying Halton, but that my timezone is for Barbados. Looks very REST, but still legacy maxminds.

Lastly if you have a HTML5 browser, you can lookup this up:

Nothing is set yet (Firefox/linux doesn't like this feature).

It would be a clever idea to use the JS geo location to allow less required input on address forms. However it doesn't look like the current generation of technology has enough reliability for this to work.


Geo lookup

RSS. Share: Share this resource on your twitter account. Share this resource on your linked-in account. G+

Geo lookup

RSS. Share: Share this resource on your linked-in account. Share this resource on your twitter account. G+ ­ Follow