I had fun making it but please note that the current implementation is just a demo and far from a proper production tool.
If you really want to use it then for best possible results you need at least 500 probes per phase.
It could be optimized fairly easily but not without going over the anon user limit which I tried to avoid
Start by doing the multi-continent probe, say 3x each. Drop the longest time probes, add probes near the shortest time, and probe once. Repeat this pattern of probe, assess, drop and add closer to the target.
You accumulate all data in your orchestrator, so in theory you don't need to deliberately issue multiple probes each round (except for the first) to get statistical power. I would expect this to "chase" the real location continuously instead of 5 discrete phases.
I just watched the Veritasium video on potentials and vector fields - the latency is a scalar potential field of sorts, and you could use it to derive a latency gradient.
Internet signals generally travel by cable, and the selected route may or may not be the shortest distance.
It's quite possible for traffic between neighboring countries to transit through another continent, sometimes two. And asymetric routing is also common.
Since this is using traceroute anyway, if you characterize the source nodes, you could probably use a lot fewer nodes and get similar results with something like:
a) probe from a few nodes on different continents (aiming to catch anycast nodes)
b) assuming the end of the trace is similar from all probes, choose probe nodes that are on similar networks, and some other nodes that are geolocated nearby those nodes.
c) declare the target is closest to the node with the lowest measured latency (after offsetting from node charachterized first hop latency)
You'll usually get the lowest ping times if you can ping from nearby customer of the same ISP as the target. Narrowing to that faster is possible if you know about your nodes.
Plenty of developers really like it too though, because that's where Claude learned to use it.
You could do even cooler tricks, like https://github.com/blechschmidt/fakeroute
Pointless? Almost certainly.
I wonder if this thing will start a cat and mouse game with VPNs.
[1]: https://old.reddit.com/r/networking/comments/1hkm4g/lets_tal...
Oversubscription is expected to a certain degree (this is fundamentally the same concept as "statistical multiplexing"). But even oversubscription in itself is not guaranteed to result in bufferbloat -- appropriate traffic shaping (especially to "encourage" congestion control algorithms to back off sooner) can mitigate a lot of those issues. And, it can be hard to differentiate between bufferbloat at the last mile vs within the ISP's backbone.
Aha, that's what you would think, but what if I fake the source of the IP used to do the geolocation ping instead!
Latency variability is a huge issue. We run both traceroute and ping data, and we observe that there are entire countries that peer with IXP thousands of miles away in a different continent.
We bought a server from the oldest telecom company in the country and recently activated it. Currently, there is a 20 ms latency when traffic is directed towards the second oldest telecom. The packets have to travel outside the country before coming back in. This is a common phenomenon that occurs frequently. So, we usually have multiple servers in major cities since various ASNs have different peering policies.
For us we can map those behaviors and have algorithms and other data sources, make measurement-based geolocation perform well.
We are hoping to support IXPs, internet governance agencies, and major telcoms in identifying these issues and resolving them.
I've done some mapping while comparing turn servers my org hosted on cloud vms vs a commercial offering, and it's pretty easy to find very different routing from point A to point B, but sometimes it's pretty clearly that not every transit network has access to every submarine cable, so traffic from say Brazil to South Africa might go from Brazil directly to Africa, or it might go to Florida, then Europe, then Africa. It'd be nice to take a more direct route, but maybe the Brazil -> Africa hop doesn't transit all the way, so BGP prefers the scenic route as it has a shorter AS path.
I didn't have any leverage to motivate routing changes though, so other than saying hmm, that's interesting, there wasn't much to do about it.
To help the system, we are reaching out to IXPs, major telecoms and peering agencies to advise them on how to peer and make critical internet routing decisions. We want to tell them on how to engage in data-focused peering, how their IXP is perceived from a broader internet data perspective, and how their packets from the IXP travel across the internet. We hope this colloboration will bring much needed efficiency in internet routing.
I was on a better connection (gigabit FTTC) and in a better peered location (central London).
>amsterdam
Don't know where precisely in NL they were or what connection type. I'd certainly expect a like for like amsterdam wired connection to win so this was probably something more pedestrian & rural
Yup. For example from my city to one of my dedicated server whose location is fully well-know (in France), I know there's 250 kilometers as the crow flies. Yet if I ping that server and draw a circle around my place (considering ping travels as fast as light in a vaccuum, which we know ain't happening but, hey, it's something) I get a radius of 2000 kilometers. About 8x the distance. I can prove that my IP ain't in the US but that's still not very precise.
And indeed many servers in the UK, which is 2x the distance than my server is, gives me constantly a lower ping.
TFA's approach, especially with the traceroute instead of Ping, is nice.
though with some key differences that address the limitations mentioned in the thread. The main issue with pure ping-based geolocation is that: IPs are already geolocated in databases (as you note) Routing asymmetries break the distance model Anycast/CDNs make single IPs appear in multiple locations ICMP can be blocked or deprioritized My approach used HTTP(S) latency measurements (not ping) with an ML model (SVR) trained on ~39k datapoints to handle internet routing non-linearity, then performed trilateration via optimization. Accuracy was ~600km for targets behind CloudFront - not precise, but enough to narrow attribution from "anywhere" to "probably Europe" for C2 servers. The real value isn't precision but rather: Detecting sandboxes via physically impossible latency patterns Enabling geo-fenced malware Providing any location signal when traditional IP geolocation fails Talk: https://youtu.be/_iAffzWxexA"
a simple rule of thumb is that a signal using optical fiber for communication will travel at around 200,000 kilometers per second
-- https://en.wikipedia.org/wiki/Optical_fiberThis brute force approach works much better than I expected as long as you have enough probes and a bit of luck.
But of course there are much better and smarter approaches to this, no doubt!
You mention the quality several times in the article but it's not clear how this is verified. Do you have a set of known-location-ip-addresses around the world (apart from your home)? Or are we just assuming that latency is a good indicator?
I tested against them, as well as other infrastructure I control that is not part of the network, and compared to the ipinfo results as well
However, ipinfo still appears to rely on active probing to triangulate geolocation data, which suggests they believe these routing asymmetries can be modeled or averaged out in practice.
The telco DSL and fiber in my metro area all runs through a single location where the PPPoE (hiss) concentrator is and the first hop latency from DSL interleaving swamps the latency from distance. You can someone is in the metro area, but not the county or city.
Cable company customers are a little more locatable, probably get the county.
so theres funky overlap wherein on one isp you appear closer to city A, and on isp 2 closer to city B, but its same physical address.
Continental classification I'd think would be good as they appear to be coalesced endpoints, separated by vast oceans.
---
Our research scientist, Calvin, will be giving a talk at NANOG96 on Monday that delves into active measurement-based IP geolocation.
1. Trilateration mostly doesn't work with internet routing, unlike GPS. Other commenters have covered this in more detail. So the approach described here - to take the closest single measurement - is often the best you can do without prior data. This means you need a crazy high distribution of nodes across cities to get useful data at scale. We run our own servers and also sponsor Globalping and use RIPE Atlas for some measurements (I work for a geo data provider), yet even with thousands of available probes, we can only accurately infer latency-based location for IPs very close to those probes.
2. As such, latency/traceroute measurements are most useful for verifying existing location data. That means for the vast majority of IP space, we rely on having something to compare against.
3. Traceroute hops are good; the caveat being that you're geolocating a router. RIPE IPmap already locates most public routers with good precision.
4. Overall these techniques work quite well for infrastructure and server IP addresses but less so for eyeball networks.
https://ping.sx is also a nice comparison tool
20 minutes talk at DEFCON
struct tcp_info info;
socklen_t len = sizeof(info);
getsockopt(sock, IPPROTO_TCP, TCP_INFO, &info, &len);
tcp_info varies by OS and version, but I think tcpi_rtt is well supported.How's this different from RIPE ATLAS?
Globalping offers real-time result streaming and a simpler user experience with focus on integrations https://globalping.io/integrations
For example you can use the CLI as if you were running a traceroute locally, without even having to register.
And if you need more credits you can simply donate via GitHub Sponsors starting from $1
They are similar with an overlapping audience yet have different goals
Sometimes residential ISPs (that hosts the probe) may have a bad routing due to many factors, how does the algorithm take that into account?
IEEE 802.11mc > Wi-Fi Round Trip Time (RTT) https://en.wikipedia.org/wiki/IEEE_802.11mc#Wi-Fi_Round_Trip...
/? fine time measurement FTM: https://www.google.com/search?q=fine+time+measurement+FTM
Seems tool is relying on ICMP results from various probes. So wouldn't this project become useless if target device disables ICMP?
I wonder if you can "fake" results by having your gateway/device respond with fake ICMP requests.
Email me if you would like to get some additional credits to test it out, dakulovgr gmail.