Comparing PathsLast modified 2023/12/03 13:51
Strava is a service that allows you to record, analyze and share your sporting activities. Strava RS is a TUI (terminal user interface) client for Strava that I’m working on. It provides an offline, keyboard driven, interface to Strava using data collected fro the Strava API.
One of the feature the Strava app provides is Matched Routes. If an activity you performed closely matches the route of any previous activities, you can list and compare your previous performances.
In Strava this is a premium feature. I wanted to implement this feature in Strava RS, which essentially boils down to: How similar is route A to route B?
I had no idea how to answer this question so, after googling and not finding anything was applicable (or if it was, I didn’t understand), I decided to figure it out by myself.
Disclaimer: there may well be better ways to solve this problem, my solution will be blindingly obvious to some, it’s possibly also wrong. But, well, I got a new graphics tablet and wanted to try it out.
The Problem ¶
-2.44698, 50.62649 -2.44728, 50.62669 -2.44753, 50.62678 -2.44774, 50.6269 -2.44798, 50.62694 -2.44844, 50.62707 ...
The full series would be plotted like this in Strava RS (actually the ParkRun in Weymouth)
⠈⢣ ⢸⡀ ⠑⡆ ⠘⠦⣀⣀⡀ ⠈⠁⠱⡀ ⢀⣀ ⢣ ⡠⠊⠉⠉⡇ ⠉ ⡇ ⢣ ⣎⠖⢄ ⡇ ⠈⢲⡀ ⠧⡀ ⠑⢦ ⠈⠑⢤⡀ ⡳ ⠉⠢⡟
Now, I want to compare my times for all attempts I made on this route. It’s inevitable that the exact set of coordinates for each attempt will deviate slightly.
Three similar and one dissimilar routes
Above we can clearly see that routes 1, 2 and 4 are the same shapes, and that 3 is clearly different. In this case 3 is actually the Dorchester ParkRun:
Weymouth and Dorchester are 5 miles apart
Route 3 doesn’t intersect with our run in Weymouth. Let’s overlay the 3 Weymouth Parkruns:
Three similar routes overlaid on top of one another
As a human we recognise them as the same “route”:
- They share the same geographical area.
- The paths are very close together.
But how to get a computer to recognise this? We can calculate the distance between the points.
Similarity Score ¶
Comparing two similar routes
Adding all the distances gives a score, in this case
0.76 and we can say that
the black route has a similarity score of
0.76 against the green route.
Dorchester and Weymouth are not similar
Dorchester and Weymouth are clearly not the same route giving a similarity
481, so we could say that if the similarity score is less than
1.0 then we have a match! This seems like a good approach to start with.
However, as a human I was able to draw some arbitrary lines to geographically similar points, but unfortunately making the computer do this would be a whole different problem to solve. How do we determine which points to compare?
Two similar routes with different number of coordinates
Route A has 20 coordinates and route B has 15 coordinates (vastly simplified).
We can normalize the routes to have the same number of coordinates, we can then score each coordinate against its counterpart:
Normalized routes with same number of coordinates
So we can now summarise the proposal
- Normalize the two routes so that they have the same number of coordinates.
- Sum the difference between the opposing coordinates to produce a similarity score.
- If the similarity score between two routes is below a given threshold it’s a match!
Now we’re getting somewhere, the number of segments will contribute to the accuracy of the matching. The more segments the greater the matching precision.
As a human I would obviously calculate the length of the path, divide it by the desired number of segments to get the segment length and then at each length interval add a new coordinate… or… wait there are more problems than answers in that solution:
- How to calculate the length?
- How to determine what the new coordinates should be?
Route Length ¶
I have been drawing curvy “analogue” routes, but in reality the route is a series of straight lines:
Routes are a series of straight lines
All we need to do is determine the length of each straight line. But how? Pythagoras to the rescue!
The Pythagorean Theorem
allows us to determine the length of hypotenuse in a right-angled triangle if
we have the lengths of the sides with the formular:
c = √(a² + b²) (
c is the
square root of the sum of
a multiplied by
b multiplied by
Our route can be represented as a bunch of triangles:
And for each triangle we can determine the length
c of the sides
and the sum of all those
c values will be the length of our route.
Given the coordinates (a list of
1.5, 2.5 2.5, 3.6 2.9, 2.9 ...
We can then determine the distance between each coordinate by considering two coordinates as a triangle and determining the length of the sides:
ahas a length of
xᵢ - xᵢ₋₁(e.g.
2.5 - 1.5 = 0.75)
bhas a length of
yᵢ - yᵢ₋₁(e.g.
3.6 - 2.5 = 1.1)
- the length of the line between the first and second coordinates is then
length = √(0.75² + 1.1²) = 1.3313
So now we can calculate the length for each segment and sum them all to provide a length.
Plotting the New Coordinates ¶
Now that we have the length of the original path and can determine the segment length we can “walk” the path and record the coordinates where each segment ends. Surely the only way to do this is with a red carpet:
- Ensure you have a very long (infinitely so if possible) red carpet.
- Cut the carpet to the length of the route.
- Cut the carpet into 100 strips of equal length (increase number of strips for accuracy).
- Walk up to the first coordinate and roll out a strip of carpet matching the direction of the segment.
- Does the carpet extend further than the next coordinate?
- If so cut it at the intersection and place the remaining strip in the direction of the next coordinate.
- Otherwise calculate
[*]the new coordinates based on the direction of the old segment and the length of the carpet from the last intersection and write them down.
The only way to do this (a flowchart)
Having finished you should have a list of the normalized coordinates, but
unfortunately there was a
[*] as we didn’t explain how to calculate the new
[*] Calculating the New Co-ordinates ¶
To determine the new coordinates we can determine the ratio (or percentage) of the segment that is occupied by the slice of red carpet.
- Calculate the remaining length
cfrom the last known carpet-coordinate to the end of the original segment using pythagoras as before:
Calculate the remaining distance
- Divide the carpet slice length
slby the remaining length
cto obtain the ratio or “percentage” (
ratio = sl / c)
- Multiple the last known coordinates by the resulting ratio to obtain the
xᵢ₊₁ = xᵢ * ratioand
yᵢ₊₁ = yᵢ * ratio
Now we have two routes with the same number of coordinates and we can directly compare them:
|Route A||Route B|
|1.0, 0.5||1.1, 0.6|
|2.0, 1.3||1.9, 1.3|
|2.2, 0.3||2.1, 0.2|
To do this we need to determine the distance/length between the adjacent carpet coordinates (using Pythagoras again) and sum them up:
0.38 = sum( √((1.0-1.1)² + (0.5-0.6)²) = 0.14 √((2.0-1.9)² + (1.3-1.3)²) = 0.10 √((2.2-2.1)² + (2.1-2.0)²) = 0.14 )
Given the three points above we get a similarity score of
That’s basically how the route matching works in Strava RS - for any given route you can filter by “similarity”. The route will currently be compared against “all” routes which isn’t very efficient, but this can be vastly improved by pre-filtering by route length and perhaps then geographic boundaries.