Cost-Optimal Paths Dataset

Distances are a concern to social scientists that study the diffusion of wars or institutions, power projection, and international trade. The standard approach to calculating distances on Earth comes from marine navigation and is several centuries old. These “Great Circle Distances” assume the planet to be a perfectly round sphere with a fixed radius. This approach works perfectly well in aviation and shipping, i.e. fields where craft move along nearly direct courses to their destinations following the curvature of the earth. But it is not appropriate in the context of land-based travel. If we only measure the shortest distance between two points on land, we systematically underestimate the effective traveling distance by not taking into account infrastructure and terrain. Check out the pictures below.

On the left, a direct path between two points that are separated by mountains can be seen. On the right, a realistic traveling path between these mountains is illustrated that avoids the mountains in favor of easy terrain. This latter, more realistic route is not the shortest path between the points, but it might very well be the fastest. If we assigned time costs for traveling to a specific spot on the map, this latter route would be cost optimal.

Therefore in many spatially disaggregated studies, we are essentially stuck with oversimplified distance estimations that systematically underestimate effective traveling distances. The good news is that finding cost-optimal paths in graph structures is a very well-researched problem in computer science and we only need to stir together some of the established algorithms and a GIS dataset on infrastructure and terrain accessibility.

To make a long story short, I have done exactly that and used Dijkstra’s algorithm together with an awesome dataset on travel times to major cities to create a small command-line program for calculating cost-optimal distances. So far, I have used this program in combination with Nils Weidmann’s CShapes dataset and calculated cost-optimal distances within countries to capital cities for the post-1945 period. In the figure below, you can see cost-optimal and direct distances within Israel. I only use Israel here because its borders changed over time and you can see these changes in the resulting dataset.

Feel free to download this dataset and get in touch if you have any questions.

Get version 0.1 of the dataset as a point shape file [here]. Please cite this working paper if you want to use the dataset for your own research:

Sebastian Schutte: Peripheral Groups, Rough Terrain, and Secessionist Civil War. Paper prepared to be presented at the annual meeting of the European Political Science Association, Berlin, Germany, June 21-23, 2012