A compactness index for urban patterns

Apr 24, 2019 26 min read urban-change, new-urban-analytics

Introduction
Shape Compactness
Multiple Polygons, Holes and Other Realities
Moment of Inertia
Urban Compactness Scores
Limitations
Conclusions

Introduction

Urban sprawl is a multi-dimensional phenomena. Chief among its characteristics are whether or not an urban area is fragmented (or pockmarked development) and if the urban area is contiguous but of uniformly low-density. In this post, I want to demonstrate the use of a relatively novel (but in some ways an age-old) indicator of urban form at a landscape level; Area Moment of Inertia.

Quite a bit of the literature on compactness comes from Political Science, where folks are concerned the most about gerrymandering and how to create compact electoral districts. Chief among them are Polsby and Popper (1991), Schwartzberg (1965) and Reock (1961).

Shape Compactness

There are many ways we can describe the compactness of a shape. But we need a standard against which we can measure it. For that, we can assume that in 2D space, circle is the most compact shape for a given area. Note the important points here 1) keeping the same dimensions (otherwise a point is the most compact) 2) we need to fix an attribute, area in this case (again if we didn’t, point would be the most compact).

One way to define compactness of a shape is to divide the perimeter of the shape \(P\) to the area \(A\) (Perimeter-Area ratio) and see how closely it matches up with similar indicator for a circle However, since units of \(P\) and \(A\) differ, it is often useful to use \(P^2/A\). For a circle this quantity is \(4 \pi\). So an index of compactness that normalised to a circle might be

\[ \frac{4 \pi A}{P^2} \]

This is known Polsby-Popper index, though the origins of this idea are much older. See Haggett et.al (1977) for some older references. The index is \((0,1]\) with the least compact areas taking on values closer to 0.

source: https://www.azavea.com/blog/2016/07/11/measuring-district-compactness-postgis/

Another way to recover the above indicator is to take the ratio of perimeters, with the numerator being the perimeter of the most compact shape possible for the given area \(A\) and the denominator being the perimeter of the shape in question (focal shape). The perimeter of the circle is \(2 \sqrt{\pi A}\). It is straightforward to notice that this is related to the above perimeter-area ratio and differ only by a square root (a non-linear but monotonic transformation). This also how Schwartzberg’s index is related to Polsby-Popper index.

One disadvantage of the above method is perimeter is very sensitive to the errors in the boundary. In particular, if the boundary becomes more rugged (technical term), the perimeter dramatically increases without concomitant increase in area.

The famous Koch curve. Perimeter increases dramatically with each iteration, but only marginal changes in the area

Reock index is slightly different. It relies on the notion of circumscribing polygon of the focal shape. We can use a convex hull or a minimum bounding circle as this polygon. The idea is to take the ratio between the area of the focal shape and the circumscribing polygon, because clearly perimeter is susceptible to fractalisation issues.

Source: https://fisherzachary.github.io/public/r-output.html

There area of circumscribing circle is proportional to

\[\max_{p_i, p_j \in P} \|p_i - p_j\|\] where \(p_i\) and \(p_j\) are points on the perimeter \(P\)

This property makes the Reock index very sensitive to the elongated shapes and the direction of elongation. Think about the circumscribing circle of the following three shapes.

Source: Polsby and Popper (1991)

This is a particular problem for geospatial datasets because shapes are distorted by different projections differently and in different parts of the world based on underlying assumptions. The convex hull approach slightly mitigates this problem.

Multiple Polygons, Holes and Other Realities

Until now, we have primarily dealt with single shapes, albeit with fuzzy boundaries. Often, we encounter disjointed polygons that belong to the ‘same feature’, polygons with holes and other esoterica. These shapes are often a result of complicated geoprocessing operations such as unions, dissolves and intersections or mapping/surveying errors. None of the above metrics fare particularly well in describing the compactness, when confronted with these realities.There is little agreement about simple concepts like perimeters, areas and circumscribing polygons.

A partial solution

One way to mitigate against these issues and also to reduce computational burden would be to deal with the rasters. While fuzziness of boundaries are still an issue, perimeters of ‘raster shapes’ cannot get arbitrarily large for a given resolution (cell size). The other is to rely on Moment of Inertia (MI) to account for multiple polygons and holes. Tightrope walkers rely on this idea to prevent rotating around (falling from) the rope. See Tatiana-Mosio-Bongonga in Paris above!

Moment of Inertia

Li et.al (2013) proposed ratios of area moment of inertia (MI). Two features of this measure makes it particularly attractive. 1) MI is decomposable 2) Parallel Axis Theorem.

If \(I_z\) is the MI of passing through an axis at centre, the MI at an axis \(d\) away from the centre is \(I_z + Ad^2\). Furthermore, for a collection of areas \(K\), the MI is \(\sum_{k \in K} (I_z^k + A_k d_k^2)\), where \(d_k\) is the distance of each area from the overall centroid and \(I_z^k\) is the MI of the individual part w.r.t its own centroid.

The MI for each square (assuming that the raster resolution is same in both x and y directions) with resolution/width \(s\) is \(s^4/6\). Thus, the MI for the landscape is

\[\sum_{i \in S} (\frac{s^4}{6} + d_i^2 s^2) \unicode{x1D7D9}_i\] where \(\unicode{x1D7D9}_i = 1\) when the cell \(i\) belongs to the shape, 0 otherwise. \(d_i\) is the distance of cell to the centroid.

The MI of the most compact shape, circle with the same area, is \(A^2/(2 \pi)\). Thus.

\[IMI := \frac{A^2}{2 \pi \sum_{i \in S} s^2 (\frac{s^2}{6} + d_i^2) \unicode{x1D7D9}_i } \]

Strictly speaking, it is hard to get a perfect circle fashioned out of a raster, so the IMI is always between \((0,1)\)ut for all practical purposes, this does not matter much.

Urban Compactness Scores

To look at urban compactness scores, I use the 2011 NLCD data. I select the urban landcover categories and remove the roads and other features that make the urban landscapes seem connected. See other posts for more details. In particular, the landcover data is processed to eliminate roads, small stringy patches.

Urban landcover shown in black

The urban compactness scores are shown in the following map. You can also download the data to use it in your own work.

It is not surprising that the more compact counties are urban areas. This is more clearly revealed in the following box plot. Large Central counties are substantially more compact that small metros or rural areas. But there is significant variation in the compactness of counties.

It is useful to see if there is a geographic pattern in these compactness scores relative to the size of the metro. Of all the large central metro counties, on average, West North Central are most compact, while Mountain counties are least compact. This result can partially be explained by the variation in topography. Terrain ruggedness, potentially prevents compact development. In the large fringe metro category, Middle Atlantic counties, on average, are most compact.

Limitations

Table 1: Most Compact Counties according to IMI
County	State	IMI
East North Central
Marion	Indiana	0.84
DuPage	Illinois	0.79
Franklin	Ohio	0.78
East South Central
Shelby	Tennessee	0.67
Fayette	Kentucky	0.64
Davidson	Tennessee	0.56
Middle Atlantic
Kings	New York	0.82
Nassau	New York	0.76
Bronx	New York	0.73
Mountain
Salt Lake	Utah	0.76
Ada	Idaho	0.61
Bernalillo	New Mexico	0.60
New England
Kent	Rhode Island	0.43
Hartford	Connecticut	0.42
Hampden	Massachusetts	0.42
Pacific
Orange	California	0.56
Multnomah	Oregon	0.53
Sacramento	California	0.49
South Atlantic
Roanoke	Virginia	0.84
Charlottesville	Virginia	0.75
Salem	Virginia	0.74
West North Central
St. Louis	Missouri	0.73
Ramsey	Minnesota	0.72
Hennepin	Minnesota	0.67
West South Central
Dallas	Texas	0.73
Tarrant	Texas	0.69
Bexar	Texas	0.69

The above table throws up some interesting names that point to some of the limitations of IMI. Dallas and Tarrant counties are the most compact according to IMI despite the reputation of the Dallas-Fortworth Metropolitan Area to be among the most sprawling. Likewise for Orange county in California. Both Ronoake and Charlottesville are independent cities in the Commonwealth of Virginia (County equivalents in Census parlance) are relatively small compared to ‘real’ counties. To see this explore the following bivariate graph.

You can click on each type of county to toggle their visibility. There is no real relationship between urban population density and IMI. Some dense places (e.g. New York, NY) has very small IMI and some low density medium metropolitan areas have high compactness (e.g. Ronoke, VA). Thus the indicators are orthogonal.

A side note here, according to Wikipedia, Brown County, IN is one of the least dense counties in Indiana. However, it shows up on the right side of the x-axis (high density). This is because density in this graph is calculated as total population divided by urban land area from the land cover data (not gross density).

As with any measure, we need to understand the limitations and its usefulness. I want to point out two main issues. This is an area based moment of inertia and thus areas that are ‘low density’ but contiguous would score high. The second is that the shape of the landscape (county) becomes quite important especially in more heavily urban counties. To see this, we only have to look at the difference in IMI between New York County (Manhattan) and King’s County (Brooklyn)