Visualizing a building shapes taxonomy - The City Summit 🏙️🗻 project
Back story
While working on new features for the OvertureMaestro library, I started to explore the dataset of buildings that is provided by Overture Maps. Remembering an old project focusing on assembling multiple images of women of different ethnicities (see here) the question came to mind - What would the “average” shape of a building in a given city look like?
So, as is my habit, I started on a side project instead of finishing the functionality I was currently working on.
The results of this project can be accessed on Streamlit: https://city-summit.streamlit.app/
You can also run the code locally using this Jupyter notebook.
Getting the data
Downloading the Overture Maps data for the buildings is quite easy. I have used the OvertureMaestro
library, but you can also access it with official overturemaps-py
library as well as using DuckDB SQL query.
from overturemaestro import convert_geometry_to_geodataframe, geocode_to_geometry
buildings = convert_geometry_to_geodataframe(
"buildings", "building", geometry_filter=geocode_to_geometry("Paris")
)
This code will download the data locally as a GeoParquet and return a GeoDataFrame for the user. You can also get the path to the file instead using convert_geometry_to_parquet
function.
First tests
To begin with, I knew I had to solve two problems: how to centre all the buildings so that they were in the same space, and how to rotate them in a similar orientation so that the edges of the buildings go mainly vertical and horizontal.
Translating building vertices
For the first part, I have used the Azimuthal Equidistant projection, that maintains the distances and azimuths. Each building was projected based on their centroid.
# Geometry projection to AEQD
import pyproj as proj
from shapely.ops import transform
from shapely import Polygon
# Example building
g = Polygon(
[
[17.0251207, 51.0607852],
[17.0250685, 51.0607947],
[17.0250616, 51.0607808],
[17.025115, 51.060771],
[17.0251207, 51.0607852],
]
)
centroid = g.centroid
crs_wgs = proj.Proj("epsg:4326")
aeqd_proj = proj.Proj(
f"+proj=aeqd +lat_0={centroid.y} +lon_0={centroid.x} +datum=WGS84 +units=m"
)
project = proj.Transformer.from_proj(crs_wgs, aeqd_proj, always_xy=True).transform
geom_proj = transform(project, g)
geom_proj # notice that values are close to zero - above and below, these are in meters
# POLYGON ((2.0498206612070975 0.2566105251930882, -1.6097078406850456 1.3134800558941384, -2.093439190884303 -0.2328869958116028, 1.6502174436552757 -1.3231316731725968, 2.0498206612070975 0.2566105251930882))
Finding the best building rotation angle
The second part was trickier. How can I determine when the building is “straight” or “right”?
My first idea was to rotate the building for each angle individually from 0 to 89 and for each angle calculate the azimuths of edges, weighted by their length and create some kind of score to determine the best rotation angle.
After thinking things through, and after reviewing the Shapely library documentation, I found a simpler solution.
Using the minimum_rotated_rectangle
function, I can get a minimum rectangle surrounding a given building, a de facto rotated bounding box. What if I just rotate this rectangle so that it is equal to the real bounding box? The idea was that this configuration should keep the footprint of a building the smallest and most edges should be in horizontal and vertical orientations.
To get the right angle, I iterated all the points of the resulting rectangle in turn, counted the azimuth between two points and rotated the shape by the negative value obtained. But in this way I will obtain 4 angles, which will be two pairs in the horizontal and vertical axis. How do I choose one angle from these 4? I decided that I wanted the rotated building to be longer than taller (width >= height), and for the centroid to be in the middle, or to be above half the height of the rectangle.
import math
from shapely import affinity
# Rotation logic
coords = list(geometry.minimum_rotated_rectangle.exterior.coords)
for i in range(len(coords) - 1):
p1 = coords[i]
p2 = coords[i + 1]
angle = math.degrees(
math.atan2(p2[1] - p1[1], p2[0] - p1[0])
) # https://stackoverflow.com/questions/42258637/how-to-know-the-angle-between-two-points
rotated_geometry = affinity.rotate(geometry, angle=-angle, origin="centroid")
minx, miny, maxx, maxy = rotated_geometry.bounds
width = maxx - minx
height = maxy - miny
# Building is higher than it is wider
if round(height, 2) > round(width, 2):
continue
# The centroid is in the lower half of the rectangle
if abs(round(maxy, 2)) > abs(round(miny, 2)):
continue
return rotated_geometry
raise RuntimeError("Rotation not found")
Now that I think about it, it would be possible to choose the right angle of rotation without rotating the building, based on the length and angle between the diagonals, but I would rather stay with the current implementation.
Basic visualizations
To start, I used the plot()
function from the GeoPandas library displaying only the edges of the buildings.
To make these less cluttered, I divided the set of buildings into small and large based on the total area. I also added a slight background colouring for the buildings.
I have also experimented with leaving only vertices, without edges.
Switch to the heightmap
For me, those previous results weren’t satisfactory, so I started experimenting with the rasterio
library and the rasterize
functionality for switching vector data into raster data.
import numpy as np
from affine import Affine
from rasterio.features import MergeAlg, rasterize
# 1 px will equal to 1 meter,
# higher values will increase number of pixels per meter
resolution = 1
minx, miny, maxx, maxy = geoseries.total_bounds
canvas_width = int(np.ceil(maxx - minx)) * resolution
canvas_height = int(np.ceil(maxy - miny)) * resolution
canvas = rasterize(
shapes=geoseries,
fill=0,
# Add padding of 2 pixels on each side
out_shape=(canvas_height + 4, canvas_width + 4),
# Add values together from many shapes
merge_alg=MergeAlg.add,
transform=(
# Add 2 pixels to move away from edge
Affine.translation(xoff=minx - 2, yoff=miny - 2)
# Scale buildings to fit bigger canvas
* Affine.scale(1 / resolution)
),
)
heightmap = np.flipud(canvas) # flip canvas vertically
Plotting the heightmap
import matplotlib.pyplot as plt
import numpy as np
from matplotlib.colors import LogNorm
fig, axes = plt.subplots(1, 2, figsize=(20, 10), dpi=300)
heightmap_masked = np.ma.masked_where(heightmap == 0, heightmap)
axes[0].imshow(heightmap_masked)
axes[0].set_axis_off()
axes[0].set_title("Linear colour scale")
axes[1].imshow(heightmap_masked, norm=LogNorm())
axes[1].set_axis_off()
axes[1].set_title("Log colour scale")
plt.show()
As you can see, the centre of the heightmap is dominated by extremely high values. This makes sense; after all, most buildings in cities have a fairly small footprint. To discover more detail, it is useful to display the values on a logarithmic scale.
3D visualization
Having a heightmap, I began to wonder what it would look like in 3D. Using the Surface plot function from the Plotly library, I was able to generate an interactive visualisation for the calculated data.
Loading the palette
The next step to improve the look was the ability to change the colour palette. For this I used the PyPalettes library with which you can easily access thousands of palettes from various sources.
from pypalettes import load_cmap
import numpy as np
loaded_cm = load_cmap("ag_Sunset", cmap_type="continuous")
# resample to 256 values
interpolated_colours = loaded_cm(np.linspace(0, 1, 256))
# replace first colour with white or black
white = np.array([1, 1, 1, 1])
interpolated_colours[:1, :] = white
# transform the palette to a plotly-compatible format:
# a list of rgb(a) css values
plotly_map = list(
map(
lambda c: f"rgb({int(c[0] * 255)}, {int(c[1] * 255)}, {int(c[2] * 255)})",
interpolated_colours[:, :3]
)
)
Polishing the visualization
To make the plot prettier I’ve done these additional steps:
- Applied logarithm on the heightmap
- Removed the axes
- Applied the new palette
import numpy as np
import plotly.graph_objects as go
# Calculate the ratio to keep proper aspect
y_ratio = heightmap.shape[0] / heightmap.shape[1]
# Applying logarithm on 0 will return -inf
# Replace these values with -1
with np.errstate(divide="ignore"):
z = np.log(heightmap)
z[np.isneginf(z)] = -1
# Camera position
x_eye, y_eye, z_eye = 0.8, 0.8, 1.5
fig = go.Figure(data=[go.Surface(z=z, colorscale=plotly_map)])
fig.update_traces(
showscale=False,
contours_z=dict(show=True, usecolormap=True, project_z=False),
)
fig.update_scenes(aspectratio=dict(x=1, z=1, y=y_ratio))
fig.update_layout(
title=dict(text=f"<br>Paris", font=dict(size=20)),
autosize=True,
scene_camera_eye=dict(x=x_eye, y=y_eye, z=z_eye),
margin=dict(l=5, r=5, b=5, t=5),
scene=dict(
xaxis=dict(visible=False),
yaxis=dict(visible=False),
# Clip the values to be above the slightly above -1
zaxis=dict(visible=False, range=[-0.99, None]),
),
)
Just for comparison, I have also created a visualization for buildings without rotating them.
Streamlit implementation
I wanted to share this project on the internet, but just showing the visualizations or animations wouldn’t cut it for me. In my open-source endevours I’m strongly advocating for exploring the areas you personally know, so I wanted to enbale everyone to easily select their home city and create the summit on the spot without any hassle.
I have never deployed any Python app on the internet, but I heard about Streamlit and Gradio. After quick search on the internet, I decided to give Streamlit a shot.
Creating the app
I was pleasantly surprised by how easy it is to create a Streamlit application deployed in the Community Cloud. You log int with your GitHub account, select a template (I went with the Blank one), choose a name / domain and everything is created for you. Then you can start with coding using GitHub Codespaces right away. It was also my first time using the Codespaces and I must admit that it is a very convenient process (though I am an avid VS Code user).
Coding experience
Streamlit documentation is really well maintained with interactive examples. I was able to quickly draft a conditional form for the user with state caching and values validation (mainly Nominatim geocoding errors or limiting the max area of interest).
Porting the code from my notebook required some changes, mainly to the tracking functions that displayed progress - I was using a track
function from rich.progress
, but had to wrap it with manual st.progress
calls in every loop iteration.
One of the limiting factors was resources. It’s not oficially mentioned in the documentation (or I didn’t find it), but I have found an answer on Streamlit Discuss forum that a Community Cloud app has up to 2 CPUs, 1 GB of memory and 50 GB of storage.
I had to completely remove multiprocessing from the app to avoid code throttling, because Streamlit displays errors if resources usage is maxed out. I even had to make a quick patch to the OvertureMaestro
library to allow passing a max number of multiprocessing workers.
To speed up the calculations, I opted to replace geometry projection from Azimuthal Equidistant to Universal Transverse Mercator (UTM) Coordinate Reference System (CRS), because it can be applied on the whole dataset of buildings at once and the difference in precision isn’t that important for this project.
Additionally, to lower the memory usage, I’ve implemented batch processing for downloaded buildings data. OvertureMaestro
is downloading Overture Maps data into a GeoParquet file. Instead of loading it all to a GeoDataFrame
with read_parquet
function, I’m iterating over Arrow batches with PyArrow
library to keep the memory usage low.
Overall I was happy with how fast I was able to push out a finalized app to the public. Automatic theme adjustment for the Plotly chart with st.plotly_chart
was also really handy. All required calculations are saved on disk, so consequent runs do not have to redo all the steps (until the app is stopped because of no activity).
I’ve also included the option to download the generated heightmap as a numpy array if anyone wants to experiment with it.
The code is publicly available on the City Summit 🏙️🗻 GitHub repository.
Summary
What I learned
The main thing I learned during this project was how to write apps with Streamlit.
The data wrangling part (getting all the building shapes into a common space and finding the proper rotation) was a nice mind-puzzle to figure out, but wasn’t as hard to achieve with currently available tools in Python.
For the visualization part, I initially tried to use Matplotlib 3D surface plot functions, but the results weren’t satisfactory, so I switched to Plotly.
As an additional feature, I was thinking about transforming the generated heightmap into an STL file for 3D printing purposes if anyone wishes to do that. I’ve checked some sources on how to transform heightmaps into 3D objects, but decided not to pursue this path for now.
In the end, I am satisfied with the visualisations generated, and with the interactive application I deployed on Streamlit.
Used libraries
- OvertureMaestro - for downloading Overture Maps data for a given city and for geocoding the string to a geometry.
- GeoPandas - for geospatial operations on a dataset of downloaded buildings and I/O.
- Shapely - for rotations and translations of a building geometry.
- Rasterio - for rastering the buildings dataset into a heightmap.
- Plotly - for displaying 3D surface plot using heightmap data.
- NumPy - for 2D array (heightmap) manipulation.
- PyArrow - for parquet file batch processing.
- PyPalettes - for easy access to many palletes.
- Matplotlib - for transforming palettes.
- streamlit-folium - a third party Streamlit component for displaying a Folium map, used for preview geocoded geometry to the user.
- Lonboard - for displaying geocoded geometry in the Jupyter Notebook.
Social media mentions
Streamlit:
LinkedIn:
- First results
- First 3D matplotlib visualization
- Rotating 3D Plotly animation
- Streamlit deploy mention
- New features update
Bluesky: