Volumetric City Modelling I

The physical urban space is a 3-dimensional composition of built and unbuilt volumes. Generating a digital (3D) model of this environment creates a virtual cityscape which is useful to the study of settlement systems and assists in the meaningful assembly of relevant information. A Digital Twin superimposes climate, weather, pollution, and/or other contextual data from the ‘fluid domain’ (surrounding void/air envelope) over a watertight ‘form’ based model made up of buildings, pavements, trees, lamp posts etc. Mapping spatial urban character thus aids in developing a clear understanding of the morphological factors that influence local intra-urban climate. Following is the first part of a dive into the crucial domain of volumetric 3D city modelling. Urban climate modelling will be taken up in subsequent articles.

Data gathered expressly for the purpose of generating a 3D model will, for the most part, be accurate. The same cannot be said for existing datasets, reusing which inevitably requires editing. Building footprints are comparatively easy to get a hold of but their height information is not. Data archives on urban morphology, especially in countries like India, face the challenges of harmonization, quality, and availability. They are generally collected by the city, state, and/or centre and may be sourced from them. Private agencies are an integral part of the international simulation market but have yet to take a hold in India. Passive collection techniques are traditionally and widely employed to complete such databases although newer active methods, such as photogrammetry and laser scanning, are popular in the professional and academic circles. Both active and passive methods are expensive in their own right which could sometimes hurt project viability. Primary data collection (passive) is time consuming, secondary sources (passive) do not produce particularly accurate results, and scanning (active) requires significant capital investment. Another system of categorization (of modelling methods) is grouping them into manual, semi-automatic, and automatic approaches. Passive modelling of individual objects falls under the category of manual modelling. Processes like photogrammetry require custom tweaking at various levels, manual or automatized, and are referred to as semi-automatic. Active methods which use pattern recognition and image processing to assemble scanned points are the most technologically demanding and constitute automatic techniques.

Passive or Manual Techniques

Passive height prediction methods are broadly based on built parameters, geometric properties, and demographic attributes. Common physical and socio-economic variables collected from primary and secondary sources include building use, year of construction, footprint, storeys above ground, net internal area, population density, household size, income, etc. To briefly summarize popular techniques within each of the above mentioned categories;

Predictors using Built Parameters:

Number of Storeys: Assuming a storey height based on use and age of a building and multiplying said height to number of floors gives the total height of the building in question. The effect of assumptions on the quality of outcome is however unclear and remains unreported. Floor count is part of an individual building’s dataset but is seldom available for an entire sector or neighbourhood. This amplifies any deviations in calculated heights.
Local Regulations: Building regulations are put in place to relieve strain on urban infrastructure, balance the local ecosystem, enhance aesthetic quality of a neighbourhood, manage traffic congestion, and the like. Height restrictions are an integral part of these regulations and are frequently exploited since real estate is the primary cash cow of an urban economy. This leads to the assumption that every building must be its maximum permissible height which forms the basis of this method of estimation. Field data however reveals it to overestimate by around 20%.
Net Internal Area/ Carpet Area/ Built-Up Area: Being highly case sensitive, this is a lucrative parameter to consider. It is readily available for most projects, specially bigger and newer ones. An estimation would use the relation between built-up volume and footprint to ascertain height by converting to number of storeys. The parameters are important as real estate metrics but do not calculate gross volume above ground all that well. The outcome therefore suffers from under- or over- estimation.
Shadows/ Sun Ephemeris: The height of a building may be estimated from the length of its shadow and the solar altitude at a given location, date, and time. This method specifically requires orthorectified aerial and/or satellite imagery to work with, preferably from the same sweep. Challenges include overlapping shadows (as in tiled satellite images from different batches), interfering vegetation (mostly in RGB representations), and distortions due to sloped terrain (from non-orthorectified images). The quality of such predictions is often reported and therefore verifiable.

Using Geometric Properties:

Geometric attributes: Parameters such as a building’s footprint area, its shape complexity, number of neighbouring buildings, etc. are noteworthy metrics that help assess the volumetric properties of a neighbourhood.

Demographic attributes are also found to directly influence building height and may be used to predict the same. They however do not suffice individually and need to be combined with similar metrics to produce a reliable result. Some such predictors are:

Population Density: A large population in a small neighbourhood may translate to high rise buildings or crammed low rise accommodations. They are both indicative of very different social classes and may be converted to building height only after assessing the area’s socio-economy.
Average Household Size: In combination with gross population, family size helps determine the number of households and consequently, dwelling units. This in addition to building footprint and/or area can help estimate a building’s height.
Average Income: A supplementary parameter, it helps estimate the affordability of a community or neighbourhood.

Most of these methods consider a single predictor which is seldom acquired with the intention of producing 3D models. This, combined with the fact that such databases have rampant missing values, hinders the production of useable results. To combat this, new methods are being developed which use machine learning techniques, such as Random Forests, to filter and combine useful predictors. Intuitively, this is expected to improve results since a variable’s shortcomings are significantly reduced when used in conjunction with others. In a study by Biljecki, Ledoux, & Stoter (2017) on generating 3D city models (https://bit.ly/3FUST4U), estimations using only built attributes show low Mean Absolute Errors (MAEs), particularly those with storey data (~1-1.3M). Models selectively using demographic parameters were found to have considerably higher MAEs, while those using geometry lay somewhere between the two. In the study, built and geometric attributes together produced an MAE of ~0.9M and all parameters combined generated an MAE of ~0.8M. A comparison between variables reveals number of storeys, building age, and net internal area as the most useful predictors, a combination of which also has an MAE of ~0.8M. CityGML sets positional and height accuracy benchmark at 5M or less which renders all of the above options usable.

There are many advantages to manual methods not the least of which is an abundance of data. In case of limited (or missing) data however, it might seem advantageous to depend entirely on active techniques. Despite being capital intensive and technologically demanding, the methods produce high quality datasets and can cover larger ground considerably faster. They are also scientifically and technically exquisite processes and very interesting to unfold.

A Digital Twin is composed of at least a morphological model and a weather simulation. Both are independently vast areas of study. The resultant rich virtual environment mimics our chosen biome. It could potentially assess multiple climate scenarios with volumetric correctness and accurate material traits. This is as important for disaster mitigation and relief as it is for building healthy and sustainable settlements. Urban Climate Maps are a 2-dimensional version of the same idea and have been around for about half a century now. They too link physical and climatic parameters within the urban canopy layer to generate a model. Their use is widespread in the domains of physical planning, social vulnerability mapping, public decision-making, sustainable design and management, and the like. These are the same areas where digital twins are expected to add value.

A vector outlining the top view of a building is its footprint. The most popular open source platform for built information, OSM, holds extensive road data for India although the same is not true for her buildings. OSM has been discussed in detail in another article (https://bit.ly/32TOuAO). In absence of this most basic database, both 2D and 3D built information over an area of interest needs to be mapped individually. This is not reassuring and points towards a predominant reliance on passive techniques.

Sources include physical surveys and volunteered geodata (primary) as well as cadastral and census datasets (secondary). Drawing from such rich databases allows for multiple initial constants which helps confirm findings. Because height information is collected directly, primary sources are more reliable than secondary.

Two of the most popular technologies used are photographs and light pulses. Photogrammetry uses a collection of overlapping photographs to assess depth. Since partial height information is present between said photographs, it is popularly known as being based off a 2.5D map. Laser scanning uses light pulses to measure distance thereby generating a 3D point cloud. This is a more active technique than photogrammetry since it directly builds the volume. Both processes are ‘technology’ sensitive which is usually what drives up prices.

Most authorities use FAR (Floor Area Ratio) for development control. It is the ratio of BUA to plot area and thus entails most of the same limitations as BUA. For instance, it includes basements which do not translate to building ‘height’ and can be divided among multiple structures within the same plot. This inevitably leads to overestimation.

Net Internal Area (NIA), Carpet Area (CA), and Built-Up Area (BUA) are different from each other as most building professionals would know. NIA excludes internal and external walls but includes circulation, CA covers habitable areas as well as water closets and baths but not balconies and external circulation areas. These two parameters underestimate the total built volume. BUA, and Super-BUA, are more appropriate in this regard but include underground construction as well which leads to overestimation.

As the name suggests, a DTM displays geographic terrain information. It is used for flood and drainage modelling, geomorphological studies, etc. Since a settlement is predominantly built above the ground, DTM is not especially useful when addressing the urban canopy layer.

CityGML is a standardized data model for the storage, representation, and exchange of 3D urban objects. It defines classes and inter-relationships between said objects based on their geometric and semantic properties. Based on the Level of Detail (LoD) of a model, the standard outlines acceptable usage and application.