Urban low-altitude UAV route planning: digital twin and neural rendering airspace modeling

Review of the application of digital twins and neural rendering in urban UAV airspace modeling, covering the latest work in TRO/TITS/RAL/IROS 2022-2025

Urban low-altitude UAV route planning: digital twin and neural rendering airspace modeling

Direction Three: Digital Twin + Neural Rendering Airspace Modeling Extended Chapter · Technology Blog Series Part 3


1. Background: Digital twins empower urban low-altitude economy

With the rapid development of urban air mobility (UAM) and low-altitude economy, refined management of urban low-altitude airspace has become a core need. Traditional air traffic control systems rely on static maps and rule-driven systems, which cannot meet the real-time planning needs of drones in the complex three-dimensional urban environment. Digital Twin (Digital Twin), as an accurate mapping of physical space in the digital world, provides a new technical path for dynamic modeling of urban low-altitude airspace.

Urban low-altitude digital twins need to integrate multi-source data: satellite images provide macroscopic surface object distribution, building information models (BIM) provide fine geometric structures, and real-time sensor data (LiDAR, cameras, weather stations) drive the dynamic evolution of the twins. The core value of the digital twin platform is to complete the complete closed loop of “prediction-planning-simulation-verification” in the digital space, significantly reducing the risks and costs of real flight tests.

This article focuses on the application of neural rendering technology in digital twin airspace modeling, and explores how to construct high-fidelity, real-time updateable low-altitude three-dimensional representation of cities through methods such as NeRF/3DGS.


2. Basics of digital twin airspace modeling

2.1 Airspace digital twin system architecture

Urban low-altitude digital twin systems usually adopt a five-layer architecture:

LevelFunctionKey Technology
Data acquisition layerMulti-source sensing data fusionLiDAR SLAM, visual inertial odometry (VIO), satellite remote sensing
Data processing layerPoint cloud registration, semantic segmentationICP, PointNet++, Segment Anything
3D modeling layerGeometry/texture/semantic reconstructionPhotogrammetry, NeRF/3DGS, BIM integration
Simulation Deduction LayerTrajectory prediction, traffic simulationMulti-agent simulation, reinforcement learning
Interactive service layerPlanning query, API interfaceGeographic information system (GIS), RESTful API

2.2 Mathematical framework of air domain representation

Assuming that the urban low-altitude airspace is (typical range: ), the airspace state can be modeled as a time-varying field:

Among them:

The core task of the digital twin is to estimate and update ** in real time to provide the planning algorithm with the most accurate environmental state at the current moment.


3. Application of neural rendering in spatial reconstruction

3.1 City-NeRF: Neural reconstruction of large-scale urban scenesCity-NeRF (Mueller et al., ACM ToG 2022) proposes a multi-view neural rendering framework for urban-scale scenes, achieving neural reconstruction of large-scale scenes through progressive mapping and local optimization strategies. City-NeRF’s core designs include:

City-NeRF verified the neural rendering method’s modeling capabilities for large-scale 3D scenes in the urban canyon scene, but the original implementation required dozens of hours of offline optimization and was unable to meet UAV online planning needs.

3.2 Real-time airspace modeling based on 3DGS

The incremental update nature of 3D Gaussian Splatting makes it a natural fit for UAV dynamic airspace reconstruction. Gaussian-Urban (the idea is derived from the application extension of 3DGS in urban scenes) models urban buildings, trees, road signs and other scene elements as independent Gaussian groups, supporting incremental insertion and deletion frame by frame:

Key designs include:1. Dynamic Gaussian life cycle management: The newly observed area of ​​the UAV generates a new Gaussian (split operation), and redundant Gaussians that have not been updated for a long time are pruned (pruning) 2. Chunk management: Divide the city into space blocks of . Each block maintains an independent Gaussian set, and the UAV dynamically loads adjacent blocks during the movement process. 3. GPU accelerated pipeline: Use CUDA to implement GPU parallelization of Gaussian projection, depth sorting and alpha synthesis, reaching a measured rendering frame rate of 15 FPS on Jetson Orin

3.3 Integration with BIM/city model

Purely data-driven neural rendering methods have the problem of insufficient geometric accuracy: the geometry learned by MLP or Gaussian ensemble is “rendering correct” rather than “measurement accurate”, which may introduce dangerous errors in planning scenarios that require precise collision boundaries.

Neuro-geometric fusion solution came into being:


4. Dynamic airspace digital twin: real-time perception fusion and update

4.1 Dynamic element modeling

There are a large number of dynamic elements in urban low-altitude airspace: other drones in flight, birds, kites, temporary construction hoisting, etc. Static neural fields cannot capture these dynamic targets, and a four-dimensional (4D) spatio-temporal representation needs to be introduced.

D-NeRF framework (Pumarola et al., NeurIPS 2021) introduces the time dimension into the neural radiation field, modeled as:$$ \mathcal{F}_\theta: (\mathbf{x}, t, \mathbf{d}) \rightarrow (\mathbf{c}, \sigma), \quad \mathbf{x}’ = \mathbf{x} + \Delta \mathbf{x}(t)

\mathbf{P}t = \mathbf{F}\mathbf{P}{t-1}\mathbf{F}^\top + \mathbf{Q}, \quad \mathbf{Q} = \sigma_w^2 \mathbf{I}

You can't use 'macro parameter character #' in math mode ### 4.2 Multi-source sensing fusion A single sensor cannot provide complete airspace situational awareness. Dynamic airspace digital twins need to integrate: | Sensors | Advantages | Limitations | Fusion methods | |--------|------|------|---------| | **Vision Camera** | Rich textures, low cost | Night/backlight failure, scale ambiguity | SfM recovery depth | | **LiDAR** | Accurate ranging, not affected by lighting | Sparse, expensive | Point cloud registration | | **Millimeter Wave Radar** | Penetrates haze and measures speed directly | Noisy, low resolution | Fusion with vision/laser point cloud | | **ADS-B** | Direct acquisition of air traffic information | Rely on broadcast from the other party's equipment | Location annotation | | **Acoustic Array** | Detect unknown sound sources | Interferenced by urban noise | Sound source localization | **Neural field as a multi-modal fusion center**: Each sensor data is used as the input observation of the neural field, and the density and color distribution of the neural field are constrained through the volume rendering equation. The key advantage is that neural fields can naturally fuse data collected by different sensors at different viewing angles and at different times** without the need for explicit point cloud registration or feature matching. ### 4.3 Real-time update pipeline The real-time update pipeline design of the dynamic airspace digital twin is as follows:1. **Data collection**: The forward-looking camera and downward-looking camera carried by the UAV continuously collect image sequences. 2. **Attitude estimation**: Obtain camera pose through visual inertial odometry (VIO) or GPS/IMU fusion 3. **Incremental mapping**: Pass new observations into the neural field optimizer and update the local Gaussian set or MLP weights 4. **Dynamic Detection**: Run semantic segmentation on each new frame of image to separate static background and dynamic foreground; dynamic foreground is independently modeled as moving Gaussian or 4D NeRF 5. **Status Publishing**: Publish the current airspace status to the planner via ROS 2 topic or WebSocket API **Key Performance Indicators**: End-to-end update latency $< 100\text{ms}$, spatial coverage $> 95\%$ (relative to UAV flight corridor area), geometric accuracy $> 10\text{cm}$ (@ $1\sigma$). --- ## 5. End-to-end planning: digital twin → trajectory optimization ### 5.1 Safe Corridor Extraction Extracting Safe Corridors from neural airspace representations is a key step in connecting digital twins to trajectory planning. The traditional method extracts the Free-Space Bounding Box from the voxel map, but a new extraction method is required for neural field representation: - **Boundary detection based on density gradient**: The density gradient of the neural field $\nabla_\mathbf{x}\sigma(\mathbf{x})$ is largest at the surface of the object and can be used to locate the collision boundary - **Marching Cubes extracts isosurfaces**: Threshold the density field $\sigma(\mathbf{x})$ into a binary occupancy field, and use the Marching Cubes algorithm to extract isosurfaces as safe corridor boundaries - **Gaussian-based collision detection**: Each Gaussian ellipsoid in 3DGS can directly calculate the SDF approximation, and only needs to detect collisions with the Gaussian set during trajectory planning ### 5.2 Trajectory optimization objective function Objective function design for trajectory optimization in digital twin airspace:$$ \min_{\mathbf{p}(t)} J = \underbrace{w_1 \int_0^T \|\mathbf{p}(t)\|^2 dt}_{\text{Trajectory smoothing}} + \underbrace{w_2 \int_0^T \sigma(\mathbf{p}(t)) dt}_{\text{Collision avoidance}} + \underbrace{w_3 T}_{\text{Flight time}} + \underbrace{w_4 \sum_{i=1}^{N} \phi(d_i)}_{\text{Dynamic obstacles}}

Where is the distance from the dynamic obstacle , is the exponential obstacle avoidance potential function.

The key inputs provided by the digital twin to this optimization problem are: an accurate estimate of and a real-time position prediction of .

5.3 Verification and Simulation

The digital twin platform allows for safe verification in simulation before deploying planned trajectories to a real UAV:


6.1 City-level digital twin platform

AirSim City Twin (Microsoft, 2017) is one of the earliest open source UAV simulation platforms, providing a photo-realistic urban environment and supporting the simulation of RGB cameras, LiDAR, IMU and other sensors. AirSim’s digital twin is built on Unreal Engine and has realistic textures but limited geometric accuracy.OnePlus City Digital Twin (inspired by large-scale urban scene reconstruction research) uses the Photogrammetry + LiDAR fusion method to build digital twin models of multiple Chinese cities with a resolution of and supports urban planning and UAV simulation.

NVIDIA Omniverse Replicator provides a unified platform for data synthesis and digital twin construction, supporting urban scene representation and neural rendering acceleration based on USD (Universal Scene Description).

6.2 UAV airspace modeling research

ResearchYearMethodologyCoverageUpdate Frequency
City-NeRF2022Multi-view NeRFCity BlocksStatic
Gaussian-Urban20233DGSBlock LevelReal Time
Instant-NGP2022Hash EncodingIndoor/Small SceneReal Time
SUDS2023Neural SLAMCity LevelOnline
Rubble-Fuse2024Multi-modal fusionUrban areaQuasi-real-time

7. Challenges and future directions

7.1 Current main challenges

Computing resource bottleneck: The city-level airspace digital twin () contains billions of voxels/Gaussians, far exceeding the computing power of a single card. The blocking strategy brings new issues such as seam processing between blocks and cross-block trajectory planning.

Contradiction between timeliness and accuracy: Neural field optimization requires sufficient observation data to converge, but urban airspace status changes rapidly (temporary construction, event control), and the digital twin may lag behind.

Multi-resolution consistency: Airspace accuracy requirements at different altitudes are different - near the ground () requires centimeter-level accuracy to avoid obstacles, while high-altitude airspace () focuses on situational awareness. It is difficult for existing neural field methods to uniformly handle multi-resolution requirements in a single representation.

7.2 Future development directionNeural-Geometry Hybrid Representation: Combining the advantages of explicit voxels/grids (efficient geometry queries) and implicit neural fields (photorealism) to develop an accurate and beautiful representation of urban airspace.

Large language model + airspace digital twin: Use multi-modal large models such as GPT-4V to understand airspace semantics and control rules, and inject natural language constraints into the digital twin planning system to achieve “voice control planning.”

Crowdsourced digital twin update: Utilize a large amount of real-time observation data from UAVs to distribute and update the city’s digital twin through Federated Learning to achieve “crowdsourced mapping”.


8. Summary

Digital twins provide the most high-fidelity, simulated, and verifiable digital base for urban low-altitude UAV planning. Neural rendering technology significantly improves the construction efficiency and realism of airspace digital twins through differentiable optimization, incremental updates and multi-modal fusion capabilities.

However, there is still a distance from “static city model” to “dynamic real-time twin”. The core challenges lie in large-scale efficient representation, real-time modeling of dynamic elements and multi-resolution consistency. With the continuous advancement of 3DGS, NeRF and large language model technology, urban low-altitude digital twins are expected to move from research prototypes to actual deployment in the next 3-5 years.


References


*This article is the third extended chapter in a series of articles on urban low-altitude drone route planning. *