Hi! I am Kaizhen Tan (Chinese name: 谭楷蓁). I am currently a master’s student in Artificial Intelligence at Carnegie Mellon University. I received my bachelor’s degree in Information Systems from Tongji University, where I built a solid foundation in programming, data analysis, and machine learning, complemented by interdisciplinary training in business and organizational systems.

My research sits at the intersection of Urban Management and Spatial Intelligence. I aim to build systems that are both spatially intelligent and socially aware, providing computational solutions to make cities more adaptive, inclusive, and human-centric.

Driven by the vision of harmonizing artificial intelligence with urban ecosystems, my research agenda is organized around four key areas:

🤖 1. Embodied Urbanism & Collaborative Governance

Core Question: How should embodied intelligence move in cities, and how can humans govern and collaborate with them at scale?

  • Robot-Friendly Urban Space: Redesign streets and building interiors for robot operations, setting standards for siting, infrastructure, protocols, and responsibility boundaries.
  • Embodied Navigation: Vision-based navigation and task execution in real urban scenes, with accessibility-aware routing and ground robot and drone platforms.
  • Low-Altitude Governance: Derive operable air corridors from demand signals, then validate in 3D city models under privacy, noise, crowds, and other effects.
  • Emerging Urban Devices & Deployment: Study city-scale rollout of robots, low-altitude systems, wearables, and BCI-like devices, focusing on deployment bottlenecks.
  • Social Acceptance & Ethics: Model how different groups perceive risks and capabilities, guiding interaction design, rollout strategy, and public communication.

🎨 2. Social Sensing & Human-Environment Interaction

Core Question: How can local narratives, culture, and human behavior be measured and integrated into models to support urban renewal and governance?

  • AI-Enhanced Geospatial Analysis: Use large-scale spatial analytics to link urban form, environment, and mobility with human behavior and public service outcomes.
  • Pedestrian-Oriented Design: Analyze walking experiences and accessibility barriers, integrating disabled people’s mobility needs into urban governance decisions.
  • Urban Perception & Visual Aesthetics: Quantify streetscape perception and neighborhood imagery to inform design choices and regeneration priorities.
  • Socio-Cultural Computing: Incorporate place-based narratives into LLM-enabled applications for communication, interaction, and inclusive governance.

🏙️ 3. Self-Evolving Urban Digital Twins and Agents

Core Question: How can we build a self evolving urban digital twin that stays continuously updated, hosts agents, and supports sustainable and equitable city governance?

  • Urban Foundation Models: Fuse remote sensing, street-level data, trajectories, IoT, text, and graphs into unified urban representations.
  • Measurement & Sensing Loops: Develop scalable metrics and updating pipelines, using robots, drones, and wearables as new data sources for continuous urban sensing.
  • Localization & Mapping: Advance geo-localization and semantic SLAM across point clouds, meshes, and 3D Gaussian representations for interactive 3D cities.
  • Urban Agents: Build task agents for planning and public services, including map-LLM systems, spatial RAG, policy QA, and travel assistance.
  • Policy Sandbox & Systems: Use the twin for what-if simulation, risk assessment, and execution checks, supported by efficient retrieval, caching, and rendering.

🚀 4. Spatial Intelligence & Foundation World Models

Core Question: How can world models support reliable spatial reasoning and actionable decision-making for physical agents?

  • World Models & Architecture: Study generative, predictive, and representation-learning paradigms for forecasting and planning in the physical world.
  • Embodied Representations: Unify geometry, semantics, physics, and action into shared representations, with a path toward richer embodied modalities.
  • Long-term Memory & Self-Evolution: Build lifelong learning mechanisms with stability, forgetting control, and safety constraints for long-horizon autonomy.
  • Neural-Inspired Reasoning: Improve interpretability and robustness of spatial reasoning, exploring 3D-aware encoders and alternatives to standard transformers.

🔥 News

  • 2026.01: 🎉 The abstract co-authored with Prof. Fan Zhang has been accepted for the XXV ISPRS Congress 2026. See you in Toronto!
  • 2025.12: 🎉 Our paper, led by my senior labmate Dr. Weihua Huan and co-authored with Prof. Wei Huang at Tongji University, was accepted by GIScience & Remote Sensing; honored to contribute as second author and big congratulations to Dr. Huan!
  • 2025.10: 🔭 Joined Prof. Yu Liu and Prof. Fan Zhang’s team at Peking University as a remote research assistant.
  • 2025.08: 🎉 Delivered an oral presentation at Hong Kong Polytechnic University after our paper was accepted to the Global Smart Cities Summit cum The 4th International Conference on Urban Informatics (GSCS & ICUI 2025).
  • 2025.07: 🎉 My undergraduate thesis was accepted by 7th Asia Conference on Machine Learning and Computing (ACMLC 2025).
  • 2025.06: 🎓 Graduated from Tongji University—grateful for the journey and excited to continue my studies at CMU.
  • 2025.04: 🔭 Completed the SITP project under the supervision of Prof. Yujia Zhai in the College of Architecture and Urban Planning.
  • 2025.01: 💼 Joined Shanghai Artificial Intelligence Laboratory as an AI Product Manager Intern.
  • 2024.09: 🌏 Conducted research at ASTAR in Singapore under the supervision of Dr. Yicheng Zhang and Dr. Sheng Zhang.
  • 2024.04: 🔭 Began my academic journey at Prof. Wei Huang’s lab in the College of Surveying and Geo-Informatics, Tongji University.

📖 Education

Carnegie Mellon University
2025.08 – 2026.08
M.S. in Artificial Intelligence Systems Management
Tongji University
2021.09 – 2025.06
B.Mgt. in Information Management and Information System

💼 Experience

🔭 Research Experience

  • 2025.10 - 2026.04, Research Assistant, Institute of Remote Sensing and Geographic Information System, Peking University, China
  • 2024.09 - 2024.12, Research Officer, A*STAR Institute for Infocomm Research, Singapore
  • 2024.04 - 2025.04, Research Assistant, College of Architecture and Urban Planning, Tongji University, China
  • 2024.04 - 2024.12, Research Assistant, College of Surveying and Geo-Informatics, Tongji University, China

💻 Professional Experience

📝 Publications

XXV ISPRS Congress
sym

UrbanVGGT: Scalable Sidewalk Width Estimation from Street View Images

Kaizhen Tan, Fan Zhang

[abstract]

  • Accepted at XXV ISPRS Congress 2026
  • Leverage street-view imagery and VGGT-based 3D reconstruction to estimate metrically scaled sidewalk widths, build the SV-SideWidth dataset, and fill OpenStreetMap gaps for equitable assessment of pedestrian infrastructure.
Computational Urban Science
sym

Decoding Tourist Perception in Historic Urban Quarters with Multimodal Social Media Data: An AI-Based Framework and Evidence from Shanghai

Kaizhen Tan, Yufan Wu, Yuxuan Liu, Haoran Zeng

[arXiv] [slides]

  • Under Review at Computational Urban Science
  • Developed an AI-powered multimodal framework to analyze tourist perception in historic Shanghai quarters, integrating image segmentation, color theme analysis, and sentiment mining for heritage-informed urban planning.
ACMLC 2025
sym

Multimodal Deep Learning for Modeling Air Traffic Controllers Command Lifecycle and Workload Prediction in Terminal Airspace

Kaizhen Tan

[arXiv][slides] [github]

  • Published in 7th Asia Conference on Machine Learning and Computing (ACMLC 2025)
  • Designed a multimodal deep learning framework linking ATCO voice commands with aircraft trajectories to model workload dynamics, enabling intelligent command generation and scheduling support.
GIScience & Remote Sensing
sym

A Spatiotemporal Adaptive Local Search Method for Tracking Congestion Propagation in Dynamic Networks

Weihua Huan, Kaizhen Tan, Xintao Liu, Shoujun Jia, Shijun Lu, Jing Zhang, Wei Huang

[paper]

  • Published in GIScience & Remote Sensing (JCR Q1; IF = 6.9).
  • Proposed a spatiotemporal adaptive local search (STALS) method combining dynamic graph learning and spatial analytics to model and mitigate large-scale urban traffic congestion propagation.

🔬 Projects

sym

BlindNav: YOLO+LLM for Real-Time Navigation Assistance for Blind Users

Kaizhen Tan, Yufan Wang, Yixiao Li, Hanzhe Hong, Nicole Lyu

[report] [github]

  • BlindNav is a real-time, camera-based navigation assistant that uses YOLO for street-scene detection and a local LLM to turn those signals into concise voice guidance for blind and low-vision pedestrians.

💬 Presentations

  • 2026.07 - XXV ISPRS Congress 2026
    UrbanVGGT: Scalable Sidewalk Width Estimation from Street View Images
    Toronto, Canada

  • 2025.08 - Global Smart Cities Summit cum The 4th International Conference on Urban Informatics (GSCS & ICUI 2025)
    A Multidimensional AI-powered Framework for Analyzing Tourist Perception in Historic Urban Quarters: A Case Study in Shanghai
    Hong Kong Polytechnic University (PolyU), Hong Kong SAR, China

  • 2025.07 - 7th Asia Conference on Machine Learning and Computing (ACMLC 2025)
    Multimodal Deep Learning for Modeling Air Traffic Controllers Command Lifecycle and Workload Prediction in Terminal Airspace
    Hong Kong SAR, China