Urban Microclimate Forecasting System

This is a research project I led as part of the Georgia Tech VIP-SMUR (Vertically Integrated Projects — Sustainable Microclimate Urban Research) group. The goal: predict temperature and relative humidity at hyper-local resolution — not city-scale or even block-scale, but at roughly 6-meter spacing across the entire Georgia Tech campus (~3.5 km²), using nothing more than 16 weather stations and publicly available geospatial data.

📄 In Preparation — Building and Environment (SCI Q1, IF ≈ 7.1)

Georgia Tech VIP-SMUR Project, Fall 2025. I led the model architecture design, training pipeline, inference system, and geospatial visualization pipeline — responsible for modules 3 and 4 of the project.

0.43°C
Temp RMSE
0.9928
Temp R²
1.28%
RH RMSE
100,283
Grid Points
337,742
Model Parameters
PyTorchMulti-Scale LSTMMulti-Head Attention (4 heads) Gated Residual NetworksRegression Kriging PyKrigeGeoPandasOSMnx Random ForestSLURM / A100 GPU

Model Architecture

The architecture combines four components chosen specifically for the microclimate forecasting problem: adaptive feature gating, learned input weighting, parallel temporal processing at two scales, and multi-head attention for long-range dependencies.

Core Components

Gated Residual Networks (GRN)
Two FC layers → sigmoid gate → learned interpolation between transformed and original input. Stabilizes training, maintains gradient flow in deep networks. Layer normalization applied after gated sum.
Variable Selection Network
Time-averaged input → GRN → softmax feature weights → applied element-wise to every timestep. The model learns solar altitude matters far more than minute-of-hour — adaptively, per sample.
Multi-Head Self-Attention (4 heads)
After LSTM processing, models long-range temporal dependencies across the output sequence. Attends to specific historical events (e.g., temperature spike 6 hours ago) beyond what LSTM hidden state captures.
Learnable Position Encoding
Temporal embeddings providing the attention layer with sequence position context independent of the LSTM's hidden state.

Key Design Decisions (with Rationale)

1. Multi-scale parallel LSTM branches. Rather than a single sequential LSTM, we use two parallel branches: a short-term branch (1 LSTM layer, hidden_dim/2) and a long-term branch (2 LSTM layers, hidden_dim/2), concatenated back to the full dimension. Rationale: microclimate evolution has two overlapping timescales — rapid fluctuations (a cloud passing, a building shadow) on minutes, and diurnal cycles (sunrise heating, nighttime cooling) on hours. A single LSTM collapses both; parallel branches preserve them separately.

2. Single-step point prediction. Given the past 24 hours (144 timesteps × 10 min), predict one timestep ahead (+10 min). For longer forecasts, iteratively roll — feed prediction back as input. This avoids distribution mismatch beyond the training horizon and produces cleaner inputs for downstream Regression Kriging (no multi-horizon uncertainty propagation).

3. No station identifiers. No station IDs anywhere in the input. This forces the model to learn location-independent patterns (physics-based features), enabling generalization to unseen stations — validated by our held-out station protocol.

Input Features (12 dimensions, physics-aware)

CategoryFeaturesEncoding
Weather observationsTemperature, Relative HumidityRaw (autoregressive signal)
Solar geometryAltitude angle, Azimuth angleSin/cos trigonometric
Hour of day24-hour diurnal cycleSin/cos (period = 24h)
Day of year365-day seasonal cycleSin/cos (period = 365d)
MinuteSub-hourly resolutionSin/cos (period = 60 min)

All temporal features encoded as sin/cos pairs to avoid discontinuities (hour 23 → hour 0). Solar angles encoded trigonometrically so the model receives physically meaningful sun-position signals rather than raw degree values.


Data Engineering

~947,000 warm-season observations (April–September 2015–2019) from 16 stations at 10-minute intervals. The preprocessing pipeline:

1. Warm-season filter
2. Physical constraint cleaning
3. Per-station segmentation (60-min gap threshold)
4. Resample + interpolate gaps ≤60 min
5. Sliding window extraction (L=144, no boundary crossing)
6. Chronological split (90/10 per station) + 1 station held out

Scaling: Features use RobustScaler (median-centered, IQR-normalized — resistant to sporadic outliers). Targets use MinMaxScaler (range [0,1]). Both fit on training data only.

Training: AdamW (β₁=0.9, β₂=0.999) + weight decay 1e-3, initial LR 1e-3, ReduceLROnPlateau (factor=0.5, patience=7), gradient clipping max_norm=1.0, early stopping patience=20. Xavier uniform init for linear layers; orthogonal init for LSTM (critical for preventing vanishing gradients in the 2-layer long-term branch). Trained in ~23 minutes on A100 GPU (Phoenix cluster).


Results

Held-Out Station Evaluation

VariableRMSEMAEMAPEMax Error
Temperature0.43°C0.29°C1.15%0.99286.98°C
Relative Humidity1.28%0.90%1.51%0.995222.23%
Sub-0.5°C temperature RMSE and R² > 0.99 on a station the model has never seen during training. That kind of generalization only happens when the model is learning physics (solar-driven diurnal cycles, humidity-temperature coupling), not memorizing station-specific patterns.

Overfitting Analysis

VariableTrain RMSETest RMSETest/Train Ratio
Temperature0.41°C0.43°C1.057
Relative Humidity1.19%1.28%1.074

Both ratios below 1.10 — excellent generalization. Ratios below 1.10 = excellent; 1.10–1.20 = good; above 1.20 = potential overfitting.

Feature Importance (Permutation-Based)

RankFeatureRMSE Increase When Removed
1Relative Humidity+2789%
2Temperature+372%
3Hour (sin)+7.0%
4Solar altitude (sin)+6.5%
5Hour (cos)+2.4%
6Solar azimuth (sin)+2.0%

Removing solar angles alone degrades RMSE by 2–6% — confirming the model uses sun position to predict temperature evolution, not just correlations with time-of-day.


Geospatial Inference: Stations → Campus Maps

The model outputs predictions at 16 discrete station locations. To predict at every campus point, I use Regression Kriging (PyKrige) — combining Random Forest regression on spatial covariates with kriging of the residuals.

9 real GIS covariates from OpenStreetMap + elevation data:

  • Distance features: Distance to nearest building, park, library, parking lot, footway
  • Elevation features: Ground elevation, building height, total elevation
  • Shadow features: Monthly shadow ratio (selected by target timestamp's month — computed from building geometry and solar angles)
Model predicts at 16 stations
Random Forest regression on 9 GIS covariates
Kriging of residuals (spherical variogram)
100,283 grid point predictions
OSMnx campus boundary overlay
300 DPI publication maps

After kriging, Random Forest feature importances are extracted to analyze which spatial characteristics most influence microclimate variation — connecting predictions back to actionable urban planning insights.


Extreme Scenario Mapping

An automated scenario generator scans the full warm-season dataset, identifies extreme-condition timestamps (hot/cold/dry/humid, filtered for ≥10 stations reporting), and runs the full inference → kriging → visualization pipeline for each:

🔥 Hottest
37.30°C avg, 34.6% RH — June 25, 2016 16:10
❄️ Coldest
8.45°C avg, 89.6% RH — May 6, 2017 06:30
🏜️ Driest
18.13°C avg, 18.0% RH — April 4, 2015 18:30
💧 Most Humid
16.73°C avg, 99.5% RH — April 18, 2015 04:40

Each comparison map shows side-by-side temperature and humidity predictions with station observations overlaid as square markers. All output maps are 300 DPI (~1.7–2.1 MB PNG) at 4800×3600 px resolution.


HPC Integration

StageTimeResources
Training (16 stations)~26 minA100 GPU, 34 GB RAM
Inference (100,283 points)~2 min 10 secA100 GPU
Visualization (3 maps, 300 DPI)~1 min 31 secCPU
All 4 extreme scenarios~3 min 4 secA100 GPU

Full SLURM job scripts for each stage, plus a comprehensive HPC tutorial I wrote for onboarding other VIP-SMUR team members (VPN setup, SSH, job submission, file transfer).


What I Learned

The most valuable lesson was about model design as scientific hypothesis testing. Every architectural decision encodes an assumption: parallel LSTM branches assume multiple temporal scales matter; removing station IDs assumes the model learns location-independent physics; using solar angles assumes that sun position causally drives temperature evolution. The performance metrics then confirm or reject each assumption. When removing solar angles degraded RMSE by 6.5%, that confirmed the assumption. This feedback loop — where the model is simultaneously a prediction tool and a scientific instrument — fundamentally changed how I think about model engineering.