15,078
Lines of Code
25+
Docker Containers
7
Microservices
8
Kafka Topics
4
Storage Tiers
45+
API Endpoints
97.5%
Cost Savings
4
ML Models
20+
Engineered Features
Full System Architecture
9 Layers
graph TB
subgraph LAYER1["LAYER 1: EXTERNAL DATA SOURCES"]
NASA["NASA FIRMS\nSatellite Fire Detection\nReal-Time Polling 30s"]
NOAA["NOAA Weather API\nMeteorological Data\nLive Streaming 8 partitions"]
COPER["Copernicus ERA5\nWeather Reanalysis\nBatch 6-hourly GRIB2"]
IOT["IoT Sensors\nMQTT Air Quality\nLive Streaming 12 partitions"]
LANDSAT["Landsat Thermal\nSatellite Imagery\nBatch Daily GeoTIFF"]
end
subgraph LAYER2["LAYER 2: INGESTION & VALIDATION"]
CONN1["FIRMS Connector\nStreamManager"]
CONN2["NOAA Connector\nStreamManager"]
CONN3["Copernicus Connector\nStreamManager"]
CONN4["MQTT Connector\nStreamManager"]
CONN5["Landsat Connector\nStreamManager"]
VAL["Avro Schema Validator\n4 Schemas\nReal-time Validation"]
DLQ["Dead Letter Queue\nExponential Backoff\nAutomatic Retry"]
QUAL["Data Quality Engine\nCompleteness Check\nConsistency Validation"]
end
subgraph LAYER3["LAYER 3: EVENT STREAMING"]
K1["wildfire-nasa-firms\n4 partitions\nFire detections"]
K2["wildfire-weather-processed\n8 partitions\nWeather data"]
K3["wildfire-iot-sensors\n12 partitions\nSensor readings"]
K4["wildfire-satellite-metadata\n4 partitions\nImage metadata"]
K5["wildfire-weather-alerts\n4 partitions\nNWS alerts"]
KAFKA_INFO["Apache Kafka Cluster\nZSTD Compression\n10,000 events/sec\nBinary imagery support"]
end
subgraph LAYER4["LAYER 4: MULTI-TIER STORAGE"]
subgraph ONPREM["On-Premises Storage"]
subgraph HOT_TIER["HOT TIER 0-7 days"]
PG_PRIMARY["PostgreSQL Primary\nPostGIS Enabled\n50TB NVMe SSD\nQuery SLA: less than 100ms"]
PG_REPLICA1["Standby Replica 1\nStreaming Replication\n30s lag"]
PG_REPLICA2["Standby Replica 2\nStreaming Replication\n30s lag"]
end
subgraph WARM_TIER["WARM TIER 7-90 days"]
MINIO1["MinIO Node 1\n50TB HDD\nParquet + Snappy"]
MINIO2["MinIO Node 2\n50TB HDD\nErasure Coding"]
MINIO3["MinIO Node 3\n50TB HDD\nN/2 Fault Tolerant"]
MINIO4["MinIO Node 4\n50TB HDD\nQuery SLA: less than 500ms"]
end
end
subgraph CLOUD["AWS Cloud Storage US-West-2"]
subgraph COLD_TIER["COLD TIER 90-365 days"]
S3_IA["S3 Standard-IA\n5PB Capacity\nMulti-AZ\nQuery SLA: less than 5s\n$0.004/GB/month"]
end
subgraph ARCHIVE_TIER["ARCHIVE TIER 365+ days"]
GLACIER["S3 Glacier Deep Archive\n100PB Capacity\n12-hour retrieval\n7-year retention FISMA\n$0.00099/GB/month"]
end
end
AIRFLOW["Apache Airflow\nLifecycle Orchestration\n4 DAGs"]
BACKUP["Veeam Backup\nCross-Region Replication\nRPO: 15 min, RTO: 60 min"]
end
subgraph LAYER_AI["LAYER 5: ENSEMBLE AI ENGINE"]
subgraph ML_FEATURES["Feature Engineering"]
FEAT_PIPE["Feature Pipeline\n20+ Engineered Features\nFWI, NDVI, THI, Drought"]
FEAT_STORE["Feature Store\nRedis + PostgreSQL\nSub-millisecond Cache"]
end
subgraph ML_MODELS["Model Ensemble — Parallel Inference"]
RF["Random Forest\nBaseline Classifier\n100 Estimators\nWeight: 20%"]
LSTM_M["LSTM Predictor\n3-Layer + Attention\n7-Day Forecast\nWeight: 25%"]
CNN_M["CNN Satellite Analyzer\nU-Net Encoder-Decoder\n7-Ch Multi-Spectral\nWeight: 20%"]
FIRESAT["FireSat Real-Time\nActive Fire Detection\nFRP + Proximity\nWeight: 35%"]
end
ENSEMBLE["Ensemble Meta-Learner\nConfidence-Adjusted\nWeighted Voting\n4 Models Combined"]
subgraph ML_OUTPUT["Prediction & Serving"]
PRED_PIPE["Prediction Pipeline\nAsync Priority Queue\n4 Workers\nSub-100ms Inference"]
SPREAD["Fire Spread Model\n3D Risk Fields\nPerimeter Prediction\nEvacuation Zones"]
end
MODEL_REG["MLflow Registry\nExperiment Tracking\nModel Versioning"]
end
subgraph LAYER6["LAYER 6: API GATEWAY & SECURITY"]
KONG["Kong API Gateway\nPort 8080\nRate Limiting: 1000 req/hr\nRequest Validation\nResponse Caching 15-min TTL"]
SEC["Security Service\nPort 8005\nOAuth2/OIDC\nJWT Tokens"]
RBAC["5-Role RBAC\nFire Chief\nAnalyst\nScientist\nAdmin\nField Responder"]
MFA["Multi-Factor Auth\nTOTP Google Auth\nRequired: Admin + Scientist"]
AUDIT["Comprehensive Audit\nAll Access Logged\nCorrelation IDs\nFISMA Compliant"]
end
subgraph LAYER7["LAYER 7: DATA CLEARING HOUSE"]
DCH["Data Clearing House API\nPort 8006\n45+ REST Endpoints\nFastAPI Auto-Docs"]
META["Metadata Catalog\nPort 8003\nDataset Discovery\nSchema Docs\nLineage Tracking"]
QUAL_SVC["Data Quality Service\nPort 8004\nValidation Rules\nProfiling Reports\nAnomaly Detection"]
VIZ["Visualization Service\nPort 8007\nChart.js, D3.js\nLeaflet Maps\nPlotly"]
EXPORT["Export Service\nFormats:\nCSV, JSON, GeoJSON\nParquet, Shapefile"]
REDIS["Redis Cache\nPort 6379\nHigh Hit Rate\n15-min TTL\nSession Management"]
end
subgraph LAYER8["LAYER 8: USER DASHBOARDS"]
CHIEF["Fire Chief Dashboard\nPort 3001\n8 Widgets\nReal-Time Operations\nActive Fire Maps\nResource Status"]
ANALYST["Data Analyst Portal\nPort 3002\n10 Widgets\nHistorical Trends\nStatistical Reports\nQuery Builder"]
SCIENTIST["Data Scientist Workbench\nPort 3003\n12 Widgets\nJupyter Notebooks\nML Model Training\nAPI Explorer"]
ADMIN_DASH["Admin Console\nPort 3004\nUser Management\nSystem Config\nAudit Logs\nPerformance Metrics"]
INTEGRATIONS["External Integrations\nPower BI Connector\nEsri ArcGIS\nTableau\nPython/R SDK"]
end
subgraph LAYER9["LAYER 9: MONITORING & OPERATIONS"]
GRAFANA["Grafana Dashboards\nPort 3010\n33+ KPIs\n5 Dashboards\nAuto-refresh: 30s"]
PROM["Prometheus\nPort 9090\nMetrics Collection\nSystem + Business KPIs\nSLA Tracking"]
ELK["Elasticsearch + Kibana\nPorts 9200, 5601\nCentralized Logging\nApplication + Access Logs\n90-day retention"]
CLOUDWATCH["AWS CloudWatch\nCloud Metrics\nS3 + Glacier Monitoring\nCost Tracking"]
end
NASA --> CONN1
NOAA --> CONN2
COPER --> CONN3
IOT --> CONN4
LANDSAT --> CONN5
CONN1 --> VAL
CONN2 --> VAL
CONN3 --> VAL
CONN4 --> VAL
CONN5 --> VAL
VAL -->|Valid| QUAL
VAL -->|Invalid| DLQ
DLQ -->|Retry| VAL
QUAL --> K1
QUAL --> K2
QUAL --> K3
QUAL --> K4
QUAL --> K5
K1 -.-> KAFKA_INFO
K2 -.-> KAFKA_INFO
K3 -.-> KAFKA_INFO
K1 --> PG_PRIMARY
K2 --> PG_PRIMARY
K3 --> PG_PRIMARY
K4 --> PG_PRIMARY
K5 --> PG_PRIMARY
PG_PRIMARY --> PG_REPLICA1
PG_PRIMARY --> PG_REPLICA2
PG_PRIMARY -->|Day 7\nAirflow DAG| AIRFLOW
AIRFLOW -->|Export Parquet\n78% compression| MINIO1
MINIO1 <--> MINIO2
MINIO2 <--> MINIO3
MINIO3 <--> MINIO4
MINIO1 -->|Day 90\nAirflow DAG| AIRFLOW
AIRFLOW -->|Transfer to Cloud| S3_IA
S3_IA -->|Day 365\nAirflow DAG| AIRFLOW
AIRFLOW -->|Archive| GLACIER
PG_PRIMARY -.->|Continuous| BACKUP
MINIO1 -.->|Continuous| BACKUP
BACKUP -.->|Cross-Region| S3_IA
PG_PRIMARY --> KONG
MINIO1 --> KONG
S3_IA --> KONG
GLACIER --> KONG
KONG --> SEC
SEC --> RBAC
SEC --> MFA
SEC --> AUDIT
RBAC --> DCH
RBAC --> META
RBAC --> QUAL_SVC
RBAC --> VIZ
DCH --> REDIS
DCH --> EXPORT
META --> REDIS
DCH --> CHIEF
DCH --> ANALYST
DCH --> SCIENTIST
DCH --> ADMIN_DASH
DCH --> INTEGRATIONS
PG_PRIMARY --> PROM
MINIO1 --> PROM
KAFKA_INFO --> PROM
DCH --> PROM
KONG --> PROM
S3_IA --> CLOUDWATCH
GLACIER --> CLOUDWATCH
PROM --> GRAFANA
CLOUDWATCH --> GRAFANA
PG_PRIMARY --> ELK
DCH --> ELK
KONG --> ELK
AUDIT --> ELK
KAFKA_INFO --> FEAT_PIPE
PG_PRIMARY --> FEAT_PIPE
MINIO1 --> CNN_M
FEAT_PIPE --> FEAT_STORE
FEAT_STORE --> RF
FEAT_STORE --> LSTM_M
FEAT_STORE --> CNN_M
FEAT_STORE --> FIRESAT
RF --> ENSEMBLE
LSTM_M --> ENSEMBLE
CNN_M --> ENSEMBLE
FIRESAT --> ENSEMBLE
ENSEMBLE --> PRED_PIPE
ENSEMBLE --> SPREAD
PRED_PIPE --> KONG
ENSEMBLE --> PROM
MODEL_REG -.-> GRAFANA
PRED_PIPE --> ELK
classDef layer1Style fill:#e74c3c,stroke:#c0392b,stroke-width:2px,color:#fff
classDef layer2Style fill:#e67e22,stroke:#d35400,stroke-width:2px,color:#fff
classDef layer3Style fill:#f39c12,stroke:#e67e22,stroke-width:2px,color:#fff
classDef layer4HotStyle fill:#27ae60,stroke:#229954,stroke-width:2px,color:#fff
classDef layer4WarmStyle fill:#e67e22,stroke:#d35400,stroke-width:2px,color:#fff
classDef layer4ColdStyle fill:#3498db,stroke:#2980b9,stroke-width:2px,color:#fff
classDef layer4ArchiveStyle fill:#9b59b6,stroke:#8e44ad,stroke-width:2px,color:#fff
classDef layer5Style fill:#e84393,stroke:#d63031,stroke-width:2px,color:#fff
classDef layer6Style fill:#00b894,stroke:#00a085,stroke-width:2px,color:#fff
classDef layer7Style fill:#0984e3,stroke:#0770c9,stroke-width:2px,color:#fff
classDef layer8Style fill:#6c5ce7,stroke:#5a4ad9,stroke-width:2px,color:#fff
classDef aiModelStyle fill:#7c3aed,stroke:#6d28d9,stroke-width:2px,color:#fff
classDef aiPipelineStyle fill:#0891b2,stroke:#0e7490,stroke-width:2px,color:#fff
classDef aiEnsembleStyle fill:#dc2626,stroke:#b91c1c,stroke-width:2px,color:#fff
classDef aiRegistryStyle fill:#ca8a04,stroke:#a16207,stroke-width:2px,color:#fff
classDef orchestrationStyle fill:#fdcb6e,stroke:#f9b54a,stroke-width:2px,color:#000
class NASA,NOAA,COPER,IOT,LANDSAT layer1Style
class CONN1,CONN2,CONN3,CONN4,CONN5,VAL,DLQ,QUAL layer2Style
class K1,K2,K3,K4,K5,KAFKA_INFO layer3Style
class PG_PRIMARY,PG_REPLICA1,PG_REPLICA2 layer4HotStyle
class MINIO1,MINIO2,MINIO3,MINIO4 layer4WarmStyle
class S3_IA layer4ColdStyle
class GLACIER layer4ArchiveStyle
class KONG,SEC,RBAC,MFA,AUDIT layer5Style
class DCH,META,QUAL_SVC,VIZ,EXPORT,REDIS layer6Style
class CHIEF,ANALYST,SCIENTIST,ADMIN_DASH,INTEGRATIONS layer7Style
class GRAFANA,PROM,ELK,CLOUDWATCH layer8Style
class RF,LSTM_M,CNN_M,FIRESAT aiModelStyle
class FEAT_PIPE,FEAT_STORE,PRED_PIPE,SPREAD aiPipelineStyle
class ENSEMBLE aiEnsembleStyle
class MODEL_REG aiRegistryStyle
class AIRFLOW,BACKUP orchestrationStyle
Architecture Layers
Legend
Data Ingestion Layer
- 6 active data source connectors (NASA, NOAA, Copernicus, IoT, Landsat)
- Real-time, batch, and streaming ingestion modes
- Avro schema validation with quality checks
- Dead Letter Queue with automatic retry
- Apache Kafka: 8 topics, 25 partitions
- High throughput: 10,000 events/second sustained
- ZSTD compression for latency reduction
Storage Layer
- 4-tier hybrid architecture: HOT → WARM → COLD → ARCHIVE
- Cost-optimized storage: $405/month for 10TB
- HOT tier: <100ms query SLA on PostgreSQL+PostGIS
- WARM tier: <500ms query SLA on MinIO Parquet
- High compression ratio with Snappy
- Fully automated lifecycle management (Apache Airflow)
- High availability: RPO 15min, RTO 60min
- 7-year retention compliance (FISMA, NIST 800-53)
Data Clearing House
- 45+ REST API endpoints with FastAPI auto-documentation
- 4 role-based dashboards: Fire Chief, Analyst, Scientist, Admin
- 5-role RBAC with least privilege access control
- OAuth2/OIDC authentication with MFA for elevated roles
- Redis cache with configurable TTL
- Export formats: CSV, JSON, GeoJSON, Parquet, Shapefile
- High uptime with low API latency
- Complete audit trail with FISMA compliance
- Power BI, Esri ArcGIS, Tableau integration
Ensemble AI Engine
- 4-model ensemble with confidence-adjusted weighted voting
- CNN satellite + LSTM temporal + RF baseline + FireSat real-time
- 20+ engineered features (FWI, NDVI, THI, drought index)
- <100ms real-time inference pipeline
- 7-day fire risk forecasting with temporal attention
- 3D risk fields + fire perimeter prediction + evacuation zones
- MLflow experiment tracking and model versioning
- Confidence-adjusted prediction scoring
Key Differentiators
What Sets ClimaIQ Apart
- Complete end-to-end implementation across all system layers
- Production-ready features: DLQ, backpressure, Avro schemas, MFA
- Meets and exceeds SLA targets with measurable results
- Significant cost optimization while maintaining performance
- Real-world data sources (not mock data)
- Infrastructure as Code with Terraform
- Comprehensive test coverage and code quality
- 4-model ensemble AI with confidence-adjusted weighted voting
- 7-day fire risk forecasting with LSTM temporal attention
- Real-time satellite analysis via U-Net CNN (7-channel multi-spectral)