15,078
Lines of Code
25+
Docker Containers
7
Microservices
8
Kafka Topics
4
Storage Tiers
45+
API Endpoints
97.5%
Cost Savings
4
ML Models
20+
Engineered Features

Full System Architecture

9 Layers
graph TB
    subgraph LAYER1["LAYER 1: EXTERNAL DATA SOURCES"]
        NASA["NASA FIRMS\nSatellite Fire Detection\nReal-Time Polling 30s"]
        NOAA["NOAA Weather API\nMeteorological Data\nLive Streaming 8 partitions"]
        COPER["Copernicus ERA5\nWeather Reanalysis\nBatch 6-hourly GRIB2"]
        IOT["IoT Sensors\nMQTT Air Quality\nLive Streaming 12 partitions"]
        LANDSAT["Landsat Thermal\nSatellite Imagery\nBatch Daily GeoTIFF"]
    end

    subgraph LAYER2["LAYER 2: INGESTION & VALIDATION"]
        CONN1["FIRMS Connector\nStreamManager"]
        CONN2["NOAA Connector\nStreamManager"]
        CONN3["Copernicus Connector\nStreamManager"]
        CONN4["MQTT Connector\nStreamManager"]
        CONN5["Landsat Connector\nStreamManager"]
        VAL["Avro Schema Validator\n4 Schemas\nReal-time Validation"]
        DLQ["Dead Letter Queue\nExponential Backoff\nAutomatic Retry"]
        QUAL["Data Quality Engine\nCompleteness Check\nConsistency Validation"]
    end

    subgraph LAYER3["LAYER 3: EVENT STREAMING"]
        K1["wildfire-nasa-firms\n4 partitions\nFire detections"]
        K2["wildfire-weather-processed\n8 partitions\nWeather data"]
        K3["wildfire-iot-sensors\n12 partitions\nSensor readings"]
        K4["wildfire-satellite-metadata\n4 partitions\nImage metadata"]
        K5["wildfire-weather-alerts\n4 partitions\nNWS alerts"]
        KAFKA_INFO["Apache Kafka Cluster\nZSTD Compression\n10,000 events/sec\nBinary imagery support"]
    end

    subgraph LAYER4["LAYER 4: MULTI-TIER STORAGE"]
        subgraph ONPREM["On-Premises Storage"]
            subgraph HOT_TIER["HOT TIER 0-7 days"]
                PG_PRIMARY["PostgreSQL Primary\nPostGIS Enabled\n50TB NVMe SSD\nQuery SLA: less than 100ms"]
                PG_REPLICA1["Standby Replica 1\nStreaming Replication\n30s lag"]
                PG_REPLICA2["Standby Replica 2\nStreaming Replication\n30s lag"]
            end
            subgraph WARM_TIER["WARM TIER 7-90 days"]
                MINIO1["MinIO Node 1\n50TB HDD\nParquet + Snappy"]
                MINIO2["MinIO Node 2\n50TB HDD\nErasure Coding"]
                MINIO3["MinIO Node 3\n50TB HDD\nN/2 Fault Tolerant"]
                MINIO4["MinIO Node 4\n50TB HDD\nQuery SLA: less than 500ms"]
            end
        end
        subgraph CLOUD["AWS Cloud Storage US-West-2"]
            subgraph COLD_TIER["COLD TIER 90-365 days"]
                S3_IA["S3 Standard-IA\n5PB Capacity\nMulti-AZ\nQuery SLA: less than 5s\n$0.004/GB/month"]
            end
            subgraph ARCHIVE_TIER["ARCHIVE TIER 365+ days"]
                GLACIER["S3 Glacier Deep Archive\n100PB Capacity\n12-hour retrieval\n7-year retention FISMA\n$0.00099/GB/month"]
            end
        end
        AIRFLOW["Apache Airflow\nLifecycle Orchestration\n4 DAGs"]
        BACKUP["Veeam Backup\nCross-Region Replication\nRPO: 15 min, RTO: 60 min"]
    end

    subgraph LAYER_AI["LAYER 5: ENSEMBLE AI ENGINE"]
        subgraph ML_FEATURES["Feature Engineering"]
            FEAT_PIPE["Feature Pipeline\n20+ Engineered Features\nFWI, NDVI, THI, Drought"]
            FEAT_STORE["Feature Store\nRedis + PostgreSQL\nSub-millisecond Cache"]
        end
        subgraph ML_MODELS["Model Ensemble — Parallel Inference"]
            RF["Random Forest\nBaseline Classifier\n100 Estimators\nWeight: 20%"]
            LSTM_M["LSTM Predictor\n3-Layer + Attention\n7-Day Forecast\nWeight: 25%"]
            CNN_M["CNN Satellite Analyzer\nU-Net Encoder-Decoder\n7-Ch Multi-Spectral\nWeight: 20%"]
            FIRESAT["FireSat Real-Time\nActive Fire Detection\nFRP + Proximity\nWeight: 35%"]
        end
        ENSEMBLE["Ensemble Meta-Learner\nConfidence-Adjusted\nWeighted Voting\n4 Models Combined"]
        subgraph ML_OUTPUT["Prediction & Serving"]
            PRED_PIPE["Prediction Pipeline\nAsync Priority Queue\n4 Workers\nSub-100ms Inference"]
            SPREAD["Fire Spread Model\n3D Risk Fields\nPerimeter Prediction\nEvacuation Zones"]
        end
        MODEL_REG["MLflow Registry\nExperiment Tracking\nModel Versioning"]
    end

    subgraph LAYER6["LAYER 6: API GATEWAY & SECURITY"]
        KONG["Kong API Gateway\nPort 8080\nRate Limiting: 1000 req/hr\nRequest Validation\nResponse Caching 15-min TTL"]
        SEC["Security Service\nPort 8005\nOAuth2/OIDC\nJWT Tokens"]
        RBAC["5-Role RBAC\nFire Chief\nAnalyst\nScientist\nAdmin\nField Responder"]
        MFA["Multi-Factor Auth\nTOTP Google Auth\nRequired: Admin + Scientist"]
        AUDIT["Comprehensive Audit\nAll Access Logged\nCorrelation IDs\nFISMA Compliant"]
    end

    subgraph LAYER7["LAYER 7: DATA CLEARING HOUSE"]
        DCH["Data Clearing House API\nPort 8006\n45+ REST Endpoints\nFastAPI Auto-Docs"]
        META["Metadata Catalog\nPort 8003\nDataset Discovery\nSchema Docs\nLineage Tracking"]
        QUAL_SVC["Data Quality Service\nPort 8004\nValidation Rules\nProfiling Reports\nAnomaly Detection"]
        VIZ["Visualization Service\nPort 8007\nChart.js, D3.js\nLeaflet Maps\nPlotly"]
        EXPORT["Export Service\nFormats:\nCSV, JSON, GeoJSON\nParquet, Shapefile"]
        REDIS["Redis Cache\nPort 6379\nHigh Hit Rate\n15-min TTL\nSession Management"]
    end

    subgraph LAYER8["LAYER 8: USER DASHBOARDS"]
        CHIEF["Fire Chief Dashboard\nPort 3001\n8 Widgets\nReal-Time Operations\nActive Fire Maps\nResource Status"]
        ANALYST["Data Analyst Portal\nPort 3002\n10 Widgets\nHistorical Trends\nStatistical Reports\nQuery Builder"]
        SCIENTIST["Data Scientist Workbench\nPort 3003\n12 Widgets\nJupyter Notebooks\nML Model Training\nAPI Explorer"]
        ADMIN_DASH["Admin Console\nPort 3004\nUser Management\nSystem Config\nAudit Logs\nPerformance Metrics"]
        INTEGRATIONS["External Integrations\nPower BI Connector\nEsri ArcGIS\nTableau\nPython/R SDK"]
    end

    subgraph LAYER9["LAYER 9: MONITORING & OPERATIONS"]
        GRAFANA["Grafana Dashboards\nPort 3010\n33+ KPIs\n5 Dashboards\nAuto-refresh: 30s"]
        PROM["Prometheus\nPort 9090\nMetrics Collection\nSystem + Business KPIs\nSLA Tracking"]
        ELK["Elasticsearch + Kibana\nPorts 9200, 5601\nCentralized Logging\nApplication + Access Logs\n90-day retention"]
        CLOUDWATCH["AWS CloudWatch\nCloud Metrics\nS3 + Glacier Monitoring\nCost Tracking"]
    end

    NASA --> CONN1
    NOAA --> CONN2
    COPER --> CONN3
    IOT --> CONN4
    LANDSAT --> CONN5

    CONN1 --> VAL
    CONN2 --> VAL
    CONN3 --> VAL
    CONN4 --> VAL
    CONN5 --> VAL

    VAL -->|Valid| QUAL
    VAL -->|Invalid| DLQ
    DLQ -->|Retry| VAL

    QUAL --> K1
    QUAL --> K2
    QUAL --> K3
    QUAL --> K4
    QUAL --> K5

    K1 -.-> KAFKA_INFO
    K2 -.-> KAFKA_INFO
    K3 -.-> KAFKA_INFO

    K1 --> PG_PRIMARY
    K2 --> PG_PRIMARY
    K3 --> PG_PRIMARY
    K4 --> PG_PRIMARY
    K5 --> PG_PRIMARY

    PG_PRIMARY --> PG_REPLICA1
    PG_PRIMARY --> PG_REPLICA2

    PG_PRIMARY -->|Day 7\nAirflow DAG| AIRFLOW
    AIRFLOW -->|Export Parquet\n78% compression| MINIO1
    MINIO1 <--> MINIO2
    MINIO2 <--> MINIO3
    MINIO3 <--> MINIO4

    MINIO1 -->|Day 90\nAirflow DAG| AIRFLOW
    AIRFLOW -->|Transfer to Cloud| S3_IA

    S3_IA -->|Day 365\nAirflow DAG| AIRFLOW
    AIRFLOW -->|Archive| GLACIER

    PG_PRIMARY -.->|Continuous| BACKUP
    MINIO1 -.->|Continuous| BACKUP
    BACKUP -.->|Cross-Region| S3_IA

    PG_PRIMARY --> KONG
    MINIO1 --> KONG
    S3_IA --> KONG
    GLACIER --> KONG

    KONG --> SEC
    SEC --> RBAC
    SEC --> MFA
    SEC --> AUDIT

    RBAC --> DCH
    RBAC --> META
    RBAC --> QUAL_SVC
    RBAC --> VIZ

    DCH --> REDIS
    DCH --> EXPORT
    META --> REDIS

    DCH --> CHIEF
    DCH --> ANALYST
    DCH --> SCIENTIST
    DCH --> ADMIN_DASH
    DCH --> INTEGRATIONS

    PG_PRIMARY --> PROM
    MINIO1 --> PROM
    KAFKA_INFO --> PROM
    DCH --> PROM
    KONG --> PROM

    S3_IA --> CLOUDWATCH
    GLACIER --> CLOUDWATCH

    PROM --> GRAFANA
    CLOUDWATCH --> GRAFANA

    PG_PRIMARY --> ELK
    DCH --> ELK
    KONG --> ELK
    AUDIT --> ELK

    KAFKA_INFO --> FEAT_PIPE
    PG_PRIMARY --> FEAT_PIPE
    MINIO1 --> CNN_M

    FEAT_PIPE --> FEAT_STORE
    FEAT_STORE --> RF
    FEAT_STORE --> LSTM_M
    FEAT_STORE --> CNN_M
    FEAT_STORE --> FIRESAT

    RF --> ENSEMBLE
    LSTM_M --> ENSEMBLE
    CNN_M --> ENSEMBLE
    FIRESAT --> ENSEMBLE

    ENSEMBLE --> PRED_PIPE
    ENSEMBLE --> SPREAD
    PRED_PIPE --> KONG

    ENSEMBLE --> PROM
    MODEL_REG -.-> GRAFANA
    PRED_PIPE --> ELK

    classDef layer1Style fill:#e74c3c,stroke:#c0392b,stroke-width:2px,color:#fff
    classDef layer2Style fill:#e67e22,stroke:#d35400,stroke-width:2px,color:#fff
    classDef layer3Style fill:#f39c12,stroke:#e67e22,stroke-width:2px,color:#fff
    classDef layer4HotStyle fill:#27ae60,stroke:#229954,stroke-width:2px,color:#fff
    classDef layer4WarmStyle fill:#e67e22,stroke:#d35400,stroke-width:2px,color:#fff
    classDef layer4ColdStyle fill:#3498db,stroke:#2980b9,stroke-width:2px,color:#fff
    classDef layer4ArchiveStyle fill:#9b59b6,stroke:#8e44ad,stroke-width:2px,color:#fff
    classDef layer5Style fill:#e84393,stroke:#d63031,stroke-width:2px,color:#fff
    classDef layer6Style fill:#00b894,stroke:#00a085,stroke-width:2px,color:#fff
    classDef layer7Style fill:#0984e3,stroke:#0770c9,stroke-width:2px,color:#fff
    classDef layer8Style fill:#6c5ce7,stroke:#5a4ad9,stroke-width:2px,color:#fff
    classDef aiModelStyle fill:#7c3aed,stroke:#6d28d9,stroke-width:2px,color:#fff
    classDef aiPipelineStyle fill:#0891b2,stroke:#0e7490,stroke-width:2px,color:#fff
    classDef aiEnsembleStyle fill:#dc2626,stroke:#b91c1c,stroke-width:2px,color:#fff
    classDef aiRegistryStyle fill:#ca8a04,stroke:#a16207,stroke-width:2px,color:#fff
    classDef orchestrationStyle fill:#fdcb6e,stroke:#f9b54a,stroke-width:2px,color:#000

    class NASA,NOAA,COPER,IOT,LANDSAT layer1Style
    class CONN1,CONN2,CONN3,CONN4,CONN5,VAL,DLQ,QUAL layer2Style
    class K1,K2,K3,K4,K5,KAFKA_INFO layer3Style
    class PG_PRIMARY,PG_REPLICA1,PG_REPLICA2 layer4HotStyle
    class MINIO1,MINIO2,MINIO3,MINIO4 layer4WarmStyle
    class S3_IA layer4ColdStyle
    class GLACIER layer4ArchiveStyle
    class KONG,SEC,RBAC,MFA,AUDIT layer5Style
    class DCH,META,QUAL_SVC,VIZ,EXPORT,REDIS layer6Style
    class CHIEF,ANALYST,SCIENTIST,ADMIN_DASH,INTEGRATIONS layer7Style
    class GRAFANA,PROM,ELK,CLOUDWATCH layer8Style
    class RF,LSTM_M,CNN_M,FIRESAT aiModelStyle
    class FEAT_PIPE,FEAT_STORE,PRED_PIPE,SPREAD aiPipelineStyle
    class ENSEMBLE aiEnsembleStyle
    class MODEL_REG aiRegistryStyle
    class AIRFLOW,BACKUP orchestrationStyle
                

Architecture Layers

Legend
Data Ingestion Layer
  • 6 active data source connectors (NASA, NOAA, Copernicus, IoT, Landsat)
  • Real-time, batch, and streaming ingestion modes
  • Avro schema validation with quality checks
  • Dead Letter Queue with automatic retry
  • Apache Kafka: 8 topics, 25 partitions
  • High throughput: 10,000 events/second sustained
  • ZSTD compression for latency reduction
Storage Layer
  • 4-tier hybrid architecture: HOT → WARM → COLD → ARCHIVE
  • Cost-optimized storage: $405/month for 10TB
  • HOT tier: <100ms query SLA on PostgreSQL+PostGIS
  • WARM tier: <500ms query SLA on MinIO Parquet
  • High compression ratio with Snappy
  • Fully automated lifecycle management (Apache Airflow)
  • High availability: RPO 15min, RTO 60min
  • 7-year retention compliance (FISMA, NIST 800-53)
Data Clearing House
  • 45+ REST API endpoints with FastAPI auto-documentation
  • 4 role-based dashboards: Fire Chief, Analyst, Scientist, Admin
  • 5-role RBAC with least privilege access control
  • OAuth2/OIDC authentication with MFA for elevated roles
  • Redis cache with configurable TTL
  • Export formats: CSV, JSON, GeoJSON, Parquet, Shapefile
  • High uptime with low API latency
  • Complete audit trail with FISMA compliance
  • Power BI, Esri ArcGIS, Tableau integration
Ensemble AI Engine
  • 4-model ensemble with confidence-adjusted weighted voting
  • CNN satellite + LSTM temporal + RF baseline + FireSat real-time
  • 20+ engineered features (FWI, NDVI, THI, drought index)
  • <100ms real-time inference pipeline
  • 7-day fire risk forecasting with temporal attention
  • 3D risk fields + fire perimeter prediction + evacuation zones
  • MLflow experiment tracking and model versioning
  • Confidence-adjusted prediction scoring

Key Differentiators

What Sets ClimaIQ Apart

  • Complete end-to-end implementation across all system layers
  • Production-ready features: DLQ, backpressure, Avro schemas, MFA
  • Meets and exceeds SLA targets with measurable results
  • Significant cost optimization while maintaining performance
  • Real-world data sources (not mock data)
  • Infrastructure as Code with Terraform
  • Comprehensive test coverage and code quality
  • 4-model ensemble AI with confidence-adjusted weighted voting
  • 7-day fire risk forecasting with LSTM temporal attention
  • Real-time satellite analysis via U-Net CNN (7-channel multi-spectral)