Canonical analytics-core package for OpenLMIS/OLMIS, consumed by the reporting-stack platform.
This package provides the Debezium connector config, dbt models/tests, and Superset assets for baseline OpenLMIS reporting. Adopters fork or extend it for their own deployments; the reference country extension lives in mw-reporting.
olmis-analytics-core/
manifest.yaml # Package metadata (name, type, includes)
connect/
openlmis-postgres-cdc.json # Debezium connector config template
dbt/
dbt_project.yml # dbt package config
models/
staging/
stg_facilities.sql # Current-state: referencedata.facilities
stg_programs.sql # Current-state: referencedata.programs
stg_geographic_zones.sql # Current-state: referencedata.geographic_zones
stg_requisitions.sql # Current-state: requisition.requisitions
schema.yml # Tests for all staging models
marts/
mart_facility_directory.sql # Facilities + geographic zone hierarchy
mart_requisition_summary.sql # Requisitions + facility/program/zone names
schema.yml # Tests for all mart models
superset/
assets/
metadata.yaml # Bundle metadata
databases/reporting_clickhouse.yaml # ClickHouse connection (no password)
datasets/reporting_clickhouse/
mart_requisition_summary.yaml # Dataset on the requisition mart
charts/requisitions_by_status.yaml # Pie chart: requisitions by status
dashboards/olmis_requisition_overview.yaml # Dashboard with the chart
Declares the package identity and what it provides:
name: olmis-analytics-core
type: core # core package — provides ingestion config
platform_version: ">=1.0.0"
includes:
- connect # Debezium connector config
- dbt # dbt models and tests
- superset # Superset dashboard assetsSee the platform's docs/architecture.md for the full schema.
connect/openlmis-postgres-cdc.json is a Debezium PostgreSQL connector template. Environment variables (${SOURCE_PG_HOST}, ${SOURCE_PG_PASSWORD}, ${DEBEZIUM_TOPIC_PREFIX}, etc.) are substituted at registration time via envsubst.
Key settings:
- JSON converters for ClickHouse compatibility
- Snapshot mode templated as
${DEBEZIUM_SNAPSHOT_MODE}(defaultwhen_needed— runs a fresh snapshot when stored offsets point to an LSN no longer in WAL, otherwise behaves likeinitial). Override tono_datafor the bootstrap initial-load workflow where data is preloaded viamake bootstrap-importand Debezium should only record the LSN baseline. - Source signal channel enabled (
signal.enabled.channels=source,signal.data.collection=public.debezium_signal) — Debezium reads signal rows for ad-hoc incremental snapshots triggered bymake snapshot-tables. - Heartbeat every 10 seconds to
public.reporting_heartbeat - Table allowlist from
${SOURCE_PG_TABLE_ALLOWLIST}(the signal table is auto-appended byregister-connector.shso users only manage data tables)
To customize for your system: copy this file, change the table allowlist and topic prefix, adjust decimal/time handling if needed.
Each staging model reconstructs current state from the append-only CDC event stream using a ranked CTE pattern:
- Partition by primary key (from JSON
afterpayload) - Order by
ts_ms desc, _ingested_at desc(latest event wins) - Filter out deletes (
op != 'd') - Extract typed columns with
JSONExtractString,JSONExtractBool,toUUID
| Model | Source table | Primary key |
|---|---|---|
stg_facilities |
referencedata.facilities |
id (UUID) |
stg_programs |
referencedata.programs |
id (UUID) |
stg_geographic_zones |
referencedata.geographic_zones |
id (UUID), with parent_id self-join |
stg_requisitions |
requisition.requisitions |
id (UUID), with FK to facilities and programs |
Marts join staging views into analytics-ready ClickHouse MergeTree tables:
| Model | Description | Key joins |
|---|---|---|
mart_facility_directory |
Facilities enriched with geographic zone hierarchy | stg_facilities → stg_geographic_zones (zone + parent zone) |
mart_requisition_summary |
Requisitions with facility, program, and zone names | stg_requisitions → stg_facilities → stg_geographic_zones + stg_programs |
All models have tests in their respective schema.yml:
- Integrity:
not_null+uniqueon primary keys - Relationships: foreign keys validated (e.g.,
facility_id→stg_facilities.id) - Accepted values: enumeration fields (e.g., requisition status)
The superset/assets/ directory contains a complete asset bundle:
- Database: ClickHouse connection to the
curatedschema (password omitted per secrets policy — patched at import time from env vars) - Dataset:
mart_requisition_summarywith typed columns and aCOUNT(*)metric - Chart: "Requisitions by Status" pie chart
- Dashboard: "OLMIS Requisition Overview" containing the chart
Assets use stable UUIDs for idempotent re-imports. Import with make superset-import.
To create a core package for a different adopter system:
- Copy this directory as a starting point
- Update the connector config: change the table allowlist, topic prefix, and database connection settings
- Replace the staging models: one per source table, using the ranked CTE pattern for current-state reconstruction
- Design your marts: join staging models into the tables your dashboards need
- Add tests:
not_null/uniqueon PKs,relationshipson FKs,accepted_valueson enums - Create Superset assets: author charts/dashboards in the UI, export as YAML, commit to
superset/assets/ - Set
ANALYTICS_CORE_PATHin.envto point to your package
For detailed guidance on each step, see the platform's docs/usage-guide.md.
Clone this repository, then point the platform's .env at wherever you cloned it:
# absolute path or relative to the platform repo root
ANALYTICS_CORE_PATH=/path/to/openlmis-reporting