A Data-Metadata Framework for Water Treatment Plants
Acquirium is a framework for storing, managing, querying, and integrating data and metadata for water treatment systems. It combines knowledge graphs and time series data to support analysis, monitoring, and experimentation.
From PyPI:
pip install acquiriumOptional extras for specific drivers:
pip install "acquirium[mqtt]" # MQTT ingestion driver
pip install "acquirium[xlsx]" # Excel ingestion driver
pip install "acquirium[watertap]" # WaterTAP simulation driverOr with uv:
uv pip install acquiriumFor development from a clone:
git clone https://github.com/DataDrivenCPS/acquirium.git
cd acquirium
python -m venv .venv && source .venv/bin/activate
pip install -e .
# or: uv syncAcquirium ships a single CLI entry point. Start the server and any configured drivers with:
acquirium server --config acquirium.tomlA sample acquirium.toml is included at the repository root. Key sections:
[server]— bind host/port, choice of timeseries backend (DuckDB or TimescaleDB), data directory.[driver]— connection defaults applied to all drivers (server URL, port, tick interval).[[drivers]]— drivers to start alongside the server.
By default the server stores data on local disk — an embedded Oxigraph RDF store and a single DuckDB file under data_dir. No external services are required for a fresh install. For multi-worker or production deployments, switch the config to timeseries_backend = "timescale" and point pg_dsn at a Postgres + TimescaleDB instance.
Override the bind host/port from the CLI if needed:
acquirium server --config acquirium.toml --host 127.0.0.1 --port 8000
acquirium server --config acquirium.toml --reload # uvicorn auto-reloadTo run only [[drivers]] against a remote Acquirium server (no FastAPI on this host), set:
[server]
enabled = falseand configure [driver].server_url / server_port to point at the remote instance. Then:
acquirium server --config acquirium.tomlWhen enabled = false, the server subcommand starts only the drivers.
A compose.yaml is provided for an all-in-one local stack (Acquirium + TimescaleDB + Grafana):
make up # start
make up ACQUIRIUM_RECREATE=true # wipe data + start
make down # stopBy default each Docker run resets the system. To preserve data across runs, set
ACQUIRIUM_RECREATE=falseincompose.yaml.
The watertap extra installs the Python packages needed for the built-in WaterTAP driver:
pip install "acquirium[watertap]"
acquirium server --config acquirium.toml # with a [[drivers]] entry for WaterTAPSome WaterTAP setups also require native extensions that are not installed by the extra:
pyomo download-extensions
python -m pip install setuptools && pyomo build-extensions
idaes get-extensionsFor a full demo (WaterTAP + streaming simulator + API examples):
make watertap-up
uv run scripts/api_example.py
# or open notebooks/watertap-single-pump.ipynb
make watertap-downAcquirium supports user logs attached to entities in the system. See scripts/logging_example.py:
acquirium server --config acquirium.toml &
python scripts/logging_example.pyAcquirium uses a text matcher to map natural-language input to ontology URIs (classes, predicates, units, quantity kinds). The match algorithm uses semantic embedding similarity powered by FastEmbed (default model: BAAI/bge-small-en-v1.5). Each ontology concept is represented by one or more surface strings, embedded and stored in an in-memory vector index. At query time the input phrase is embedded and compared against the index using cosine similarity.
There are two separate matchers, each with its own index:
- Graph matcher — indexes classes and predicates from user-inserted RDF graphs. Surface strings are derived from
rdfs:labelvalues and CamelCase/underscore-split local names. - QUDT matcher — indexes units and quantity kinds from the QUDT ontology, which ships bundled inside the
acquiriumpackage and is registered at the versionless canonical IRIshttps://qudt.org/vocab/unitandhttps://qudt.org/vocab/quantitykind. Override either by adding a{ source = "...", as = "<canonical IRI>" }entry to[ontologies] sourcesinacquirium.toml. Surface strings includerdfs:label,skos:prefLabel,skos:altLabel, symbols, UCUM codes, and split local names.
Both indexes are cached to disk and updated incrementally when graphs change. Results can be filtered by kind (class, predicate, unit, quantity_kind) and are ranked by cosine similarity, deduplicated to the highest-scoring surface per URI. See scripts/text_matcher_example.py for usage.
pytest tests/unit # unit tests only
make test # full suite (Docker required)Acquirium is under active development. Planned work is tracked in improvements.md. Bug reports and feature requests are welcome — please open an issue.