Ingestion module for foundational DP#276
Conversation
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces the cdf_ingestion_foundation module, which provides a framework for orchestrating two-phase ingestion workflows (population and contextualization) across PI, OPC-UA, and SAP source systems. It includes a Python generator script to build workflow versions from task snippets, various SQL transformations, and authorization group definitions. Review feedback primarily addresses Python style guide violations in the build script—such as import sorting, the need for typed data structures (dataclasses/Pydantic), and proper logging—as well as security recommendations to restrict overly broad wildcard scopes in the authorization group capabilities.
7e0e65e to
7425757
Compare
0347e26 to
c6ed831
Compare
7425757 to
a14c818
Compare
cd31c09 to
34ed237
Compare
BergsethCognite
left a comment
There was a problem hiding this comment.
Should we remove the Transformation examples here - maybe just use this module to set up the generic auth groups?
depends on PR #275
This module owns the ingestion workflow, transformation definitions, auth groups, and data model configuration tooling for the Foundation Deployment Pack. It lives under
modules/common/and is registered inpackages.tomlas part ofdp:foundation.What's included
Two-phase workflow driven entirely by config flags — no YAML editing required when toggling a source on or off:
Which phases and tasks are included is controlled by enabledSources, enabledContextualization, and dataModelVariant in default.config.yaml.
Key files
scripts/build_workflow.py— generateswf_ingestion_v1.WorkflowVersion.yamlfrom per-task snippets based on the active config. Run with --check in CI to detect drift.scripts/configure_datamodel.py— writes DM-variant variable overrides (schemaSpace, view names, instanceSpace) into all discoveredconfig.<env>.yamlfiles, covering both contextualization modules (cdf_entity_matching,cdf_file_annotation) and source system modules (cdf_pi_foundation,cdf_sap_foundation,cdf_opcua_foundation,cdf_files_foundation).