Skip to content

Ingestion module for foundational DP#276

Open
Aashutosh-cognite wants to merge 6 commits into
foundational-dp-cleanupfrom
foundational-dp-ingestion
Open

Ingestion module for foundational DP#276
Aashutosh-cognite wants to merge 6 commits into
foundational-dp-cleanupfrom
foundational-dp-ingestion

Conversation

@Aashutosh-cognite
Copy link
Copy Markdown
Contributor

@Aashutosh-cognite Aashutosh-cognite commented May 20, 2026

depends on PR #275

This module owns the ingestion workflow, transformation definitions, auth groups, and data model configuration tooling for the Foundation Deployment Pack. It lives under modules/common/ and is registered in packages.toml as part of dp:foundation.

What's included

Two-phase workflow driven entirely by config flags — no YAML editing required when toggling a source on or off:

  • Phase 1 (Population) — transformation tasks for PI, OPC-UA, and SAP run in parallel, landing data into the active DM views (ISATimeSeries, ISAAsset, Equipment, WorkOrder, Operation for ISA; FunctionalLocation, TimeSeriesData, Files for CFIHOS).
  • Phase 2 (Contextualization) — relationship transforms run after population completes, setting Equipment.asset and Operation.workOrder properties.

Which phases and tasks are included is controlled by enabledSources, enabledContextualization, and dataModelVariant in default.config.yaml.

Key files

  • scripts/build_workflow.py — generates wf_ingestion_v1.WorkflowVersion.yaml from per-task snippets based on the active config. Run with --check in CI to detect drift.
  • scripts/configure_datamodel.py — writes DM-variant variable overrides (schemaSpace, view names, instanceSpace) into all discovered config.<env>.yaml files, covering both contextualization modules (cdf_entity_matching, cdf_file_annotation) and source system modules (cdf_pi_foundation, cdf_sap_foundation, cdf_opcua_foundation, cdf_files_foundation).

@gemini-code-assist
Copy link
Copy Markdown

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize the Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counterproductive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

@Aashutosh-cognite
Copy link
Copy Markdown
Contributor Author

/gemini review

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the cdf_ingestion_foundation module, which provides a framework for orchestrating two-phase ingestion workflows (population and contextualization) across PI, OPC-UA, and SAP source systems. It includes a Python generator script to build workflow versions from task snippets, various SQL transformations, and authorization group definitions. Review feedback primarily addresses Python style guide violations in the build script—such as import sorting, the need for typed data structures (dataclasses/Pydantic), and proper logging—as well as security recommendations to restrict overly broad wildcard scopes in the authorization group capabilities.

Comment thread modules/common/cdf_ingestion_foundation/scripts/build_workflow.py
Comment thread modules/common/cdf_ingestion_foundation/scripts/build_workflow.py Outdated
Comment thread modules/common/cdf_ingestion_foundation/scripts/build_workflow.py Outdated
Comment thread modules/common/cdf_ingestion_foundation/scripts/build_workflow.py
@Aashutosh-cognite Aashutosh-cognite force-pushed the foundational-dp-cleanup branch from 7e0e65e to 7425757 Compare May 21, 2026 05:07
@Aashutosh-cognite Aashutosh-cognite force-pushed the foundational-dp-ingestion branch from 0347e26 to c6ed831 Compare May 21, 2026 05:08
@Aashutosh-cognite Aashutosh-cognite force-pushed the foundational-dp-cleanup branch from 7425757 to a14c818 Compare May 22, 2026 03:08
@Aashutosh-cognite Aashutosh-cognite requested a review from a team as a code owner May 22, 2026 03:08
Copy link
Copy Markdown
Contributor

@BergsethCognite BergsethCognite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the Transformation examples here - maybe just use this module to set up the generic auth groups?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants