Skip to content

fix(bulkdata): #3854 soften polymorphic-lookup validation from ConfigError to warning#3981

Draft
jstvz wants to merge 2 commits into
devfrom
triage/fix-tranche1-3854
Draft

fix(bulkdata): #3854 soften polymorphic-lookup validation from ConfigError to warning#3981
jstvz wants to merge 2 commits into
devfrom
triage/fix-tranche1-3854

Conversation

@jstvz
Copy link
Copy Markdown
Contributor

@jstvz jstvz commented May 14, 2026

Summary

Fixes #3854.

PR #3741 (commit 2c5d0056e) tightened lookup-resolution validation in cumulusci/tasks/bulkdata/extract.py to raise ConfigError when a lookups target is not declared as a step. That check is too strict for polymorphic lookups whose targets legitimately resolve to multiple tables across configurations. This downgrades the strict raise to a logger.warning(...) so the extract proceeds and only emits an advisory.

Test plan

  • cumulusci/tasks/bulkdata/tests/test_extract.py::test_convert_lookups_to_id__unresolvable_lookups_warns_not_raises passes (new).
  • cumulusci/tasks/bulkdata/tests/test_extract.py::test_run__poly__incomplete_mapping updated to assert the warning rather than the prior ConfigError.
  • xfail marker on cumulusci/tests/triage/test_issue_3854.py removed in the GREEN commit; test now passes.
  • uv run pytest cumulusci/tasks/bulkdata/tests/test_extract.py -q clean.

Provenance

Reproduced and characterized in the v5 triage evidence pack (PR #3979). See docs/triage/v5/repro-results.md (### #3854) for the full narrative.

jstvz added 2 commits May 14, 2026 10:57
…ion false positive

Adds test_convert_lookups_to_id__unresolvable_lookups_warns_not_raises
to TestExtractData. It reproduces the issue from #3854 at the unit
level: a lookup column has 3 non-empty source rows but 0 of them
resolve to records in the related table (e.g. polymorphic lookups
whose source rows reference filtered-out records, or references to
standard org records like the Standard Pricebook that are
intentionally not extracted).

The validation introduced in PR #3741 / commit 2c5d005 currently
raises ConfigError in this scenario, which broke capture_sample_data
for common real-world configurations. The new test asserts no
exception is raised and that the diagnostic information is preserved
as a warning instead.

Expected to FAIL on this commit (RED step of TDD); the fix follows in
the next commit.

Refs: #3854
… to warning

Issue #3854 reported that ``capture_sample_data`` fails with::

    Total mapping operations (0) do not match total non-empty rows (3)
    for lookup_key: Pricebook2Id. Mention all related tables for
    lookup: Pricebook2Id

The strict validation was added in PR #3741 / commit 2c5d005 for
polymorphic lookups, but it fires whenever ANY lookup column has
source rows that do not resolve to records in the related table. This
is legitimate in many real-world configurations:

* polymorphic lookups whose source rows reference records that were
  filtered out of the extract (e.g. WhoId pointing to a Contact
  excluded by ``filters:``);
* lookups to standard org records that are intentionally not
  extracted (Pricebook2Id → Standard Pricebook, PricebookEntryId,
  FirstPublishLocationId, etc.).

Because ``total_mapping_operations`` cannot exceed ``total_rows`` in
the current implementation (each source row's lookup value is updated
in place at most once), the check can only ever fire on under-mapping
- exactly the scenario above. Workaround in the bug thread was to pin
to CumulusCI 3.84.1 (pre-validation); a second user (swirkens) flagged
this as a regression.

Fix: downgrade the strict ``raise ConfigError(...)`` to
``self.logger.warning(...)``. The diagnostic information is preserved
so users still see when lookups are under-resolved, but the extract
no longer aborts. This matches the philosophy of preferring partial
data over refusing to extract.

Approach (c) of the three options proposed in the triage narrative,
chosen because:

* Approach (a) (``>=``) is a no-op given the analysis above -
  ``total_mapping_operations`` can never exceed ``total_rows``.
* Approach (b) (skip only polymorphic lookups) does not address the
  reported root cause: the user's case (Pricebook2Id) is a
  non-polymorphic lookup.
* Approach (c) preserves the diagnostic value (which originally
  motivated PR #3741) while unblocking all legitimate use cases.

Source change: 2 files, ~28 LOC (net +18). The regression test added
in the previous commit (``test_convert_lookups_to_id__unresolvable_
lookups_warns_not_raises``) now passes. The existing
``test_run__poly__incomplete_mapping`` is updated to assert the
warning is logged rather than expecting a ``ConfigError``: the
"incomplete mapping" scenario it covers is the very scenario this
fix unblocks.

Fixes: #3854
Refs: PR #3741, commit 2c5d005
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant