OSIDB-4923 - Add helper function to pull cpe list by darakian · Pull Request #1253 · RedHatProductSecurity/osidb

darakian · 2026-04-23T21:25:39Z

Adds a helper function to extract nvd cpe data and adds it to the list of data returned Partial implementation for OSIDB-4923

I didn't see where the function response2result is being tested, so I didn't add a test, but happy to do so if that can be pointed out.

This PR does not store the data elsewhere. The eventual location will be one of the new fields discussed in OSIDB-4923

Adds a helper function to extract nvd cpe data and adds it to the list of data returned Partial implementation for OSIDB-4923

coderabbitai · 2026-04-23T21:25:51Z

Warning

Rate limit exceeded

@darakian has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 47 minutes and 15 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: e19c05c2-280d-4c8e-8ec6-3f1aaefcfe09

📥 Commits

Reviewing files that changed from the base of the PR and between a36d034 and 9169a82.

📒 Files selected for processing (1)

collectors/nvd/tests/test_collectors.py

📝 Walkthrough

Walkthrough

Adds extraction of CPE "criteria" values from NVD CVE records into per-CVE results under a new nvd_cpes field, and adds a VCR-backed integration test for CPE loading (the test currently contains an unconditional failing assertion).

Changes

Cohort / File(s)	Summary
NVD CPE Extraction `collectors/nvd/collectors.py`	Adds local `get_cpe_list` helper to iterate `data.cpe` when present and return an empty list otherwise; `response2result` now includes a new `"nvd_cpes"` key containing the extracted CPE criteria list for each CVE.
Tests — NVD integration `collectors/nvd/tests/test_collectors.py`	Adds a VCR-backed test `test_cpe_load` that sets up a `Flaw` for `CVE-2020-1234`, runs NVD collection, checks for a `cpe` attribute on the resulting model, but ends with an unconditional `assert False` (test forced to fail).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main change: adding a helper function to extract and include CPE data from NVD responses.
Description check	✅ Passed	The description is directly related to the changeset, explaining the purpose of the helper function and its integration into the NVD data extraction process.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

collectors/nvd/collectors.py (1)

68-80: Consider simplifying and deduplicating CPEs.

Two optional improvements:

The helper can be collapsed into a comprehension; the outer len(data.cpe) > 0 guard is redundant with the final emptiness check.
A single CVE can reference the same criteria across multiple configurations/nodes, producing duplicates. Unless downstream consumers rely on multiplicity, deduping here avoids surprises once this is persisted per OSIDB-4923.

♻️ Proposed simplification

-        def get_cpe_list(data: CVE) -> Optional[List[str]]:
-            """
-            Return a list of CPEs from the CVE `data`
-            """
-            cpe_list = []
-            if "cpe" in data and len(data.cpe) > 0:
-                for entry in data.cpe:
-                    cpe_list.append(entry.criteria)
-
-            if len(cpe_list) > 0:
-                return cpe_list
-            else:
-                return None
+        def get_cpe_list(data: CVE) -> Optional[List[str]]:
+            """
+            Return a deduplicated list of CPE criteria strings from the CVE `data`.
+            """
+            if "cpe" not in data:
+                return None
+            cpe_list = list({entry.criteria for entry in data.cpe})
+            return cpe_list or None

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@collectors/nvd/collectors.py` around lines 68 - 80, The get_cpe_list helper
should be simplified and deduplicated: replace the manual loop and redundant
len(data.cpe) guard in get_cpe_list with a comprehension that extracts
entry.criteria for each entry in data.cpe, then deduplicate (e.g., via an
ordered set or dict.fromkeys) and return the deduped list or None if empty;
update references to data.cpe and entry.criteria accordingly so callers still
receive Optional[List[str]].

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@collectors/nvd/collectors.py`:
- Around line 68-80: The get_cpe_list helper should be simplified and
deduplicated: replace the manual loop and redundant len(data.cpe) guard in
get_cpe_list with a comprehension that extracts entry.criteria for each entry in
data.cpe, then deduplicate (e.g., via an ordered set or dict.fromkeys) and
return the deduped list or None if empty; update references to data.cpe and
entry.criteria accordingly so callers still receive Optional[List[str]].

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 5c96f460-e084-4870-8c77-bd1f571c6b8e

📥 Commits

Reviewing files that changed from the base of the PR and between a71cec7 and 7c1734f.

📒 Files selected for processing (1)

collectors/nvd/collectors.py

Jincxz

Some comments regarding the helper function.

There aren't tests directly for response2result but there are tests for the NVD collector at https://github.com/RedHatProductSecurity/osidb/blob/master/collectors/nvd/tests/test_collectors.py

You can add tests for the NVDQuerier, and response2result, there if you would like. Testing the collectors and the data being written to flaw(s) will have to come after the field is added.

Jincxz · 2026-04-23T22:37:15Z

+            Return a list of CPEs from the CVE `data`
+            """
+            cpe_list = []
+            if "cpe" in data and len(data.cpe) > 0:


This can be reduced to hasattr(data, "cpe") since we are checking if the field exists. An empty data.cpe will just mean that the following loop wouldn't run.

Oh good call. Changed that in fa056a0

Jincxz · 2026-04-23T22:38:45Z

+            if len(cpe_list) > 0:
+                return cpe_list
+            else:
+                return None


I think it should be fine with just returning the cpe_list as is (potentially an empty list). The new field will likely have a default of an empty list and a None value might produce an unexpected result when writing to it. Will make adjustments later if that is no longer the case.

Fair enough. Updated in bb2d409

Elkasitu · 2026-04-27T09:21:05Z

+            Return a list of CPEs from the CVE `data`
+            """
+            cpe_list = []
+            if "cpe" in data and len(data.cpe) > 0:


cpe doesn't seem to be a valid top-level attribute per NVD's Vulnerability API schema, I would add a test or two as Jin suggested.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

collectors/nvd/tests/test_collectors.py (1)
239-273: ⚠️ Potential issue | 🔴 Critical

Test is broken in multiple critical ways and also breaks the next test.

This new test is non-functional, and its placement also breaks the test that previously consumed the parametrize decorator:

Misplaced @pytest.mark.parametrize (lines 239–259): This decorator was originally attached to test_reset_flag_on_removal and parametrizes old_flag, new_flag. By inserting test_cpe_load between the decorator and test_reset_flag_on_removal, the parametrize now applies to test_cpe_load(self) (which doesn’t accept those params → pytest collection error) while test_reset_flag_on_removal loses its parametrization (its old_flag, new_flag parameters become unresolved fixtures and the test fails). Move the new test below test_reset_flag_on_removal, or above the parametrize block.

assert False (line 273): unconditional failure — must be removed before merge.

assert hasattr(flaw, "cpe") (line 272): the change in collectors/nvd/collectors.py only adds nvd_cpes to a transient dict in response2result; nothing in this PR persists CPEs onto the Flaw model, and per the codebase the only cpe field lives on PSModule, not Flaw. As written, this assertion will always fail. Either assert against the collector’s returned dict (e.g., nvdc.get_cve(cve_id) and inspect nvd_cpes), or defer the test until the storage field on Flaw from OSIDB-4923 lands.

Missing @pytest.mark.enable_signals: every other VCR-backed collect-driven test in this class uses it; without it, behavior may diverge from the other integration tests.
🔧 Suggested restructuring
-    `@pytest.mark.parametrize`(
-        "old_flag,new_flag",
-        [
-            (
-                Flaw.FlawNistCvssValidation.APPROVED,
-                Flaw.FlawNistCvssValidation.APPROVED,
-            ),
-            (
-                Flaw.FlawNistCvssValidation.REJECTED,
-                Flaw.FlawNistCvssValidation.APPROVED,
-            ),
-            (
-                Flaw.FlawNistCvssValidation.REQUESTED,
-                Flaw.FlawNistCvssValidation.APPROVED,
-            ),
-            (
-                Flaw.FlawNistCvssValidation.NOVALUE,
-                Flaw.FlawNistCvssValidation.NOVALUE,
-            ),
-        ],
-    )
-    `@pytest.mark.vcr`
-    def test_cpe_load(self):
-        """
-        Test that CPE values are correctly loaded in the Flaw model.
-        """
-        cve_id = "CVE-2020-1234"
-        FlawFactory(cve_id=cve_id)
-
-        nvdc = NVDCollector()
-        nvdc.collect(cve_id)
-
-        flaw = Flaw.objects.get(cve_id=cve_id)
-        assert hasattr(flaw, "cpe")
-        assert False
-
-    def test_reset_flag_on_removal(self, old_flag, new_flag):
+    `@pytest.mark.vcr`
+    `@pytest.mark.enable_signals`
+    def test_cpe_load(self):
+        """
+        Test that CPE values are correctly extracted by the NVD querier.
+        """
+        cve_id = "CVE-2020-1234"
+
+        nvdc = NVDCollector()
+        result = nvdc.get_cve(cve_id)
+        assert result, "no result returned for CVE"
+        assert "nvd_cpes" in result[0]
+        assert isinstance(result[0]["nvd_cpes"], list)
+
+    `@pytest.mark.parametrize`(
+        "old_flag,new_flag",
+        [
+            (Flaw.FlawNistCvssValidation.APPROVED, Flaw.FlawNistCvssValidation.APPROVED),
+            (Flaw.FlawNistCvssValidation.REJECTED, Flaw.FlawNistCvssValidation.APPROVED),
+            (Flaw.FlawNistCvssValidation.REQUESTED, Flaw.FlawNistCvssValidation.APPROVED),
+            (Flaw.FlawNistCvssValidation.NOVALUE, Flaw.FlawNistCvssValidation.NOVALUE),
+        ],
+    )
+    def test_reset_flag_on_removal(self, old_flag, new_flag):
Re: the PR description — response2result is exercised indirectly by every nvdc.collect(...) test in this file (e.g., test_collect_cve, test_collect_batch); a focused unit test can be added by calling NVDQuerier().get_cve("CVE-2020-1234") against a recorded cassette and asserting on nvd_cpes directly, without needing the full collector pipeline.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@collectors/nvd/tests/test_collectors.py` around lines 239 - 273, The new
test_cpe_load breaks pytest collection and is incorrect: move test_cpe_load so
it does not sit between the pytest.mark.parametrize(...) block and
test_reset_flag_on_removal (or relocate the parametrize above/below the new
test) so the parametrize still applies to test_reset_flag_on_removal; remove the
stray assert False; replace the failing Flaw assertion (hasattr(flaw, "cpe"))
with an assertion against the collector result (call
NVDCollector().get_cve(cve_id) or use NVDQuerier().get_cve("CVE-2020-1234") and
assert on the returned nvd_cpes key); and add `@pytest.mark.enable_signals` and
`@pytest.mark.vcr` to the test to match other collect tests.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@collectors/nvd/tests/test_collectors.py`:
- Around line 239-273: The new test_cpe_load breaks pytest collection and is
incorrect: move test_cpe_load so it does not sit between the
pytest.mark.parametrize(...) block and test_reset_flag_on_removal (or relocate
the parametrize above/below the new test) so the parametrize still applies to
test_reset_flag_on_removal; remove the stray assert False; replace the failing
Flaw assertion (hasattr(flaw, "cpe")) with an assertion against the collector
result (call NVDCollector().get_cve(cve_id) or use
NVDQuerier().get_cve("CVE-2020-1234") and assert on the returned nvd_cpes key);
and add `@pytest.mark.enable_signals` and `@pytest.mark.vcr` to the test to match
other collect tests.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Enterprise

Run ID: 3b8d98bd-d053-422a-a43c-c1790e0980bd

📥 Commits

Reviewing files that changed from the base of the PR and between 7c1734f and a36d034.

📒 Files selected for processing (2)

collectors/nvd/collectors.py
collectors/nvd/tests/test_collectors.py

JakubFrejlach · 2026-04-28T11:16:57Z

+    @pytest.mark.vcr
+    def test_cpe_load(self):


This test won't work without the actual cassette recorded. https://github.com/RedHatProductSecurity/osidb/blob/master/docs/developer/DEVELOP.md#testing-external-apis

darakian · 2026-04-28T18:24:36Z

Closing this out as the work has been de-prioritized. NVDs reduced scope and unclear SLA make this less viable for our goals.

Add helper function to pull cpe list

7c1734f

Adds a helper function to extract nvd cpe data and adds it to the list of data returned Partial implementation for OSIDB-4923

coderabbitai Bot reviewed Apr 23, 2026

View reviewed changes

Jincxz self-requested a review April 23, 2026 21:29

Jincxz requested changes Apr 23, 2026

View reviewed changes

Elkasitu reviewed Apr 27, 2026

View reviewed changes

darakian added 4 commits April 27, 2026 11:19

Return an empty list instead of a None value

bb2d409

reduced cpe field check

fa056a0

Remove optional value

6bb043b

Stub test. Need to work test on another machine

a36d034

coderabbitai Bot reviewed Apr 27, 2026

View reviewed changes

Move test and look for length for the moment

9169a82

Elkasitu added the technical For PRs that introduce changes not worthy of a CHANGELOG entry label Apr 28, 2026

JakubFrejlach reviewed Apr 28, 2026

View reviewed changes

darakian closed this Apr 28, 2026

Conversation

darakian commented Apr 23, 2026

Uh oh!

coderabbitai Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Estimated code review effort

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Jincxz left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Jincxz Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

darakian Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Jincxz Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

darakian Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

Elkasitu Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

JakubFrejlach Apr 28, 2026

Choose a reason for hiding this comment

Uh oh!

darakian commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

coderabbitai Bot commented Apr 23, 2026 •

edited

Loading

Jincxz left a comment •

edited

Loading