test(source-hubspot): add synthetic CRM Search pagination boundary tests by ZaneHyattAB · Pull Request #81335 · airbytehq/airbyte

ZaneHyattAB · 2026-06-30T16:45:19Z

What

Investigation of OC airbytehq/oncall#12852: HubSpot contacts missing/duplicate records at CRM Search pagination boundaries.

This PR adds tests only — no production code changes. The tests document and prove suspected failure modes in the current master code. Some tests intentionally assert broken behavior (e.g., master returning None when it should continue paginating). These are not "passing because master is correct" — they pass because they assert what master currently does, which is believed to be wrong.

Two prior draft PRs proposed independent fixes:

airbytehq/airbyte#79666: hypothesized that int(id)+1 with GTE creates lexicographic gaps at the 10k boundary
airbytehq/airbyte#80745: hypothesized that short/empty pages with a valid paging.next.after cursor cause premature pagination stop

How

34 synthetic unit tests in test_crm_search_pagination_boundaries.py, organized into two parts:

Part A — Connector-code tests (exercise real HubspotCRMSearchPaginationStrategy):

A.1: Short/empty page early-stop — documents that master stops when last_page_size < page_size even if paging.next.after exists (the #80745 bug)
A.2: 10k boundary int(id)+1 conversion — documents master's ID conversion behavior with various ID shapes (the #79666 bug)
A.3: Same-timestamp records at page/10k boundaries
A.4: RECORDS_LIMIT threshold edge cases
A.5: Full pagination sequences with patched RECORDS_LIMIT
A.6: Combined scenario showing how the short-page bug masks the 10k boundary bug
A.7: Verification that #80745 does NOT fix the ID gap (the two bugs are independent)
A.8: String cursor edge case

Part B — Pure string/lexicographic demonstrations (no connector code called):

B.1: Proves GTE str(int("7198")+1) = GTE "7199" skips IDs like "719869649082" (supports #79666 hypothesis)
B.2: Manifest template rendering creates string comparison
B.3: Adjacent 30-day slice boundaries — confirms no overlap or gap (slice logic is NOT a source of missing records)

What the tests prove vs. what they do not prove

Proved:

Master stops pagination prematurely on short/empty pages with valid cursors
Master's int(id)+1 + GTE creates lexicographic gaps with mixed-length IDs
The two bugs are independent — fixing one does not fix the other
The short-page bug can mask the ID-gap bug (fires first, prevents reaching 10k boundary)
30-day slice boundaries are NOT a source of missing/duplicate records

Not proved (would require real API testing):

Whether HubSpot actually returns short pages with valid cursors in production
The exact frequency of mixed-length ID scenarios in real portals
Whether other undiscovered boundary conditions exist

Review guide

airbyte-integrations/connectors/source-hubspot/unit_tests/test_crm_search_pagination_boundaries.py — the only file changed

User Impact

No user impact — tests only, no production code changes.

Can this PR be safely reverted and rolled back?

YES 💚

Requested by: ZaneHyattAB

Link to Devin session: https://app.devin.ai/sessions/8024cac44d9249e1ad4a6011e17b27df

Adds comprehensive unit tests covering HubSpot CRM Search pagination boundary behavior to investigate OC #12852 (missing/duplicate contacts). Tests cover: - Short page + paging.next.after present (PR #80745 bug) - Empty page + paging.next.after present - 10k boundary int() conversion with various ID shapes - Lexicographic gap demonstration (PR #79666 hypothesis) - Same-timestamp records at pagination boundaries - Adjacent 30-day slice boundary behavior (GTE/LTE filters) - Full pagination sequence with patched RECORDS_LIMIT - Combined bug interaction scenarios - Verification of both PR hypotheses Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

devin-ai-integration · 2026-06-30T16:45:23Z

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

Disable automatic comment, CI, and merge conflict monitoring

github-actions · 2026-06-30T16:45:48Z

👋 Greetings, Airbyte Team Member!

Here are some helpful tips and reminders for your convenience.

💡 Show Tips and Tricks

PR Slash Commands

Airbyte Maintainers (that's you!) can execute the following slash commands on your PR:

🛠️ Quick Fixes
- /format-fix - Fixes most formatting issues.
- /bump-version - Bumps connector versions, scraping changelog description from the PR title.
  - Bump types: patch (default), minor, major, major_rc, rc, promote.
  - The rc type is a smart default: applies minor_rc if stable, or bumps the RC number if already RC.
  - The promote type strips the RC suffix to finalize a release.
  - Example: /bump-version type=rc or /bump-version type=minor
- /bump-progressive-rollout-version - Alias for /bump-version type=rc. Bumps with an RC suffix and enables progressive rollout.
❇️ AI Testing and Review (internal link: AI-SDLC Docs):
- /ai-prove-fix - Runs prerelease readiness checks, including testing against customer connections.
- /ai-canary-prerelease - Rolls out prerelease to 5-10 connections for canary testing.
- /ai-review - AI-powered PR review for connector safety and quality gates.
📝 AI Documentation:
- /ai-docs-review - AI-powered documentation review for PRs with connector changes.
- /ai-create-docs-pr - Creates a documentation PR for connector changes, stacked on the current PR.
🚀 Connector Releases:
- /publish-connectors-prerelease - Publishes pre-release connector builds (tagged as {version}-preview.{git-sha}) for all modified connectors in the PR.
☕️ JVM connectors:
- /update-connector-cdk-version connector=<CONNECTOR_NAME> - Updates the specified connector to the latest CDK version.
  Example: /update-connector-cdk-version connector=destination-bigquery
🐍 Python connectors:
- /poe connector source-example lock - Run the Poe lock task on the source-example connector, committing the results back to the branch.
- /poe source example lock - Alias for /poe connector source-example lock.
- /poe source example use-cdk-branch my/branch - Pin the source-example CDK reference to the branch name specified.
- /poe source example use-cdk-latest - Update the source-example CDK dependency to the latest available version.
⚙️ Admin commands:
- /force-merge reason="<REASON>" - Force merges the PR using admin privileges, bypassing CI checks. Requires a reason.
  Example: /force-merge reason="CI is flaky, tests pass locally"

📚 Show Repo Guidance

Helpful Resources

Breaking Changes Guide - Breaking changes, migration guides, and upgrade deadlines
Developing Connectors Locally
Managing Connector Secrets
On-Demand Regression Tests
#connector-ci-issues
#connector-publish-updates
#connector-build-statuses

📝 Edit this welcome message.

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

github-actions · 2026-06-30T17:18:15Z

`source-hubspot` Connector Test Results

240 tests 232 ✅ 12m 57s ⏱️
2 suites 8 💤
2 files 0 ❌

Results for commit 3fcaabb.

♻️ This comment has been updated with latest results.

- Split tests into Part A (connector code) and Part B (string demos) - Rename confusing test names (e.g. 'create_overlap' -> 'no_overlap_and_no_gap') - Add docstrings linking each group to PR #79666 or #80745 - Clarify that tests document broken master behavior, not correct behavior - Use owner/repo#number references consistently Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

ZaneHyattAB self-assigned this Jun 30, 2026

Octavia Squidington III (octavia-squidington-iii) added the connectors/source/hubspot label Jun 30, 2026

github-code-quality Bot found potential problems Jun 30, 2026

View reviewed changes

fix: resolve constant comparison lint warning in test

37bb34e

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

github-code-quality Bot found potential problems Jul 2, 2026

View reviewed changes

Comment thread ...e-integrations/connectors/source-hubspot/unit_tests/test_crm_search_pagination_boundaries.py Fixed

fix: resolve self-comparison lint warning in slice boundary test

3fcaabb

Co-Authored-By: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(source-hubspot): add synthetic CRM Search pagination boundary tests#81335

test(source-hubspot): add synthetic CRM Search pagination boundary tests#81335
ZaneHyattAB wants to merge 4 commits into
masterfrom
devin/1782837632-hubspot-crm-search-pagination-boundary-tests

ZaneHyattAB commented Jun 30, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

devin-ai-integration Bot commented Jun 30, 2026

Uh oh!

github-actions Bot commented Jun 30, 2026

PR Slash Commands

Helpful Resources

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 30, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

ZaneHyattAB commented Jun 30, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

How

What the tests prove vs. what they do not prove

Review guide

User Impact

Can this PR be safely reverted and rolled back?

Uh oh!

devin-ai-integration Bot commented Jun 30, 2026

🤖 Devin AI Engineer

Uh oh!

github-actions Bot commented Jun 30, 2026

👋 Greetings, Airbyte Team Member!

PR Slash Commands

Helpful Resources

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

github-actions Bot commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

source-hubspot Connector Test Results

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ZaneHyattAB commented Jun 30, 2026 •

edited by devin-ai-integration Bot

Loading

github-actions Bot commented Jun 30, 2026 •

edited

Loading

`source-hubspot` Connector Test Results