fix: support new webpack chunk format for ondemand.s lookup by steverex169 · Pull Request #416 · d60/twikit

steverex169 · 2026-04-09T14:08:20Z

Problem

Twitter changed the structure of x.com's HTML, breaking the ClientTransaction.get_indices() method for all users.

The old format was:

'ondemand.s': 'abc123'

The current format uses a webpack chunk map split across two objects:

// name map
{20113:"ondemand.s", ...}
// hash map (separate)
{20113:"2c5bb94", ...}

The existing ON_DEMAND_FILE_REGEX no longer matches, causing this error on every single API call:

Exception: Couldn't get KEY_BYTE indices

Fix

Added CHUNK_NAME_REGEX to extract the chunk ID from the name map
Falls back to resolving the hash from the separate hash map using that chunk ID
Old format still works (tried first), so this is fully backwards compatible

Testing

Verified locally against live x.com — list scraping, tweet fetching, and search all work correctly after the fix.

Summary by Sourcery

Bug Fixes:

Fix failure to locate ondemand.s script hash after Twitter changed the webpack chunking scheme, restoring successful key byte index extraction for API calls.

Summary by CodeRabbit

Bug Fixes
- Improved handling of updated asset manifests so on-demand resources load reliably across old and new backend formats, reducing failed asset fetches and page load errors for end users.

sourcery-ai · 2026-04-09T14:08:27Z

Reviewer's Guide

Updates ondemand JavaScript asset discovery to handle Twitter/X's new webpack chunk mapping format while keeping backward compatibility with the previous inline hash format.

Class diagram for ClientTransaction get_indices hash resolution

classDiagram
    class ClientTransaction {
        +home_page_response
        +get_indices(home_page_response, session, headers)
    }

    class RegexUtilities {
        +ON_DEMAND_FILE_REGEX
        +CHUNK_NAME_REGEX
        +INDICES_REGEX
    }

    ClientTransaction ..> RegexUtilities : uses

Flow diagram for ondemand.s hash resolution in get_indices

flowchart TD
    A["Start get_indices"] --> B["Validate response and select home_page_response"]
    B --> C["Convert response to string response_str"]
    C --> D["Search response_str with ON_DEMAND_FILE_REGEX"]
    D --> E{Old format match?}

    E -->|Yes| F["Extract file_hash from on_demand_file.group(1)"]
    F --> M["Build ondemand.s URL with file_hash"]

    E -->|No| G["Search response_str with CHUNK_NAME_REGEX"]
    G --> H{Chunk ID match?}

    H -->|No| L["file_hash remains None"]
    H -->|Yes| I["Extract chunk_id from chunk_id_match.group(1)"]
    I --> J["Compile hash_pattern using chunk_id"]
    J --> K["Iterate all hash_pattern matches in response_str"]
    K --> N{Valid hash candidate?}

    N -->|Yes| O["Set file_hash to candidate value"]
    N -->|No| P["Continue iterating matches"]
    P --> K

    L --> Q{file_hash is set?}
    O --> Q
    F --> Q

    Q -->|No| R["Abort: cannot resolve ondemand.s hash"]
    Q -->|Yes| M

    M --> S["GET ondemand.s file via session.request"]
    S --> T["Extract key_byte_indices with INDICES_REGEX"]
    T --> U["Return key_byte_indices"]

File-Level Changes

Change	Details	Files
Extend ondemand.s asset lookup to support new webpack chunk ID + hash map format while preserving support for the old inline hash format.	Add CHUNK_NAME_REGEX to detect the ondemand.s chunk ID from the webpack name map in the HTML response. Refactor get_indices to stringify the home page response once and attempt the legacy ON_DEMAND_FILE_REGEX match first. When legacy lookup fails, search for the ondemand.s chunk ID using CHUNK_NAME_REGEX, then locate the associated hash via a dynamically constructed regex over the separate hash map. Filter candidate hash matches to ignore the literal 'ondemand' value and constrain hash length to a reasonable maximum before selecting the file hash. Build the ondemand.s asset URL using the resolved file_hash (from either path) and proceed with existing logic to fetch the script and extract key byte indices.	`twikit/x_client_transaction/transaction.py`

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

coderabbitai · 2026-04-09T14:08:35Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6a369603-ad69-421f-b5fe-82b7254e399d

📥 Commits

Reviewing files that changed from the base of the PR and between f367875 and a7099a7.

📒 Files selected for processing (1)

twikit/x_client_transaction/transaction.py

🚧 Files skipped from review as they are similar to previous changes (1)

twikit/x_client_transaction/transaction.py

📝 Walkthrough

Walkthrough

Updated webpack manifest parsing in the transaction module to handle a new response format. Added a chunk-name regex and fallback logic in get_indices() to derive file_hash either from a direct "ondemand.s": "hash" entry or indirectly via a chunk ID that maps to "ondemand.s" and then to the hash.

Changes

Cohort / File(s)	Summary
Webpack Manifest Format Adaptation `twikit/x_client_transaction/transaction.py`	Added `CHUNK_NAME_REGEX` and changed `get_indices()` to serialize the validated home page response once, attempt legacy `'ondemand.s': '<hash>'` lookup, and fall back to finding a chunk ID that references `"ondemand.s"` and then resolving its hash. Adjusted URL construction and conditional fetching to only request `ondemand.s` after `file_hash` is resolved; extraction of `key_byte_indices` now depends on the resolved `file_hash`.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 Chunks and hashes, what a prance,
A manifest shifted, so I dance,
I hunt the chunk, then chase the hash,
From old to new I make a dash,
Small regex hops — success at last! ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and accurately summarizes the main change: adding support for a new webpack chunk format in the ondemand.s lookup mechanism.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Twitter changed the x.com HTML structure from the old format: 'ondemand.s': 'hash' to a new webpack chunk map format: chunk_id:"ondemand.s" (name map) chunk_id:"hash" (separate hash map) The old ON_DEMAND_FILE_REGEX no longer matches, causing "Couldn't get KEY_BYTE indices" on every API call. This fix detects both formats: tries the old regex first, then falls back to extracting the chunk ID from the name map and resolving its hash from the separate hash map. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

The new chunk/hash extraction logic relies on a fairly loose hash_pattern that will match any {chunk_id:"..."} pair; consider tightening this (e.g., via surrounding context or restricting the object scope) to reduce the risk of accidentally picking up unrelated values.
The heuristic val != 'ondemand' and len(val) <= 12 is somewhat opaque and fragile; extracting these constants into named variables or adding a small helper with a descriptive name would make the intent and constraints clearer and easier to adjust when Twitter changes formats again.
The code currently recompiles hash_pattern on every call to get_indices; if this pattern is stable, precompiling it (or using a function that builds it once per chunk_id) would avoid repeated compilation and make the code more consistent with the other module-level regexes.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The new chunk/hash extraction logic relies on a fairly loose `hash_pattern` that will match any `{chunk_id:"..."}` pair; consider tightening this (e.g., via surrounding context or restricting the object scope) to reduce the risk of accidentally picking up unrelated values.
- The heuristic `val != 'ondemand' and len(val) <= 12` is somewhat opaque and fragile; extracting these constants into named variables or adding a small helper with a descriptive name would make the intent and constraints clearer and easier to adjust when Twitter changes formats again.
- The code currently recompiles `hash_pattern` on every call to `get_indices`; if this pattern is stable, precompiling it (or using a function that builds it once per `chunk_id`) would avoid repeated compilation and make the code more consistent with the other module-level regexes.

## Individual Comments

### Comment 1
<location path="twikit/x_client_transaction/transaction.py" line_range="59-64" />
<code_context>
+            if chunk_id_match:
+                chunk_id = chunk_id_match.group(1)
+                hash_pattern = re.compile(rf'{chunk_id}:"([\w]+)"')
+                all_matches = list(hash_pattern.finditer(response_str))
+                file_hash = None
+                for m in all_matches:
+                    val = m.group(1)
+                    if val != 'ondemand' and len(val) <= 12:
+                        file_hash = val
+                        break
+            else:
</code_context>
<issue_to_address>
**suggestion (performance):** Collecting all matches into a list is unnecessary and slightly wasteful for large responses.

Because you only need the first matching `val` that satisfies `val != 'ondemand' and len(val) <= 12`, you can iterate directly over `hash_pattern.finditer(response_str)` and break on the first suitable match instead of building `all_matches` as a list. This avoids the intermediate list and reduces work/memory usage for large `response_str` values.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

sourcery-ai · 2026-04-09T14:10:08Z

+                all_matches = list(hash_pattern.finditer(response_str))
+                file_hash = None
+                for m in all_matches:
+                    val = m.group(1)
+                    if val != 'ondemand' and len(val) <= 12:
+                        file_hash = val


suggestion (performance): Collecting all matches into a list is unnecessary and slightly wasteful for large responses.

Because you only need the first matching val that satisfies val != 'ondemand' and len(val) <= 12, you can iterate directly over hash_pattern.finditer(response_str) and break on the first suitable match instead of building all_matches as a list. This avoids the intermediate list and reduces work/memory usage for large response_str values.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@twikit/x_client_transaction/transaction.py`:
- Line 18: CHUNK_NAME_REGEX is too strict and only matches unquoted, no-space
forms like 20113:"ondemand.s"; update CHUNK_NAME_REGEX to allow optional
single/double quotes around the numeric key and the value and permit arbitrary
spacing around the colon (e.g. use a pattern like
r'["\']?(\d+)["\']?\s*:\s*["\']?ondemand\.s["\']?' as the new regex), and apply
the same tolerant regex update to the other similar regexes/usages referenced
around lines 55-58 so all key/value formatting variants (quoted keys, spaces)
are matched.
- Around line 61-64: The loop that assigns file_hash is rejecting candidates by
a hard-coded length check ("len(val) <= 12"), which can drop valid webpack chunk
hashes; remove that arbitrary constraint in the block that iterates over
all_matches (the for m in all_matches loop) and instead accept any
non-'ondemand' match (val != 'ondemand') or replace the check with a proper
validation (e.g., match against a hex/base62 regex or a configurable
max_hash_length) before assigning file_hash; update references to file_hash
accordingly so downstream logic performs definitive validation rather than
relying on the 12-character heuristic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1aafa861-b8d8-4e35-8df6-540f0cdd32d9

📥 Commits

Reviewing files that changed from the base of the PR and between c3b7220 and f367875.

📒 Files selected for processing (1)

twikit/x_client_transaction/transaction.py

coderabbitai · 2026-04-09T14:11:16Z

 ON_DEMAND_FILE_REGEX = re.compile(
    r"""['|\"]{1}ondemand\.s['|\"]{1}:\s*['|\"]{1}([\w]*)['|\"]{1}""", flags=(re.VERBOSE | re.MULTILINE))
+# New webpack format: chunk ID maps to name, separate hash map
+CHUNK_NAME_REGEX = re.compile(r'(\d+):"ondemand\.s"')


⚠️ Potential issue | 🟠 Major

Make chunk-ID regex tolerant to key/value formatting variants.

The current pattern only matches 20113:"ondemand.s" exactly. If the runtime emits quoted keys or spacing (e.g., "20113": "ondemand.s"), this will fail and break index resolution again.

Proposed robust pattern update

-CHUNK_NAME_REGEX = re.compile(r'(\d+):"ondemand\.s"') +CHUNK_NAME_REGEX = re.compile( + r"""['"]?(\d+)['"]?\s*:\s*['"]ondemand\.s['"]""" +) ... - hash_pattern = re.compile(rf'{chunk_id}:"([\w]+)"') + hash_pattern = re.compile( + rf"""['"]?{re.escape(chunk_id)}['"]?\s*:\s*['"]([A-Za-z0-9]+)['"]""" + )

Also applies to: 55-58

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@twikit/x_client_transaction/transaction.py` at line 18, CHUNK_NAME_REGEX is too strict and only matches unquoted, no-space forms like 20113:"ondemand.s"; update CHUNK_NAME_REGEX to allow optional single/double quotes around the numeric key and the value and permit arbitrary spacing around the colon (e.g. use a pattern like r'["\']?(\d+)["\']?\s*:\s*["\']?ondemand\.s["\']?' as the new regex), and apply the same tolerant regex update to the other similar regexes/usages referenced around lines 55-58 so all key/value formatting variants (quoted keys, spaces) are matched.

coderabbitai · 2026-04-09T14:11:16Z

+                for m in all_matches:
+                    val = m.group(1)
+                    if val != 'ondemand' and len(val) <= 12:
+                        file_hash = val


⚠️ Potential issue | 🟠 Major

Avoid hard-coding max hash length (<= 12) for candidate selection.

Webpack chunk hashes are not guaranteed to stay at or below 12 chars. This heuristic can silently reject valid hashes and reintroduce the "Couldn't get KEY_BYTE indices" failure.

Safer candidate filter

- if val != 'ondemand' and len(val) <= 12: + # prefer hex-like hash candidates; tolerate future length changes + if re.fullmatch(r"[0-9a-fA-F]{6,64}", val): file_hash = val break

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@twikit/x_client_transaction/transaction.py` around lines 61 - 64, The loop that assigns file_hash is rejecting candidates by a hard-coded length check ("len(val) <= 12"), which can drop valid webpack chunk hashes; remove that arbitrary constraint in the block that iterates over all_matches (the for m in all_matches loop) and instead accept any non-'ondemand' match (val != 'ondemand') or replace the check with a proper validation (e.g., match against a hex/base62 regex or a configurable max_hash_length) before assigning file_hash; update references to file_hash accordingly so downstream logic performs definitive validation rather than relying on the 12-character heuristic.

steverex169 force-pushed the fix/x-client-transaction-webpack-format branch from f367875 to a7099a7 Compare April 9, 2026 14:09

sourcery-ai bot reviewed Apr 9, 2026

View reviewed changes

coderabbitai bot reviewed Apr 9, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: support new webpack chunk format for ondemand.s lookup#416

fix: support new webpack chunk format for ondemand.s lookup#416
steverex169 wants to merge 1 commit intod60:mainfrom
steverex169:fix/x-client-transaction-webpack-format

steverex169 commented Apr 9, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

sourcery-ai bot commented Apr 9, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai bot commented Apr 9, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

sourcery-ai bot left a comment

Uh oh!

sourcery-ai bot Apr 9, 2026

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot Apr 9, 2026

Uh oh!

coderabbitai bot Apr 9, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

steverex169 commented Apr 9, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Testing

Summary by Sourcery

Summary by CodeRabbit

Uh oh!

sourcery-ai bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Class diagram for ClientTransaction get_indices hash resolution

Flow diagram for ondemand.s hash resolution in get_indices

File-Level Changes

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

coderabbitai bot commented Apr 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

sourcery-ai bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

steverex169 commented Apr 9, 2026 •

edited by coderabbitai bot

Loading

sourcery-ai bot commented Apr 9, 2026 •

edited

Loading

coderabbitai bot commented Apr 9, 2026 •

edited

Loading