Skip to content

fix: benchmark Flare loading and broken SmolLM2 GGUF URLs#308

Merged
sauravpanda merged 1 commit intomainfrom
fix/benchmark-flare-and-model-urls
Apr 17, 2026
Merged

fix: benchmark Flare loading and broken SmolLM2 GGUF URLs#308
sauravpanda merged 1 commit intomainfrom
fix/benchmark-flare-and-model-urls

Conversation

@sauravpanda
Copy link
Copy Markdown
Owner

@sauravpanda sauravpanda commented Apr 17, 2026

Summary

  • Flare WASM parse fix: @sauravpanda/flare@0.2.0 has a wasm-pack codegen bug where /* done */ inside a JSDoc block prematurely closes the /** */ comment, causing "Unexpected token '*'". Benchmark now patches the source before blob-importing.
  • Transformers.js tok/s: Was showing 0.0 because it's a batch engine (TTFT=totalTime, decode=0). Now uses tokens/totalTime.
  • SmolLM2 GGUF URLs: HuggingFaceTB repo returns 401. Switched to bartowski's public repos for all SmolLM2 variants (135M Q8_0/Q4_K_M, 360M Q8_0).

Test plan

  • 62/62 tests pass
  • npm run build clean
  • MLC benchmark: 94 tok/s
  • Transformers.js: 16.4 tok/s (was 0.0)
  • Flare: WASM init succeeds, model URL now returns 302

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Improvements
    • Updated SmolLM2 model sources to point to new repository for better availability.
    • Enhanced accuracy of benchmark performance metrics, including token counting and generation speed measurements.
    • Improved WASM module loading reliability with fallback error handling.

- Fix Flare WASM parse error: @sauravpanda/flare@0.2.0 has a nested
  block comment "/* done */" inside a JSDoc block that prematurely
  closes the /** */ comment. Benchmark now patches this before import.
- Fix Transformers.js 0 tok/s: it runs batch (non-streaming), so
  tok/s is now calculated as tokens/totalTime instead of tokens/decodeTime.
- Fix SmolLM2 GGUF URLs: HuggingFaceTB repo returns 401, switched to
  bartowski's public repos for 135M (Q8_0, Q4_K_M) and 360M (Q8_0).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@sauravpanda sauravpanda merged commit e4e53f5 into main Apr 17, 2026
5 of 10 checks passed
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 17, 2026

Caution

Review failed

The pull request is closed.

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: eb623811-3b53-4903-8c30-4036a2ea4f1d

📥 Commits

Reviewing files that changed from the base of the PR and between 39b9d47 and a0e0558.

📒 Files selected for processing (2)
  • examples/benchmark/index.html
  • src/config/models/flare-models.json

📝 Walkthrough

Walkthrough

Updated Flare model registry URLs to point to the bartowski repository and refactored the benchmark's token accounting logic and WASM module loading from a direct CDN import to a fetch-patch-blob approach with fallback error handling.

Changes

Cohort / File(s) Summary
Flare Model Configuration
src/config/models/flare-models.json
Updated repo and url fields for three SmolLM2 Flare models (smollm2-135m-flare, smollm2-135m-flare-q4, smollm2-360m-flare) to point to bartowski/SmolLM2-*-Instruct-GGUF repository with adjusted filename casing (e.g., Q8_0.gguf, Q4_K_M).
Benchmark Index & Token Accounting
examples/benchmark/index.html
Refactored runTransformersInference to change token counting from input-length-based derivation to output tokenization; replaced callback logic to track prevTokenCount and update TTFT on token increase; recalculated tokensPerSec from total time; added fallback token estimation. Reworked Flare WASM module loading to fetch flare_web.js from CDN, patch import.meta.url and remove nested comments, import via blob URL, and initialize with dedicated wasmUrl; added error handling with CDN fallback and blob cleanup.

Sequence Diagram(s)

sequenceDiagram
    participant Browser
    participant CDN
    participant Blob as Blob URL
    participant WASM as Flare WASM
    
    Browser->>CDN: Fetch flare_web.js
    CDN-->>Browser: Return JS source
    Browser->>Browser: Patch import.meta.url<br/>& remove comments
    Browser->>Blob: Create blob URL<br/>from patched source
    Blob-->>Browser: Return blob URL
    Browser->>Browser: Import patched module<br/>via blob URL
    Browser->>WASM: Initialize with wasmUrl
    WASM-->>Browser: Module ready
    Browser->>Browser: Revoke blob URL
    
    alt Error during blob import
        Browser->>CDN: Fallback: fetch<br/>from CDN directly
        CDN-->>Browser: Return module
    end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • BrowserAI#301: Introduced Flare integration; this PR updates the model URLs and refactors the WASM loading/token accounting logic for the same Flare integration.

Suggested labels

size/L

Poem

🐇 Hop and patch, we go!
Flare's URLs now flow,
Blob tricks and token counts,
WASM loads where it mounts,
Bartowski's path steals the show!

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/benchmark-flare-and-model-urls

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant