feat: add engine benchmark demo and fix Flare package name#307
feat: add engine benchmark demo and fix Flare package name#307sauravpanda merged 1 commit intomainfrom
Conversation
- Add examples/benchmark/ with a standalone HTML demo that benchmarks MLC (WebGPU), Transformers.js (ONNX/WASM), and Flare (GGUF/WASM) engines side-by-side on the same model and prompt. Measures model load time, TTFT, and tokens/sec with configurable runs and warmup. - Rename @aspect/flare to @sauravpanda/flare throughout the codebase — the package is published on npm as @sauravpanda/flare@0.2.0, not under the @aspect scope. Closes #299 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
Caution Review failedThe pull request is closed. ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (4)
📝 WalkthroughWalkthroughThis PR adds a comprehensive browser-based benchmark UI at Changes
Sequence Diagram(s)sequenceDiagram
actor User
participant UI as Benchmark UI
participant EngMgr as Engine Manager
participant MLC as MLC Engine
participant TransJS as Transformers.js
participant Flare as Flare Engine
participant ModelCDN as Model CDN
participant Results as Results Display
User->>UI: Select model & engines
User->>UI: Configure parameters
User->>UI: Run benchmark
UI->>EngMgr: Load selected engines
par Engine Loading
EngMgr->>MLC: Initialize WebGPU
EngMgr->>TransJS: Initialize ONNX/WASM
EngMgr->>Flare: Initialize WASM/GGUF
end
par Model Loading
MLC->>ModelCDN: Fetch TVM artifact
TransJS->>ModelCDN: Fetch ONNX model
Flare->>ModelCDN: Fetch GGUF model
end
UI->>EngMgr: Execute warmup runs
EngMgr->>MLC: Warmup inference
EngMgr->>TransJS: Warmup inference
EngMgr->>Flare: Warmup inference
UI->>EngMgr: Run benchmark iterations
loop Per Engine Per Run
EngMgr->>MLC: Streaming inference (measure TTFT, tokens/sec)
EngMgr->>TransJS: Streaming inference (measure TTFT, tokens/sec)
EngMgr->>Flare: Streaming inference (measure TTFT, tokens/sec)
end
EngMgr->>UI: Aggregate metrics
UI->>Results: Display comparison charts & tables
User->>UI: Export results as JSON
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary
examples/benchmark/): Standalone HTML page that compares MLC (WebGPU), Transformers.js (ONNX/WASM), and Flare (GGUF/WASM) engines side-by-side. Measures model load time, TTFT, and tokens/sec with configurable runs, warmup, and prompt. Includes comparison table and bar charts.@aspect/flare→@sauravpanda/flarethroughout the codebase. The package is published on npm as@sauravpanda/flare@0.2.0.Test plan
npm run buildsucceedshttp://localhost:3456vianpx serveCloses #299
🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Chores