Local-first desktop AI for private document intelligence
Chat with a local GGUF model, ingest PDFs into a local vector database, and optionally switch on Gemini hybrid mode for stronger document-grounded answers.
Most AI note and document tools make an uncomfortable tradeoff: either your data leaves your machine, or the local experience is too weak to be useful. I built DocuSage to challenge that tradeoff.
DocuSage is a local-first desktop assistant that keeps ingestion, embeddings, retrieval, and storage on-device. Then, for users who want better reasoning quality on document questions, it offers an optional Gemini-powered hybrid mode. That gives the project a practical engineering balance: privacy-sensitive retrieval stays local, while answer generation can scale up when quality matters more than strict offline operation.
- Local general chat with a GGUF model powered by mistral.rs
- Local RAG pipeline for PDF ingestion, chunking, embeddings, and retrieval
- Hybrid Gemini mode that uses the current free-tier
gemini-2.5-flashendpoint for higher-quality document Q&A - Source-aware answers grounded in retrieved document excerpts
- Settings panel for Gemini API key with local key storage in the desktop client
- Stop generation and streaming responses for better UX during long outputs
- Multi-session chat history with persistent local conversation state
- Windows release pipeline via GitHub Actions for desktop installer generation
┌──────────────────────────────────────────────────────────────────┐
│ React 19 Frontend │
│ Vite · TypeScript · Tailwind CSS │
│ │
│ General Chat UI RAG Chat UI Settings UI Session Manager │
└──────────────────────────────┬───────────────────────────────────┘
│ Tauri invoke
┌──────────────────────────────▼───────────────────────────────────┐
│ Rust Backend (Tauri) │
│ │
│ commands.rs │
│ - load_model │
│ - chat_general │
│ - chat_rag │
│ - chat_gemini_rag │
│ - ingest_document │
└───────────────┬───────────────────────────────┬──────────────────┘
│ │
┌───────▼────────┐ ┌──────▼─────────────────┐
│ Local LLM Path │ │ Local RAG Path │
│ mistral.rs │ │ pdf-extract │
│ GGUF inference │ │ fastembed │
│ general chat │ │ LanceDB │
└────────────────┘ └──────────┬─────────────┘
│
┌────────────────▼────────────────┐
│ Optional Hybrid Answer Engine │
│ Gemini 2.5 Flash generateContent│
│ sends only retrieved excerpts │
└─────────────────────────────────┘
| Area | Choice | Reason |
|---|---|---|
| Desktop shell | Tauri v2 | Native desktop UX with lower overhead than Electron |
| Local model runtime | mistral.rs | Direct GGUF inference inside Rust backend |
| Embeddings | fastembed + BAAI/bge-small-en-v1.5 | Fast local embedding generation |
| Vector store | LanceDB | Persistent local retrieval with simple Rust integration |
| Hybrid answering | Gemini 2.5 Flash | Better synthesis quality for document-grounded answers |
| Frontend | React 19 + Vite | Fast iteration and responsive chat UI |
- The user sends a prompt from the desktop UI.
- Tauri invokes the Rust
chat_generalcommand. - The backend builds the prompt with chat history and streams tokens from the local GGUF model.
- The UI renders the response incrementally.
- A PDF is selected and parsed locally.
- Text is chunked, embedded, and stored in LanceDB.
- A user question is embedded and matched against the local vector store.
- The top chunks are assembled into grounded context.
- DocuSage answers either with the local model or, if configured, through
chat_gemini_ragusing Gemini 2.5 Flash.
- Local mode: documents, embeddings, retrieval, and generation all stay on-device.
- Hybrid mode: documents are still indexed and searched locally; only the retrieved excerpts and user question are sent to Gemini for final synthesis.
- Node.js 18+
- npm
- Rust stable toolchain
- Tauri v2 system prerequisites
cd DocuSage
npm install
npm run tauri devPlace a .gguf file in one of these directories:
| Platform | Default Path |
|---|---|
| Windows | Documents\DocuSage\models\ |
| macOS | ~/Documents/DocuSage/models/ |
| Linux | ~/Documents/DocuSage/models/ |
Or configure MODEL_PATH in DocuSage/src-tauri/.env:
MODEL_PATH=D:\DocuSage\models- Launch the app.
- Open the Settings panel from the header.
- Paste your Gemini API key.
- Ask a question in document mode to route the final answer through Gemini 2.5 Flash.
cd DocuSage
npm run tauri buildDocuSage/
├── src/
│ ├── App.tsx
│ ├── lib/api.ts
│ └── main.tsx
├── src-tauri/
│ ├── src/
│ │ ├── lib.rs
│ │ ├── commands.rs
│ │ ├── rag.rs
│ │ └── main.rs
│ ├── Cargo.toml
│ └── tauri.conf.json
├── public/
└── package.json
| Variable | Default | Description |
|---|---|---|
MODEL_PATH |
~/Documents/DocuSage/models |
Directory containing .gguf files |
USE_GPU |
0 |
Enables GPU acceleration where supported |
CHAT_TEMPLATE |
auto-detected | Custom chat template override |
TOK_MODEL_ID |
auto-detected | Tokenizer model id override |
MIT
.png)
.png)