DocuSage

Local-first desktop AI for private document intelligence
Chat with a local GGUF model, ingest PDFs into a local vector database, and optionally switch on Gemini hybrid mode for stronger document-grounded answers.

Why I Built It

Most AI note and document tools make an uncomfortable tradeoff: either your data leaves your machine, or the local experience is too weak to be useful. I built DocuSage to challenge that tradeoff.

DocuSage is a local-first desktop assistant that keeps ingestion, embeddings, retrieval, and storage on-device. Then, for users who want better reasoning quality on document questions, it offers an optional Gemini-powered hybrid mode. That gives the project a practical engineering balance: privacy-sensitive retrieval stays local, while answer generation can scale up when quality matters more than strict offline operation.

Screenshot

Features

Local general chat with a GGUF model powered by mistral.rs
Local RAG pipeline for PDF ingestion, chunking, embeddings, and retrieval
Hybrid Gemini mode that uses the current free-tier gemini-2.5-flash endpoint for higher-quality document Q&A
Source-aware answers grounded in retrieved document excerpts
Settings panel for Gemini API key with local key storage in the desktop client
Stop generation and streaming responses for better UX during long outputs
Multi-session chat history with persistent local conversation state
Windows release pipeline via GitHub Actions for desktop installer generation

Architecture

┌──────────────────────────────────────────────────────────────────┐
│                        React 19 Frontend                        │
│                 Vite · TypeScript · Tailwind CSS                │
│                                                                  │
│  General Chat UI   RAG Chat UI   Settings UI   Session Manager   │
└──────────────────────────────┬───────────────────────────────────┘
                               │ Tauri invoke
┌──────────────────────────────▼───────────────────────────────────┐
│                       Rust Backend (Tauri)                       │
│                                                                  │
│  commands.rs                                                     │
│  - load_model                                                    │
│  - chat_general                                                  │
│  - chat_rag                                                      │
│  - chat_gemini_rag                                               │
│  - ingest_document                                               │
└───────────────┬───────────────────────────────┬──────────────────┘
                │                               │
        ┌───────▼────────┐              ┌──────▼─────────────────┐
        │ Local LLM Path │              │ Local RAG Path         │
        │ mistral.rs     │              │ pdf-extract            │
        │ GGUF inference │              │ fastembed              │
        │ general chat   │              │ LanceDB                │
        └────────────────┘              └──────────┬─────────────┘
                                                   │
                                  ┌────────────────▼────────────────┐
                                  │ Optional Hybrid Answer Engine   │
                                  │ Gemini 2.5 Flash generateContent│
                                  │ sends only retrieved excerpts   │
                                  └─────────────────────────────────┘

Key Technical Decisions

Area	Choice	Reason
Desktop shell	Tauri v2	Native desktop UX with lower overhead than Electron
Local model runtime	mistral.rs	Direct GGUF inference inside Rust backend
Embeddings	fastembed + BAAI/bge-small-en-v1.5	Fast local embedding generation
Vector store	LanceDB	Persistent local retrieval with simple Rust integration
Hybrid answering	Gemini 2.5 Flash	Better synthesis quality for document-grounded answers
Frontend	React 19 + Vite	Fast iteration and responsive chat UI

How It Works

General Chat

The user sends a prompt from the desktop UI.
Tauri invokes the Rust chat_general command.
The backend builds the prompt with chat history and streams tokens from the local GGUF model.
The UI renders the response incrementally.

Document Q&A

A PDF is selected and parsed locally.
Text is chunked, embedded, and stored in LanceDB.
A user question is embedded and matched against the local vector store.
The top chunks are assembled into grounded context.
DocuSage answers either with the local model or, if configured, through chat_gemini_rag using Gemini 2.5 Flash.

Privacy Model

Local mode: documents, embeddings, retrieval, and generation all stay on-device.
Hybrid mode: documents are still indexed and searched locally; only the retrieved excerpts and user question are sent to Gemini for final synthesis.

Getting Started

Prerequisites

Node.js 18+
npm
Rust stable toolchain
Tauri v2 system prerequisites

Run Locally

cd DocuSage
npm install
npm run tauri dev

GGUF Model Location

Place a .gguf file in one of these directories:

Platform	Default Path
Windows	`Documents\DocuSage\models\`
macOS	`~/Documents/DocuSage/models/`
Linux	`~/Documents/DocuSage/models/`

Or configure MODEL_PATH in DocuSage/src-tauri/.env:

MODEL_PATH=D:\DocuSage\models

Optional Gemini Hybrid Setup

Launch the app.
Open the Settings panel from the header.
Paste your Gemini API key.
Ask a question in document mode to route the final answer through Gemini 2.5 Flash.

Production Build

cd DocuSage
npm run tauri build

Project Structure

DocuSage/
├── src/
│   ├── App.tsx
│   ├── lib/api.ts
│   └── main.tsx
├── src-tauri/
│   ├── src/
│   │   ├── lib.rs
│   │   ├── commands.rs
│   │   ├── rag.rs
│   │   └── main.rs
│   ├── Cargo.toml
│   └── tauri.conf.json
├── public/
└── package.json

Environment Variables

Variable	Default	Description
`MODEL_PATH`	`~/Documents/DocuSage/models`	Directory containing `.gguf` files
`USE_GPU`	`0`	Enables GPU acceleration where supported
`CHAT_TEMPLATE`	auto-detected	Custom chat template override
`TOK_MODEL_ID`	auto-detected	Tokenizer model id override

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github		.github
DocuSage		DocuSage
.gitignore		.gitignore
Light-mode (1).png		Light-mode (1).png
Light-mode (2).png		Light-mode (2).png
README.md		README.md
RECRUITER_QA.md		RECRUITER_QA.md
Screenshot 2026-03-08 140928.png		Screenshot 2026-03-08 140928.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DocuSage

Why I Built It

Screenshot

Features

Architecture

Key Technical Decisions

How It Works

General Chat

Document Q&A

Privacy Model

Getting Started

Prerequisites

Run Locally

GGUF Model Location

Optional Gemini Hybrid Setup

Production Build

Project Structure

Environment Variables

License

About

Uh oh!

Releases 12

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DocuSage

Why I Built It

Screenshot

Features

Architecture

Key Technical Decisions

How It Works

General Chat

Document Q&A

Privacy Model

Getting Started

Prerequisites

Run Locally

GGUF Model Location

Optional Gemini Hybrid Setup

Production Build

Project Structure

Environment Variables

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 12

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages