Skip to content

ShadeRipper/aether-rag-cli

Repository files navigation

Aether RAG CLI

In ancient cosmology, aether was the fifth element — the invisible medium believed to fill all space, through which light and meaning travelled. Not a thing you could hold. The substrate that made transmission possible.

This tool works the same way. It moves through your research corpus silently, surfaces the fragments most relevant to your question, and returns a structured answer with citations. Your documents stay local. Only the final synthesis reaches the cloud.

No local LLM. No Docker. No Ollama. Everything runs on your machine except the final synthesis step (Claude API).


Install

# 1. Navigate to this folder
cd aether-rag-cli

# 2. Create a virtual environment
python -m venv .venv
.\.venv\Scripts\Activate.ps1     # Windows
# source .venv/bin/activate       # macOS/Linux

# 3. Install dependencies and register the `aether` command
pip install -e .

# 4. Add your API key
copy .env.example .env
# Open .env and set: ANTHROPIC_API_KEY=sk-ant-...

First run downloads the embedding model (~80 MB) and reranker (~80 MB) from HuggingFace — one-time only.


Usage

Every session

cd aether-rag-cli
.\.venv\Scripts\Activate.ps1

Step 1 — Ingest your research files

aether ingest ./my-papers

Points at any folder. Recurses. Picks up PDF, DOCX, TXT, MD, PPTX. Already-ingested files are detected by hash and skipped on re-runs.

Step 2 — Ask a question

# Free dry-run: retrieves chunks, shows what would be sent to the LLM, estimates cost
aether query "What are the recurring themes across these papers?" --dry-run

# Live synthesis (requires API key)
aether query "What are the recurring themes across these papers?"

All commands

aether ingest <folder>                     # parse, chunk, embed, store
aether query "<question>"                  # retrieve + synthesize
aether query "<question>" --dry-run        # retrieve only, no API call, free
aether query "<question>" --top-n 5        # limit chunks sent to LLM
aether tokens                              # show token usage from last query
aether reset                               # wipe the vector store

Supported file types: PDF, DOCX, TXT, MD, PPTX


Output

Answer:

The corpus surfaces three consistent tensions: [1] users who trust AI financial
tools tend to outsource judgment rather than build it [2], while tools designed
around explanation rather than recommendation show the opposite effect [3]...

Sources:
  - financial_literacy_meta.pdf (chunk 3, relevance: 0.91)
  - trust_ai_outcomes.md (chunk 7, relevance: 0.87)

Tokens used: 847 input / 312 output
Estimated cost: $0.0051

How it works

Files → MarkItDown (parse) → RecursiveCharacterTextSplitter (~400 tokens/chunk)
     → sentence-transformers (embed locally) → ChromaDB (store on disk)

Query → dense retrieval + BM25 keyword search → RRF fusion
      → cross-encoder reranking → top 10 chunks
      → Claude API (only step that costs tokens) → answer + citations

Token cost is a fraction of full-context injection — roughly 1,250× cheaper per query at scale.


Requirements


License

MIT — see LICENSE

About

Local-first research synthesis CLI. Hybrid retrieval, cross-encoder reranking, cited answers.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages