In ancient cosmology, aether was the fifth element — the invisible medium believed to fill all space, through which light and meaning travelled. Not a thing you could hold. The substrate that made transmission possible.
This tool works the same way. It moves through your research corpus silently, surfaces the fragments most relevant to your question, and returns a structured answer with citations. Your documents stay local. Only the final synthesis reaches the cloud.
No local LLM. No Docker. No Ollama. Everything runs on your machine except the final synthesis step (Claude API).
# 1. Navigate to this folder
cd aether-rag-cli
# 2. Create a virtual environment
python -m venv .venv
.\.venv\Scripts\Activate.ps1 # Windows
# source .venv/bin/activate # macOS/Linux
# 3. Install dependencies and register the `aether` command
pip install -e .
# 4. Add your API key
copy .env.example .env
# Open .env and set: ANTHROPIC_API_KEY=sk-ant-...First run downloads the embedding model (~80 MB) and reranker (~80 MB) from HuggingFace — one-time only.
cd aether-rag-cli
.\.venv\Scripts\Activate.ps1aether ingest ./my-papersPoints at any folder. Recurses. Picks up PDF, DOCX, TXT, MD, PPTX. Already-ingested files are detected by hash and skipped on re-runs.
# Free dry-run: retrieves chunks, shows what would be sent to the LLM, estimates cost
aether query "What are the recurring themes across these papers?" --dry-run
# Live synthesis (requires API key)
aether query "What are the recurring themes across these papers?"aether ingest <folder> # parse, chunk, embed, store
aether query "<question>" # retrieve + synthesize
aether query "<question>" --dry-run # retrieve only, no API call, free
aether query "<question>" --top-n 5 # limit chunks sent to LLM
aether tokens # show token usage from last query
aether reset # wipe the vector storeSupported file types: PDF, DOCX, TXT, MD, PPTX
Answer:
The corpus surfaces three consistent tensions: [1] users who trust AI financial
tools tend to outsource judgment rather than build it [2], while tools designed
around explanation rather than recommendation show the opposite effect [3]...
Sources:
- financial_literacy_meta.pdf (chunk 3, relevance: 0.91)
- trust_ai_outcomes.md (chunk 7, relevance: 0.87)
Tokens used: 847 input / 312 output
Estimated cost: $0.0051
Files → MarkItDown (parse) → RecursiveCharacterTextSplitter (~400 tokens/chunk)
→ sentence-transformers (embed locally) → ChromaDB (store on disk)
Query → dense retrieval + BM25 keyword search → RRF fusion
→ cross-encoder reranking → top 10 chunks
→ Claude API (only step that costs tokens) → answer + citations
Token cost is a fraction of full-context injection — roughly 1,250× cheaper per query at scale.
- Python 3.9+
- An Anthropic API key (for live synthesis — dry-run is free)
MIT — see LICENSE