Skip to content

dharminnagar/obsidian-research-agent

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Obsidian Research Orchestrator

The Obsidian Research Orchestrator is an automated system designed to transform diverse information sources—including URLs, PDFs, YouTube videos, and raw text—into a deeply synthesized, interconnected knowledge base within an Obsidian vault.

This tool follows a structured "Deep Synthesis" workflow to produce atomic notes that are educational, self-contained, and properly cited.

research-agent.mov

Features

  • Multi-Source Ingestion: Concurrent fetching from web pages, PDF documents, and YouTube transcripts.
  • Deep Synthesis: Analysis of cross-source patterns, identification of core concepts, and mapping of thematic connections.
  • Atomic Note Generation: Creation of comprehensive (800–1200 word) notes for every major concept identified.
  • Automated Organization:
    • MOC.md: A Map of Content serving as a central index with Mermaid mindmaps.
    • Mindmap.md: A standalone visual overview of the research topic.
    • Key-Quotes.md: A centralized repository of verbatim, verified quotes with full citations.
  • Standardized Output: Strict adherence to Obsidian's YAML frontmatter and WikiLink conventions.

Project Structure

  • scripts/
    • fetch_url.py: Scrapes and cleans web content.
    • fetch_pdf.py: Extracts text from PDF documents.
    • fetch_youtube.py: Retrieves and processes YouTube video transcripts.
    • requirements.txt: Python dependencies for the fetching scripts.
  • SKILL.md: Detailed orchestration logic, workflow phases, and file templates.
  • research-prompt.txt: The foundational instructions for the research agent.

Installation

Follow these steps to set up the project and its dependencies:

  1. Clone the repository to your local machine:

    git clone https://github.com/your-username/obsidian-research-agent.git
    cd obsidian-research-agent
  2. Create and activate a Python virtual environment:

    python3 -m venv venv
    source venv/bin/activate
  3. Install the required Python libraries:

    pip install -r scripts/requirements.txt

Integration as an Agent Skill

To use this project as a specialized skill for an AI agent, you should move or link the project directory into your local skills folder.

  1. Create the skills directory if it does not exist:

    mkdir -p ~/.agents/skills/
  2. Move or symlink the project folder into the skills directory:

    ln -s $(pwd) ~/.agents/skills/obsidian-research-orchestrator

Once installed, the agent can activate the skill by referencing the instructions in SKILL.md to begin a research session.

Usage

Automated Research

When triggered as a skill, the agent will:

  1. Ingest provided sources using the scripts in the scripts/ directory.
  2. Perform a mandatory synthesis phase to map concepts and connections.
  3. Iteratively write the MOC, atomic notes, and quote repository to the specified Obsidian vault path.

Manual Script Execution

The fetching scripts can also be used independently to extract text from sources:

# Extract text from a URL
python3 scripts/fetch_url.py "https://example.com/article"

# Extract text from a PDF
python3 scripts/fetch_pdf.py "/path/to/document.pdf"

# Extract transcript from a YouTube video
python3 scripts/fetch_youtube.py "https://www.youtube.com/watch?v=VIDEO_ID"

License

MIT License - see the LICENSE file for details.


Made with ❤️ by Dharmin Nagar

If this helps you, consider giving it a ⭐ on GitHub!

About

Give it a topic, some urls, pdfs, or youtube links. It fetches everything, searches the web for more context, synthesizes across all sources, and writes structured notes directly into your obsidian vault.

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages