DocJarvis- AI Medical Assistant

Medical Disclaimer: DocJarvis is an AI-assisted tool for educational (informational) purposes only. It does not constitute medical advice, diagnosis, or treatment. Always consult qualified healthcare professional.

DocJarvis is a multilingual, voice-first medical consultation assistant built on CrewAI mulit-agent architecture. It takes a patient through symptom collection, AI-driven diagnosis, medical recommendations, and prescription generation- with a Human-In-The-Loop (HITL) doctor review step implemented via GMail MCP (Model Context Protocol) before any prescription is finalised.

Architecture Overview

The system has two parallel API tracks:

V1- single-agent, stateless endpoints for simple integrations and the legacy frontend flow
V2- the full multi-agent workflow. All new development targets V2.

Multi-agent System

The V2 pipeline uses five CrewAI agents, each with a dedicated tool set and a scoped role:

Agent	Role	Tools	Step(s)
`speech_processor`	Transcribes audio and synthesises TTS responses	`TextToSpeechTool`	2, 8
`translator`	Translates between patient language and English	`TranslationTool`	3, 4
`qna_generator`	Generates exactly 3 focused diagnostic questions	`QuestionGenerationTool`	4
`medication`	Produces evidence-based medication recommendations	`MedicationTool`	7
`prescription_specialist`	Generates prescriptions and manages Gmail MCP review loop	`PrescriptionTool`, `GMailMCPSendTool`, `GMailMCPReadTool`	9, 10

Agents are pre-inistantiated module-level singletons (medical_agents.py) and loaded into MedicalCrew at startup. The crew validates configuration during the FastAPI lifespan and logs a warning (without blocking startup) if initialisation fails.

Agent Tools

All agent tools inherit from CrewAI's BaseTool because FastAPI runs in an async event loop and CrewAI tools call _run() synchronously, async service calls (LLMs, TTS) are dispatched to a dedicated ThreadPoolExecutor via a _run_async() helper to avoid RuntimeError: This event loop is already running

Tech Stack

Backend

Component	Technology
Framework	FastAPI 0.100+ with async lifespan
LLM	Google Gemini 2.5 Flash via `langchain-google-genai`
Agent orchestration	CrewAI
STT	Google Speech Recognition (`speech_recognition`)
TTS	Google TTS (`gTTS`) + `pydub` for format conversion
Translation	`deep-translator` (GoogleTranslator) + `langdetect`
Session store (dev)	In-memory dict
Session store (prod)	Redis (`redis-py` async)
Tracing	OpenTelemetry (OTLP gRPC exporter)
LLM tracing	LangSmith
MCP	Gmail MCP server (custom `GMailMCPClient`)
Config	Pydantic Settings v2
Runtime	Python 3.11, Uvicorn

Frontend

Component	Technology
Framework	React 19 + TypeScript 5.5
Build	Vite 7
State	Zustand 5 with `devtools` + `persist` middleware
Styling	Tailwind CSS 3.4
Audio capture	`MediaRecorder` API (WebM/Opus → server STT)
STT (Q&A phase)	Web Speech Recognition API
TTS (intro)	Web Speech Synthesis API
HTTP	Fetch API (custom `V1ApiClient` / `V2ApiClient`)
Testing	Vitest + Testing Library

Infrastructure

Component	Technology
Reverse proxy	Nginx (TLS 1.2/1.3, HTTP/2, SSE support)
Containerisation	Docker + Docker Compose
CI	GitHub Actions
Metrics / traces	OpenTelemetry Collector → OTLP endpoint

Project Structure

docjarvis/
├── backend/                      # Python FastAPI backend
│   ├── src/
│   │   ├── api/
│   │   │   ├── __init__.py
│   │   │   ├── main.py          # FastAPI app (artifact above)
│   │   │   ├── schemas.py
│   │   │   ├── routes/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── diagnosis.py
│   │   │   │   ├── health_checks.py
│   │   │   │   ├── helpers.py
│   │   │   │   ├── monitoring.py
│   │   │   │   ├── prescription.py
│   │   │   │   ├── sessions.py
│   │   │   │   ├── workflow_routes.py
│   │   │   └── middleware/
│   │   │   │   ├── __init__.py
│   │   │   │   ├── error_handler.py
│   │   │   │   └── logging.py
│   │   ├── config/
│   │   │   ├── __init__.py
│   │   │   ├── monitoring.py
│   │   │   ├── settings.py
│   │   ├── core/
│   │   │   ├── __init__.py
│   │   │   ├── diagnosis.py
│   │   │   ├── llm_manager.py
│   │   │   ├── mcp_client.py
│   │   │   ├── prescription.py
│   │   │   ├── crew_ai/
│   │   │   │   ├── tools/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── gmail_mcp_tools.py
│   │   │   │   │   ├── medical_tools.py
│   │   │   │   ├── workflows/
│   │   │   │   │   ├── __init__.py
│   │   │   │   │   ├── mcp_workflow.py
│   │   │   │   │   └── session_workflow.py
│   │   │   │   ├── __init__.py
│   │   │   │   ├── constants.py
│   │   │   │   ├── medical_agents.py
│   │   │   │   └── medical_crew.py
│   │   ├── monitoring/
│   │   │   ├── __init__.py
│   │   │   ├── cache_manager.py
│   │   │   ├── dashboard.py
│   │   │   ├── load_balancer.py
│   │   │   ├── performance_monitor.py
│   │   ├── services/
│   │   │   ├── __init__.py
│   │   │   ├── session_store.py
│   │   │   ├── speech.py
│   │   │   ├── translation.py
│   │   ├── utils/
│   │   │   ├── __init__.py
│   │   │   ├── backstories.py
│   │   │   ├── consts.py
│   │   │   ├── exceptions.py
│   │   │   ├── file_handler.py
│   │   │   ├── helpers.py
│   │   │   └── task_descriptions.py
│   ├── tests/
│   │   ├── conftest.py
│   │   ├── integration/
│   │   │   ├── test_monitoring_health.py
│   │   │   ├── test_session_lifecycle.py
│   │   │   ├── test_sessions_api.py
│   │   │   └── test_workflow_routes.py
│   │   └── unit/
│   │   │   ├── test_cache_manager.py
│   │   │   ├── test_consts.py
│   │   │   ├── test_diagnosis.py
│   │   │   ├── test_helpers.py
│   │   │   ├── test_mcp_workflow.py
│   │   │   ├── test_monitoring.py
│   │   │   ├── test_session_store.py
│   │   │   └── test_session_workflow.py
│   ├── pyproject.toml
│   ├── requirements.txt
│   └── Dockerfile
├── frontend/                     # React TypeScript frontend
│   ├── src/
│   │   ├── api/
│   │   │   ├── client.ts        # API client
│   │   ├── components/
│   │   │   ├── consultation/
│   │   │   │   ├── index.ts
│   │   │   │   ├── ConversationDisplay.tsx
│   │   │   │   ├── ConversationPane.tsx
│   │   │   │   ├── PatientForm.tsx
│   │   │   │   ├── PrescriptionPane.tsx
│   │   │   │   ├── PrescriptionReview.tsx
│   │   │   │   ├── VoiceConsultation.tsx
│   │   │   ├── layout/
│   │   │   │   ├── index.ts
│   │   │   │   ├── Header.tsx
│   │   │   │   ├── Footer.tsx
│   │   │   ├── speech/
│   │   │   │   ├── index.ts
│   │   │   │   ├── SpeechControls.tsx
│   │   │   │   ├── VoiceInput.tsx
│   │   │   └── ui/ # Reusable UI components
│   │   │   │   ├── index.ts
│   │   │   │   ├── Alert.tsx
│   │   │   │   ├── Button.tsx
│   │   │   │   ├── Card.tsx
│   │   │   │   ├── Input.tsx
│   │   │   │   ├── ProgressBar.tsx
│   │   │   │   ├── Select.tsx
│   │   │   │   ├── Spinner.tsx
│   │   │   │   └── TextArea.tsx
│   │   ├── hooks/
│   │   │   ├── index.ts
│   │   │   ├── useAudioRecording.ts
│   │   │   ├── useLocalStorage.ts
│   │   │   ├── useSpeechRecognition.ts
│   │   │   ├── useSpeechSynthesis.ts
│   │   ├── utils/
│   │   │   ├── constants.ts
│   │   │   ├── consultationStore.ts
│   │   │   └── index.ts
│   │   ├── App.tsx
│   │   ├── main.tsx
│   │   └── index.css
│   ├── public/
│   ├── Dockerfile
│   ├── env.d.ts
│   ├── index.html
│   ├── nginx.conf
│   ├── package.json
|   ├── package-lock.json
│   ├── tailwind.config.js
│   ├── tsconfig.json
│   ├── tsconfig.node.json
│   └── vite.config.ts
├── .github/
│   └── workflows/
│       ├── ci.yml
│       └── deploy.yml
├── .gitignore
├── .pylintrc
├── Pytest.ini
├── docker-compose.yml
├── otel-config.yml
├── package.json
└── README.md

API Reference

Full interactive docs are available at http://localhost:8000/docs in debug mode.

V1 — Legacy Single-Agent

Method	Endpoint	Description
`POST`	`/api/v1/sessions/`	Create a session
`GET`	`/api/v1/sessions/{id}`	Get full session state
`POST`	`/api/v1/sessions/{id}/answer`	Submit a text answer
`POST`	`/api/v1/sessions/{id}/transcribe`	Submit audio (STT + next question)
`POST`	`/api/v1/sessions/{id}/complete`	Complete session and get medication
`POST`	`/api/v1/sessions/{id}/complete/stream`	Streaming medication (SSE)
`DELETE`	`/api/v1/sessions/{id}`	Delete session
`POST`	`/api/v1/diagnosis/questions`	Generate questions from complaint text
`POST`	`/api/v1/prescription/{id}/generate`	Generate prescription document
`GET`	`/api/v1/prescription/{id}/download`	Download prescription file

V2 — CrewAI Multi-Agent Workflow

All V2 workflow endpoints accept multipart/form-data (FastAPI Form parameters).

Method	Endpoint	Step	Description
`POST`	`/api/v2/workflow/welcome-audio`	1	Generate TTS welcome audio
`POST`	`/api/v2/workflow/process-initial-symptom`	2–4	STT → translation → 3 diagnostic questions
`POST`	`/api/v2/workflow/answer-question/{id}`	5–6	Record a Q&A answer
`POST`	`/api/v2/workflow/generate-recommendations/{id}`	7	CrewAI diagnosis + pharmacist agents
`POST`	`/api/v2/workflow/recommendations-audio`	8	TTS of recommendations
`POST`	`/api/v2/workflow/generate-prescription/{id}`	9–10	Generate PDF + Gmail MCP send
`POST`	`/api/v2/workflow/doctor-response`	MCP	Parse doctor's APPROVE/MODIFY/REJECT reply
`GET`	`/api/v2/workflow/session/{id}/status`	—	Poll session progress
`DELETE`	`/api/v2/workflow/session/{id}`	—	Delete session
`GET`	`/api/v2/workflow/health`	—	Crew health check

V2 — Monitoring

Method	Endpoint	Description
`GET`	`/api/v2/monitoring/dashboard`	Full metrics dashboard
`GET`	`/api/v2/monitoring/performance`	Agent P50/P95/P99 latency + error rates
`GET`	`/api/v2/monitoring/cache`	Cache hit rate and per-agent config
`POST`	`/api/v2/monitoring/cache/clear`	Clear all agent caches
`POST`	`/api/v2/monitoring/cache/clear/{agent}`	Clear single agent cache
`GET`	`/api/v2/monitoring/load-balancing`	Concurrency load per agent
`GET`	`/api/v2/monitoring/agents`	Per-agent health status
`GET`	`/api/v2/monitoring/health`	Overall system health score

V2 — Health

Method	Endpoint	Description
`GET`	`/api/v2/health/ready`	Kubernetes readiness probe (LLM, crew, cache, load balancer)
`GET`	`/api/v2/health/deep`	Full diagnostic: agents, resources, MCP
`GET`	`/api/v2/health/startup`	Post-init startup check
`GET`	`/health`	Root health (used by load balancer)
`GET`	`/ready`	Root readiness (used by load balancer)

Getting Started

Prerequisites

Requirement	Version
Python	3.11+
Node.js	20+
Docker + Docker Compose	24+
Google Cloud account	—
Gmail account (for MCP)	—

Environment Variables

Create .env in project root from the template below:

# -- LLM -------------------------
GOOGLE_API_KEY=your_google_api_key
# -- Application -------------------------
ENVIRONMENT=dev # dev | staging | prod
# -- Session Store -------------------------
# Leave blank to use in-memory store (dev).
# Set to redis://... for production
REDIS_URL=
# -- MCP / GMail -------------------------
# Gmail MCP server endpoint (e.g. a locally running MCP server or hosted URL)
GMAIL_SERVER=http://localhost:3001
# -- Doctor's email address for prescription review -------------------------
DOCTOR_EMAIL = doctor@hospital.com
# -- LangSmith (optional) -------------------------
LANGSMITH_API_KEY=
LANGSMITH_PROJECT=docjarvis
LANGSMITH_TRACING=false
# -- OpenTelemetry (optional) -------------------------
OTEL_ENABLED=false
OTEL_EXPORTER_ENDPOINT=http://local-host:4317
OTEL_SERVICE_NAME=docjarvis-backend
VITE_API_URL_v1=http://localhost:8000/api/v1
VITE_API_URL_v2=http://localhost:8000/api/v2

Local Development

Backend:

cd backend
python -m venv .venv
source .venv/bin/activate          # Windows: .venv\Scripts\activate
pip install -r requirements.txt
uvicorn src.api.main:app --reload --host 0.0.0.0 --port 8000

The API will be live at http://localhost:8000. Swagger UI is available at http://localhost:8000/docs (debug mode only).

Frontend:

The Vite dev server requires a local TLS certificate because MediaRecorder and Web Speech APIs require HTTPS (even on localhost in some browsers). Generate one with mkcert:

mkcert -install
mkcert localhost
# Moves the generated files into the frontend directory:
mv localhost.pem localhost-key.pem frontend/
cd frontend
npm install
npm run dev

Docker Compose

# Build and start all services
docker compose up --build

# Backend only
docker compose up backend

# With Redis session store
REDIS_URL=redis://redis:6379 docker compose up

Services:

Service	Port	Description
`backend`	`8000`	FastAPI app
`frontend`	`443`	Nginx + React (HTTPS)
`redis`	`6379`	Session store (prod profile)

Workflow: End-to-End Flow

MCP Integration

DocJarvis implements a HITL review step using the Gmail MCP server. No prescription is finalised without explicit doctor approval.

How it works

The prescription_specialist CrewAI agent calls GMailMCPSendTool to send a formatted HTML email to DOCTOR_EMAIL containing the prescription and review instructions.
The email body contains structured commands the doctor replies with:
- APPROVE #<review_id>- approve as written
- MODIFY #<review_id> - <changes>- approve with modifications
- REJECT #<review_id> - <reason>- reject
The MCPWorkflowManager polls GMail via GMailMCPReadTool every 10 seconds for replies (configure via POLL_INTERVAL_SECONDS).
On receipt, _parse_action() uses regex matching to extract the command and routes to appropriate outcome handlers.
The frontend PrescriptionReview component lets users paste the doctor's reply directly as a fallback for environments where polling is unavailable.

MCP server setup

The GMail MCP server must be running and accessible at GMAIL_SERVER. Refer to MCP server's own documentation for OAuth2 credential setup. The GMailMCPClient connects on first use and reconnects if disconnected.

Monitoring and Observability

OpenTelemetry

When OTEL_ENABLED=true, the backend exports traces and metrics via OTLP gRPC to OTEL_EXPORTER_ENDPOINT. FastAPI and HTTPX are auto-instrumented via FastAPIInstrumentor and HTTPXClientInstrumentor. Custom metrics recorded:

Metric	Type	Description
`docjarvis.session.created`	Counter	Sessions created
`docjarvis.session.completed`	Counter	Sessions completed
`docjarvis.llm.requests`	Counter	LLM API calls
`docjarvis.llm.errors`	Counter	LLM API errors
`docjarvis.llm.latency`	Histogram (ms)	LLM request latency
`docjarvis.session.duration`	Histogram (s)	Consultation duration
`agent_execution_duration`	Histogram (ms)	Per-agent execution time
`cache_operations`	Counter	Cache hits / misses / sets
`agent_concurrent_load`	Histogram	Real-time agent concurrency

LangSmith

Set LANGSMITH_TRACING=true and provide LANGSMITH_API_KEY + LANGSMITH_PROJECT to trace all LangChain/LLM calls in the LangSmith dashboard.

Monitoring Dashboard

The V2 dashboard endpoint (GET /api/v2/monitoring/dashboard) returns a comprehensive JSON payload covering:

Per-agent performance (P50/P95/P99, success rate, error count)
Cache statistics (hit rate, LRU eviction count, per-agent TTL config)
Load balancer state (current concurrency, queue depth per agent)
System metrics (CPU, memory, disk via psutil)

Per-Agent Thresholds

Agent	Latency threshold	Error rate threshold
`stt`	3,000 ms	3%
`translation`	2,000 ms	2%
`qa`	4,000 ms	5%
`diagnosis`	6,000 ms	5%
`prescription`	5,000 ms	5%
`tts`	4,000 ms	3%

Breaches trigger a performance_alerts counter increment and a WARNING log entry.

CI/CD

The pipeline is defined in .github/workflows/ci.yml and runs on every push to main and on pull requests targetting main.

backend-test ──┐
               ├──▶ docker-build
frontend-test ─┘

backend-test:

Python 3.11 setup with pip cache
Install requirements.txt + pytest pytest-asyncio pytest-cov pylint httpx
Pylint lint (non-blocking)
pytest --cov=src --cov-report=xml
Upload coverage to Codecov frontend-test:
Node 20 setup with npm cache
npm ci
ESLint + TypeScript typecheck
Vitest with coverage
vite build (validates the production bundle) docker-build (runs only after both test jobs pass):
docker build ./backend
docker build ./frontend

Configuration Reference

All backend configuration is managed through src/config/settings.py (Pydantic Settings v2). Values are read from environment variables or .env

Variable	Default	Description
`GOOGLE_API_KEY`	—	Required. Google Generative AI API key
`ENVIRONMENT`	`dev`	`dev` \| `staging` \| `prod`
`DEBUG`	`true`	Enables Swagger UI at `/docs`
`HOST`	`0.0.0.0`	Bind address
`PORT`	`8000`	Bind port
`WORKERS`	`4`	Uvicorn workers (ignored in debug mode)
`GEMINI_MODEL`	`gemini-2.5-flash`	Model identifier
`LLM_TEMPERATURE`	`0.2`	Generation temperature
`LLM_MAX_TOKENS`	`2048`	Max output tokens
`REDIS_URL`	`""`	Redis connection string (empty = in-memory store)
`SESSION_TTL`	`3600`	Redis session TTL in seconds
`GMAIL_SERVER`	`""`	Gmail MCP server endpoint
`DOCTOR_EMAIL`	`""`	Recipient for prescription review emails
`LANGSMITH_API_KEY`	`""`	LangSmith API key
`LANGSMITH_PROJECT`	`""`	LangSmith project name
`LANGSMITH_TRACING`	`false`	Enable LangSmith tracing
`OTEL_ENABLED`	`false`	Enable OpenTelemetry export
`OTEL_SERVICE_NAME`	`""`	OTel service name
`OTEL_EXPORTER_ENDPOINT`	`""`	OTLP gRPC endpoint

Deployment

Tech Stack

Layer	Technology
Frontend	React 19, TypeScript, Vite, Zustand, Tailwind CSS
Backend	FastAPI, Python 3.11, CrewAI, Gemini 2.5 Flash
AI / LLM	Google Gemini via LangChain, CrewAI multi-agent
Speech	Web Speech API (STT), Web Speech Synthesis + gTTS (TTS)
Email	Gmail API (OAuth 2.0) for prescription review
Frontend Hosting	Vercel
Backend Hosting	Google Cloud Run
Container Registry	Google Artifact Registry
CI	GitHub Actions
CD	GitHub Actions → Cloud Run + Vercel

CI/CD Deployment Flow

Environment Variables

Variable	Description
`GOOGLE_API_KEY`	Gemini API key
`DOCTOR_EMAIL`	Recipient email for prescription review
`GMAIL_CREDENTIALS_B64`	Base64-encoded `credentials.json`
`GMAIL_TOKEN_B64`	Base64-encoded `token.json`
`ENVIRONMENT`	`prod`
`CREWAI_TRACING_ENABLED`	`false`
`CREWAI_DISABLE_TELEMETRY`	`true`
`VITE_API_URL_V1`	`https://your-backend.run.app/api/v1`
`VITE_API_URL_V2`	`https://your-backend.run.app/api/v2`

GitHUb Secrets Required

GCP_SA_KEY                 — GCP service account JSON key
GCP_PROJECT_ID             — GCP project ID
GOOGLE_API_KEY             — Gemini API key
DOCTOR_EMAIL               — Doctor email for prescription review
GMAIL_CREDENTIALS_B64      — base64 -i credentials.json | tr -d '\n'
GMAIL_TOKEN_B64            — base64 -i token.json | tr -d '\n'
VERCEL_TOKEN               — Vercel API token
VITE_API_URL_V1            — Cloud Run backend URL /api/v1
VITE_API_URL_V2            — Cloud Run backend URL /api/v2

Manual First Deploy (one-time, before CD is active)

# 1. Authenticate
gcloud auth login
gcloud config set project YOUR_PROJECT_ID

# 2. Enable APIs
gcloud services enable run.googleapis.com artifactregistry.googleapis.com

# 3. Create Artifact Registry repository
gcloud artifacts repositories create docjarvis \
  --repository-format=docker \
  --location=us-central1

# 4. Build and push
gcloud auth configure-docker us-central1-docker.pkg.dev
docker buildx build \
  --platform linux/amd64 \
  -t us-central1-docker.pkg.dev/YOUR_PROJECT_ID/docjarvis/backend:latest \
  --push ./backend

# 5. Deploy
gcloud run deploy docjarvis-backend \
  --image us-central1-docker.pkg.dev/YOUR_PROJECT_ID/docjarvis/backend:latest \
  --platform managed \
  --region us-central1 \
  --allow-unauthenticated \
  --memory 2Gi --cpu 2 --timeout 300 \
  --port 8080 --startup-cpu-boost

# 6. Get URL
gcloud run services describe docjarvis-backend \
  --region us-central1 \
  --format 'value(status.url)'

After this, all subsequent deploys happen automatically via the CD pipeline on every push to main.

Name		Name	Last commit message	Last commit date
Latest commit History 80 Commits
.github/workflows		.github/workflows
backend		backend
diagrams		diagrams
frontend		frontend
.gitignore		.gitignore
.pylintrc		.pylintrc
Pytest.ini		Pytest.ini
README.md		README.md
docker-compose.yml		docker-compose.yml
otel-config.yaml		otel-config.yaml
package.json		package.json

Folders and files

Latest commit

History

Repository files navigation

DocJarvis- AI Medical Assistant

Table of Contents

Architecture Overview

Multi-agent System

Tech Stack

Infrastructure

Project Structure

API Reference

V1 — Legacy Single-Agent

V2 — CrewAI Multi-Agent Workflow

V2 — Monitoring

V2 — Health

Getting Started

Prerequisites

Environment Variables

Local Development

Docker Compose

Workflow: End-to-End Flow

MCP Integration

Monitoring and Observability

OpenTelemetry

LangSmith

Monitoring Dashboard

Per-Agent Thresholds

CI/CD

Configuration Reference

Deployment

Tech Stack

CI/CD Deployment Flow

Environment Variables

GitHUb Secrets Required

Manual First Deploy (one-time, before CD is active)

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages