Skip to content

feat: add exo-cli management tool for controlling a running cluster#1729

Open
ecohash-co wants to merge 2 commits intoexo-explore:mainfrom
ecohash-co:feat/exo-cli
Open

feat: add exo-cli management tool for controlling a running cluster#1729
ecohash-co wants to merge 2 commits intoexo-explore:mainfrom
ecohash-co:feat/exo-cli

Conversation

@ecohash-co
Copy link
Copy Markdown
Contributor

Motivation

The exo command starts a node — it's a long-running daemon. There's currently no CLI tool for managing a running cluster. This adds exo-cli as a separate entrypoint following the same pattern as kubectl/kubelet or obsidian-cli/obsidian.

Closes #1728
Depends on #1727 (cluster management API endpoints)

Changes

New package: src/exo/cli/ (4 modules, ~500 lines + 300 lines of tests)

Module Purpose
client.py Sync HTTP client wrapping /v1/cluster/* endpoints
format.py Human-friendly table formatters for each endpoint
main.py Argparse CLI with subcommands
tests/test_cli.py 20 tests (parser, formatters, client URL construction)

New entrypoint in pyproject.toml:

[project.scripts]
exo = "exo.main:main"
exo-cli = "exo.cli.main:main"   # new

Commands

exo-cli status                            Cluster overview (nodes, models, memory)
exo-cli health                            Quick liveness check (exits 1 if unhealthy)
exo-cli nodes                             List all nodes
exo-cli nodes <id>                        Single node detail
exo-cli models                            Loaded models + active downloads
exo-cli models status <name>              Poll model readiness
exo-cli models load [--wait] <name>       Load model (auto-placement)
exo-cli models unload <name>              Unload by name
exo-cli models swap [--wait] <old> <new>  Atomic unload-then-load

Key features

  • --wait flag blocks until async operations complete — eliminates polling loops in scripts
  • --json flag for machine-readable output (pipe to jq)
  • --host/--port to target any node in the cluster
  • Human-friendly table output by default with progress bars for memory
  • Zero new dependencies — uses stdlib urllib and argparse

Example: day/night model rotation cron

#!/bin/bash
# 11pm: swap to large model for overnight batch reasoning
exo-cli --host atlas models swap --wait \
  "Qwen3-30B-A3B-4bit" "mlx-community/MiniMax-M1-80B-A45B-4bit"

# Run deep batch inference...
curl http://atlas:52415/v1/chat/completions -d '{...}'

# 6am: swap back to fast model for daytime use
exo-cli --host atlas models swap --wait \
  "MiniMax-M1-80B-A45B-4bit" "mlx-community/Qwen3-30B-A3B-4bit"

Example: health check in monitoring

# Nagios/healthcheck — exits 0 if healthy, 1 if not
exo-cli --host atlas health

# JSON for Grafana/scripts
exo-cli --host atlas --json status | jq '.ram_used_percent'

Why It Works

The CLI is a pure HTTP client — it talks to the cluster management API endpoints from #1727 and does no server-side work. Each command maps 1:1 to an API endpoint. The --wait flag uses GET /v1/cluster/models/{id}/status in a polling loop with configurable interval and timeout.

Test Plan

Automated Testing

20 tests in src/exo/cli/tests/test_cli.py:

  • TestParser (11 tests) — all subcommands, flags, global options
  • TestFormatters (7 tests) — each formatter with realistic API response data
  • TestClientURLs (2 tests) — default and custom host/port

Full suite: 266 passed, 0 failed across the entire repo.

Manual Testing

Not yet tested against a live cluster (the CLI depends on #1727 endpoints). The HTTP client uses standard urllib — the interesting logic is all in the argument parsing and output formatting, which are well-covered by tests.

Adds `exo-cli` as a separate entrypoint for managing a running exo
cluster over HTTP, analogous to kubectl/kubelet or obsidian-cli/obsidian.

Commands:
  exo-cli status                          Cluster overview
  exo-cli health                          Quick liveness (exits 1 if down)
  exo-cli nodes [<id>]                    List or inspect nodes
  exo-cli models                          Loaded models + downloads
  exo-cli models status <name>            Poll readiness
  exo-cli models load [--wait] <name>     Load with auto-placement
  exo-cli models unload <name>            Unload by name
  exo-cli models swap [--wait] <old> <new>  Atomic model swap

Key features:
- --wait flag blocks until async ops complete (no polling loops in scripts)
- --json flag for machine-readable output
- --host/--port to target any node
- Human-friendly table output by default
- Zero new dependencies (stdlib urllib + argparse)

Closes exo-explore#1728
Depends on exo-explore#1727 (cluster management API endpoints)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ecohash-co
Copy link
Copy Markdown
Contributor Author

Two updates based on feedback on #1728:

  1. We'll rename the command from exo-cli to exo per @AlexCheema's suggestion — much cleaner.
  2. Pausing work on this per @Evanev7's note about waiting for daemonization to land first. Makes sense to build the CLI against the final daemon interface rather than iterate against a moving target.

Will rebase and update once the team is ready. Ping us anytime.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat: exo-cli — management CLI for controlling a running exo cluster

1 participant