Skip to content

Batch remote execution: target profiles + K8s tooling (PR 2)#34

Open
GondekNP wants to merge 8 commits intomainfrom
feat/batch-run
Open

Batch remote execution: target profiles + K8s tooling (PR 2)#34
GondekNP wants to merge 8 commits intomainfrom
feat/batch-run

Conversation

@GondekNP
Copy link
Copy Markdown
Contributor

Summary

Part of the batch remote execution plan (#31). This branch accumulates all batch PRs (PR1 ingest already merged, this adds PR2).

  • Target profile system (joshpy/targets.py): CRUD for ~/.josh/targets/<name>.json — the shared config format between josh and joshpy. Includes TargetProfile, HttpTargetConfig, KubernetesTargetConfig dataclasses, snake↔camel serialization for Java interop, and resolve_minio_creds() with profile > env var hierarchy.
  • Devcontainer K8s tooling: SHA256-pinned gcloud SDK 526.0.0 + kubectl v1.31.4 installed in the Docker image for K8s target interaction.
  • BATCH_INTEGRATION.md updated with validated access patterns and shipped PR1 status.

Related: josh#416 — normalize JSON key casing (snake_case vs camelCase mix).

Test plan

  • 900 unit tests pass (pixi run test), 17 integration tests deselected
  • 32 new target profile tests (dataclass construction, serialization round-trip, CRUD, credential resolution)
  • No new lint or mypy errors
  • Rebuild devcontainer to verify gcloud/kubectl install

🤖 Generated with Claude Code

GondekNP and others added 6 commits April 16, 2026 18:39
Enable recovering simulation results from MinIO into the RunRegistry by
label.  DuckDB reads CSVs directly from S3 via httpfs — no local download
needed.  Also provides a download=True fallback via stageFromMinio.

- configure_s3(): reusable DuckDB httpfs + S3 credential setup
- CellDataLoader.load_csv(): accepts s3:// URLs alongside local Paths
- ingest_results(): label lookup → export path discovery → S3 read → load
- SweepManager.ingest(): convenience wrapper
- StageFromMinioConfig + JoshCLI.stage_from_minio(): download fallback
- Fix pre-existing test regression from 3e487fe (data file extension)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Move os, tempfile, StageFromMinioConfig, configure_s3 imports to
  module level instead of inside methods
- Extract monolithic ingest_results() into focused helpers:
  _resolve_ingest_metadata(), _get_josh_source(),
  _configure_minio_access(), _load_ingest_replicates()
- Fix test mock patch path to match new top-level import

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- TestStageFromMinio: mock JarManager.get_jar so tests don't require
  a local JAR file (they already mock subprocess.run)
- TestDiffCLI.test_main_view: mock _launch_ide so test doesn't require
  VS Code's `code` CLI in PATH

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Escalating integration tests against a real MinIO service container:
- Level 1: DuckDB httpfs write/read to MinIO
- Level 2: Josh JAR simulation exports to MinIO, Python reads back
- Level 3: CellDataLoader.load_csv from s3:// URLs
- Level 4: End-to-end ingest_results() by label from MinIO
- Level 5: Partial/interrupted sweep recovery (missing replicates)
- Edge cases: bad credentials, missing bucket, namespace isolation

Infrastructure:
- tests/conftest.py with shared fixtures (minio_conn, minio_registry, seed_csv, etc.)
- tests/fixtures/minio_export.josh minimal test simulation
- pytest 'integration' marker registered in pyproject.toml
- pixi tasks: 'test' excludes integration, 'test-integration' runs only integration
- CI workflow with unit-tests + integration-tests jobs (MinIO service container)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes zizmor alerts:
- Pin bitnamilegacy/minio to SHA digest (unpinned image reference)
- Add persist-credentials: false to checkout (credential persistence)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Target profiles (`~/.josh/targets/<name>.json`) are the shared config
format between josh and joshpy for batch remote execution.  This adds
CRUD operations, snake↔camel serialization for Java interop, and a
credential resolution hierarchy (profile > env vars).

Also installs gcloud SDK + kubectl in the devcontainer for K8s target
interaction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
GondekNP and others added 2 commits April 17, 2026 18:31
 PR 3)

Three new config dataclasses and JoshCLI methods that wrap the
joshsim batch commands: batchRemote (dispatch to HTTP/K8s targets),
preprocessBatch (remote preprocessing), and stageToMinio (upload
staging). All follow the existing stage_from_minio/run_remote patterns.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wires batch_remote into the sweep loop with two modes:

- Blocking (default): each job calls batchRemote, blocks until the JAR
  finishes polling internally, same pattern as run_remote().
- Async (batch_no_wait=True): dispatches all jobs with --no-wait,
  parses JSON job IDs, then polls via pollBatch until all complete.

New: PollBatchConfig + poll_batch(), to_batch_remote_config(),
batch_remote/target/batch_no_wait/poll_interval/batch_timeout params
threaded through run_sweep(), SweepManager.run(), run_adaptive_sweep().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant