Batch remote execution: target profiles + K8s tooling (PR 2)#34
Open
Batch remote execution: target profiles + K8s tooling (PR 2)#34
Conversation
Enable recovering simulation results from MinIO into the RunRegistry by label. DuckDB reads CSVs directly from S3 via httpfs — no local download needed. Also provides a download=True fallback via stageFromMinio. - configure_s3(): reusable DuckDB httpfs + S3 credential setup - CellDataLoader.load_csv(): accepts s3:// URLs alongside local Paths - ingest_results(): label lookup → export path discovery → S3 read → load - SweepManager.ingest(): convenience wrapper - StageFromMinioConfig + JoshCLI.stage_from_minio(): download fallback - Fix pre-existing test regression from 3e487fe (data file extension) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Move os, tempfile, StageFromMinioConfig, configure_s3 imports to module level instead of inside methods - Extract monolithic ingest_results() into focused helpers: _resolve_ingest_metadata(), _get_josh_source(), _configure_minio_access(), _load_ingest_replicates() - Fix test mock patch path to match new top-level import Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- TestStageFromMinio: mock JarManager.get_jar so tests don't require a local JAR file (they already mock subprocess.run) - TestDiffCLI.test_main_view: mock _launch_ide so test doesn't require VS Code's `code` CLI in PATH Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Escalating integration tests against a real MinIO service container: - Level 1: DuckDB httpfs write/read to MinIO - Level 2: Josh JAR simulation exports to MinIO, Python reads back - Level 3: CellDataLoader.load_csv from s3:// URLs - Level 4: End-to-end ingest_results() by label from MinIO - Level 5: Partial/interrupted sweep recovery (missing replicates) - Edge cases: bad credentials, missing bucket, namespace isolation Infrastructure: - tests/conftest.py with shared fixtures (minio_conn, minio_registry, seed_csv, etc.) - tests/fixtures/minio_export.josh minimal test simulation - pytest 'integration' marker registered in pyproject.toml - pixi tasks: 'test' excludes integration, 'test-integration' runs only integration - CI workflow with unit-tests + integration-tests jobs (MinIO service container) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Fixes zizmor alerts: - Pin bitnamilegacy/minio to SHA digest (unpinned image reference) - Add persist-credentials: false to checkout (credential persistence) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Target profiles (`~/.josh/targets/<name>.json`) are the shared config format between josh and joshpy for batch remote execution. This adds CRUD operations, snake↔camel serialization for Java interop, and a credential resolution hierarchy (profile > env vars). Also installs gcloud SDK + kubectl in the devcontainer for K8s target interaction. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This was referenced Apr 17, 2026
PR 3) Three new config dataclasses and JoshCLI methods that wrap the joshsim batch commands: batchRemote (dispatch to HTTP/K8s targets), preprocessBatch (remote preprocessing), and stageToMinio (upload staging). All follow the existing stage_from_minio/run_remote patterns. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Wires batch_remote into the sweep loop with two modes: - Blocking (default): each job calls batchRemote, blocks until the JAR finishes polling internally, same pattern as run_remote(). - Async (batch_no_wait=True): dispatches all jobs with --no-wait, parses JSON job IDs, then polls via pollBatch until all complete. New: PollBatchConfig + poll_batch(), to_batch_remote_config(), batch_remote/target/batch_no_wait/poll_interval/batch_timeout params threaded through run_sweep(), SweepManager.run(), run_adaptive_sweep(). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Part of the batch remote execution plan (#31). This branch accumulates all batch PRs (PR1 ingest already merged, this adds PR2).
joshpy/targets.py): CRUD for~/.josh/targets/<name>.json— the shared config format between josh and joshpy. IncludesTargetProfile,HttpTargetConfig,KubernetesTargetConfigdataclasses, snake↔camel serialization for Java interop, andresolve_minio_creds()with profile > env var hierarchy.Related: josh#416 — normalize JSON key casing (snake_case vs camelCase mix).
Test plan
pixi run test), 17 integration tests deselected🤖 Generated with Claude Code