Skip to content

HDDS-11838. Top-level shell to allow access to admin/debug/sh commands#10134

Open
will-sh wants to merge 3 commits intoapache:masterfrom
will-sh:HDDS-11838
Open

HDDS-11838. Top-level shell to allow access to admin/debug/sh commands#10134
will-sh wants to merge 3 commits intoapache:masterfrom
will-sh:HDDS-11838

Conversation

@will-sh
Copy link
Copy Markdown
Contributor

@will-sh will-sh commented Apr 25, 2026

What changes were proposed in this pull request?

HDDS-11838. Introduce interactive shell mode for Ozone CLI

This PR introduces a new top-level interactive command mode for the Ozone CLI (ozone --interactive). The motivation behind this feature is to provide a cohesive and persistent shell environment where users can execute multiple ozone subcommands (such as sh, admin, debug, etc.) sequentially within the same JVM instance. This avoids the overhead of JVM startup time for each individual command and significantly improves the operational experience.

Approach used to solve the issue:

Interactive Shell Skeleton: Created OzoneInteractiveShell.java based on Picocli and JLine3 to provide a ozone> prompt.
Subcommand Registration: Statically added common module commands (sh, tenant, s3) and dynamically injected commands from other modules (admin, debug, fs, repair) using reflection to avoid circular dependencies across Maven modules.
Execution Isolation (Exception Handling): Modified the GenericCli base class to intercept ExitUtils.terminate(). In interactive mode, if a single command execution fails, the exception is caught and printed without exiting the entire JVM. This ensures the REPL shell persists and awaits the next input.
Shell Script Integration: Updated the hadoop-ozone/dist/src/shell/ozone/ozone startup script to recognize the --interactive flag, configure the OzoneInteractiveShell main class, and dynamically load all necessary classpaths (ozone-cli-admin, ozone-cli-debug, ozone-cli-repair, ozone-tools).

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-11838

How was this patch tested?

Manual Verification Steps:

  1. github CI passed.
  2. Compiled the project successfully using mvn clean install -DskipTests
  3. Started the CLI in docker:
$ cd hadoop-ozone/dist/target/ozone-*/compose/ozone
$ docker compose down
$ docker compose up -d

Verified that the ozone> prompt correctly appeared.
Executed multiple commands sequentially within the REPL environment:

$ docker exec -it ozone-om-1 bash
bash-5.1$ ozone --interactive
ozone> debug version
{
  "components" : {
    "client" : {
      "componentVersion" : {
        "name" : "BUCKET_LAYOUT_SUPPORT",
        "protoValue" : 3
      }
    },
    "datanode" : {
      "componentVersion" : {
        "name" : "STREAM_BLOCK_SUPPORT",
        "protoValue" : 3
      }
    },
    "om" : {
      "componentVersion" : {
        "name" : "ATOMIC_CREATE_IF_NOT_EXISTS",
        "protoValue" : 12
      }
    }
  },
  "ozone" : {
    "revision" : "de379b4f3fbdad038335107e5a6a362419f27a82",
    "url" : "https://github.com/will-sh/ozone.git",
    "version" : "2.2.0-SNAPSHOT"
  }
}
ozone> sh volume create vol1
ozone> admin om roles
java.lang.IllegalArgumentException: There is no Ozone Manager service ID specified, but there are either zero, or more than one service IDconfigured.
	at org.apache.hadoop.ozone.admin.om.OMAdmin.getTheOnlyConfiguredOmServiceIdOrThrow(OMAdmin.java:133)
	at org.apache.hadoop.ozone.admin.om.OMAdmin.createOmClient(OMAdmin.java:111)
	at org.apache.hadoop.ozone.admin.om.OmAddressOptions$AbstractServiceIdMixin.newClient(OmAddressOptions.java:41)
	at org.apache.hadoop.ozone.admin.om.OmAddressOptions$OptionalServiceIdMixin.newClient(OmAddressOptions.java:77)
	at org.apache.hadoop.ozone.admin.om.GetServiceRolesSubcommand.call(GetServiceRolesSubcommand.java:66)
	at org.apache.hadoop.ozone.admin.om.GetServiceRolesSubcommand.call(GetServiceRolesSubcommand.java:39)
	at picocli.CommandLine.executeUserObject(CommandLine.java:2041)
	at picocli.CommandLine.access$1500(CommandLine.java:148)
	at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2461)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2453)
	at picocli.CommandLine$RunLast.handle(CommandLine.java:2415)
	at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2273)
	at picocli.CommandLine$RunLast.execute(CommandLine.java:2417)
	at picocli.CommandLine.execute(CommandLine.java:2170)
	at picocli.shell.jline3.PicocliCommands.invoke(PicocliCommands.java:287)
	at org.jline.console.impl.SystemRegistryImpl.execute(SystemRegistryImpl.java:1228)
	at org.jline.console.impl.SystemRegistryImpl.execute(SystemRegistryImpl.java:1274)
	at org.apache.hadoop.ozone.shell.REPL.<init>(REPL.java:86)
	at org.apache.hadoop.ozone.shell.OzoneInteractiveShell.main(OzoneInteractiveShell.java:64)
ozone> exit

Verified the execution isolation: intentionally executed failing commands (e.g., creating an existing volume or missing OM configuration) to confirm the interactive shell correctly prints the error message and gracefully recovers to accept the next command without terminating the JVM.

@errose28 errose28 requested a review from adoroszlai April 27, 2026 16:26
Copy link
Copy Markdown
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot @will-sh for the patch.

In addition to inline comments, please also add the new command to ozone help here:

ozone_add_subcommand "admin" client "Ozone admin tool"
ozone_add_subcommand "debug" client "Ozone debug tool"
ozone_add_subcommand "repair" client "Ozone repair tool"

Comment on lines +81 to +83
if (cmd.getOut() != null) {
cmd.getOut().println("Command executed with exit code: " + exitCode);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like extra output, why is this added?

Comment on lines +226 to +233
OZONE_RUN_ARTIFACT_NAME="ozone-cli-shell"
OZONE_OPTS="${OZONE_OPTS} ${RATIS_OPTS} ${OZONE_MODULE_ACCESS_ARGS}"
# Add all CLI classpaths to support all subcommands dynamically
for cp_file in "ozone-cli-admin" "ozone-cli-debug" "ozone-cli-repair" "ozone-tools"; do
if [[ -f "${OZONE_HOME}/share/ozone/classpath/${cp_file}.classpath" ]]; then
ozone_add_classpath_from_file "${OZONE_HOME}/share/ozone/classpath/${cp_file}.classpath"
fi
done
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create a new submodule to depend on all other CLI modules in the POM? Benefits:

  • can add all subcommands statically in OzoneInteractiveShell, without reflection
  • avoid the need for custom classpath management

Comment on lines +332 to +335
if [[ "$OZONE_SUBCMD" == "--interactive" ]]; then
OZONE_SUBCMD="interactive"
fi

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not add --interactive for simplicity.

if (cmd.getOut() != null) {
cmd.getOut().println("Command executed with exit code: " + exitCode);
}
if (System.getProperty("ozone.interactive.shell") == null) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please define a constant for "ozone.interactive.shell" and use in both places.

// Dynamically add subcommands from other modules to avoid circular dependencies.
addDynamicSubcommand(topCmd, "admin", "org.apache.hadoop.ozone.admin.OzoneAdmin");
addDynamicSubcommand(topCmd, "debug", "org.apache.hadoop.ozone.debug.OzoneDebug");
addDynamicSubcommand(topCmd, "fs", "org.apache.hadoop.fs.ozone.OzoneFsShell");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fs does not appear in help, probably because it's not implemented as GenericCli:

bash-5.1$ ozone interactive
ozone> help
 -  ozone registry
Summary: admin  Developer tools for Ozone Admin operations
         debug  Developer tools for Ozone Debug operations
         repair Advanced tool to repair Ozone. Check the --help output of the subcommand for
         s3     Shell for S3 specific operations
         sh     Shell for Ozone object store
         tenant Shell for multi-tenant specific operations

I think we can omit it for now, and maybe refactor OzoneFsShell later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants