Skip to content

Added Try Catch block in openBatchSession#7409

Open
oh0873 wants to merge 7 commits intoapache:masterfrom
oh0873:hoonoh/openBatchSessionExceptionCatch
Open

Added Try Catch block in openBatchSession#7409
oh0873 wants to merge 7 commits intoapache:masterfrom
oh0873:hoonoh/openBatchSessionExceptionCatch

Conversation

@oh0873
Copy link
Copy Markdown

@oh0873 oh0873 commented Apr 16, 2026

Why are the changes needed?

In Kyuubi Batch v2 version, Batch Service creates batchExecutor to submit a batch job. However, when openBatchSession fails, the executor dies and never recovers.

This results in a jobs stuck at PENDING state and no executor to submit any jobs at INITIALIZED state.

This PR added error handling logic to handle openBatchSession.

This also fixes withUpdateCount call changes fromState to targetState.

How was this patch tested?

Tested in our environment, openBatchSession exceptions like connections per user or DB connection error no longer leaves jobs stuck at PENDING state. Instead all those jobs are labeled as ERROR.

Was this patch authored or co-authored using generative AI tooling?

Test case and bug finding were assisted with Cursor agent.

Oh, Hoon added 3 commits April 16, 2026 10:01
---
**Work Item:** #11316434 #11406267

---

**Summary**
--

Kyuubi Batch Service creates `batchExecutor` to submit a batch job. However, when `openBatchSession` fails the executor dies and never recovers. This can result in a jobs stuck at PENDING state and no executor to submit any jobs stuck at INITIALIZED state.

We added  error handling logic to handle openBatchSession failures

**Problem**
--

`openBatchSession` may fail if connections.per.users are exceeded or metedata query fails. When `openBatchSession` fails, the executor dies and does not recover unless we restart kyuubi server pods.

---

**Approach**
---

Add try block to catch exception when `openBatchSession` fails. We mark the job from PENDING to FAILED to avoid PENDING jobs taking up submitter executors for long period of time.

**Code Change**
--

Fixed incorrect behavior of `withUpdateCount`.

Added Try and Except block in `KyuubiBatchService` to catch all open batch session exceptions, then it continues to the next job.

---

**Concern**

It's not clear whether we should set the failed job with ERROR or INITIALZIED (INITIALIZED would allow it to retry).

**Test**

Related work items: #11406267
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves Kyuubi Batch v2 resiliency by preventing the batch submitter thread(s) from dying permanently when openBatchSession fails, and fixes an argument-order bug in JDBC metadata state transitions.

Changes:

  • Add error-handling around batch picking/opening in KyuubiBatchService to keep the submit loop alive and attempt to mark failed scheduled batches as ERROR.
  • Fix JDBCMetadataStore.transformMetadataState parameter ordering so state transitions update the intended column/value.
  • Add a REST suite test intended to cover failures during metadata update while opening a batch session.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
kyuubi-server/src/main/scala/org/apache/kyuubi/server/KyuubiBatchService.scala Wraps submit loop with try/catch and attempts to fail scheduled batches on open failures.
kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/MetadataManager.scala Adds failScheduledBatch helper to transition PENDING -> ERROR.
kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/jdbc/JDBCMetadataStore.scala Corrects SQL parameter order in transformMetadataState.
kyuubi-server/src/test/scala/org/apache/kyuubi/server/api/v1/BatchesResourceSuite.scala Adds a test scenario for openBatchSession failing during metadata update.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread kyuubi-server/src/main/scala/org/apache/kyuubi/server/KyuubiBatchService.scala Outdated
oh0873 and others added 4 commits April 20, 2026 09:41
Grammar

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants