Adds backlog reporting support for non-fnapi based SDF's.#38346
Adds backlog reporting support for non-fnapi based SDF's.#38346acrites wants to merge 4 commits intoapache:masterfrom
Conversation
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request introduces a mechanism for non-FnApi based Splittable DoFns (SDFs) to report their source backlog. By invoking the GetSize method on the residual restriction, the runner can now capture the remaining work size. This information is propagated through the execution context and included in the Dataflow backend's CommitWorkRequest, which allows the Dataflow service to make more informed autoscaling decisions based on the actual remaining work. Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request implements backlog reporting for Splittable DoFns (SDF) by introducing a getSize mechanism to calculate and propagate backlog bytes to the runner. Key changes include updates to the SplittableProcessElementInvoker and StepContext to handle backlog data, along with an implementation for unbounded sources. Review feedback suggests refining exception handling in the getSize method and addresses the potential brittleness of filtering out legacy default backlog values.
| } | ||
|
|
||
| @GetSize | ||
| public double getSize( |
There was a problem hiding this comment.
| getWorkItem().getWorkToken(), | ||
| activeReader); | ||
| activeReader = null; | ||
| } else if (backlogBytes != UnboundedReader.BACKLOG_UNKNOWN && backlogBytes != 1L) { |
There was a problem hiding this comment.
There was a problem hiding this comment.
Code Review
This pull request introduces the ability for Splittable DoFns (SDF) to report backlog bytes, which is essential for accurate autoscaling in runners like Dataflow. The changes include adding a backlogBytes field to the SDF Result object, updating the StepContext to allow reporting these values, and implementing getSize for UnboundedSourceAsSDFWrapperFn. Feedback is provided regarding potential data loss when casting backlog values from double to long in the Dataflow worker and a suggested optimization in Read.java to avoid expensive reader creation for empty sources.
|
|
||
| @Override | ||
| public void setBacklogBytes(double backlogBytes) { | ||
| StreamingModeExecutionContext.this.backlogBytes = (long) backlogBytes; |
There was a problem hiding this comment.
The backlogBytes value is cast from double to long here. In flushState() (line 532), a value of 1L is explicitly ignored to avoid legacy default values. If a user-reported backlog is between 1.0 and 2.0 (e.g., 1.5), it will be truncated to 1L and subsequently ignored. While a backlog of 1 is typically negligible for autoscaling, this truncation could lead to unexpected behavior if the user-reported value is intended to be significant.
| try (UnboundedReader<OutputT> reader = | ||
| restriction.getSource().createReader(pipelineOptions, restriction.getCheckpoint())) { |
There was a problem hiding this comment.
Creating and starting a reader in getSize can be expensive for some UnboundedSource implementations. As an optimization, consider checking if the source is an instance of EmptyUnboundedSource and returning 0.0 immediately to avoid unnecessary reader creation.
if (restriction.getSource() instanceof EmptyUnboundedSource) {
return 0.0;
}
try (UnboundedReader<OutputT> reader =
restriction.getSource().createReader(pipelineOptions, restriction.getCheckpoint())) {|
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
… the API in DoFn.java, but some tests are using it.
Non FnApi-based SDF's now call GetSize on the residual restriction to get source backlog. For the DataflowRunner, this backlog is sent in the CommitWorkRequest sent to the backend, which will allow for better autoscaling decisions.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>instead.CHANGES.mdwith noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.