Skip to content

Improve logging in boot.go to facilitate future triaging#38342

Open
shunping wants to merge 4 commits intoapache:masterfrom
shunping:debug-boot
Open

Improve logging in boot.go to facilitate future triaging#38342
shunping wants to merge 4 commits intoapache:masterfrom
shunping:debug-boot

Conversation

@shunping
Copy link
Copy Markdown
Collaborator

@shunping shunping commented Apr 30, 2026

We met some difficulty recently when triaging dataflow customer problems regarding Python environments.

Here, we add some more useful logging in the sdk image entrypoint boot.go, which should help with the triaging process.

Related internal bug: 491352862

@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request improves the observability of the SDK container boot process by adding more granular logging. These changes facilitate easier debugging of environment-specific issues by providing clearer insights into pipeline configurations, experiment settings, and the state of Python dependencies before and after package installation.

Highlights

  • Enhanced Logging: Added detailed logging for PipelineOptions and Experiments to improve visibility during the boot process.
  • Dependency Tracking: Updated the runtime dependency logging to include a phase identifier and capture all installed packages using 'pip freeze --all'.
  • Pre-installation Diagnostics: Integrated a pre-installation dependency check to better diagnose environment states before package installation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances logging within the Python container boot process by recording pipeline options, experiments, and runtime dependencies at both pre-installation and post-installation phases. It also updates the dependency logging to use pip freeze --all for better visibility. A security concern was identified regarding the logging of the entire PipelineOptions object, as it may inadvertently expose sensitive credentials such as API keys or access tokens.

Comment thread sdks/python/container/boot.go Outdated
@shunping
Copy link
Copy Markdown
Collaborator Author

r: @tvalentyn

@shunping shunping changed the title Add some more logging into boot.go to facilitate future environment debugging Add more logging into boot.go to facilitate future environment debugging Apr 30, 2026
@github-actions
Copy link
Copy Markdown
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

@shunping shunping changed the title Add more logging into boot.go to facilitate future environment debugging Add more logging into boot.go to facilitate future triaging Apr 30, 2026
bufLogger.Printf(ctx, "Logging runtime dependencies:")
args = []string{"-m", "pip", "freeze"}
bufLogger.Printf(ctx, "Logging runtime dependencies (%s):", phase)
args = []string{"-m", "pip", "freeze", "--all"}
Copy link
Copy Markdown
Collaborator Author

@shunping shunping Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The argument "--all" ensures that the versions of pip, setuptools, etc are included in the result.

@shunping shunping requested a review from tvalentyn April 30, 2026 15:21
Comment thread sdks/python/container/boot.go Outdated
bufLogger := tools.NewBufferedLogger(logger)
bufLogger.Printf(ctx, "Installing setup packages ...")

if err := logRuntimeDependencies(ctx, bufLogger, "pre-installation"); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use-case for this?

Copy link
Copy Markdown
Collaborator Author

@shunping shunping Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While we already log dependencies after installation, boot.go exits immediately if any installation step fails.

Adding a pre-installation call ensures we capture the environment state regardless of whether the installation succeeds. This is useful for reproducing and triaging environment-specific failures from customers.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'pre-installation', 'post-installation' sounds a bit cryptic for end user who is not a beam dev, and the output may be a a bit verbose.

How about we think of a way to enable debug logging for boot.go and only print pre-installation env if debug logging is enabled? then, we can ask affected customers to run their pipeline with debug logging enabled if necessary.

Copy link
Copy Markdown
Collaborator Author

@shunping shunping Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new logs only add 4 lines and are already emitted at the DEBUG level, which allows users to filter them out as needed.

Given this minimal footprint, I’d prefer to avoid adding complexity of a new configuration mechanism or flag in boot.go to keep the boot logic as simple as possible.

image

'pre-installation', 'post-installation' sounds a bit cryptic for end user who is not a beam dev.

I used the term "installation" because of the line of "Installing setup packages ..." prior to these logs (see above screenshot too), but I am open to any better term.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg, thanks!

Copy link
Copy Markdown
Contributor

@tvalentyn tvalentyn Apr 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i do think that the pre-installation output is confusing unless you know why you need to look at it; most of the time, you need to look at the final list after all the installations.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could change logs to the below:

Installing setup packages -> Installing additional runtime dependencies if any are specified in --requirements_file, --setup_file or --extra_package options.

post-installation-> post-installation (final runtime environment)

Copy link
Copy Markdown
Collaborator Author

@shunping shunping May 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great. I made the change to use "initial runtime environmemnt" and "final runtime environment". I also included some small edits to existing messages to make them consistent and concise. PTAL

@shunping shunping changed the title Add more logging into boot.go to facilitate future triaging Improve logging in boot.go to facilitate future triaging May 2, 2026
Comment thread sdks/python/container/boot.go Outdated
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants