ti_*: Use num_failure_retries instead of unattended mode#18404
Draft
kcreddy wants to merge 1 commit intoelastic:mainfrom
Draft
ti_*: Use num_failure_retries instead of unattended mode#18404kcreddy wants to merge 1 commit intoelastic:mainfrom
kcreddy wants to merge 1 commit intoelastic:mainfrom
Conversation
Replace settings.unattended: true with settings.num_failure_retries: -1 in all ti_* managed transforms. Unlike unattended mode which retries all failures indefinitely (masking irrecoverable errors), num_failure_retries: -1 retries only recoverable failures while still surfacing genuinely irrecoverable ones to users. Three packages (ti_anyrun, ti_flashpoint, ti_strider) that were added after the original unattended PR (elastic#16535) had no failure resilience at all and now get num_failure_retries: -1 added. Requires elastic/package-spec#1124 (add num_failure_retries to the transform settings schema). [git-generate] for transform in $(find packages/ti_*/ -type f -name transform.yml \ -path '*/elasticsearch/transform/*'); do yq -i 'del(.settings.unattended)' "$transform" yq -i '.settings.num_failure_retries = -1' "$transform" done for transform in $(git diff --name-only packages/ | \ grep 'transform\.yml$'); do current=$(yq '._meta.fleet_transform_version' "$transform") next=$(echo "$current" | awk -F. '{printf "%d.%d.%d",$1,$2+1,0}') yq -i "._meta.fleet_transform_version = \"$next\"" "$transform" done for pkg in $(git diff --name-only packages/ | cut -d/ -f1,2 | \ sort -u); do cd "$pkg" elastic-package changelog add \ --description "Use num_failure_retries instead of unattended mode for transform failure recovery." \ --type enhancement --next minor \ --link "elastic#18404" cd ../../ done Made-with: Cursor
2fc72f0 to
f441298
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Proposed commit message
Summary
Switches all
ti_*managed transforms fromsettings.unattended: truetosettings.num_failure_retries: -1.unattended: trueretries all failures indefinitely, including irrecoverable ones, which masks real problems from users.num_failure_retries: -1retries only recoverable failures indefinitely (network blips, transient cluster instability) while still surfacing irrecoverable errors.This covers 52 transforms across 23 packages. Three packages (
ti_anyrun,ti_flashpoint,ti_strider) were added after the originalunattendedPR (#16535) and had no failure resilience at all -- they now getnum_failure_retries: -1for the first time.Changes per package
For each affected transform:
settings.unattended: truesettings.num_failure_retries: -1_meta.fleet_transform_version(minor bump triggers reinstall)Checklist
changelog.ymlfile.Related issues
unattended: true)unattended: truewithnum_failure_retries: -1in all managed transforms #18403