Skip to content

[FLINK-39436][table] Allow late data in PTFs#27935

Open
fhueske wants to merge 3 commits intomasterfrom
fhueske-FLINK-39436-Allow-late-data-in-PTFs
Open

[FLINK-39436][table] Allow late data in PTFs#27935
fhueske wants to merge 3 commits intomasterfrom
fhueske-FLINK-39436-Allow-late-data-in-PTFs

Conversation

@fhueske
Copy link
Copy Markdown
Contributor

@fhueske fhueske commented Apr 15, 2026

What is the purpose of the change

Previously, late events (rowtime <= watermark) were silently dropped before reaching PTF eval(), and timer registrations for times <= watermark were also silently dropped. This change removes both restrictions.
These are breaking changes that are part of FLIP-565 which was approved.

I checked that the docs don't need to be adjusted because the earlier behavior of PTFs dropping late data was not documented. The new behavior is aligned with other functions in Flink.

Brief change log

  • ProcessTableRunner: remove the early-return guard in processEval() so that late events are passed to the PTF's eval() method.

  • WritableInternalTimeContext: remove the watermark check in registerOnTimeInternal() so that timers can be registered for past times. Such timers fire immediately at the next watermark advance, including when registered from within onTimer(). The previous guard also had an unintended side effect: any call to replaceNamedTimer() with a past time would delete the existing timer entry but then silently drop the new registration, leaving the named timer in a state where it appeared un-registered but the old timer was gone.

  • AbstractProcessTableOperator: remove the fired named timer's state entry before invoking onTimer() to prevent stale entries from accumulating in the named timers map state.

Verifying this change

Tests are updated to reflect the new semantics:

  • PROCESS_LATE_EVENTS: demonstrates that late events enter eval(), can register timers (including for past times), and that such timers fire immediately at the next watermark advance.

  • PROCESS_ROW_LATE_EVENTS (new): verifies the same for row-semantics PTFs.

  • PROCESS_OPTIONAL_ON_TIME / PROCESS_NAMED_TIMERS: updated to reflect that timer registrations for past times are no longer dropped. Previously, once the watermark passed the registered time, the timer was silently discarded; now it fires immediately.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? n/a

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented Apr 15, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@fhueske fhueske requested a review from twalthr April 15, 2026 17:12
@fhueske fhueske marked this pull request as ready for review April 15, 2026 17:13
Copy link
Copy Markdown
Contributor

@gustavodemorais gustavodemorais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good in general to me!

Two nits: can you rename the PR title and commit title to use this format [FLINK-39436][table]?

"+I[Bob, {Processing input row +I[Bob, 6, 1970-01-01T00:00:00.006Z] at time 6 watermark null}, 1970-01-01T00:00:00.006Z]",
"+I[Bob, {Timer timeout2 fired at time 2 watermark 9223372036854775807}, 1970-01-01T00:00:00.002Z]",
"+I[Bob, {Clearing all timers at time 2 watermark 9223372036854775807}, 1970-01-01T00:00:00.002Z]")
"+I[Bob, {Timer timeout2 fired at time 5 watermark 9223372036854775807}, 1970-01-01T00:00:00.005Z]",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is basically the bugfix, right? We were incorrectly firing timeout2 at 2 and now we fire corrently at 5

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not really a bug fix. It's an artifact of allowing late timer registration.

"+I[Alice, {Processing input row +I[Alice, 1, 1970-01-01T00:00:00.001Z] at time 1 watermark null}, 1970-01-01T00:00:00.001Z]",
"+I[Bob, {Processing input row +I[Bob, 2, 1970-01-01T00:00:00.002Z] at time 2 watermark null}, 1970-01-01T00:00:00.002Z]")
.consumedAfterRestore(
"+I[Bob, {Timer timeout1 fired at time 1 watermark 9223372036854775807}, 1970-01-01T00:00:00.001Z]",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing that was a bit confusing for me is why we have watermark 9223372036854775807 here? Do we emit this max watermark only for testing or do our sources actually emit this when all values have been read? I wondered if that basically makes the user not know if something is late anymore since

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sources emit this when all values have been read. It will flush all remaining timers and windows and mark the job as complete.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. This is an unrelated topic to the PR so definitely not a blocker: but I guess it makes sense for users to be aware of this so they can write their logic accordingly as well? Maybe it's already documented somewhere and then there's nothing to do 🙂

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for this PR: maybe add a short comment Sources emit 9223372036854775807 as a MAX watermark after all values have been read. It makes understanding the test a bit easier

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAICT this is a property of the testing framework.
I can see that this looks surprising, but I'm not sure if it makes sense to put a comment everywhere the framework is being used.

public static final TableTestProgram PROCESS_ROW_LATE_EVENTS =
TableTestProgram.of(
"process-row-late-events",
"test that late events enter a row-level PTF")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"test that late events enter a row-level PTF")
"test that late events enter a PTF with row semantics")

@gustavodemorais
Copy link
Copy Markdown
Contributor

One thing worth noting: removing the watermark guard means past-time timers registered from onTimer() fire immediately (within the same advanceWatermark loop). But it also means something like this lead to infinite loops:

 public void onTimer(OnTimerContext ctx) {
      // "remind me again in the past" = fires immediately = infinite loop
      ctx.timeContext(Long.class).registerOnTime(0L);
  }

AI says this is the same behavior as standard Flink KeyedProcessFunction, so the design is consistent - just wanted to flag that PTF users were previously protected from this and they may shoot themselves in the foot more easily now. Maybe worth a short note in the timers section of ptfs.md along the lines of "timers registered at or below the current watermark fire immediately at the next watermark advance; avoid unconditional re-registration of past-time timers from onTimer() as this will cause an infinite loop."

Copy link
Copy Markdown
Contributor

@twalthr twalthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR @fhueske. I added some minor comments. Could we update the JavaDoc in ProcessTableFunction and the documentation for late events. Late events deserve a dedicated section, before * <h2>Efficiency and Design Principles</h2>.

Comment on lines 157 to 165
// Remove the fired timer's state entry immediately to prevent stale entries from
// accumulating. Without this, entries for fired timers would persist until the same
// timer name is re-registered or the state is explicitly cleared.
if (namedTimersMapState != null && namedTimer != VoidNamespace.INSTANCE) {
namedTimersMapState.remove((StringData) namedTimer);
}
processTableRunner.ingestTimerEvent(
timer.getKey(),
namedTimer == VoidNamespace.INSTANCE ? null : (StringData) namedTimer,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch! some nit comments

Suggested change
// Remove the fired timer's state entry immediately to prevent stale entries from
// accumulating. Without this, entries for fired timers would persist until the same
// timer name is re-registered or the state is explicitly cleared.
if (namedTimersMapState != null && namedTimer != VoidNamespace.INSTANCE) {
namedTimersMapState.remove((StringData) namedTimer);
}
processTableRunner.ingestTimerEvent(
timer.getKey(),
namedTimer == VoidNamespace.INSTANCE ? null : (StringData) namedTimer,
boolean isNamedTimer = namedTimer != VoidNamespace.INSTANCE;
// Remove the fired timer's state entry immediately to prevent stale entries from
// accumulating. Without this, entries for fired timers would persist until the same
// timer name is re-registered or the state is explicitly cleared.
if (isNamedTimer) {
namedTimersMapState.remove((StringData) namedTimer);
}
processTableRunner.ingestTimerEvent(
timer.getKey(),
isNamedTimer ? (StringData) namedTimer: null,

// Register at wm+1 to always target the immediate next watermark: the timer fires
// exactly once per watermark advance, and each new row re-registers the timer for the
// following watermark step, demonstrating repeated timer re-registration.
long timer = wm == null || wm < 0 ? 1 : wm + 1;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wouldn't this be more correct and allowing for watermarks before epoch.

Suggested change
long timer = wm == null || wm < 0 ? 1 : wm + 1;
long timer = wm == null ? Long.MIN_VALUE + 1: wm + 1;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just a function for testing purposes and we assert against the behavior that we implement.
So I think it doesn't really matter.
Or is there an (edge) case that would be covered with your suggestion that isn't with the current version?

@github-actions github-actions bot added the community-reviewed PR has been reviewed by the community. label Apr 16, 2026
@fhueske fhueske force-pushed the fhueske-FLINK-39436-Allow-late-data-in-PTFs branch from 857e77c to a08f393 Compare April 16, 2026 17:18
@fhueske fhueske changed the title [FLINK-39436] [sql] Allow late data in PTFs [FLINK-39436][table] Allow late data in PTFs Apr 16, 2026
@fhueske
Copy link
Copy Markdown
Contributor Author

fhueske commented Apr 16, 2026

Thanks for your reviews @gustavodemorais and @twalthr.
I've addressed your comments and updated the PR.

Copy link
Copy Markdown
Contributor

@gustavodemorais gustavodemorais left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks good to me! Thanks, @fhueske

  • I think you have to run spotless to make the CI green

@fhueske fhueske force-pushed the fhueske-FLINK-39436-Allow-late-data-in-PTFs branch from a08f393 to 57b157d Compare April 17, 2026 06:38
Copy link
Copy Markdown
Contributor

@twalthr twalthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update @fhueske, I had two last comments. Should be good in the next iteration.

{{< /tab >}}
{{< /tabs >}}

### Handling of Late Records
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copy this to the Chinese docs as well

### Handling of Late Records

A late record is a record with a time attribute value that is less than or equal to the current
watermark. PTFs handle late records just like non-late records by calling the `eval()` method.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

keep docs in sync:

Suggested change
watermark. PTFs handle late records just like non-late records by calling the `eval()` method.
watermark. PTFs handle late records just like non-late records by calling the `eval()` method. If the {@code on_time} argument is specified, the late timestamp is preserved in
* the output

fhueske added 3 commits April 17, 2026 19:42
Previously, late events (rowtime <= watermark) were silently dropped
before reaching PTF eval(), and timer registrations for times <=
watermark were also silently dropped. This change removes both
restrictions:

- ProcessTableRunner: remove the early-return guard in processEval()
  so that late events are passed to the PTF's eval() method.

- WritableInternalTimeContext: remove the watermark check in
  registerOnTimeInternal() so that timers can be registered for past
  times. Such timers fire immediately at the next watermark advance,
  including when registered from within onTimer(). The previous guard
  also had an unintended side effect: any call to replaceNamedTimer()
  with a past time would delete the existing timer entry but then
  silently drop the new registration, leaving the named timer in a
  state where it appeared un-registered but the old timer was gone.

- AbstractProcessTableOperator: remove the fired named timer's state
  entry before invoking onTimer() to prevent stale entries from
  accumulating in the named timers map state.

Tests are updated to reflect the new semantics:

- PROCESS_LATE_EVENTS: demonstrates that late events enter eval(),
  can register timers (including for past times), and that such timers
  fire immediately at the next watermark advance.

- PROCESS_ROW_LATE_EVENTS (new): verifies the same for row-semantics
  PTFs.

- PROCESS_OPTIONAL_ON_TIME / PROCESS_NAMED_TIMERS: updated to reflect
  that timer registrations for past times are no longer dropped.
  Previously, once the watermark passed the registered time, the timer
  was silently discarded; now it fires immediately.
@fhueske fhueske force-pushed the fhueske-FLINK-39436-Allow-late-data-in-PTFs branch from 57b157d to 1781123 Compare April 17, 2026 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

community-reviewed PR has been reviewed by the community.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants