feat(traces): migrate to embed.TraceConsumer + ProducerModule with realistic per-span emission timing (PIPE-1024)#234
Conversation
| case <-drainCtx.Done(): | ||
| g.logger.Warn("Traces generator stop drain bounded", | ||
| zap.Duration("bound", stopDrainTimeout), | ||
| zap.Error(drainCtx.Err()), | ||
| ) | ||
| return drainCtx.Err() |
There was a problem hiding this comment.
Stop() drain timeout fires but background goroutine still blocks on g.wg.Wait() until pending
time.AfterFunc callbacks complete. Generator object held in memory until last pending span emits. PR description explicitly intended "in-flight emissions abandoned", actual behavior: emissions complete eventually, just after Stop
returns. Not a leak; document the lingering-goroutine behavior or cancel pending AfterFunc on stop.
There was a problem hiding this comment.
Good catch — the doc was misleading. Took the doc-only path you suggested: rewrote the Stop docstring to honestly describe the actual semantic. Emissions that don't fire within the drain bound continue asynchronously after Stop returns; the Generator stays reachable until they resolve. Not a leak (every callback eventually runs and the wg drains), but it's worth being explicit that "Stop returned a timeout error" doesn't mean "no further emissions will reach the consumer". Also noted in the docstring that callers needing strict no-post-Stop emissions can wrap the supplied consumer with a guard.
73fa41c to
f3df873
Compare
f6ebb9d to
7dc72c2
Compare
f3df873 to
45aabb7
Compare
7dc72c2 to
069816e
Compare
…alistic per-span emission timing (PIPE-1024)
45aabb7 to
c130826
Compare
069816e to
2178b70
Compare

Proposed Change
traces.Newnow takes aConfigstruct (Logger, Workers, Rate, Hostname, Consumer, Seed);StartbecomesStart(ctx context.Context) error(embed.ProducerModule shape);Name()returns"traces".Realistic per-span emission timing. Rather than batching all 2–5 spans of a trace into a single
ConsumeTracescall, each span is scheduled individually at its ownEndTimeviatime.AfterFunc, so the consumer observes spans arriving at staggered wall-clock moments the way a real distributed system delivers them. This is a load-bearing design choice for downstream distributed-blitz simulation where traces span multiple blitz instances. Pending emissions are tracked in async.WaitGroupsoStopattempts to drain them within a bound ofctxor 20s (whichever fires first). On bound elapsed, pendingtime.AfterFunccallbacks continue to fire on their original schedule — their emissions complete asynchronously after Stop returns, and the Generator stays reachable until they resolve. Not a leak, but documented explicitly in theStopdocstring so callers aren't surprised; callers who need strict no-post-Stop emissions can wrap the supplied consumer with a guard.Trace shape. Each trace = 1 HTTP server root + 1–4 children randomly drawn from {DB query, cache lookup, internal processing, downstream HTTP call, input validation} — yielding 2–5 spans per trace. Variety + child count seed-controlled. The fixed root + db + optional proc shape is gone.
New
Seed int64field onTracesGeneratorConfig(and ontraces.Config), mirroring FIX's precedent.Seed >= 0deterministically seeds workerNwithSeed+N;Seed < 0falls back totime.Now().UnixNano()+Nper worker. YAML omitted ⇒ randomize: dispatch layer translates YAML zero-value (or explicitseed: 0) into the negative-randomize path so YAML users get stochastic data by default. TraceID / SpanID always come fromcrypto/randregardless of Seed — global uniqueness is preserved across blitz instances. Per-machine identity determinism (hostname) is governed separately by PIPE-1036'senvironment.seed_config.Hostname. New optional
Hostnamefield ontraces.Config+ YAML — when empty, a deterministic Linux-style hostname is generated from Seed viadatagen.GenerateHostname, matching the hostmetrics convention so records from both signals attribute consistently to the same simulated machine. Every emitted span'sMetadata.Resourceis populated viaresource.Default("traces")withhost.nameoverridden by this hostname.Count tracker semantics.
SetCountTracker: one Acquire equals one trace, not one span. Documented on the method.The legacy
generator.TraceGeneratorinterface and the correspondinginternal/service/service.gotype-switch case are deleted in this PR —traceswas the sole implementer.Tests
traces_test.gousesrequire.Eventually(notime.Sleep), newmockTraceConsumercaptures spans + arrival times, new nil-consumer error case ("TraceConsumer cannot be nil").TestSpansEmittedAtEndTimeproves arrival time ≥ EndTime per-span — holds under both abort and drain semantics.TestSeedDeterminismfor deterministic Name/Kind across runs.TestTraceStructurevalidates the new root + 1–4 children shape, with every child parented to root.stubTraceGenfromservice_test.go(dead path).Part of PIPE-1024. Stacked on #233.