Open
Conversation
4998e4b to
062f136
Compare
Preview Deployments (6fa0cd1)
|
d4e0eb0 to
c2f88e0
Compare
57bfedf to
b3e5da3
Compare
node 22's built-in WebSocket couldn't scale to 2000 concurrent
connections — ~15% of connects failed with generic "WS error" and
the p99 upgrade latency was 22 seconds. Those weren't server-side
failures; it was the client struggling. Same server code, different
client, the numbers transformed:
metric built-in ws pkg
--- -------- ------
total errors 412 0
ws connect errs 312 0
ws connect p90 11973ms 302ms
ws connect p99 21924ms 397ms
throughput 83 r/s 99 r/s
Also fixes the "extra respondents" issue: the built-in WebSocket
can't send Cookie headers, so returning users always created fresh
respondents on reconnect. With the ws package we can set headers
and capture Set-Cookie from the upgrade response, so reconnects
reuse the same rid. Server-side respondent count now matches the
expected user count exactly.
Changes:
- add ws + @types/ws devDependencies
- createWsClient uses ws package with Error-typed callbacks and
HTTP 'unexpected-response' event for real diagnostics (status
codes, error messages) instead of a generic 'WS error' label
- capture rid_{slug} cookie from upgrade response, reuse on all
subsequent reconnects in the same client
- returning user flow passes the cookie on explicit reconnect so
they come back as the same respondent
- load test queries the server's /results endpoint at the end and
prints "Server-side respondents: X total ✓" to verify the
expected user count matches persisted state
- remove debug logging from cookie capture
Researched how leading survey tools (typeform, surveymonkey, jotform,
tally, google forms, microsoft forms) present results per question
type. The previous results page had several well-known anti-patterns:
pie charts as a toggle, word clouds for text, chart.js for simple bars,
no low-sample guards, and a single generic bar chart for every choice
type regardless of what was actually being measured.
Per-type visualisations:
ChoiceResult (radio/dropdown/checkbox)
- horizontal bars sorted by count
- "most common" / "most selected" callout
- checkbox mode uses respondent-based percentages (not response-
based) and shows "avg selections per respondent" — the stat
everyone actually wants
- top answer is visually highlighted
RatingResult
- big mean score with rendered stars
- "satisfaction" top-box % (rated 4+ of 5)
- distribution bars ordered high-to-low
- low-sample notice below n=5
NpsResult
- big NPS score colour-coded by industry bands
- four stats: promoters/passives/detractors/responses
- classic 3-segment stacked bar
- per-score distribution (0-10) showing the underlying shape
- "need 10+ responses for a meaningful NPS" notice
LikertResult
- diverging stacked bar centred on the neutral midpoint (the gold
standard for agreement scales — biggest differentiator vs most
survey tools)
- agree/neutral/disagree/mean headline stats
- inline legend
NumberResult
- histogram with auto-bucketing (integer-aware)
- mean/median/min/max/count stat strip, with median tonally
highlighted because it's less skewed by outliers
- raw dot list for low-sample data
TextResult (text/textarea/email)
- n-gram frequency (bigrams + trigrams, stopword filtered) —
replaces the word cloud which is decorative rather than
analytic. actual phrases with counts are much more useful.
- email mode shows unique/duplicate counts and top domain
breakdown instead of n-grams
- paginated response list with client-side search
- textarea responses truncate past 200 chars with "show more"
Shared:
- HorizontalBar component (pure CSS, no chart.js) — faster to
render and gives much finer layout control for row-per-option
- StatStrip for consistent stat display
- LowSampleNotice with configurable threshold
- analytics-utils gains computeRating, computeLikert, computeNumber,
bucketNumbers, computeNgrams, computeTextStats, computeEmailStats,
computeCheckboxStats, npsDistribution
Results page header:
- 4-up kpi strip (total / completed / completion / live) replacing
the previous 3-card layout
- gradients and progress bar for visual clarity
Removed obsolete components: BarChart, PieChart, ChartTypeToggle,
WordCloud, NpsScoreCard. Chart.js is no longer a dependency for the
per-question results (still used for Timeline/Dropoff overviews).
QuestionResult.svelte is now a thin dispatcher that picks the right
component based on question.type.
Tests: 381 unit + 81 e2e passing, production build clean, 0 svelte-
check errors.
follow practice used by typeform, tally, jotform: the overview card shows a small sample (top 5 by frequency then length) with a link to the dedicated responses tab, instead of duplicating the full browser and search that already exist there.
for email questions the actual addresses ARE the value, unlike free text. split out a new emailresult component with: - lead-quality split (corporate vs free vs disposable) from hardcoded domain lists - disposable/role-based/invalid flag badges per address - gmail-aware normalisation (strip dots and +tags) so duplicates get properly grouped - filter chips + search + pagination over the full deduped list - copy-filtered-list and mailto:?bcc= actions for quick lead export
clicking a search row fetches the respondent and shows every answer inline, with the matching question highlighted so it's obvious where the search hit landed.
…rework fast tier (5s) stays pure in-memory as before. new slow tier fires once per minute and runs the sql-backed aggregations (timeline, dropoff, completion times) ONCE per do and fans the result out to every connected viewer in a single push — so n viewers never become n queries. timeline: - add minute granularity (backend + wire op + ws protocol) - auto-pick initial granularity from the observed timestamp span - client-side gap fill anchored to now so a single-point survey renders as a proper line across the time axis instead of a dot on the left dropoff: - pure-css funnel (no more chart.js for this one) — bar width is share of the starting cohort, rhs label is drop from previous question - live updates via slow analytics push completion-time histogram: - brand new chart. server groups durations into 8 fixed buckets with running mean/median/min/max. green bar is the bucket containing the median. live updates via slow analytics push.
…s for histogram - revert the 2-col grid that made both charts cramped; back to full-width stacked cards - completion-time histogram now uses chart.js (same dep as timeline, already loaded on the page) so the bars actually render. the previous pure-css approach relied on percentage heights on flex-1 children, which collapse when the column has no intrinsic cross-axis height
enter-to-advance: QuestionCard's keydown handler used to only fire for input[type=text], so email/number/etc didn't advance on enter. now it advances on enter for every focus target except textarea (newline), button and link (native activation). per-question timing: - new answer_ms column on answers (migration + DO schema + additive alter for existing DO instances) - shared AnswerInput + client PendingSave carry an optional answerMs - the survey loader stamps 'question shown at' in a plain non-reactive object via a $effect on engine.currentQuestion.id; handleAnswer computes elapsed and puts it on the buffered save - ws submit-answers and the http service layer clamp [0, 24h] then store alongside the answer row - new getAnswerDurationsByQuestion repo query + getQuestionTimings service method compute median/mean per question in js - slow-tier analytics broadcast now includes question timings so the chart updates once a minute without fan-out - new QuestionTimingChart (chart.js horizontal bar) showing median seconds per question with tooltip showing median/mean/sample size
question timings: - backend extended to compute p5/p25/p50/p75/p95 + min/max per question via nearest-rank percentile on the sorted duration array - QuestionTimingChart rewritten as a pure-css horizontal box plot: whiskers p5-p95, iqr box p25-p75, median tick, caps at p5 and p95 - shared axis anchored to max-of-p95 x 1.05 so a single slow outlier doesn't squash every other row number histogram: - NumberResult had the same broken flex-1 percentage-height bar pattern as the completion-time chart used to — explicit pixel heights + a fixed-height parent fix the invisible bars
unit (analytics-utils.test.ts): normalizeEmail (gmail dots/+tags, validation), classifyDomain, computeEmailSummary (dedupe, flags, topDomains), computeNgrams, computeTextStats, computeCheckboxStats, computeRating, computeLikert, computeNumber + bucketNumbers. 52 new tests bring the file count from 381 → 433. e2e: - Enter-to-advance: text/email/number inputs submit on Enter, textarea keeps Enter as a newline - Search tab: clicking a result row expands the full respondent detail, shows the match chip on the originating question, and collapses again on re-click - Overview tab: completion-time and per-question-timing charts render
- new shared constants.ts entries: MAX_ANSWER_MS, clampAnswerMs, percentile, BROADCAST_FAST_INTERVAL_MS, BROADCAST_SLOW_TICKS_PER_CYCLE, COMPLETION_TIME_BUCKETS. Same magic numbers were scattered across ws-handler, ws-broadcaster and respondent.service before. - respondent.service.getQuestionTimings / getCompletionTimes now share a single nearest-rank percentile helper instead of re-declaring a local closure in each method. The completion-time bucket definition lives in constants so it's testable and the service just clones + counts it. - ws-handler replaces three ADMIN_OPS/EDITOR_OPS/VIEWER_OPS sets + three inline hasMinRole checks with a single declarative OP_ROLES map. Ops not listed are public (respondent survey-taking); any new op just adds one line to the table. The dispatch becomes a single conditional. - ws-handler's inline answer_ms clamping is replaced with the shared helper so the WS fast-path and the HTTP service path validate identically.
…lidate slug cache on mutate Two fixes from the DO auth review: 1) Strip X-WS-Role / X-Respondent-Id / X-Authenticated from inbound client requests before forwarding to the DO. The worker is the only thing allowed to set these headers — previously, `new Headers(request.headers)` copied client-supplied values, and X-Respondent-Id was only overwritten when the worker found a valid rid cookie. A client sending the header with no cookie could claim an arbitrary respondent ID on the WS upgrade and the HTTP path. Fixed by deleting all three internal headers up front in stripInternalHeaders() and then explicitly setting the ones the worker has verified. 2) Invalidate the per-isolate slug→id cache when a survey is deleted or its slug/password_hash mutates. Previously the 60s TTL could serve a stale route after delete, and if the slug got reused by a fresh survey the cache would still point at the old DO. Now invalidateSlugCacheBySurveyId() drops matching entries as part of the delete path and the catalog-sync path. Other isolates still rely on TTL — keeping that short (60s) bounds the window. e2e: added a regression test that sends a forged X-Respondent-Id with no cookie and verifies the DO ignores it and allocates a fresh respondent.
The chained Enter-key presses raced with the 200ms question transition animation and the QuestionCard keydown listener teardown/remount cycle — locators were firing while the old card was unmounting and the new one hadn't mounted yet. Switched to role-based heading locators, added waitForTransition() between each advance, and factored the boilerplate into a startSurveyAtFirstQuestion() helper. The textarea test now uses the Next button to get to the textarea question so the feature-under-test (Enter behaviour) is exercised in isolation.
…ator TextareaQuestion was not passing question.placeholder down to the Textarea component, so any placeholder configured on a textarea question was silently ignored. The enter-to-advance e2e test hit this when trying to locate the textarea by placeholder and timed out. Fixed the component to forward the placeholder prop and switched the e2e test to a role-based textarea locator so it no longer depends on the placeholder being rendered.
text input debounce:
- TextQuestion, TextareaQuestion, EmailQuestion, NumberQuestion all
debounce their oninput callback by 300ms (clearTimeout on every
keystroke, fire once the user pauses). onDestroy flushes the last
value so advancing via Enter or clicking Next doesn't lose input.
- client.ts bufferAnswer no longer increments unflushedCount when the
same questionId is already buffered — repeated updates to a text
field were reaching the FLUSH_THRESHOLD of 4 after just 4 keystrokes
and triggering an unnecessary network flush mid-typing.
dashboard pagination + search:
- backend GET /api/surveys now accepts search, offset, limit params.
returns { surveys, total } instead of a bare array. repository uses
LIKE on title for search, standard offset/limit for pagination,
COUNT(*) for the total.
- frontend dashboard uses listSurveysPaginated with 12 per page.
debounced search bar (300ms), paginator with prev/next, empty-state
distinguishes 'no surveys yet' from 'no matching surveys'.
- existing listSurveys() updated to parse the new response shape so
callers that don't need pagination still work.
the survey page had a branch gap: if loading failed (API error, network
issue) the error was shown in a floating toast with a dismiss button.
dismissing hid the error but the engine/showWelcome/etc were never
initialized, so none of the other template branches matched → blank
page.
fixes:
- error toast only shows for non-fatal errors (i.e. while the user is
actively answering and a save fails). fatal load errors are handled
by the main if/else chain which shows the error inline with a 'try
again' reload button.
- added a catch-all {:else} branch at the bottom of the template that
shows 'something went wrong' + retry. this makes it impossible for
the page to ever render as completely blank regardless of state.
the 'Loading...' text was nearly invisible and the user reported a blank page. adds a proper animated spinner + 'Loading survey...' text so the loading state is clearly visible while the onMount async completes (fetch survey + WS connect + resume).
when a respondent answered every question but never hit submit (tab crash, browser close, network drop), the resume returns a nextQuestionIndex equal to the total question count. the engine was initialized with an out-of-bounds index, so currentQuestion was undefined, the progress bar rendered but no question card or section header — producing the blank-with-blue-bar state. now if nextQuestionIndex >= questions.length, the loader calls postComplete() automatically since the answers are already saved and there's nothing left to show.
auto-completing was wrong — the user might have been mid-answer on a free-text field when their tab crashed. now we cap the resume index to the last valid question so they land back on their final answer and can review before hitting submit.
the 300ms input debounce I added earlier created a gap: if the user typed within the last 300ms before navigating away, the pending timer hadn't fired yet so the latest value wasn't in the answer buffer when flushBufferSync sent the beacon. fix: new useDebouncedAnswer() helper that: - debounces oninput by 300ms (same as before) - registers a pre-flush hook via svelte context so the survey loader can call it BEFORE flushBufferSync on beforeunload - flushes on component destroy (normal question-to-question navigation) all 4 debounced components (text, textarea, email, number) now use the shared helper instead of inline setTimeout/onDestroy boilerplate.
the resume logic scanned questions in order and returned the first one without an answer. if the user skipped an optional question mid-survey and answered later ones, they'd be sent back to the skipped question on refresh — losing their place entirely. now both the WS handler and the HTTP service find the LAST answered question and resume from the one after it. the client-side loader already clamps out-of-bounds indices so all-answered respondents still land on the final question.
SurveyShell's seenSectionIds started empty on every mount, so resuming mid-survey always showed the section interstitial screen before the question — even for sections the user had already passed through. now buildInitialSeenSections() marks every section up to and including the current question's section as 'seen' at mount time. fresh starts (index 0, no answers) still show the first section header normally.
single shared validation module (shared/answer-validation.ts) imported by both client and server — no duplication, identical rules on both sides. built-in validation per question type: - text/textarea: minLength, maxLength, minWords, maxWords, custom regex pattern with custom error message - email: format check + optional allowed-domains list - number: valid number, min/max range, integerOnly, step (multiples) - rating: integer 1..scaleMax - nps: integer 0..10 - likert: must be one of the 5 valid labels - radio: must be a defined option, Other requires otherText - checkbox: valid options, minSelections/maxSelections, Other text - dropdown: must be a defined option client: QuestionCard.handleNext() calls validateAnswer instead of the old required-only check. email question now passes raw input so the format error surfaces properly (was swallowing invalid emails as ''). server: ws-handler submit-answers and respondent.service submitBatch both validate every answer before INSERT. invalid answers get a 400 with a descriptive error message. builder: QuestionEditor gains config sections for all user-settable rules (text minLength/words/pattern, email allowedDomains, number integerOnly/step, checkbox min/maxSelections). 50 unit tests cover every type + edge case.
moved the validation config into a collapsible section matching the skip logic pattern — both are now chevron toggle buttons sitting side by side in a single row above the question's bottom border. validation toggle only appears for question types that have configurable rules (text, textarea, email, number, checkbox).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.