Skip to content

enh(Foundation): optimize numeric conversions with digit-pairs algorithm and std::to_chars/from_chars#5301

Open
matejk wants to merge 16 commits intomainfrom
5292-number-formatter-perf
Open

enh(Foundation): optimize numeric conversions with digit-pairs algorithm and std::to_chars/from_chars#5301
matejk wants to merge 16 commits intomainfrom
5292-number-formatter-perf

Conversation

@matejk
Copy link
Copy Markdown
Contributor

@matejk matejk commented Apr 6, 2026

Summary

Rewrites integer and floating-point conversion in NumericString.h/.cpp for
significantly improved performance, using modern C++17/C++20 techniques while
maintaining full API compatibility.

Integer formatting (intToStr)

Based on the implementation suggested by @aleks-f,
replaces the old division-based loop with the two-digits-at-a-time algorithm
using a 200-byte digit-pairs lookup table (as used by {fmt}, jemalloc, Facebook Folly).
Shift-and-mask for hex and power-of-2 bases. Writes directly into the result buffer
in reverse, then reverses in-place -- no intermediate buffer copy.

Integer parsing (strToInt)

Replaced the manual digit loop with std::from_chars (C++17). Range-based
(begin, end) core overload avoids strlen. Thousand-separator slow path only
triggers when from_chars stops at a non-digit character.

Float/double formatting (floatToStr, doubleToStr)

When POCO_HAS_FLOAT_CHARCONV is defined (GCC 11+, MSVC 19.24+, non-Apple
libc++ 20+, or Apple Clang with macOS 26.0+ deployment target), uses
std::to_chars for shortest representation and std::to_chars(fixed) + adjustPrecision for fixed-precision formatting. Falls back to bundled
double-conversion library on older compilers.

To enable on Apple Clang, set the deployment target:
```bash
cmake -B build -DCMAKE_OSX_DEPLOYMENT_TARGET=26.0
```

Caveat: std::to_chars(fixed) + manual precision adjustment is slower than
double-conversion's integrated ToFixed for fixed-precision formatting on macOS
Apple Clang (0.6x for format(double, prec=2)). This is a tradeoff: the
std::to_chars path eliminates ~6800 lines of bundled double-conversion dependency
when the feature is available. As stdlib implementations improve, this gap should
close. The double-conversion fallback remains available for older compilers where
it is the only option.

Float/double parsing (strToFloat, strToDouble)

When POCO_HAS_FLOAT_CHARCONV is defined, uses std::from_chars with manual
inf/nan handling (matching double-conversion's exact-match semantics).
Falls back to strtod for out-of-range values (extreme underflow/overflow)
where from_chars reports errc::result_out_of_range.

Other changes

  • POCO_HAS_FLOAT_CHARCONV feature detection macro in Config.h
  • Float/double functions in NumericString.h remain Foundation_API exported
    (required by inline NumberFormatter methods compiled into other libraries)
  • if constexpr in isIntOverflow and safeMultiply (C++17 consistency)
  • [[nodiscard]] on strToInt, intToStr, safeMultiply
  • intToStr uses size parameter as buffer capacity for bounds checking
  • poco_bugcheck() in unreachable switch default
  • Removed obsolete MSVC #pragma warning(disable: 4146)
  • Moved <cmath> from header to .cpp, removed unused <memory>
  • tryParseHex/tryParseHex64 now skip leading whitespace before 0x prefix
  • Benchmarks rewritten to use NumberFormatter/NumberParser public API with
    baselines (std::to_string, snprintf, strtol, strtod, sscanf) and round-trip
    correctness validation

Performance (5M random values, macOS Apple Clang arm64 / Linux GCC arm64)

NumberFormatter -- Integer

Operation macOS speedup Linux speedup
format(small 0-999) 1.2x 1.2x
format(int) 1.7x 2.3x
format(int, w=15) 1.5x 1.9x
format(-int) 1.5x 1.9x
formatHex(int) 1.4x 1.7x
format(Int64) 2.7x 1.7x
format(UInt64) 2.3x 1.6x
formatHex(Int64) 2.3x 1.4x

NumberFormatter -- Float/Double

Operation macOS speedup Linux speedup
format(float) shortest 1.8x 2.5x
format(double) shortest ~same ~same
format(double, prec=2) 0.6x (regression) 0.9x
format(double, w=15, prec=4) ~same ~same

NumberParser -- Integer

Operation macOS speedup Linux speedup
parse(uint dec) 1.4x 2.2x
parseHex(uint) 1.2x 3.7x
parseUnsigned64 1.9x 2.8x

NumberParser -- Float/Double

Operation macOS speedup Linux speedup
tryParseFloat(double) 1.2x 1.5x

vs standard library

vs snprintf Integer Float
macOS 2.5-4x faster 2-3x faster
Linux 2.8-4.3x faster 2.5-4.3x faster
vs sscanf Integer Float
macOS 3-5x faster 1.9x faster
Linux 3-5x faster 2.7x faster

Known regressions

  • format(double, prec=2) on macOS: 0.6x vs main. Root cause: std::to_chars(fixed) +
    adjustPrecision is slower than double-conversion's ToFixed. This is the cost of
    eliminating the double-conversion dependency when POCO_HAS_FLOAT_CHARCONV is available.
    On compilers without charconv support, the double-conversion path is unchanged.

Test plan

  • Foundation tests: 946 pass on macOS C++20, macOS C++17 (fallback), Linux GCC
  • JSON tests: 48 pass on macOS and Linux (including extreme underflow 123e-10000000)
  • Benchmarks on macOS arm64 and Linux arm64 with 5M random values
  • Round-trip correctness validation (format -> parse, parse -> strtol/strtod)

@matejk matejk added this to the Release 1.15.2 milestone Apr 6, 2026
@matejk matejk force-pushed the 5292-number-formatter-perf branch from 1ae9e17 to 2c8cfd7 Compare April 6, 2026 17:16
@matejk matejk marked this pull request as draft April 6, 2026 18:18
Comment thread Foundation/include/Poco/NumericString.h Fixed
Copy link
Copy Markdown
Member

@aleks-f aleks-f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the tradeoff, it seems that the best approach is hybrid - for <1000 the old way, >1000 the new one. This is essentially the same pattern used by small string optimization or SpinlockMutex

@matejk matejk changed the title enh(Foundation): optimize NumberFormatter and NumberParser with std::to_chars/from_chars enh(Foundation): optimize numeric conversions with digit-pairs algorithm and std::to_chars/from_chars Apr 7, 2026
@matejk
Copy link
Copy Markdown
Contributor Author

matejk commented Apr 7, 2026

POCO Numeric Conversion Benchmark Report

Comparison of main branch vs numeric-conversion-performance-improvements branch.

  • 5 million random values per test, fixed seed for reproducibility
  • macOS: Apple Clang, arm64, RelWithDebInfo, deployment target 26.0
  • Linux: GCC (OrbStack), arm64, RelWithDebInfo

NumberFormatter — Integer Formatting

Operation macOS main macOS current speedup Linux main Linux current speedup
format(small 0-999) 53.8 45.4 1.2x 66.4 53.7 1.2x
formatHex(small) 57.3 48.7 1.2x 70.3 63.0 1.1x
format(int) 165.2 99.8 1.7x 218.6 95.1 2.3x
format(int, w=15) 188.1 128.0 1.5x 200.5 107.8 1.9x
format0(int, w=15) 195.9 133.3 1.5x 213.9 115.5 1.9x
format(-int) 135.9 93.5 1.5x 162.7 84.6 1.9x
formatHex(int) 73.3 51.6 1.4x 110.9 65.8 1.7x
formatHex(int, w=10) 76.2 62.3 1.2x 100.8 73.6 1.4x
formatHex(int, prefix) 86.4 65.0 1.3x 101.6 74.7 1.4x
format(unsigned) 110.5 70.4 1.6x 122.8 74.0 1.7x
formatHex(unsigned) 75.8 82.7 0.9x 85.5 66.1 1.3x
format(Int64) 332.0 121.3 2.7x 392.5 230.1 1.7x
format(UInt64) 318.5 136.1 2.3x 392.4 243.1 1.6x
formatHex(Int64) 174.7 76.1 2.3x 258.0 189.9 1.4x
formatHex(UInt64, w=20) 203.4 106.6 1.9x 294.5 211.0 1.4x

vs standard library baselines (current branch)

Operation macOS POCO macOS snprintf macOS to_string Linux POCO Linux snprintf Linux to_string
format(int) 99.8 276.7 70.0 95.1 301.9 70.3
format(int, w=15) 128.0 325.4 107.8 328.7
format0(int, w=15) 133.3 332.4 115.5 321.5
formatHex(int) 51.6 204.7 65.8 237.2

POCO is 2.5-4x faster than snprintf for all integer formatting options.
std::to_string is faster for plain decimal (no width/hex options available).

NumberFormatter — Float/Double Formatting

Operation macOS main macOS current speedup Linux main Linux current speedup
format(double) 314.2 324.8 ~same 437.0 444.0 ~same
format(double, prec=2) 204.5 322.5 0.6x 334.3 385.9 0.9x
format(double, prec=6) 281.4 296.9 ~same 383.6 375.2 ~same
format(double, w=15, prec=4) 360.3 366.2 ~same 328.4 341.4 ~same
format(float) 514.2 280.4 1.8x 975.4 387.4 2.5x
format(float, prec=2) 202.2 224.6 0.9x 331.9 329.4 ~same

vs standard library baselines (current branch)

Operation macOS POCO macOS snprintf macOS to_string Linux POCO Linux snprintf Linux to_string
format(double) 324.8 770.6 992.7 444.0 1115.2 1524.3
format(double, prec=2) 322.5 955.1 385.9 1344.5
format(double, w=15, prec=4) 366.2 934.8 341.4 1466.7
format(float) 280.4 677.1 387.4 1134.3

POCO is 2-4x faster than snprintf for all float formatting options.
format(float) shortest representation improved 1.8-2.5x vs main (std::to_chars).
format(double, prec=2) regressed on macOS (0.6x) — std::to_chars(fixed) + adjustPrecision
is slower than double-conversion's integrated ToFixed for this case.

NumberParser — Integer Parsing

Operation macOS main macOS current speedup Linux main Linux current speedup
parse(small dec) 82.0 73.0 1.1x 65.7 48.9 1.3x
parse(uint dec) 155.7 110.1 1.4x 163.1 74.5 2.2x
parse(int neg) 179.2 149.6 1.2x 153.1 103.6 1.5x
parseHex(uint) 229.7 185.1 1.2x 222.6 60.1 3.7x
parseHex(0x-prefixed) 242.0 170.5 1.4x 236.5 73.5 3.2x
parseUnsigned64(uint64) 319.2 172.3 1.9x 329.0 118.4 2.8x

vs standard library baselines (current branch)

Operation macOS POCO macOS strtol macOS sscanf Linux POCO Linux strtol Linux sscanf
parse(uint dec) 110.1 78.3 386.4 74.5 88.1 316.8
parseHex(uint) 185.1 183.7 575.3 60.1 183.0 390.8
parseUnsigned64 172.3 133.8 118.4 151.6

POCO matches or beats strtol on most operations. 3-5x faster than sscanf.
Linux hex parsing is 3x faster than strtoul thanks to std::from_chars.

NumberParser — Float/Double Parsing

Operation macOS main macOS current speedup Linux main Linux current speedup
tryParseFloat(float) 289.0 257.1 1.1x 323.0 224.7 1.4x
tryParseFloat(double) 287.5 249.4 1.2x 333.3 225.5 1.5x
tryParseFloat(fixed) 297.1 244.5 1.2x 343.9 224.5 1.5x

vs standard library baselines (current branch)

Operation macOS POCO macOS strtod macOS sscanf Linux POCO Linux strtod Linux sscanf
tryParseFloat(double) 249.4 128.9 478.6 225.5 326.6 611.3

macOS: POCO is 1.9x slower than raw strtod (string copy + separator stripping overhead),
but 1.9x faster than sscanf.
Linux: POCO is 1.4x faster than strtod and 2.7x faster than sscanf.

Summary

Improvements vs main

Category macOS Linux
Integer formatting (int) 1.5-1.7x faster 1.7-2.3x faster
Integer formatting (Int64) 2.3-2.7x faster 1.4-1.7x faster
Float format(float) shortest 1.8x faster 2.5x faster
Integer parsing 1.1-1.9x faster 1.3-3.7x faster
Float parsing 1.1-1.2x faster 1.4-1.5x faster

Regressions vs main

Category macOS Linux Cause
format(double, prec=2) 0.6x (322 vs 205 ms) 0.9x (386 vs 334 ms) std::to_chars(fixed) + adjustPrecision slower than double-conversion's integrated ToFixed
format(float, prec=2) 0.9x (225 vs 202 ms) ~same (329 vs 332 ms) Same as above, less pronounced for float
formatHex(unsigned) macOS 0.9x (83 vs 76 ms) Noise or minor overhead from reverse-based hex vs old forward-write

Slower than standard library baselines

Operation macOS POCO macOS baseline ratio Linux POCO Linux baseline ratio
format(int) vs to_string 99.8 70.0 0.7x 95.1 70.3 0.7x
format(unsigned) vs to_string 70.4 44.2 0.6x 74.0 58.8 0.8x
format(Int64) vs to_string 121.3 79.7 0.7x 230.1 120.5 0.5x
tryParseFloat vs strtod 249.4 128.9 0.5x
parse(small dec) vs strtol 73.0 44.1 0.6x 48.9 41.4 0.8x
parse(uint dec) vs strtol 110.1 78.3 0.7x

Root causes:

  • format(int) vs std::to_string: std::to_string writes directly into a string
    with no sign/prefix/width handling. POCO writes into a char buffer, reverses, then
    copies to std::string — the extra copy + reversal adds ~30-40% overhead. This is
    inherent to the richer API (width, fill, prefix, hex, thSep support).
  • tryParseFloat vs strtod (macOS): POCO's tryParseFloat(string) copies the input
    string, strips thousand separators and 'f' suffix, replaces decimal separator, then
    calls std::from_chars. The raw strtod operates directly on the C string with no
    preprocessing. On Linux, GCC's std::from_chars is fast enough to offset this overhead.
  • parse(small dec) vs strtol: NumberParser::parse dispatches through tryParse
    strToInt(string) → whitespace skip → strToInt(range)std::from_chars. The
    function call chain adds overhead vs a direct strtol call. For small numbers where
    the parsing itself is trivial, this overhead dominates.

vs standard library (current branch)

Comparison Integer formatting Float formatting Integer parsing Float parsing
vs snprintf 2.5-4.3x faster 2-4.3x faster
vs std::to_string 1.1-1.4x slower 3-3.4x faster
vs strtol/strtod ~same to 3.7x faster 1.9x slower (macOS) to 1.4x faster (Linux)
vs sscanf 3-5x faster 1.9-2.7x faster

Optimization opportunities

format(double, prec=2) regression (0.6x on macOS):
The std::to_chars(fixed) + adjustPrecision path is slower than double-conversion's
integrated ToFixed for fixed-precision formatting. Options:

  1. Fall back to double-conversion for FixedStr — restores performance but requires
    always compiling double-conversion even when charconv is available.
  2. Accept the tradeoff — std::to_chars path eliminates ~6800 lines of bundled
    dependency when POCO_HAS_FLOAT_CHARCONV is defined.
  3. Wait for stdlib improvements — Apple Clang's std::to_chars(fixed, precision) may
    improve in future releases.

tryParseFloat vs strtod (1.9x slower on macOS):
The overhead is in string preprocessing (copy, trim, separator stripping). Options:

  1. Add a fast path in tryParseFloat(string) that skips preprocessing when decSep='.'
    and thSep=',' and the string contains no separators (the common case) — check for
    separator characters before copying.
  2. Provide a tryParseFloat(const char*, double&) overload that skips all preprocessing
    for callers who know their input is clean.

format(int) vs std::to_string (0.7x slower):
The overhead is from writing to a char buffer, reversing, then copying to std::string.
Not actionable without removing width/padding/hex/prefix support. Users who need raw
speed for plain decimal can use std::to_string directly.

parse(small dec) vs strtol (0.6-0.8x slower):
The overhead is the dispatch chain (tryParsestrToInt(string)strToInt(range)
from_chars). Not actionable without flattening the API. The overhead is constant
per call and becomes negligible for larger numbers.

Techniques used

  • Integer formatting: two-digits-at-a-time with digit-pairs lookup table,
    shift-and-mask for hex/power-of-2 bases
  • Integer parsing: std::from_chars (C++17)
  • Float shortest: std::to_chars default representation (when POCO_HAS_FLOAT_CHARCONV),
    double-conversion fallback for older compilers
  • Float fixed-precision: std::to_chars(fixed) + manual adjustPrecision with
    round-half-up rounding
  • Float parsing: std::from_chars (when POCO_HAS_FLOAT_CHARCONV),
    double-conversion fallback

@matejk matejk force-pushed the 5292-number-formatter-perf branch from e10d9e8 to 4d7619f Compare April 7, 2026 17:31
@matejk
Copy link
Copy Markdown
Contributor Author

matejk commented Apr 7, 2026

given the tradeoff, it seems that the best approach is hybrid - for <1000 the old way, >1000 the new one. This is essentially the same pattern used by small string optimization or SpinlockMutex

I pushed (almost) completely rewritten formatting/parsing. Benchmarks also indicate that in many cases it does (or would make sense to) use standard library-provided implementations.

Comment thread Foundation/testsuite/src/StringTest.cpp Fixed
@matejk matejk force-pushed the 5292-number-formatter-perf branch from 1991226 to 64678a7 Compare April 7, 2026 20:44
@matejk matejk requested a review from aleks-f April 8, 2026 08:47
@aleks-f aleks-f marked this pull request as ready for review April 8, 2026 08:52
@matejk matejk modified the milestones: Release 1.15.2, Release 1.16.0 Apr 8, 2026
@matejk matejk force-pushed the 5292-number-formatter-perf branch from 605bd9e to 8b97c51 Compare April 13, 2026 07:28
@aleks-f aleks-f requested a review from Copilot April 13, 2026 14:05
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modernizes and speeds up Foundation numeric string conversions by rewriting NumericString integer/float parsing/formatting to use a digit-pairs algorithm and std::to_chars/from_chars when available, while updating NumberFormatter/NumberParser to use the new internal APIs and refreshing conversion benchmarks/tests.

Changes:

  • Reworked integer formatting/parsing (intToStr/strToInt) with digit-pairs + std::from_chars fast path and a thousand-separator slow path.
  • Added float/double std::to_chars/from_chars implementations behind POCO_HAS_FLOAT_CHARCONV with double-conversion fallback.
  • Updated NumberFormatter/NumberParser call sites and rewrote conversion benchmarks/tests accordingly.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 9 comments.

Show a summary per file
File Description
Foundation/testsuite/src/StringTest.h Removes direct NumericString/stream parsing helpers; adds benchmarkIntToStr() declaration.
Foundation/testsuite/src/StringTest.cpp Rewrites conversion benchmarks to use NumberFormatter/NumberParser APIs and adds validation/baselines.
Foundation/testsuite/src/NumberParserTest.cpp Adjusts whitespace-related expectation in parse error tests.
Foundation/src/NumericString.cpp Implements charconv-based float/double conversions and refactors padding/format helpers.
Foundation/src/NumberParser.cpp Routes parsing through updated strToInt overloads (including range-based hex paths).
Foundation/src/NumberFormatter.cpp Checks intToStr return value before appending to output strings.
Foundation/include/Poco/NumericString.h Makes header “internal-only”, rewrites strToInt/intToStr, and refactors helper templates/constants.
Foundation/include/Poco/NumberFormatter.h Includes NumericString.h via private-include guard and handles intToStr failure.
Foundation/include/Poco/Config.h Adds POCO_HAS_FLOAT_CHARCONV feature detection for float to_chars/from_chars.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread Foundation/include/Poco/NumericString.h
Comment thread Foundation/include/Poco/NumericString.h Outdated
Comment thread Foundation/testsuite/src/StringTest.cpp Outdated
Comment thread Foundation/testsuite/src/StringTest.cpp Outdated
Comment thread Foundation/include/Poco/NumericString.h Outdated
Comment thread Foundation/src/NumberParser.cpp Outdated
Comment thread Foundation/src/NumberParser.cpp Outdated
Comment thread Foundation/testsuite/src/StringTest.cpp Outdated
Comment thread Foundation/include/Poco/NumericString.h
@matejk
Copy link
Copy Markdown
Contributor Author

matejk commented Apr 13, 2026

Response to review comments

POCO_NUMERIC_STRING_PRIVATE #error guard: This is intentional. NumericString.h has always been an internal implementation detail -- the functions are called only by NumberFormatter and NumberParser. The #error prevents accidental direct inclusion by downstream code that should use the public API (NumberFormatter.h / NumberParser.h) instead. This is not an API break because the header was never part of the public API interace.

isSafeIntCast<F, T> template parameter order: Pre-existing design, not changed by this PR. Both isSafeIntCast<F, T> and isIntOverflow<To, From> require explicit template arguments. Swapping the order would break existing callers.

Foundation_API on float/double functions: Restored. These must remain exported because inline NumberFormatter methods (compiled into other libraries like JSON, Net, Data) call them.

All other review findings have been addressed in the latest push.

@matejk matejk force-pushed the 5292-number-formatter-perf branch 2 times, most recently from 41695b7 to ddff866 Compare April 13, 2026 17:51
@aleks-f
Copy link
Copy Markdown
Member

aleks-f commented Apr 13, 2026

Response to review comments

POCO_NUMERIC_STRING_PRIVATE #error guard: This is intentional. NumericString.h has always been an internal implementation detail -- the functions are called only by NumberFormatter and NumberParser. The #error prevents accidental direct inclusion by downstream code that should use the public API (NumberFormatter.h / NumberParser.h) instead. This is not an API break because the header was never part of the public API surface.

I'm not sure what is considered "public API surface", but these functions certainly were visible and accessible to users so far. We use strToInt directly and there may be other users using it.

Perhaps it could have been better thought initially, but at this point I don't see a rationale for hiding it like this - if someone wants to use it directly, there is no harm.

@matejk
Copy link
Copy Markdown
Contributor Author

matejk commented Apr 13, 2026

I'm not sure what is considered "public API surface", but these functions certainly were visible and accessible to users so far. We use strToInt directly and there may be other users using it.

Perhaps it could have been better thought initially, but at this point I don't see a rationale for hiding it like this - if someone wants to use it directly, there is no harm.

Mark as deprecated if used from outside the library?

@aleks-f
Copy link
Copy Markdown
Member

aleks-f commented Apr 13, 2026

I'm not sure what is considered "public API surface", but these functions certainly were visible and accessible to users so far. We use strToInt directly and there may be other users using it.
Perhaps it could have been better thought initially, but at this point I don't see a rationale for hiding it like this - if someone wants to use it directly, there is no harm.

Mark as deprecated if used from outside the library?

I wouldn't. There are some utility functions there that may come handy, like safeMultiply, locale etc. Maybe we could even add string param to decimal and thousand separator functions, defaulting to empty for global locale. We needed that recently and I forgot it was there and recommended the std atrocity.

@matejk
Copy link
Copy Markdown
Contributor Author

matejk commented Apr 14, 2026

Removed private guard.

@matejk matejk force-pushed the 5292-number-formatter-perf branch from a50cef0 to e420b4d Compare April 16, 2026 20:33
matejk and others added 16 commits April 20, 2026 20:53
…hars

Rewrite intToStr as inline template in NumericString.h with two paths:
- Fast path (no padding/prefix/thSep): converts directly into the
  result buffer using std::to_chars (decimal) or shift-and-mask (hex).
  Fully inlined, no PLT calls.
- Slow path: converts to temp buffer then calls formatIntResult() in
  Foundation for padding, prefix, case, and thousand separators.

Move NumberFormatter::format/formatHex back to inline in the header
so they benefit from the inlined intToStr fast path.

Make NumericString.h a private header (include guard enforced via
POCO_NUMERIC_STRING_PRIVATE). Add NumberFormatter/intToStr benchmark.
…d tryParseFloat allocation

- isSafeIntCast: change return type from T& to bool (was UB)
- strToInt: reject bare "+" or "-" without digits (returned true with 0)
- tryParseFloat: pass std::string directly to strToDouble instead of
  s.c_str() which caused needless const char* -> std::string conversion
Use std::from_chars as the primary parsing path for strToInt. For
inputs without thousand separators (the common case), from_chars
handles sign, overflow, and all bases directly — no manual per-digit
loop needed.

When from_chars stops early on a non-digit character and a thousand
separator is configured (base 10 only), strip separator characters
and retry from_chars on the cleaned string.

Also fix: use static_cast<unsigned char> for std::isspace to avoid
UB on signed char with values > 0x7F.
Test specific formatting scenarios (padding, prefix, hex, octal,
thousand separator, binary, edge values) across int/uint/int64/uint64,
plus NumberFormatter throughput with random data.
Change all NumberParser::tryParse* calls from strToInt(s.c_str(), ...)
to strToInt(s, ...) or strToInt(begin, end, ...) to use the known
string length directly, avoiding a redundant strlen scan.
…numbers

Remove per-case constant-value intToStr benchmark (compiler can
constant-fold, not representative of real usage). Add random small
numbers (0-999) to both intToStr and strToInt benchmarks. Use
std::string overload in strToInt benchmarks to avoid strlen.
Replace std::to_chars-based intToStr with a direct implementation using
the "digit pairs" technique: extract two decimal digits per iteration
via division by 100 and a 200-byte lookup table, halving the number of
expensive integer divisions.

- Base 10: two-digits-at-a-time with integrated thousand separator support
- Power-of-2 bases (2, 4, 8, 16): shift-and-mask, no division
- Other bases (3, 5, 6, 7, 9, 11-15): division with digit table lookup
- Write directly into result buffer, reverse in-place (no temp buffer)

Benchmarks (1M random values, macOS Apple Clang):
- small int (0-999) dec:  2.1x faster than std::to_chars
- int hex:               1.9x faster
- Int64 hex:             1.4x faster
- int dec / Int64 dec:   ~same or slightly faster
- Guard width parameter against buffer overflow (throw RangeException early)
- Replace unreachable default with poco_bugcheck() in power-of-2 switch
- Use if constexpr in isIntOverflow and safeMultiply (C++17 consistency)
- Merge split namespace Impl blocks
- Remove unused <algorithm> include
- Fix misleading whitespace test: parse(" 123") now correctly succeeds
…ars for floats

Use std::to_chars/from_chars for float/double formatting and parsing
when available (GCC 11+, MSVC 19.24+, non-Apple libc++ 17+, Apple
Clang with macOS 26.0+ deployment target). Falls back to bundled
double-conversion library on older compilers.

- Add POCO_HAS_FLOAT_CHARCONV feature detection macro in Config.h
- Conditional compilation in NumericString.cpp
- toShortestStr helper respects lowDec/highDec for notation selection
- strToFloatImpl handles inf/nan matching manually
- std::to_chars uses round-half-to-even (IEEE 754) vs round-half-up
…anup

- Fix UB in toShortestStr for NaN/infinity (log10 of non-finite values)
- Fix log10 rounding near powers of 10 with power-of-10 verification
- Add missing empty-string guard to strToFloat(const std::string&)
- Deduplicate float/double string overloads via template helpers
- Simplify pad(): replace unique_ptr<string> with plain string
- Remove obsolete MSVC #pragma warning(disable: 4146)
- Move <cmath> from header to .cpp, remove unused <memory>
- Fix POCO_MAX_INT_STRING_LEN comment
- Add [[nodiscard]] to strToInt, intToStr, safeMultiply
- Add comment documenting 'f' suffix stripping intent
- Use std::to_chars(fixed) + adjustPrecision instead of slow
  std::to_chars(fixed, precision) for floatToFixedStr/doubleToFixedStr
- Fix adjustPrecision carry propagation for negative numbers
- Merge floatToStrImpl/floatToFixedStrImpl into floatToStrCommon
- Add const to all non-mutated variables in NumericString.h and .cpp
- Rewrite benchmarks to use NumberFormatter/NumberParser public API
  with baselines (std::to_string, snprintf, strtol, strtod, sscanf)
- Add round-trip correctness validation to all benchmark sections
- Remove unused parseStream helper and MemoryStream dependency from
  test header
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
- Restore Foundation_API on float/double functions in NumericString.h
  (inline NumberFormatter methods need exported symbols in shared lib)
- Fix POCO_HAS_FLOAT_CHARCONV: raise libc++ threshold from 170000 to
  200000 (libc++ 17-19 have float from_chars overloads deleted; only
  libc++ 20+ / LLVM 20 supports them — fixes Android NDK r28)
- Fix [[nodiscard]] warnings: use intToStr return value to guard
  str.append in NumberFormatter.cpp and NumberFormatter.h
- Update double-conversion comment to mention POCO_HAS_FLOAT_CHARCONV
- Fix POCO_HAS_FLOAT_CHARCONV: use __APPLE__ instead of
  __apple_build_version__ to handle Homebrew LLVM which uses system
  libc++ with Apple availability annotations
- Fix libc++ version threshold: 170000 -> 200000 (libc++ 17-19 have
  float from_chars overloads deleted, only libc++ 20+ supports them)
- Fix from_chars underflow: fall back to strtod on result_out_of_range
  (GCC leaves result unmodified on range error); fixes JSON parsing
  of extreme values like "123e-10000000"
- Fix benchmark validation on Windows: use strtoul for unsigned
  comparison (strtol overflows 32-bit long for values > INT_MAX)
- Fix [[nodiscard]] warnings: use intToStr return value in
  NumberFormatter.cpp/.h to guard str.append
- Replace non-ASCII characters in comments with ASCII equivalents
- Update double-conversion comment to mention POCO_HAS_FLOAT_CHARCONV
- Swap template parameters from <F, T> to <To, From> matching
  isIntOverflow<To, From>. Allows isSafeIntCast<TargetType>(value)
  with From deduced from the argument.
- Use consistent naming (To/From) across all cast functions.
- Return type was already fixed from T& to bool (the old T& returning
  bool was UB); [[nodiscard]] ensures callers check the result.
- Remove unused isSafeIntCast using declaration from StringTest.
Following PR review feedback, NumericString.h is no longer guarded
against direct inclusion. The header remains an internal implementation
detail of NumberFormatter/NumberParser, but removing the #error guard
allows flexibility for downstream code and tests.
@matejk matejk force-pushed the 5292-number-formatter-perf branch from e420b4d to 61cdc0e Compare April 20, 2026 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants