Skip to content

enh: reduce binary size across all libraries #5292#5297

Merged
matejk merged 3 commits intomainfrom
5292-reduce-binary-size
Apr 7, 2026
Merged

enh: reduce binary size across all libraries #5292#5297
matejk merged 3 commits intomainfrom
5292-reduce-binary-size

Conversation

@matejk
Copy link
Copy Markdown
Contributor

@matejk matejk commented Apr 5, 2026

Summary

  • Move = default constructors/destructors from headers to .cpp files for 42 non-template classes across Util, Prometheus, ActiveRecord, MongoDB, Net, and Redis
  • Add extern template declarations for BasicEvent types used by AbstractConfiguration
  • Add -Wl,-dead_strip on macOS and visibility-aware -ffunction-sections/gc-sections
  • Document binary size reduction options in README

Binary Size (stripped, default options, RelWithDebInfo)

macOS arm64 (Apple Clang)

Library main fix diff %
Util 647,104 620,656 -26,448 -4.1%
Foundation 3,372,384 3,347,584 -24,800 -0.7%
DataODBC 1,269,088 1,244,496 -24,592 -1.9%
Data 2,821,152 2,788,992 -32,160 -1.1%
Net 1,431,824 1,414,000 -17,824 -1.2%
JSON 467,248 460,480 -6,768 -1.4%
MongoDB 433,520 426,976 -6,544 -1.5%
TOTAL (17 libs) 15,227,944 15,077,608 -150,336 -1.0%

Linux x86_64 (GCC)

Library main fix diff %
Util 793,752 728,464 -65,288 -8.2%
TOTAL (17 libs) 17,395,512 17,330,504 -65,008 -0.4%

On Linux with default visibility, GCC already deduplicates weak template symbols effectively. The Util savings (-65 KB) come from the BasicEvent extern template declarations eliminating duplicate instantiations across 11 Configuration subclasses.

Closes #5292

Test plan

  • Foundation tests (939 pass)
  • Util tests (240 pass)
  • Prometheus tests (21 pass)
  • MongoDB tests (89 pass)
  • Redis tests (57 pass)
  • Builds on macOS (Apple Clang) and Linux (GCC)
  • CI

@matejk matejk added this to the Release 1.15.2 milestone Apr 5, 2026
@matejk
Copy link
Copy Markdown
Contributor Author

matejk commented Apr 5, 2026

Binary Size Comparison: 1.14.2 vs main vs fix branch

Stripped library sizes in bytes. "fix" = code changes only (default visibility).
"fix+hidden" = code changes + -DCMAKE_CXX_VISIBILITY_PRESET=hidden -DCMAKE_VISIBILITY_INLINES_HIDDEN=ON.

macOS arm64 (Apple Clang, RelWithDebInfo)

Library 1.14.2 main fix fix+hidden fix-main hidden-main
ActiveRecord 50,800 50,608 50,752 51,248 +144 +640
Crypto 361,176 361,992 358,504 356,904 -3,488 -5,088
Data 3,102,680 3,151,848 3,119,592 3,183,640 -32,256 +31,792
DataODBC 1,435,384 1,428,872 1,404,168 1,441,368 -24,704 +12,496
DataSQLite 1,823,336 1,838,776 1,837,416 1,847,224 -1,360 +8,448
Encodings 945,208 945,080 945,016 945,240 -64 +160
Foundation 3,233,544 3,850,760 3,825,368 3,741,672 -25,392 -109,088
JSON 525,720 524,568 517,784 524,072 -6,784 -496
JWT 277,112 258,216 256,440 260,296 -1,776 +2,080
MongoDB 298,392 487,848 483,656 463,096 -4,192 -24,752
Net 1,589,144 1,692,696 1,674,968 1,666,696 -17,728 -26,000
NetSSL 410,376 412,120 410,024 409,976 -2,096 -2,144
Prometheus 198,008 196,440 201,432 202,040 +4,992 +5,600
Redis 175,864 202,248 201,944 200,360 -304 -1,888
Util 624,008 722,424 699,096 695,896 -23,328 -26,528
XML 762,856 776,280 774,456 774,952 -1,824 -1,328
Zip 304,152 304,856 302,552 302,136 -2,304 -2,720
TOTAL 16,117,760 17,205,632 17,063,168 17,066,816 -142,464 -138,816
Configuration vs main %
fix (default visibility) -139 KB -0.8%
fix + visibility hidden -136 KB -0.8%

Linux x86_64 (GCC, RelWithDebInfo)

Library 1.14.2 main fix fix+hidden fix-main hidden-main
ActiveRecord 67,888 67,904 67,912 67,856 +8 -48
Crypto 333,504 333,520 333,520 333,152 +0 -368
Data 3,230,232 3,296,176 3,296,176 2,631,448 +0 -664,728
DataODBC 1,517,216 1,517,920 1,517,920 1,318,064 +0 -199,856
DataSQLite 1,724,328 1,724,560 1,724,560 1,657,680 +0 -66,880
Encodings 921,320 921,320 921,320 921,312 +0 -8
Foundation 3,099,048 3,759,304 3,759,304 3,492,192 +0 -267,112
JSON 529,680 529,600 529,600 463,008 +0 -66,592
JWT 266,608 267,368 267,368 266,664 +0 -704
MongoDB 332,368 531,872 531,952 465,448 +80 -66,424
Net 1,523,176 1,724,312 1,724,312 1,657,448 +0 -66,864
NetSSL 400,224 400,704 400,704 400,408 +0 -296
Prometheus 199,896 200,328 200,504 200,184 +176 -144
Redis 200,048 200,240 200,256 200,024 +16 -216
Util 662,560 793,752 728,464 662,056 -65,288 -131,696
XML 727,824 794,240 794,240 794,040 +0 -200
Zip 332,352 332,392 332,392 332,056 +0 -336
TOTAL 16,068,272 17,395,512 17,330,504 15,863,040 -65,008 -1,532,472
Configuration vs main %
fix (default visibility) -63 KB -0.4%
fix + visibility hidden -1,497 KB -8.8%

Note: on Linux with default visibility, GCC already deduplicates weak template symbols
effectively, so the code-level improvements yield modest savings. The visibility-hidden
option produces dramatic savings because gc-sections can then discard all
unreferenced hidden symbols from shared libraries.

Changes Applied

  1. Moved = default out-of-line: 42 non-template classes across Util (22),
    Prometheus (14), ActiveRecord (1), MongoDB (3), Net (1), Redis (2).

  2. BasicEvent explicit instantiations (Util): extern template for the three
    BasicEvent types used by AbstractConfiguration.

  3. CMake dead-code stripping: macOS always gets -Wl,-dead_strip. When
    CMAKE_CXX_VISIBILITY_PRESET=hidden is set, -ffunction-sections -fdata-sections
    and linker gc are enabled automatically.

  4. README: documented binary size reduction options for users.

@aleks-f
Copy link
Copy Markdown
Member

aleks-f commented Apr 5, 2026

I would like to see some benchmarks on intToStr. That function was manually written for a reason (used often in Var) to be faster than standard implementations

Comment thread Net/src/HTTPAuthenticationParams.cpp Fixed
Move = default constructors/destructors from headers to .cpp files
for 42 non-template classes across Util (22), Prometheus (14),
ActiveRecord (1), MongoDB (3), Net (1), and Redis (2) to eliminate
template bloat from implicit instantiations in every translation unit.

Add extern template declarations for BasicEvent types used by
AbstractConfiguration to prevent duplicate instantiations across
Util's 11 Configuration subclasses.
Copy link
Copy Markdown
Member

@aleks-f aleks-f left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

intToStr seems to be mostly pessimization. benchmark code here

Aggregate (58 cases across 4 types)

Metric Value
Total OLD 1933.22 ms
Total NEW 2442.94 ms
Delta +509.72 ms (+26.4%)
NEW faster 8 cases
OLD faster 49 cases
~Same 1 case
Best NEW gain -53.2% (uint64/int64_max)
Worst NEW loss +168.0% (int/zero)

Per-type breakdown

Type OLD NEW Change
int 480.09 ms 687.61 ms +43.2%
uint 413.37 ms 525.80 ms +27.2%
int64 575.95 ms 689.48 ms +19.7%
uint64 463.80 ms 540.05 ms +16.4%

@aleks-f
Copy link
Copy Markdown
Member

aleks-f commented Apr 6, 2026

Here is an optimized version of the old intToStr, with comparison to old and new (std::to_chars) versions

Type OLD NEW OPT OPT/OLD OPT/NEW
int 490.30 692.71 419.53 -14.4% -39.4%
uint 404.68 521.96 342.54 -15.4% -34.4%
int64 545.73 699.65 466.30 -14.6% -33.4%
uint64 470.13 543.89 369.13 -21.5% -32.1%

@matejk
Copy link
Copy Markdown
Contributor Author

matejk commented Apr 6, 2026

I will remove numeric parser/formatter changes from this PR. New PR is open for that: #5301.

macOS: add -Wl,-dead_strip for non-Debug builds (Apple linker handles
it natively on Mach-O atoms).

All platforms: when CMAKE_CXX_VISIBILITY_PRESET=hidden is set, enable
-ffunction-sections -fdata-sections and linker gc for non-Debug builds.
Use generator expressions for proper multi-config generator support.

ICF (--icf=safe) is enabled on non-Apple platforms when gold or lld
is detected.
@matejk matejk force-pushed the 5292-reduce-binary-size branch from b2c8f25 to 1bb1c47 Compare April 6, 2026 18:47
…n of copy constructor and assignment ('Rule of Two')

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

Potential fix for code scanning alert no. 118: Inconsistent definition of copy constructor and assignment ('Rule of Two')

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
@matejk matejk force-pushed the 5292-reduce-binary-size branch from ab2087a to 04fb9a7 Compare April 7, 2026 08:19
@matejk matejk merged commit 1aa0624 into main Apr 7, 2026
106 checks passed
@matejk matejk deleted the 5292-reduce-binary-size branch April 7, 2026 17:07
matejk added a commit that referenced this pull request Apr 9, 2026
* enh: move defaulted constructors/destructors out-of-line #5292

Move = default constructors/destructors from headers to .cpp files
for 42 non-template classes across Util (22), Prometheus (14),
ActiveRecord (1), MongoDB (3), Net (1), and Redis (2) to eliminate
template bloat from implicit instantiations in every translation unit.

Add extern template declarations for BasicEvent types used by
AbstractConfiguration to prevent duplicate instantiations across
Util's 11 Configuration subclasses.

* enh(cmake): add dead-code stripping and visibility-aware gc-sections

macOS: add -Wl,-dead_strip for non-Debug builds (Apple linker handles
it natively on Mach-O atoms).

All platforms: when CMAKE_CXX_VISIBILITY_PRESET=hidden is set, enable
-ffunction-sections -fdata-sections and linker gc for non-Debug builds.
Use generator expressions for proper multi-config generator support.

ICF (--icf=safe) is enabled on non-Apple platforms when gold or lld
is detected.

* Potential fix for code scanning alert no. 118: Inconsistent definition of copy constructor and assignment ('Rule of Two')

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

Potential fix for code scanning alert no. 118: Inconsistent definition of copy constructor and assignment ('Rule of Two')

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>

---------

Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
matejk added a commit that referenced this pull request Apr 9, 2026
…lution

The cherry-pick of PRs #5297 and #5276 resolved conflicts by taking
the incoming single-brace namespace close (} // namespace Poco::X),
but this branch still uses old-style dual namespace opens
(namespace Poco { namespace X {), requiring two closing braces.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

enh: reduce binary size growth since 1.14.2

3 participants