CLDSRV-898: handle checksums in CompleteMultipartUpload by leif-scality · Pull Request #6170 · scality/cloudserver

leif-scality · 2026-05-15T21:01:27Z

Calculate and compare the final object checksum with the one sent by the headers
Check that all parts have the correct checksum and checksum type
stores the final checksum when FULL_OBJECT (COMPOSITE are going to be stored by https://scality.atlassian.net/browse/S3C-10399)

claude · 2026-05-15T21:05:58Z

LGTM

Review by Claude Code

codecov · 2026-05-15T21:06:39Z

❌ 1 Tests Failed:

Tests completed	Failed	Passed	Skipped
9179	1	9178	0

View the top 1 failed test(s) by shortest run time

"before each" hook for "should retrieve a part put after part copied from MPU"::GET object With v4 signature With PartNumber field uploadPartCopy "before each" hook for "should retrieve a part put after part copied from MPU"

Stack Traces | 6.22s run time

Connection timed out after 5000 ms

To view more test analytics, go to the Test Analytics Dashboard
_{📋 Got 3 mins? Take this short survey to help us improve Test Analytics.}

bert-e · 2026-05-21T13:19:02Z

Hello leif-scality,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options

name	description	privileged	authored
`/after_pull_request`	Wait for the given pull request id to be merged before continuing with the current one.
`/bypass_author_approval`	Bypass the pull request author's approval	⭐
`/bypass_build_status`	Bypass the build and test status	⭐
`/bypass_commit_size`	Bypass the check on the size of the changeset `TBA`	⭐
`/bypass_incompatible_branch`	Bypass the check on the source branch prefix	⭐
`/bypass_jira_check`	Bypass the Jira issue check	⭐
`/bypass_peer_approval`	Bypass the pull request peers' approval	⭐
`/bypass_leader_approval`	Bypass the pull request leaders' approval	⭐
`/approve`	Instruct Bert-E that the author has approved the pull request.		✍️
`/create_pull_requests`	Allow the creation of integration pull requests.
`/create_integration_branches`	Allow the creation of integration branches.
`/no_octopus`	Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
`/unanimity`	Change review acceptance criteria from `one reviewer at least` to `all reviewers`
`/wait`	Instruct Bert-E not to run until further notice.

Available commands

name	description	privileged
`/help`	Print Bert-E's manual in the pull request.
`/status`	Print Bert-E's current status in the pull request `TBA`
`/clear`	Remove all comments from Bert-E from the history `TBA`
`/retry`	Re-start a fresh build `TBA`
`/build`	Re-start a fresh build `TBA`
`/force_reset`	Delete integration branches & pull requests, and restart merge process from the beginning.
`/reset`	Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

bert-e · 2026-05-21T13:19:08Z

Incorrect fix version

The Fix Version/s in issue CLDSRV-898 contains:

None

Considering where you are trying to merge, I ignored possible hotfix versions and I expected to find:

9.4.0

Please check the Fix Version/s of CLDSRV-898, or the target
branch of this pull request.

claude · 2026-05-21T13:28:33Z

LGTM

Well-designed separation of CompleteMPU's final-object checksum semantics from body-digest validation. The callNext double-callback guard in processParts is a good defensive pattern for the mixed callback/promise flow. Test coverage is thorough across all algorithm/type combinations, including edge cases for default MPUs, legacy MPUs, and soft-null vs InternalError distinction.

Review by Claude Code

jonathan-gramain · 2026-05-21T18:47:36Z

+    const parts = jsonList.Part || [];
+    for (let i = 0; i < parts.length; i++) {
+        const part = parts[i];


Suggested change

const parts = jsonList.Part || [];

for (let i = 0; i < parts.length; i++) {

const part = parts[i];

for (const part of (jsonList.Part || [])) {

jonathan-gramain · 2026-05-21T18:49:38Z

+            if (tag !== expectedTag) {
+                const algoLabel = tag.replace(/^Checksum/, '').toLowerCase();
+                return errorInstances.BadDigest.customizeDescription(
+                    `The ${algoLabel} you specified for part ${partNumber} ` + 'did not match what we received.',


Suggested change

`The ${algoLabel} you specified for part ${partNumber} ` + 'did not match what we received.',

`The ${algoLabel} you specified for part ${partNumber} did not match what we received.`,

jonathan-gramain · 2026-05-21T18:51:46Z

+            return errorInstances.InvalidRequest.customizeDescription(
+                `The upload was created using a ${mpuAlgo} checksum. ` +
+                    'The complete request must include the checksum for each ' +
+                    `part. It was missing for part ${partNumber} in the request.`,


Suggested change

`part. It was missing for part ${partNumber} in the request.`,

`part, it is missing for part ${partNumber} in the request.`,

The current error message is identical to the one sent by AWS

<Error> <Code>InvalidRequest</Code> <Message>The upload was created using a sha256 checksum. The complete request must include the checksum for each part. It was missing for part 1 in the request.</Message> <RequestId>2CMRHS7YQJ418XRD</RequestId><HostId>Z+nL7Rh7aglZj/H2D2aesp/j38QjuvN/wpgur86e7/0P6nohBelRVXaXY+mWSkQ dGaPIKHTmmg0iPUkaWjIWFTHmMJdNZolP</HostId> </Error>

jonathan-gramain · 2026-05-21T18:57:15Z

There's already a test file multipartUpload.js, wouldn't it be better to add the tests there? Or if a refactor into multiple files is beneficial, maybe port some related tests from there to this new file to avoid scattering the complete MPU tests into multiple files.

moved the tests to multipartUpload.js. This also triggered the new prettier linter in the file, I added it as a separate commit

jonathan-gramain · 2026-05-21T18:58:37Z

+const DummyRequest = require('../DummyRequest');
+const { cleanup, DummyRequestLogger, makeAuthInfo } = require('../helpers');
+
+const SPLITTER = '..|..';


Suggested change

const SPLITTER = '..|..';

const SPLITTER = constants.splitter;

jonathan-gramain · 2026-05-21T18:59:02Z

+
+// XML element name AWS uses for each algorithm in CompleteMultipartUpload's
+// per-part body.
+const TAG_BY_ALGO = {


Shouldn't this be in constants too?

The algorithms object contains the TAG of each algo already, this object is just for testing that the tag was not changed in the algorithms object

jonathan-gramain · 2026-05-21T19:01:13Z

+                    assert.strictEqual(
+                        err.description,
+                        'One or more of the specified parts could not be ' +
+                            'found.  The part may not have been uploaded, or ' +


Is the double space intentional? Also not very fond of matching against such a long error message that may vary, but that's okay I think.

AWS returns two spaces. For all the final API/XML errors I return exactly what AWS returns

<Error><Code>InvalidPart</Code> <Message>One or more of the specified parts could not be found. The part may not have been uploaded, or the specified entity tag may not match the part's entity tag.</Message> <UploadId>CTFNKyL2hI6n6irH0zbVWDzdPZ4n2ueJceRh1juCeuL2X5HOjrvCmXQMEqaoAatEWTEa3pWWxC7t9lOStMzjo0nJb4pv8Ct6oT Hv2n8mggVXRQ8RxiXyVyt3.3zpY98HVsZd.ozihhJ1HdUjLCkJtwJMQdNBd4fSdG9drS80vdg-</UploadId> <PartNumber>1</PartNumber> <ETag>0ebf9257a12e808d107b2ed1a826c122</ETag> <RequestId>DAQAPVMCMSY4PDPV</RequestId> <HostId>XS/2re3ieUQRTKdANLtZv14qyB2h3LjVHnmVrvjP0cj1PazPO16KkArQMtBLBy8S4mmLzQkuXZc=</HostId></Error>

jonathan-gramain · 2026-05-22T01:53:43Z

+    // `x-amz-checksum-type` and `x-amz-checksum-algorithm` are configuration
+    // headers (MPU completeness mode / SDK algorithm hint), not value
+    // headers. They must not count toward the "value header" tally.
+    const valueHeaders = Object.keys(headers).filter(
+        h => h.startsWith('x-amz-checksum-') && h !== 'x-amz-checksum-type' && h !== 'x-amz-checksum-algorithm',
+    );


If the list of supported headers is known and a short list of supported algorithms, it may be cleaner to directly extract each of the possible ones, or otherwise filter like [list-of-supported-headers].includes(h).

It would change the behavior (probably in a good way) if the client sends one or more unsupported checksums along with multiple valid ones, where we probably want to return AlgoNotSupported rather than MultipleChecksumTypes in priority in this case, but that's more a nitpick, either way should work.

AWS checks MultipleChecksumTypes before AlgoNotSupported, the current behavior is the same as AWS.

AWS also ignores only x-amz-checksum-type and x-amz-checksum-algorithm, they don't count them to the checksum count, and they also never trigger AlgoNotSupported. if we send x-amz-checksum-BAD on the other hand we get an AlgoNotSupported. If we send x-amz-checksum-BAD + x-amz-checksum-ZZZ we get MultipleChecksumTypes.

So the order is

check no MultipleChecksumTypes

check no AlgoNotSupported

check actual checksum value, BadDigest if mismatch

I added a commit to ignore x-amz-checksum-type and x-amz-checksum-algorithm in the other functions, I didn't know this behavior existed.

jonathan-gramain · 2026-05-22T02:04:50Z

+            partInputs.map(p => p.value),
+        );
+    } else if (type === 'FULL_OBJECT') {
+        result = computeFullObjectMPUChecksum(algorithm, partInputs);


Wondering what is the worst case run time of computeFullObjectMPUChecksum, to know if it shouldn't be an async function so it can yield to the event loop regularly? Assuming the worst case is 10K parts and it's just combining CRCs, I think it should not take more than one or a few ms, so should be fine I think, but better check if not done yet.

A test was added in the previous PR for testing the worst case, and it takes a couple of milliseconds so it should be ok

jonathan-gramain · 2026-05-22T02:14:17Z

+                let nextCalled = false;
+                const callNext = (...args) => {
+                    if (nextCalled) {
+                        log.error('processParts: swallowed late callNext after next already invoked', {


Not a big fan of this style of defensive programming: for me, either we know that next may be called multiple times by design and we accept it as part of the normal behavior (e.g. with jsutil.once wrapping the inner next callback where we know it may occur), or we should raise an exception (async.* typically does). This type of error will probably be lost in the logs without anyone noticing and finish as if jsutil.once was used in the first place, potentially hiding real issues.

The original code didn't have such error handling, so I think it should be fine to assume it will not happen if the logic introduced ensures a single callback call.

…ecksum header

…ing in part metadata

…igests

claude · 2026-05-22T13:07:18Z

LGTM

The checksum handling for CompleteMultipartUpload is well-implemented: the split between md5-only body validation and final-object checksum assertion correctly matches AWS behavior. The callNext guard in processParts properly prevents double-callbacks in the mixed callback/promise flow. Per-part validation, final-object computation, and header assertion are thorough. Tests cover the full (algorithm, type) matrix including edge cases (default MPU, legacy MPU, missing parts).

Review by Claude Code

claude · 2026-05-22T13:34:11Z

+                                return next(typeErr, destBucket);
+                            }
+                            const mpuType = storedMetadata.checksumType;
+                            if (!mpuType || headerTypeUpper !== mpuType.toUpperCase()) {


When mpuType is falsy (legacy MPU created before the checksum feature, where storedMetadata.checksumType is undefined), this error message reads "The upload was created using the undefined checksum mode." — leaking an internal implementation detail to the client. Consider splitting the !mpuType and mismatch cases into separate error messages.

— Claude Code

claude · 2026-05-22T13:34:18Z

completeMultipartUpload.js:352 — When storedMetadata.checksumType is undefined (legacy MPU created before the checksum feature), the error message interpolates undefined into user-facing text: "The upload was created using the undefined checksum mode." Split the !mpuType guard into a separate branch with an appropriate message.

Review by Claude Code

claude · 2026-05-22T13:37:46Z

 */
+// Validate a Content-MD5 header against the buffered body. Returns null on
+// success, an error object otherwise.
+function validateContentMd5(headers, body) {


The JSDoc block above (validateChecksumsNoChunking on lines 332-337) now documents validateContentMd5 instead, since the new function was inserted between the comment and its intended target at line 361. Move the JSDoc down to line 361 or add a proper JSDoc here for validateContentMd5.

— Claude Code

claude · 2026-05-22T13:37:53Z

LGTM — solid implementation with thorough test coverage.

One minor nit:
- Orphaned JSDoc in validateChecksums.js:332-337: the validateChecksumsNoChunking JSDoc now sits above the new validateContentMd5 function instead of its intended target.

Review by Claude Code

leif-scality force-pushed the improvement/CLDSRV-902-calculate-final-checksum-from-parts branch 2 times, most recently from 2493909 to 538c16c Compare May 21, 2026 09:18

Base automatically changed from improvement/CLDSRV-902-calculate-final-checksum-from-parts to development/9.4 May 21, 2026 13:18

leif-scality force-pushed the improvement/CLDSRV-898-complete-mpu-checksums-2 branch from 27b4a43 to a90bd94 Compare May 21, 2026 13:24

jonathan-gramain reviewed May 21, 2026

View reviewed changes

jonathan-gramain reviewed May 22, 2026

View reviewed changes

leif-scality added 8 commits May 22, 2026 14:56

CLDSRV-898: validate per-part checksums and x-amz-checksum-type

f5dc3cd

CLDSRV-898: CompleteMPU calculate and validate final checksum with ch…

3b3a33e

…ecksum header

CLDSRV-898: CompleteMPU store FULL_OBJECT checksum in object metadata

2588321

CLDSRV-898: CompleteMPU set checksum value and type in response XML body

7eeb08a

CLDSRV-898: CompleteMPU checksum functional tests

d5a59a4

CLDSRV-898: reject Checksum<X> field on a default MPU checksum

7e8aef6

CLDSRV-898: reject CompleteMPU if explicit checksum but checksum miss…

ff5b400

…ing in part metadata

CLDSRV-898: stop treating x-amz-checksum-algorithm/type as checksum d…

e468700

…igests

leif-scality force-pushed the improvement/CLDSRV-898-complete-mpu-checksums-2 branch from a90bd94 to e468700 Compare May 22, 2026 13:01

leif-scality added 2 commits May 22, 2026 15:22

CLDSRV-898: prettier format multipartUpload.js

ffd39d7

CLDSRV-898: merge completeMultipartUpload tests into multipartUpload.js

6e735e4

leif-scality force-pushed the improvement/CLDSRV-898-complete-mpu-checksums-2 branch from 46b6d78 to 6e735e4 Compare May 22, 2026 13:32

claude Bot reviewed May 22, 2026

View reviewed changes

	`The ${algoLabel} you specified for part ${partNumber} ` + 'did not match what we received.',
	`The ${algoLabel} you specified for part ${partNumber} did not match what we received.`,

	`part. It was missing for part ${partNumber} in the request.`,
	`part, it is missing for part ${partNumber} in the request.`,

	const SPLITTER = '..\|..';
	const SPLITTER = constants.splitter;

Conversation

leif-scality commented May 15, 2026

Uh oh!

claude Bot commented May 15, 2026

Uh oh!

codecov Bot commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

❌ 1 Tests Failed:

Uh oh!

bert-e commented May 21, 2026

Hello leif-scality,

Uh oh!

bert-e commented May 21, 2026

Incorrect fix version

Uh oh!

claude Bot commented May 21, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leif-scality May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leif-scality May 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 22, 2026

Uh oh!

claude Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 22, 2026

Uh oh!

claude Bot May 22, 2026

Choose a reason for hiding this comment

Uh oh!

claude Bot commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented May 15, 2026 •

edited

Loading

leif-scality May 22, 2026 •

edited

Loading

leif-scality May 22, 2026 •

edited

Loading