[FLINK-39482][filesystem] Support configurable maxConnections in S3ClientProvider#27970
[FLINK-39482][filesystem] Support configurable maxConnections in S3ClientProvider#27970Samrat002 wants to merge 3 commits intoapache:masterfrom
Conversation
|
@gaborgsomogyi PTAL |
| config.set(NativeS3FileSystemFactory.BULK_COPY_MAX_CONCURRENT, 32); | ||
| config.set(NativeS3FileSystemFactory.MAX_CONNECTIONS, 10); |
There was a problem hiding this comment.
Just for my own understanding. BULK_COPY_MAX_CONCURRENT drives bulk operation concurrency but then what exactly MAX_CONNECTIONS drives (I read the config explanation but that's a bit cloudy)?
There was a problem hiding this comment.
They operate at different layers:
-
s3.connection.max- the HTTP connection pool size in the underlying HTTP clients (Apache for sync ops, Netty for async ops). This is a shared pool: every S3 API call, like GetObject, PutObject, HeadObject, ListObjectsV2, etc., borrows a connection from it and returns it when done. This maps directly to
https://github.com/apache/flink/blob/FLINK-39482/flink-filesystems/flink-s3-fs-native/src/main/java/org/apache/flink/fs/s3native/S3ClientProvider.java#L391 on the Netty async
client and maxConnections on the Apache sync client. -
s3.bulk-copy.max-concurrent- how many S3 download operations NativeS3BulkCopyHelper fires in parallel during state restore (the batch size in copyFiles).
The root cause of FLINK-39482: these two layers interact because S3TransferManager uses multipart downloads for files > 8MB, each part is a separate HTTP connection from the shared pool. So maxConcurrentCopies=16 files × ~4 parts each = ~64 HTTP connections needed, but the pool only has 50 → acquire timeout → opaque SdkClientException.
Does this configuration make sense, or are there any suggestions to simplify it?
…est, rename method and use key
|
After the nit fix + other comments resolution it's good to go from my perspective. |
|
Intended to merge this unless comments arrive |
What is the purpose of the change
This pull request prevents S3 connection pool exhaustion during RocksDB state restore when using the Native S3 filesystem. When NativeS3BulkCopyHelper fires concurrent downloads via S3TransferManager, each multipart download can consume multiple HTTP connections. With the default pool size of 50 and a batch concurrency of 16, the pool can be exhausted, causing downloads to hang until the SDK's acquire timeout expires. This results in opaque SdkClientException failures during checkpoint restore.
The fix introduces a configurable s3.connection.max option, clamps s3.bulk-copy.max-concurrent to the connection pool size, raises the connection acquisition timeout from the SDK default to the user-configured connection timeout
Brief change log
Added configurable maxConnections.
Unit tests for connection pool exhautions
Verifying this change
chains, as well as false-positive resistance
factory
end-to-end through factory → filesystem → bulk copy helper
Does this pull request potentially affect one of the following parts:
Documentation