fix(performance): Use bounded HTTP Range requests for indexed BAM queries by TechIsCool · Pull Request #1998 · samtools/htslib

TechIsCool · 2026-04-14T23:58:32Z

Problem

When reading remote BAM files with an index, htslib seeks to each chunk's start offset but issues unbounded Range requests. The server advertises gigabytes of Content-Length even though we only need kilobytes:

Seek	Unbounded Request	Server Advertises	Actual Data Required
chr1	`bytes=8224425-`	17.2 GB	141 KB
chr2	`bytes=1631423494-`	15.6 GB	123 KB
chr7	`bytes=7287649006-`	9.9 GB	167 KB

The client terminates early, but "early termination" isn't free - data in flight still transfers. We have also found that being specific about what is needed improves S3 responsiveness.

Solution

The BAM index already contains chunk end offsets. Pass them through to the HTTP layer:

hts_itr_next()
  → bgzf_seek_limit(fp, chunk.start, SEEK_SET, chunk.end)  // NEW: chunk.end
    → hfile_set_readahead_limit(fp->fp, compressed_limit)
      → CURLOPT_RANGE "bytes=X-Y" instead of CURLOPT_RESUME_FROM_LARGE

samtools view --verbosity 10 -X <bam> <bai> <regions> 2>&1 | grep "Range:"

# Before: Range: bytes=7287649006-
# After:  Range: bytes=7287649006-7287816262

EC2 Benchmark

(35 MB/s bandwidth)
Environment: EC2 m8azn.medium (up to 25 Gbps bandwidth), us-east-1
Test file: s3://1000genomes/.../NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam (17.3 GB)
Measurement: Wall clock time + actual wire transfer via /sys/class/net/<iface>/statistics/rx_bytes

It appears S3 optimizes resource allocation for bounded requests, leading to much faster responses. The time improvement exceeds bandwidth savings, suggesting that S3 can serve bounded requests more efficiently.

Query	Unbounded	Bounded	Bandwidth	Time
1 region	1.04 MB, 0.67s	0.72 MB, 0.38s	31% less	1.8x faster
5 regions	1.94 MB, 1.33s	1.28 MB, 0.45s	34% less	2.9x faster
10 regions	3.88 MB, 2.19s	2.55 MB, 1.11s	34% less	2.0x faster
chr22 (275 MB)	275.8 MB, 7.17s	275.2 MB, 5.07s	~same	1.4x faster

Per-Request Comparison (5 regions)

Seek	Unbounded (before)	Bounded (after)
	`Range: bytes=X-`	`Range: bytes=X-Y`
chr1	Content-Length: 17.2 GB	Content-Length: 141 KB
chr2	Content-Length: 15.6 GB	Content-Length: 123 KB
chr3	Content-Length: 14.3 GB	Content-Length: 155 KB
chr5	Content-Length: 12.2 GB	Content-Length: 122 KB
chr7	Content-Length: 9.9 GB	Content-Length: 167 KB
Total Advertised	69.2 GB	708 KB

Local Benchmark

(2 MB/s bandwidth)
Environment: MacOS M1 (up to 100Mbps), California
Test file: s3://1000genomes/.../NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam (17.3 GB)

Regions	Unbounded	Bounded	Speedup
1 region	2.6s	2.5s	1.04x
5 regions	7.0s	4.5s	1.55x
10 regions	13.0s	7.5s	1.74x

Reproduction

# Measure wall time and wire transfer (on EC2/Linux)
IFACE=$(ls /sys/class/net/ | grep -v lo | head -1)
BEFORE=$(cat /sys/class/net/$IFACE/statistics/rx_bytes)
START=$(date +%s.%N)
samtools view --verbosity 10 -X \
  https://s3.amazonaws.com/1000genomes/phase3/data/NA12878/exome_alignment/NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam \
  https://s3.amazonaws.com/1000genomes/phase3/data/NA12878/exome_alignment/NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam.bai \
  1:1000000-1000100 2:5000000-5000100 3:10000000-10000100 \
  5:50000000-50000100 7:117188547-117188800 >/dev/null 2>&1
END=$(date +%s.%N)
AFTER=$(cat /sys/class/net/$IFACE/statistics/rx_bytes)
echo "Time: $(echo "$END - $START" | bc)s, Bytes: $((AFTER - BEFORE))"

When reading indexed BAM files from remote URLs (HTTP, S3, etc.), seeking to a chunk offset would then read unbounded to EOF. For small queries against large files, this downloads far more data than needed. This adds bgzf_seek_limit() which accepts the chunk end offset from the BAM index, enabling bounded Range requests (bytes=X-Y) instead of unbounded ones (bytes=X-) in the libcurl backend. Changes: - hfile.h/hfile.c: Add readahead_limit field and setter - bgzf.h/bgzf.c: Add bgzf_seek_limit() that passes limit to hfile - hfile_libcurl.c: Use CURLOPT_RANGE with bounds when limit is set - hts.c: Call bgzf_seek_limit() with chunk end in hts_itr_next() The limit is cleared after each hseek(), so only affects reads immediately following a seek. Signed-off-by: David Beck <techiscool@gmail.com>

daviesrob · 2026-04-15T09:22:05Z

This changes the public hFILE structure, so it would require an ABI bump if merged. It would be good to avoid that if possible.

This problem has already been fixed in the s3 plug-in, albeit in a different way. If you try using s3:// URLs instead of https://, you'll find that the amount of data requested is much smaller, and the time taken is similar to what you report for your solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(performance): Use bounded HTTP Range requests for indexed BAM queries#1998

fix(performance): Use bounded HTTP Range requests for indexed BAM queries#1998
TechIsCool wants to merge 1 commit intosamtools:developfrom
TechIsCool:index-aware-range-requests

TechIsCool commented Apr 14, 2026

Uh oh!

daviesrob commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

TechIsCool commented Apr 14, 2026

Problem

Solution

EC2 Benchmark

Per-Request Comparison (5 regions)

Local Benchmark

Reproduction

Uh oh!

daviesrob commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants