Skip to content

fix(performance): Use bounded HTTP Range requests for indexed BAM queries#1998

Open
TechIsCool wants to merge 1 commit intosamtools:developfrom
TechIsCool:index-aware-range-requests
Open

fix(performance): Use bounded HTTP Range requests for indexed BAM queries#1998
TechIsCool wants to merge 1 commit intosamtools:developfrom
TechIsCool:index-aware-range-requests

Conversation

@TechIsCool
Copy link
Copy Markdown

Problem

When reading remote BAM files with an index, htslib seeks to each chunk's start offset but issues unbounded Range requests. The server advertises gigabytes of Content-Length even though we only need kilobytes:

Seek Unbounded Request Server Advertises Actual Data Required
chr1 bytes=8224425- 17.2 GB 141 KB
chr2 bytes=1631423494- 15.6 GB 123 KB
chr7 bytes=7287649006- 9.9 GB 167 KB

The client terminates early, but "early termination" isn't free - data in flight still transfers. We have also found that being specific about what is needed improves S3 responsiveness.

Solution

The BAM index already contains chunk end offsets. Pass them through to the HTTP layer:

hts_itr_next()
  → bgzf_seek_limit(fp, chunk.start, SEEK_SET, chunk.end)  // NEW: chunk.end
    → hfile_set_readahead_limit(fp->fp, compressed_limit)
      → CURLOPT_RANGE "bytes=X-Y" instead of CURLOPT_RESUME_FROM_LARGE
samtools view --verbosity 10 -X <bam> <bai> <regions> 2>&1 | grep "Range:"

# Before: Range: bytes=7287649006-
# After:  Range: bytes=7287649006-7287816262

EC2 Benchmark

(35 MB/s bandwidth)
Environment: EC2 m8azn.medium (up to 25 Gbps bandwidth), us-east-1
Test file: s3://1000genomes/.../NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam (17.3 GB)
Measurement: Wall clock time + actual wire transfer via /sys/class/net/<iface>/statistics/rx_bytes

It appears S3 optimizes resource allocation for bounded requests, leading to much faster responses. The time improvement exceeds bandwidth savings, suggesting that S3 can serve bounded requests more efficiently.

Query Unbounded Bounded Bandwidth Time
1 region 1.04 MB, 0.67s 0.72 MB, 0.38s 31% less 1.8x faster
5 regions 1.94 MB, 1.33s 1.28 MB, 0.45s 34% less 2.9x faster
10 regions 3.88 MB, 2.19s 2.55 MB, 1.11s 34% less 2.0x faster
chr22 (275 MB) 275.8 MB, 7.17s 275.2 MB, 5.07s ~same 1.4x faster

Per-Request Comparison (5 regions)

Seek Unbounded (before) Bounded (after)
Range: bytes=X- Range: bytes=X-Y
chr1 Content-Length: 17.2 GB Content-Length: 141 KB
chr2 Content-Length: 15.6 GB Content-Length: 123 KB
chr3 Content-Length: 14.3 GB Content-Length: 155 KB
chr5 Content-Length: 12.2 GB Content-Length: 122 KB
chr7 Content-Length: 9.9 GB Content-Length: 167 KB
Total Advertised 69.2 GB 708 KB

Local Benchmark

(2 MB/s bandwidth)
Environment: MacOS M1 (up to 100Mbps), California
Test file: s3://1000genomes/.../NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam (17.3 GB)

Regions Unbounded Bounded Speedup
1 region 2.6s 2.5s 1.04x
5 regions 7.0s 4.5s 1.55x
10 regions 13.0s 7.5s 1.74x

Reproduction

# Measure wall time and wire transfer (on EC2/Linux)
IFACE=$(ls /sys/class/net/ | grep -v lo | head -1)
BEFORE=$(cat /sys/class/net/$IFACE/statistics/rx_bytes)
START=$(date +%s.%N)
samtools view --verbosity 10 -X \
  https://s3.amazonaws.com/1000genomes/phase3/data/NA12878/exome_alignment/NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam \
  https://s3.amazonaws.com/1000genomes/phase3/data/NA12878/exome_alignment/NA12878.mapped.ILLUMINA.bwa.CEU.exome.20121211.bam.bai \
  1:1000000-1000100 2:5000000-5000100 3:10000000-10000100 \
  5:50000000-50000100 7:117188547-117188800 >/dev/null 2>&1
END=$(date +%s.%N)
AFTER=$(cat /sys/class/net/$IFACE/statistics/rx_bytes)
echo "Time: $(echo "$END - $START" | bc)s, Bytes: $((AFTER - BEFORE))"

When reading indexed BAM files from remote URLs (HTTP, S3, etc.),
seeking to a chunk offset would then read unbounded to EOF. For
small queries against large files, this downloads far more data
than needed.

This adds bgzf_seek_limit() which accepts the chunk end offset from
the BAM index, enabling bounded Range requests (bytes=X-Y) instead
of unbounded ones (bytes=X-) in the libcurl backend.

Changes:
- hfile.h/hfile.c: Add readahead_limit field and setter
- bgzf.h/bgzf.c: Add bgzf_seek_limit() that passes limit to hfile
- hfile_libcurl.c: Use CURLOPT_RANGE with bounds when limit is set
- hts.c: Call bgzf_seek_limit() with chunk end in hts_itr_next()

The limit is cleared after each hseek(), so only affects reads
immediately following a seek.

Signed-off-by: David Beck <techiscool@gmail.com>
@daviesrob
Copy link
Copy Markdown
Member

This changes the public hFILE structure, so it would require an ABI bump if merged. It would be good to avoid that if possible.

This problem has already been fixed in the s3 plug-in, albeit in a different way. If you try using s3:// URLs instead of https://, you'll find that the amount of data requested is much smaller, and the time taken is similar to what you report for your solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants