Skip to content

attributed_text: Generalize attributed text storage and segment ranges#624

Merged
waywardmonkeys merged 4 commits into
linebender:mainfrom
waywardmonkeys:attributed-text-storage-chunks
May 25, 2026
Merged

attributed_text: Generalize attributed text storage and segment ranges#624
waywardmonkeys merged 4 commits into
linebender:mainfrom
waywardmonkeys:attributed-text-storage-chunks

Conversation

@waywardmonkeys
Copy link
Copy Markdown
Contributor

This updates attributed_text so attributed content can be backed by non-contiguous text storage while keeping range validation explicit and reusable.

Goals:

  • Support chunk-readable text storage for rope-like or sparse storage backends.
  • Store validated TextRange values in AttributedText.
  • Have attribute segment iteration yield validated ranges instead of raw byte ranges.
  • Provide a combined segment view for callers that want the range and active spans together.

Non-goals:

  • This does not add a concrete rope implementation.
  • This does not resolve attribute conflicts or add style cascade behavior.
  • This does not change downstream Parley layout behavior.

Concepts

  • TextStorage: the storage abstraction for text bytes and UTF-8 boundary checks.
  • TextChunk: a borrowed contiguous slice from storage, with its global TextRange.
  • TextRange: a validated byte range whose endpoints are in bounds and on UTF-8 boundaries.
  • AttributeSegments: the sweep-line segment iterator over applied attribute spans.
  • ActiveSpans: the active attribute spans for the most recently yielded segment.

Changes

  • Add TextChunk and TextStorage::chunks for reading validated ranges from contiguous or chunked text storage.
  • Add TextStorage::as_str, returning Some(&str) only for contiguous storage.
  • Change AttributedText::as_str to return Option<&str>.
  • Store attribute ranges internally as TextRange.
  • Change attribute query and segment APIs to expose TextRange values.
  • Add AttributeSegments::next_segment and AttributeSegment for callers that want range + active spans as one borrowed view.
  • Ensure ActiveSpansIter reports an exact size_hint, matching its ExactSizeIterator implementation.

@waywardmonkeys waywardmonkeys requested review from taj-p and tomcur May 22, 2026 16:45
@waywardmonkeys
Copy link
Copy Markdown
Contributor Author

This was done with the assistance of Codex (GPT 5.5, xhigh).

@waywardmonkeys
Copy link
Copy Markdown
Contributor Author

Perhaps also GPT 5.4 as I've had this around for a while.

@waywardmonkeys waywardmonkeys enabled auto-merge May 23, 2026 03:50
@waywardmonkeys waywardmonkeys force-pushed the attributed-text-storage-chunks branch from 159b4f8 to e9af936 Compare May 23, 2026 17:06
Copy link
Copy Markdown
Member

@tomcur tomcur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new TextChunk is nice for non-contiguously stored text! The docs for TextStorage::chunks and TextStorage::as_str should probably link to each other for discoverability. I've not gone through everything with a fine-toothed comb, but the main changes look good.

TextStorage is no longer dyn-compatible, but that probably does not matter. Orthogonal to this PR, it may be worth thinking about what behavior around invalid TextRanges should be throughout the crate (panic, clamp, something else?).

TextRange::new(self, range)
}

/// Iterates over borrowed chunks covering `range`.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps emphasize chunks are contiguous and non-empty.

Suggested change
/// Iterates over borrowed chunks covering `range`.
/// Iterates over contiguous, non-empty chunks of text covering `range`.

Comment on lines +222 to +260
ChunkedTextChunks {
chunks: self.chunks.iter(),
chunk_start: 0,
range,
}
}
}

#[derive(Clone, Debug)]
struct ChunkedTextChunks<'a> {
chunks: core::slice::Iter<'a, &'static str>,
chunk_start: usize,
range: TextRange,
}

impl<'a> Iterator for ChunkedTextChunks<'a> {
type Item = TextChunk<'a>;

fn next(&mut self) -> Option<Self::Item> {
for chunk in self.chunks.by_ref() {
let chunk = *chunk;
let chunk_start = self.chunk_start;
let chunk_end = chunk_start + chunk.len();
self.chunk_start = chunk_end;

let start = self.range.start().max(chunk_start);
let end = self.range.end().min(chunk_end);
if start < end {
let local_start = start - chunk_start;
let local_end = end - chunk_start;
return Some(TextChunk::new(
TextRange::new_unchecked(start, end),
&chunk[local_start..local_end],
));
}
}

None
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The following perhaps reads a bit more clearly.

Suggested change
ChunkedTextChunks {
chunks: self.chunks.iter(),
chunk_start: 0,
range,
}
}
}
#[derive(Clone, Debug)]
struct ChunkedTextChunks<'a> {
chunks: core::slice::Iter<'a, &'static str>,
chunk_start: usize,
range: TextRange,
}
impl<'a> Iterator for ChunkedTextChunks<'a> {
type Item = TextChunk<'a>;
fn next(&mut self) -> Option<Self::Item> {
for chunk in self.chunks.by_ref() {
let chunk = *chunk;
let chunk_start = self.chunk_start;
let chunk_end = chunk_start + chunk.len();
self.chunk_start = chunk_end;
let start = self.range.start().max(chunk_start);
let end = self.range.end().min(chunk_end);
if start < end {
let local_start = start - chunk_start;
let local_end = end - chunk_start;
return Some(TextChunk::new(
TextRange::new_unchecked(start, end),
&chunk[local_start..local_end],
));
}
}
None
}
let mut chunk_start = 0;
self.chunks.iter().filter_map(move |&chunk| {
let offset = chunk_start;
let start = range.start().max(chunk_start);
chunk_start += chunk.len();
let end = range.end().min(chunk_start);
(start < end).then(|| {
let text = &chunk[start - offset..end - offset];
TextChunk::new(TextRange::new_unchecked(start, end), text)
})
})
}
}

@waywardmonkeys waywardmonkeys added this pull request to the merge queue May 25, 2026
Merged via the queue into linebender:main with commit 3c9f6b3 May 25, 2026
47 of 48 checks passed
@waywardmonkeys waywardmonkeys deleted the attributed-text-storage-chunks branch May 25, 2026 12:33
nicoburns pushed a commit to nicoburns/parley that referenced this pull request May 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants