Move html characeter converting mechanism to regexp-handling#1570
Conversation
| h[encoding] = { | ||
| :close_dquote => encode_fallback('”', encoding, '"'), | ||
| :close_squote => encode_fallback('’', encoding, '\''), | ||
| :copyright => encode_fallback('©', encoding, '(c)'), |
There was a problem hiding this comment.
I think html character is ©, not ©, but it's another issue.
|
🚀 Preview deployment available at: https://0733bf1b.rdoc-6cd.pages.dev (commit: d8f23d9) |
| @insquotes = true | ||
| end | ||
| end | ||
| TO_HTML_CHARACTERS[quote.encoding][type] if type |
There was a problem hiding this comment.
Could TO_HTML_CHARACTERS[quote.encoding] ever be nil and causes NoMethodError?
There was a problem hiding this comment.
TO_HTML_CHARACTERS = Hash.new do |h, encoding| ... end will generate a new hash if there is no key.
6eb654a to
d64c2b8
Compare
There was a problem hiding this comment.
Pull request overview
This PR moves “smart punctuation” (quotes/dashes/ellipsis/(c)/(r) conversions) from RDoc::Text’s HTML post-processing into RDoc::Markup::ToHtml’s inline regexp-handling so conversions apply only to plain-text nodes (and avoid rewriting already-generated HTML).
Changes:
- Removed
RDoc::Text#to_html_characters(and related helpers) and stopped post-processing rendered HTML. - Added regexp-based handling in
RDoc::Markup::ToHtmlfor HTML-character aliases and quote conversion, scoped to inline/plain-text processing. - Updated snippet and crossref behaviors/tests to reflect the new conversion location and boundaries.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| lib/rdoc/text.rb | Removes the old HTML post-processing conversion mechanism from RDoc::Text. |
| lib/rdoc/markup/to_html.rb | Introduces encoding-aware character mapping + regexp-handling and quote conversion in the HTML formatter. |
| lib/rdoc/markup/to_html_snippet.rb | Aligns snippet truncation ellipsis and HTML conversion with the new formatter behavior. |
| test/rdoc/rdoc_text_test.rb | Removes tests that were specific to the deleted RDoc::Text conversion API. |
| test/rdoc/markup/to_html_test.rb | Adds coverage for the new formatter-level conversion behavior and encoding fallback. |
| test/rdoc/markup/to_html_snippet_test.rb | Updates expected truncation output to use the new ellipsis behavior. |
| test/rdoc/markup/to_html_crossref_test.rb | Updates expectations where generated crossref text is no longer post-processed into smart quotes. |
Comments suppressed due to low confidence (1)
lib/rdoc/text.rb:172
RDoc::Text#to_html/#to_html_characterswere removed here. If any external callers still depend on these helpers (they previously handled quote/dash/ellipsis substitutions post-render), this is a breaking API change. Consider leaving a compatibility method that forwards to the new regexp-handling implementation (or raising a clearer error/deprecation) to avoid silent behavioral differences across versions.
##
# Wraps +txt+ to +line_len+
def wrap(txt, line_len = 76)
res = []
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| :undef => :replace, :replace => fallback) | ||
| end | ||
|
|
||
| ## |
`Text#to_html_characters` was a postprocess that converts ascii quotes/marks to multibyte characters. Postprocessing HTML to do thaat is not a good idea. Convert plain text node is better.
d64c2b8 to
d8f23d9
Compare
Text#to_html_characterswas a postprocess that converts ascii quotes/marks to multibyte characters. Postprocessing HTML to do that is not a good idea. Convert plain text node with regexp-handling is better.