Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
107 changes: 104 additions & 3 deletions doc/markup_reference/markdown.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,9 @@ Use triple backticks with an optional language identifier:
end
```

Supported language for syntax highlighting: `ruby`, `rb` (alias to `ruby`), and `c`.
Supported languages for syntax highlighting: `ruby` (and `rb` alias) with server-side
highlighting, and `c`, `bash`/`sh`/`shell`/`console` with client-side JavaScript highlighting.
Other info strings are accepted and added as a CSS class but receive no highlighting.

### Blockquotes

Expand Down Expand Up @@ -420,6 +422,9 @@ For example:
* [Link to Blockquotes](#blockquotes)
* [Link to Anchor Links](#anchor-links)

When multiple headings produce the same anchor, RDoc appends `-1`, `-2`, etc.
to subsequent duplicates, matching GitHub's behavior.

## Footnotes

### Reference Footnotes
Expand Down Expand Up @@ -535,7 +540,7 @@ See [rdoc.rdoc](rdoc.rdoc) for complete directive documentation.
| Headings | `= Heading` | `# Heading` |
| Bold | `*word*` | `**word**` |
| Italic | `_word_` | `*word*` |
| Monospace | `+word+` | `` `word` `` |
| Monospace | `+word+` or `` `word` `` | `` `word` `` |
| Links | `{text}[url]` | `[text](url)` |
| Code blocks | Indent beyond margin | Indent 4 spaces or fence |
| Block quotes | `>>>` | `>` |
Expand All @@ -551,8 +556,104 @@ See [rdoc.rdoc](rdoc.rdoc) for complete directive documentation.

3. **Footnotes are collapsed** - Multiple paragraphs in a footnote become a single paragraph.

4. **Syntax highlighting** - Only `ruby` and `c` are supported for fenced code blocks.
4. **Syntax highlighting** - Only `ruby`/`rb` (server-side) and `c`, `bash`/`sh`/`shell`/`console` (client-side) receive syntax highlighting. Other info strings are accepted but not highlighted.

5. **Fenced code blocks** - Only triple backticks are supported. Tilde fences (`~~~`) are not supported as they conflict with strikethrough syntax. Four or more backticks for nesting are also not supported.

6. **Auto-linking** - RDoc automatically links class and method names in output, even without explicit link syntax.

## Comparison with GitHub Flavored Markdown (GFM)

This section compares RDoc's Markdown implementation with the
[GitHub Flavored Markdown Spec](https://github.github.com/gfm/) (Version 0.29-gfm, 2019-04-06).

### Block Elements

| Feature | GFM | RDoc | Notes |
|---------|:---:|:----:|-------|
| ATX Headings (`#`) | ✅ | ✅ | Both support levels 1-6, optional closing `#` |
| Setext Headings | ✅ | ✅ | `=` for H1, `-` for H2 |
| Paragraphs | ✅ | ✅ | Full match |
| Indented Code Blocks | ✅ | ✅ | 4 spaces or 1 tab |
| Fenced Code (backticks) | ✅ 3+ | ⚠️ 3 only | RDoc doesn't support 4+ backticks for nesting |
| Fenced Code (tildes) | ✅ `~~~` | ❌ | Conflicts with strikethrough syntax |
| Info strings (language) | ✅ any | ⚠️ limited | `ruby`/`rb`, `c`, and `bash`/`sh`/`shell`/`console` highlighted; others accepted as CSS class |
| Blockquotes | ✅ | ✅ | Full match, nested supported |
| Lazy Continuation | ✅ | ⚠️ | Continuation text is included in blockquote but line break is lost (becomes a space) |
| Bullet Lists | ✅ | ✅ | `*`, `+`, `-` supported |
| Ordered Lists | ✅ `.` `)` | ⚠️ `.` only | RDoc doesn't support `)` delimiter; numbers are always renumbered from 1 |
| Nested Lists | ✅ | ✅ | 4-space indentation |
| Tables | ✅ | ✅ | Full alignment support |
| Thematic Breaks | ✅ | ✅ | `---`, `***`, `___` |
| HTML Blocks | ✅ 7 types | ⚠️ | See below |

#### HTML Blocks

GFM defines 7 types of HTML blocks:

| Type | Description | GFM | RDoc | Notes |
|------|-------------|:---:|:----:|-------|
| 1 | `<script>`, `<pre>` | ✅ | ✅ | |
| 1 | `<style>` | ✅ | ❌ | Available via `css` extension (disabled by default) |
| 2 | HTML comments `<!-- -->` | ✅ | ✅ | |
| 3 | Processing instructions `<? ?>` | ✅ | ❌ | |
| 4 | Declarations `<!DOCTYPE>` | ✅ | ❌ | |
| 5 | CDATA `<![CDATA[ ]]>` | ✅ | ❌ | |
| 6 | Block-level tags | ✅ | ⚠️ | |
| 7 | Any complete open/close tag | ✅ | ❌ | |

RDoc uses a whitelist of block-level tags defined in
[lib/rdoc/markdown.kpeg](https://github.com/ruby/rdoc/blob/master/lib/rdoc/markdown.kpeg)
(see `HtmlBlockInTags`). HTML5 semantic elements like `<article>`, `<section>`,
`<nav>`, `<header>`, `<footer>` are not supported.

### Inline Elements

| Feature | GFM | RDoc | Notes |
|---------|:---:|:----:|-------|
| Emphasis `*text*` `_text_` | ✅ | ⚠️ | Intraword emphasis not supported (see [Notes](#notes-and-limitations)) |
| Strong `**text**` `__text__` | ✅ | ✅ | Full match |
| Combined `***text***` | ✅ | ✅ | Full match |
| Code spans | ✅ | ✅ | Multiple backticks supported |
| Inline links | ✅ | ✅ | Full match |
| Reference links | ✅ | ✅ | Full match |
| Link titles | ✅ | ⚠️ | Parsed but not rendered |
| Images | ✅ | ✅ | Full match |
| Autolinks `<url>` | ✅ | ✅ | Full match |
| Hard line breaks | ✅ | ⚠️ | 2+ trailing spaces only; backslash `\` at EOL not supported |
| Backslash escapes | ✅ | ⚠️ | Subset of GFM's escapable characters (e.g., `~` not escapable) |
| HTML entities | ✅ | ✅ | Named, decimal, hex |
| Inline HTML | ✅ | ⚠️ | `<b>` converted to `<strong>`, `<i>` to `<em>`; `<strong>` itself is escaped |

### GFM Extensions

| Feature | GFM | RDoc | Notes |
|---------|:---:|:----:|-------|
| Strikethrough `~~text~~` | ✅ | ✅ | Full match |
| Task Lists `[ ]` `[x]` | ✅ | ❌ | Not supported |
| Extended Autolinks | ✅ | ⚠️ | See below |
| Disallowed Raw HTML | ✅ | ❌ | No security filtering |

#### GFM Extended Autolinks

GFM automatically converts certain text patterns into links without requiring
angle brackets (`<>`). RDoc also auto-links URLs and `www.` prefixes through
its cross-reference system, but the behavior differs from GFM.

GFM recognizes these patterns:

- `www.example.com` — text starting with `www.` followed by a valid domain
- `https://example.com` — URLs starting with `http://` or `https://`
- `user@example.com` — valid email addresses

RDoc auto-links `www.` prefixes and `http://`/`https://` URLs similarly to GFM.
However, bare email addresses like `user@example.com` are not auto-linked;
use `<user@example.com>` instead.

### RDoc-Specific Features (not in GFM)

- [Definition Lists](#definition-lists)
- [Footnotes](#footnotes)
- [Cross-references](#cross-references)
- [Anchor Links](#anchor-links)
- [Directives](#directives)
81 changes: 81 additions & 0 deletions test/rdoc/rdoc_markdown_test.rb
Original file line number Diff line number Diff line change
Expand Up @@ -1411,4 +1411,85 @@ def parse(text)
@parser.parse text
end

def render(markdown_source)
@to_html.convert(parse(markdown_source))
end

def test_atx_heading_closing_hashes_stripped
html = render("## Heading ##\n")
assert_match(%r{<h2.*>.*Heading.*</h2>}, html)
assert_not_match(/##/, html.gsub(/<[^>]+>/, "").strip)
end

def test_fenced_code_4_backticks_not_supported
html = render("````\ncode\n````\n")
assert_not_match(%r{<pre>code\n</pre>}, html)
end

def test_tilde_is_strikethrough_not_fence
html = render("~~~\ncode\n~~~\n")
assert_not_match(%r{<pre>code\n</pre>}, html)

html = render("~~strike~~\n")
assert_match(%r{<del>strike</del>}, html)
end

def test_info_string_css_classes
assert_match(/class="ruby"/, render("```rb\ndef hello; end\n```\n"))
assert_match(/class="c"/, render("```c\nint main() {}\n```\n"))
assert_match(/class="bash"/, render("```bash\necho hello\n```\n"))
assert_match(/class="python"/, render("```python\nprint('hi')\n```\n"))
end

def test_lazy_continuation_in_blockquote
html = render("> Foo\nBar\n")
assert_match(%r{<blockquote>.*Foo.*Bar.*</blockquote>}m, html)
assert_match(%r{Foo Bar}, html)
end

def test_ordered_list_paren_delimiter_not_supported
html = render("1) first\n2) second\n")
assert_not_match(%r{<ol>}, html)
end

def test_style_block_not_supported
html = render("<style>body { color: red; }</style>\n")
assert_not_match(%r{<style>}, html)
end

def test_inline_html_tag_conversion
assert_match(%r{<strong>bold</strong>}, render("This has <b>bold</b> HTML.\n"))
assert_match(%r{<em>emphasized</em>}, render("This has <em>emphasized</em> HTML.\n"))

html = render("This has <strong>bold</strong> HTML.\n")
assert_match(/&lt;strong&gt;/, html)
end

def test_link_title_not_rendered
html = render('[text](https://example.com "My Title")' + "\n")
assert_match(%r{<a href="https://example.com">text</a>}, html)
assert_not_match(/My Title/, html)
end

def test_task_list_not_supported
html = render("- [ ] unchecked\n- [x] checked\n")
assert_not_match(%r{<input}, html)
end

def test_autolinks
assert_match(%r{<a href.*www\.example\.com}, render("Visit www.example.com for help.\n"))
assert_match(%r{<a href="https://example\.com"}, render("Visit https://example.com for help.\n"))
assert_not_match(%r{<a href="mailto:user@example\.com"}, render("Contact user@example.com for help.\n"))
end

def test_backslash_line_break_not_supported
html = render("Line one\\\nLine two\n")
assert_not_match(%r{<br>}, html)
end

def test_escape_tilde_not_supported
html = render("\\~not escaped\n")
assert_match(/\\~/, html)
end

end