Skip to content

XSS via HTML entities in javascript: URLs #672

@JorianWoltjer

Description

@JorianWoltjer

Describe the bug

Numeric entities in link destinations (e.g. s for s) decode during handling, so javascript: becomes javascript: in the rendered href, bypassing checks on the raw markdown string.

Component

  • Parser / link handling (URL normalization)
  • HtmlRenderer

Minimal repro (flexmark-java)

import com.vladsch.flexmark.html.HtmlRenderer;
import com.vladsch.flexmark.parser.Parser;
import com.vladsch.flexmark.util.ast.Document;

public class Repro {
    public static void main(String[] args) {
        Parser parser = Parser.builder().build();
        HtmlRenderer renderer = HtmlRenderer.builder().build();
        Document doc = parser.parse("[lnk](javascript:alert(origin))");
        System.out.println(renderer.render(doc));
        // <p><a href="javascript:alert(origin)">lnk</a></p>
    }
}

To Reproduce

Input

[lnk](java&#115;cript:alert(origin))

Options: Default Parser + HtmlRenderer (no extensions).

Expected behavior

Do not emit a live javascript: URL after normalization (strip unsafe schemes, or reject decoding that reveals them).

Resulting Output (fuzzer)

Input: [lnk](java&#115;cript:alert(origin))

Output:

<p><a href="javascript:alert(origin)">lnk</a></p>

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions