‹› markdown.preprocessors

Preprocessors work on source text before it is broken down into its individual parts. This is an excellent place to clean up bad characters or to extract portions for later processing that the parser may otherwise choke on.

‹› markdown.preprocessors.build_preprocessors(md: Markdown, **kwargs: Any) -> util.Registry[Preprocessor]

Build and return the default set of preprocessors used by Markdown.

Return a Registry instance which contains the following collection of classes with their assigned names and priorities.

Class Instance Name Priority
NormalizeWhitespace normalize_whitespace 30
HtmlBlockPreprocessor html_block 20

‹› markdown.preprocessors.Preprocessor(md: Markdown | None = None)

Bases: Processor

Preprocessors are run after the text is broken into lines.

Each preprocessor implements a run method that takes a pointer to a list of lines of the document, modifies it as necessary and returns either the same pointer or a pointer to a new list.

Preprocessors must extend Preprocessor.

‹› markdown.preprocessors.Preprocessor.run(lines: list[str]) -> list[str]

Each subclass of Preprocessor should override the run method, which takes the document as a list of strings split by newlines and returns the (possibly modified) list of lines.

‹› markdown.preprocessors.NormalizeWhitespace(md: Markdown | None = None)

Bases: Preprocessor

Normalize whitespace for consistent parsing.

‹› markdown.preprocessors.HtmlBlockPreprocessor(md: Markdown | None = None)

Bases: Preprocessor

Remove html blocks from the text and store them for later retrieval.

The raw HTML is stored in the htmlStash of the Markdown instance.