‹›
markdown.htmlparser
¶
This module imports a copy of html.parser.HTMLParser
and modifies it heavily through monkey-patches.
A copy is imported rather than the module being directly imported as this ensures that the user can import
and use the unmodified library for their own needs.
‹›
markdown.htmlparser.HTMLExtractor(md: Markdown, *args, **kwargs)
¶
Bases: HTMLParser
Extract raw HTML from text.
The raw HTML is stored in the htmlStash
of the
Markdown
instance passed to md
and the remaining text
is stored in cleandoc
as a list of strings.
‹›
markdown.htmlparser.HTMLExtractor.line_offset: int
property
¶
Returns char index in self.rawdata
for the start of the current line.
‹›
markdown.htmlparser.HTMLExtractor.at_line_start() -> bool
¶
Returns True if current position is at start of line.
Allows for up to three blank spaces at start of line.
‹›
markdown.htmlparser.HTMLExtractor.get_endtag_text(tag: str) -> str
¶
Returns the text of the end tag.
If it fails to extract the actual text from the raw data, it builds a closing tag with tag
.