Writing Extensions for Python-Markdown¶

Python-Markdown includes an API for extension writers to plug their own custom functionality and syntax into the parser. An extension will patch into one or more stages of the parser:

Preprocessors alter the source before it is passed to the parser.
Block Processors work with blocks of text separated by blank lines.
Tree Processors modify the constructed ElementTree
Inline Processors are common tree processors for inline elements, such as *strong*.
Postprocessors munge of the output of the parser just before it is returned.

The parser loads text, applies the preprocessors, creates and builds an ElementTree object from the block processors and inline processors, renders the ElementTree object as Unicode text, and then then applies the postprocessors.

There are classes and helpers provided to ease writing your extension. Each part of the API is discussed in its respective section below. Additionally, you can walk through the Tutorial on Writing Extensions; look at some of the Available Extensions and their source code. As always, you may report bugs, ask for help, and discuss various other issues on the bug tracker.

Phases of processing¶

Preprocessors¶

Preprocessors munge the source text before it is passed to the Markdown parser. This is an excellent place to clean up bad characters or to extract portions for later processing that the parser may otherwise choke on.

Preprocessors inherit from markdown.preprocessors.Preprocessor and implement a run method, which takes a single parameter lines. This parameter is the entire source text stored as a list of Unicode strings, one per line. run should return its processed list of Unicode strings, one per line.

Example¶

This simple example removes any lines with ‘NO RENDER’ before processing:

from markdown.preprocessors import Preprocessor
import re

class NoRender(Preprocessor):
    """ Skip any line with words 'NO RENDER' in it. """
    def run(self, lines):
        new_lines = []
        for line in lines:
            m = re.search("NO RENDER", line)
            if not m:
                # any line without NO RENDER is passed through
                new_lines.append(line)
        return new_lines

Usages¶

Some preprocessors in the Markdown source tree include:

Class	Kind	Description
`NormalizeWhiteSpace`	built-in	Normalizes whitespace by expanding tabs, fixing `\r` line endings, etc.
`HtmlBlockPreprocessor`	built-in	Removes html blocks from the text and stores them for later processing
`ReferencePreprocessor`	built-in	Removes reference definitions from text and stores for later processing
`MetaPreprocessor`	extension	Strips and records meta data at top of documents
`FootnotesPreprocessor`	extension	Removes footnote blocks from the text and stores them for later processing

Block Processors¶

A block processor parses blocks of text and adds new elements to the ElementTree. Blocks of text, separated from other text by blank lines, may have a different syntax and produce a differently structured tree than other Markdown. Block processors excel at code formatting, equation layouts, and tables.

Block processors inherit from markdown.blockprocessors.BlockProcessor, are passed md.parser on initialization, and implement both the test and run methods:

test(self, parent, block) takes two parameters: parent is the parent ElementTree element and block is a single, multi-line, Unicode string of the current block. test, often a regular expression match, returns a true value if the block processor’s run method should be called to process starting at that block.
run(self, parent, blocks) has the same parent parameter as test; and blocks is the list of all remaining blocks in the document, starting with the block passed to test. run may return False (not None) to signal failure, meaning that it did not process the blocks after all. On success, run is expected to pop one or more blocks from the front of blocks and attach new nodes to parent.

Crafting block processors is more involved and flexible than the other processors, involving controlling recursive parsing of the block’s contents and managing state across invocations. For example, a blank line is allowed in indented code, so the second invocation of the inline code processor appends to the element tree generated by the previous call. Other block processors may insert new text into the blocks list, signal to future calls of itself, and more.

To make writing these complex beasts more tractable, three convenience functions have been provided by the BlockProcessor parent class:

lastChild(parent) returns the last child of the given element or None if it has no children.
detab(text) removes one level of indent (four spaces by default) from the front of each line of the given multi-line, text string, until a non-blank line is indented less.
looseDetab(text, level) removes multiple levels of indent from the front of each line of text but does not affect lines indented less.

Also, BlockProcessor provides the fields self.tab_length, the tab length (default 4), and self.parser, the current BlockParser instance.

BlockParser¶

BlockParser, not to be confused with BlockProcessor, is the class used by Markdown to cycle through all the registered block processors. You should never need to create your own instance; use self.parser instead.

The BlockParser instance provides a stack of strings for its current state, which your processor can push with self.parser.set(state), pop with self.parser.reset(), or check the the top state with self.parser.isstate(state). Be sure your code pops the states it pushes.

The BlockParser instance can also be called recursively, that is, to process blocks from within your block processor. There are three methods:

parseDocument(lines) parses a list of lines, each a single-line Unicode string, returning a complete ElementTree.
parseChunk(parent, text) parses a single, multi-line, possibly multi-block, Unicode string text and attaches the resulting tree to parent.
parseBlocks(parent, blocks) takes a list of blocks, each a multi-line Unicode string without blank lines, and attaches the resulting tree to parent.

For perspective, Markdown calls parseDocument which calls parseChunk which calls parseBlocks which calls your block processor, which, in turn, might call one of these routines.

Example¶

This example calls out important paragraphs by giving them a border. It looks for a fence line of exclamation points before and after and renders the fenced blocks into a new, styled div. If it does not find the ending fence line, it does nothing.

Our code, like most block processors, is longer than other examples:

def test_block_processor():
    class BoxBlockProcessor(BlockProcessor):
        RE_FENCE_START = r'^ *!{3,} *\n' # start line, e.g., `   !!!! `
        RE_FENCE_END = r'\n *!{3,}\s*$'  # last non-blank line, e.g, '!!!\n  \n\n'

        def test(self, parent, block):
            return re.match(self.RE_FENCE_START, block)

        def run(self, parent, blocks):
            original_block = blocks[0]
            blocks[0] = re.sub(self.RE_FENCE_START, '', blocks[0])

            # Find block with ending fence
            for block_num, block in enumerate(blocks):
                if re.search(self.RE_FENCE_END, block):
                    # remove fence
                    blocks[block_num] = re.sub(self.RE_FENCE_END, '', block)
                    # render fenced area inside a new div
                    e = etree.SubElement(parent, 'div')
                    e.set('style', 'display: inline-block; border: 1px solid red;')
                    self.parser.parseBlocks(e, blocks[0:block_num + 1])
                    # remove used blocks
                    for i in range(0, block_num + 1):
                        blocks.pop(0)
                    return True  # or could have had no return statement
            # No closing marker!  Restore and do nothing
            blocks[0] = original_block
            return False  # equivalent to our test() routine returning False

    class BoxExtension(Extension):
        def extendMarkdown(self, md):
            md.parser.blockprocessors.register(BoxBlockProcessor(md.parser), 'box', 175)

Start with this example input:

A regular paragraph of text.

!!!!!
First paragraph of wrapped text.

Second Paragraph of **wrapped** text.
!!!!!

Another regular paragraph of text.

The fenced text adds one node with two children to the tree:

div, with a style attribute. It renders as <div style="display: inline-block; border: 1px solid red;">...</div>
- p with text First paragraph of wrapped text.
- p with text Second Paragraph of **wrapped** text. The conversion to a <strong> tag will happen when running the inline processors, which will happen after all of the block processors have completed.

The example output might display as follows:

A regular paragraph of text.

First paragraph of wrapped text.

Second Paragraph of wrapped text.

Another regular paragraph of text.

Usages¶

Some block processors in the Markdown source tree include:

Class	Kind	Description
`HashHeaderProcessor`	built-in	Title hashes (`#`), which may split blocks
`HRProcessor`	built-in	Horizontal lines, e.g., `---`
`OListProcessor`	built-in	Ordered lists; complex and using `state`
`Admonition`	extension	Render each Admonition in a new `div`

Tree processors¶

Tree processors manipulate the tree created by block processors. They can even create an entirely new ElementTree object. This is an excellent place for creating summaries, adding collected references, or last minute adjustments.

A tree processor must inherit from markdown.treeprocessors.Treeprocessor (note the capitalization). A tree processor must implement a run method which takes a single argument root. In most cases root would be an xml.etree.ElementTree.Element instance; however, in rare cases it could be some other type of ElementTree object. The run method may return None, in which case the (possibly modified) original root object is used, or it may return an entirely new Element object, which will replace the existing root object and all of its children. It is generally preferred to modify root in place and return None, which avoids creating multiple copies of the entire document tree in memory.

For specifics on manipulating the ElementTree, see Working with the ElementTree below.

Example¶

A pseudo example:

from markdown.treeprocessors import Treeprocessor

class MyTreeprocessor(Treeprocessor):
    def run(self, root):
        root.text = 'modified content'
        # No return statement is same as `return None`

Usages¶

The core InlineProcessor class is a tree processor. It walks the tree, matches patterns, and splits and creates nodes on matches.

Additional tree processors in the Markdown source tree include:

Class	Kind	Description
`PrettifyTreeprocessor`	built-in	Add line breaks to the html document
`TocTreeprocessor`	extension	Builds a table of contents from the finished tree
`FootnoteTreeprocessor`	extension	Create footnote div at end of document
`FootnotePostTreeprocessor`	extension	Amend div created by `FootnoteTreeprocessor` with duplicates

Inline Processors¶

Inline processors, previously called inline patterns, are used to add formatting, such as **emphasis**, by replacing a matched pattern with a new element tree node. It is an excellent place for adding new syntax for inline tags. Inline processor code is often quite short.

Inline processors inherit from InlineProcessor, are initialized, and implement handleMatch:

__init__(self, pattern, md=None) is the inherited constructor. You do not need to implement your own.
- pattern is the regular expression string that must match the code block in order for the handleMatch method to be called.
- md, an optional parameter, is a pointer to the instance of markdown.Markdown and is available as self.md on the InlineProcessor instance.
handleMatch(self, m, data) must be implemented in all InlineProcessor subclasses.
- m is the regular expression match object found by the pattern passed to __init__.
- data is a single, multi-line, Unicode string containing the entire block of text around the pattern. A block is text set apart by blank lines.
- Returns either (None, None, None), indicating the provided match was rejected or (el, start, end), if the match was successfully processed. On success, el is the element being added to the tree, start and end are indexes in data that were “consumed” by the pattern. The “consumed” span will be replaced by a placeholder. The same inline processor may be called several times on the same block.

Inline Processors can define the property ANCESTOR_EXCLUDES which is either a list or tuple of undesirable ancestors. The processor will be skipped if it would cause the content to be a descendant of one of the listed tag names.

Convenience Classes¶

Convenience subclasses of InlineProcessor are provided for common operations:

SimpleTextInlineProcessor returns the text of group(1) of the match.
SubstituteTagInlineProcessor is initialized as SubstituteTagInlineProcessor(pattern, tag). It returns a new element tag whenever pattern is matched.
SimpleTagInlineProcessor is initialized as SimpleTagInlineProcessor(pattern, tag). It returns an element tag with a text field of group(2) of the match.

Example¶

This example changes --strike-- to <del>strike</del>.

from markdown.inlinepatterns import InlineProcessor
from markdown.extensions import Extension
import xml.etree.ElementTree as etree


class DelInlineProcessor(InlineProcessor):
    def handleMatch(self, m, data):
        el = etree.Element('del')
        el.text = m.group(1)
        return el, m.start(0), m.end(0)

class DelExtension(Extension):
    def extendMarkdown(self, md):
        DEL_PATTERN = r'--(.*?)--'  # like --del--
        md.inlinePatterns.register(DelInlineProcessor(DEL_PATTERN, md), 'del', 175)

Use this input example:

First line of the block.
This is --strike one--.
This is --strike two--.
End of the block.

The example output might display as follows:

First line of the block. This is ~~strike one~~. This is ~~strike two~~. End of the block.

On the first call to handleMatch
- m will be the match for --strike one--
- data will be the string: First line of the block.\nThis is --strike one--.\nThis is --strike two--.\nEnd of the block.
Because the match was successful, the region between the returned start and end are replaced with a placeholder token and the new element is added to the tree.
On the second call to handleMatch
- m will be the match for --strike two--
- data will be the string First line of the block.\nThis is klzzwxh:0000.\nThis is --strike two--.\nEnd of the block.

Note the placeholder token klzzwxh:0000. This allows the regular expression to be run against the entire block, not just the the text contained in an individual element. The placeholders will later be swapped back out for the actual elements by the parser.

Actually it would not be necessary to create the above inline processor. The fact is, that example is not very DRY (Don’t Repeat Yourself). A pattern for **strong** text would be almost identical, with the exception that it would create a strong element. Therefore, Markdown provides a number of generic InlineProcessor subclasses that can provide some common functionality. For example, strike could be implemented with an instance of the SimpleTagInlineProcessor class as demonstrated below. Feel free to use or extend any of the InlineProcessor subclasses found at markdown.inlinepatterns.

from markdown.inlinepatterns import SimpleTagInlineProcessor
from markdown.extensions import Extension

class DelExtension(Extension):
    def extendMarkdown(self, md):
        md.inlinePatterns.register(SimpleTagInlineProcessor(r'()--(.*?)--', 'del'), 'del', 175)

Usages¶

Here are some convenience functions and other examples:

Class	Kind	Description
`AsteriskProcessor`	built-in	Emphasis processor for handling strong and em matches inside asterisks
`AbbrInlineProcessor`	extension	Apply tag to abbreviation registered by preprocessor
`WikiLinksInlineProcessor`	extension	Link `[[article names]]` to wiki given in metadata
`FootnoteInlineProcessor`	extension	Replaces footnote in text with link to footnote div at bottom

Patterns¶

In version 3.0, a new, more flexible inline processor was added, markdown.inlinepatterns.InlineProcessor. The original inline patterns, which inherit from markdown.inlinepatterns.Pattern or one of its children are still supported, though users are encouraged to migrate.

Comparison with new `InlineProcessor`¶

The new InlineProcessor provides two major enhancements to Patterns:

Inline Processors no longer need to match the entire block, so regular expressions no longer need to start with r'^(.*?)' and end with r'(.*?)%'. This runs faster. The returned match object will only contain what is explicitly matched in the pattern, and extension pattern groups now start with m.group(1).
The handleMatch method now takes an additional input called data, which is the entire block under analysis, not just what is matched with the specified pattern. The method now returns the element and the indexes relative to data that the return element is replacing (usually m.start(0) and m.end(0)). If the boundaries are returned as None, it is assumed that the match did not take place, and nothing will be altered in data.

This allows handling of more complex constructs than regular expressions can handle, e.g., matching nested brackets, and explicit control of the span “consumed” by the processor.

Inline Patterns¶

Inline Patterns can implement inline HTML element syntax for Markdown such as *emphasis* or [links](http://example.com). Pattern objects should be instances of classes that inherit from markdown.inlinepatterns.Pattern or one of its children. Each pattern object uses a single regular expression and must have the following methods:

getCompiledRegExp():

Returns a compiled regular expression.
handleMatch(m):

Accepts a match object and returns an ElementTree element of a plain Unicode string.

Inline Patterns can define the property ANCESTOR_EXCLUDES with is either a list or tuple of undesirable ancestors. The pattern will be skipped if it would cause the content to be a descendant of one of the listed tag names.

Note that any regular expression returned by getCompiledRegExp must capture the whole block. Therefore, they should all start with r'^(.*?)' and end with r'(.*?)!'. When using the default getCompiledRegExp() method provided in the Pattern you can pass in a regular expression without that and getCompiledRegExp will wrap your expression for you and set the re.DOTALL and re.UNICODE flags. This means that the first group of your match will be m.group(2) as m.group(1) will match everything before the pattern.

For an example, consider this simplified emphasis pattern:

from markdown.inlinepatterns import Pattern
import xml.etree.ElementTree as etree

class EmphasisPattern(Pattern):
    def handleMatch(self, m):
        el = etree.Element('em')
        el.text = m.group(2)
        return el

As discussed in Integrating Your Code Into Markdown, an instance of this class will need to be provided to Markdown. That instance would be created like so:

# an oversimplified regex
MYPATTERN = r'\*([^*]+)\*'
# pass in pattern and create instance
emphasis = EmphasisPattern(MYPATTERN)

Postprocessors¶

Postprocessors munge the document after the ElementTree has been serialized into a string. Postprocessors should be used to work with the text just before output. Usually, they are used to add back sections that were extracted in a preprocessor, fix up outgoing encodings, or wrap the whole document.

Postprocessors inherit from markdown.postprocessors.Postprocessor and implement a run method which takes a single parameter text, the entire HTML document as a single Unicode string. run should return a single Unicode string ready for output. Note that preprocessors use a list of lines while postprocessors use a single multi-line string.

Example¶

Here is a simple example that changes the output to one big page showing the raw html.

from markdown.postprocessors import Postprocessor
import re

class ShowActualHtmlPostprocesor(Postprocessor):
    """ Wrap entire output in <pre> tags as a diagnostic. """
    def run(self, text):
        return '<pre>\n' + re.sub('<', '&lt;', text) + '</pre>\n'

Usages¶

Some postprocessors in the Markdown source tree include:

Class	Kind	Description
`raw_html`	built-in	Restore raw html from `htmlStash`, stored by `HTMLBlockPreprocessor`, and code highlighters
`amp_substitute`	built-in	Convert ampersand substitutes to `&`; used in links
`unescape`	built-in	Convert some escaped characters back from integers; used in links
`FootnotePostProcessor`	extension	Replace footnote placeholders with html entities; as set by other stages

Working with the ElementTree¶

As mentioned, the Markdown parser converts a source document to an ElementTree object before serializing that back to Unicode text. Markdown has provided some helpers to ease that manipulation within the context of the Markdown module.

First, import the ElementTree module:

import xml.etree.ElementTree as etree

Sometimes you may want text inserted into an element to be parsed by Inline Patterns. In such a situation, simply insert the text as you normally would and the text will be automatically run through the Inline Patterns. However, if you do not want some text to be parsed by Inline Patterns, then insert the text as an AtomicString.

from markdown.util import AtomicString
some_element.text = AtomicString(some_text)

Here’s a basic example which creates an HTML table (note that the contents of the second cell (td2) will be run through Inline Patterns latter):

table = etree.Element("table")
table.set("cellpadding", "2")                      # Set cellpadding to 2
tr = etree.SubElement(table, "tr")                 # Add child tr to table
td1 = etree.SubElement(tr, "td")                   # Add child td1 to tr
td1.text = markdown.util.AtomicString("Cell content") # Add plain text content
td2 = etree.SubElement(tr, "td")                   # Add second td to tr
td2.text = "*text* with **inline** formatting."    # Add markup text
table.tail = "Text after table"                    # Add text after table

You can also manipulate an existing tree. Consider the following example which adds a class attribute to <a> elements:

def set_link_class(self, element):
    for child in element:
        if child.tag == "a":
              child.set("class", "myclass") #set the class attribute
        set_link_class(child) # run recursively on children

For more information about working with ElementTree see the ElementTree Documentation.

Working with Raw HTML¶

Occasionally an extension may need to call out to a third party library which returns a pre-made string of raw HTML that needs to be inserted into the document unmodified. Raw strings can be stashed for later retrieval using an htmlStash instance, rather than converting them into ElementTree objects. A raw string (which may or may not be raw HTML) passed to self.md.htmlStash.store() will be saved to the stash and a placeholder string will be returned which should be inserted into the tree instead. After the tree is serialized, a postprocessor will replace the placeholder with the raw string. This prevents subsequent processing steps from modifying the HTML data. For example,

html = "<p>This is some <em>raw</em> HTML data</p>"
el = etree.Element("div")
el.text = self.md.htmlStash.store(html)

For the global htmlStash instance to be available from a processor, the markdown.Markdown instance must be passed to the processor from extendMarkdown and will be available as self.md.htmlStash.

Integrating Your Code Into Markdown¶

Once you have the various pieces of your extension built, you need to tell Markdown about them and ensure that they are run in the proper sequence. Markdown accepts an Extension instance for each extension. Therefore, you will need to define a class that extends markdown.extensions.Extension and overrides the extendMarkdown method. Within this class you will manage configuration options for your extension and attach the various processors and patterns to the Markdown instance.

It is important to note that the order of the various processors and patterns matters. For example, if we replace http://... links with <a> elements, and then try to deal with inline HTML, we will end up with a mess. Therefore, the various types of processors and patterns are stored within an instance of the markdown.Markdown class in a Registry. Your Extension class will need to manipulate those registries appropriately. You may register instances of your processors and patterns with an appropriate priority, deregister built-in instances, or replace a built-in instance with your own.

`extendMarkdown`¶

The extendMarkdown method of a markdown.extensions.Extension class accepts one argument:

md:

A pointer to the instance of the markdown.Markdown class. You should use this to access the Registries of processors and patterns. They are found under the following attributes:
- md.preprocessors
- md.inlinePatterns
- md.parser.blockprocessors
- md.treeprocessors
- md.postprocessors
Some other things you may want to access on the markdown.Markdown instance are:
- md.htmlStash
- md.output_formats
- md.set_output_format()
- md.output_format
- md.serializer
- md.registerExtension()
- md.tab_length
- md.block_level_elements
- md.isBlockLevel()

Warning

With access to the above items, theoretically you have the option to change anything through various monkey_patching techniques. However, you should be aware that the various undocumented parts of Markdown may change without notice and your monkey_patches may break with a new release. Therefore, what you really should be doing is inserting processors and patterns into the Markdown pipeline. Consider yourself warned!

A simple example:

from markdown.extensions import Extension

class MyExtension(Extension):
    def extendMarkdown(self, md):
        # Register instance of 'mypattern' with a priority of 175
        md.inlinePatterns.register(MyPattern(md), 'mypattern', 175)

registerExtension¶

Some extensions may need to have their state reset between multiple runs of the markdown.Markdown class. For example, consider the following use of the Footnotes extension:

md = markdown.Markdown(extensions=['footnotes'])
html1 = md.convert(text_with_footnote)
md.reset()
html2 = md.convert(text_without_footnote)

Without calling reset, the footnote definitions from the first document will be inserted into the second document as they are still stored within the class instance. Therefore the Extension class needs to define a reset method that will reset the state of the extension (i.e.: self.footnotes = {}). However, as many extensions do not have a need for reset, reset is only called on extensions that are registered.

To register an extension, call md.registerExtension from within your extendMarkdown method:

def extendMarkdown(self, md):
    md.registerExtension(self)
    # insert processors and patterns here

Then, each time reset is called on the markdown.Markdown instance, the reset method of each registered extension will be called as well. You should also note that reset will be called on each registered extension after it is initialized the first time. Keep that in mind when over-riding the extension’s reset method.

Configuration Settings¶

If an extension uses any parameters that the user may want to change, those parameters should be stored in self.config of your markdown.extensions.Extension class in the following format:

class MyExtension(markdown.extensions.Extension):
    def __init__(self, **kwargs):
        self.config = {
            'option1' : ['value1', 'description1'],
            'option2' : ['value2', 'description2']
        }
        super(MyExtension, self).__init__(**kwargs)

When implemented this way the configuration parameters can be overridden at run time (thus the call to super). For example:

markdown.Markdown(extensions=[MyExtension(option1='other value')])

Note that if a keyword is passed in that is not already defined in self.config, then a KeyError is raised.

The markdown.extensions.Extension class and its subclasses have the following methods available to assist in working with configuration settings:

getConfig(key [, default]):

Returns the stored value for the given key or default if the key does not exist. If not set, default returns an empty string.
getConfigs():

Returns a dict of all key/value pairs.
getConfigInfo():

Returns all configuration descriptions as a list of tuples.
setConfig(key, value):

Sets a configuration setting for key with the given value. If key is unknown, a KeyError is raised. If the previous value of key was a Boolean value, then value is converted to a Boolean value. If the previous value of key is None, then value is converted to a Boolean value except when it is None. No conversion takes place when the previous value of key is a string.
setConfigs(items):

Sets multiple configuration settings given a dict of key/value pairs.

Naming an Extension¶

As noted in the library reference an instance of an extension can be passed directly to markdown.Markdown. In fact, this is the preferred way to use third-party extensions.

For example:

import markdown
from path.to.module import MyExtension
md = markdown.Markdown(extensions=[MyExtension(option='value')])

However, Markdown also accepts “named” third party extensions for those occasions when it is impractical to import an extension directly (from the command line or from within templates). A “name” can either be a registered entry point or a string using Python’s dot notation.

Entry Point¶

Entry points are defined in a Python package’s setup.py script. The script must use setuptools to support entry points. Python-Markdown extensions must be assigned to the markdown.extensions group. An entry point definition might look like this:

from setuptools import setup

setup(
    # ...
    entry_points={
        'markdown.extensions': ['myextension = path.to.module:MyExtension']
    }
)

After a user installs your extension using the above script, they could then call the extension using the myextension string name like this:

markdown.markdown(text, extensions=['myextension'])

Note that if two or more entry points within the same group are assigned the same name, Python-Markdown will only ever use the first one found and ignore all others. Therefore, be sure to give your extension a unique name.

For more information on writing setup.py scripts, see the Python documentation on Packaging and Distributing Projects.

Dot Notation¶

If an extension does not have a registered entry point, Python’s dot notation may be used instead. The extension must be installed as a Python module on your PYTHONPATH. Generally, a class should be specified in the name. The class must be at the end of the name and be separated by a colon from the module.

Therefore, if you were to import the class like this:

from path.to.module import MyExtension

Then the extension can be loaded as follows:

markdown.markdown(text, extensions=['path.to.module:MyExtension'])

You do not need to do anything special to support this feature. As long as your extension class is able to be imported, a user can include it with the above syntax.

The above two methods are especially useful if you need to implement a large number of extensions with more than one residing in a module. However, if you do not want to require that your users include the class name in their string, you must define only one extension per module and that module must contain a module-level function called makeExtension that accepts **kwargs and returns an extension instance.

For example:

class MyExtension(markdown.extensions.Extension)
    # Define extension here...

def makeExtension(**kwargs):
    return MyExtension(**kwargs)

When markdown.Markdown is passed the “name” of your extension as a dot notation string that does not include a class (for example path.to.module), it will import the module and call the makeExtension function to initiate your extension.

Registries¶

The markdown.util.Registry class is a priority sorted registry which Markdown uses internally to determine the processing order of its various processors and patterns.

A Registry instance provides two public methods to alter the data of the registry: register and deregister. Use register to add items and deregister to remove items. See each method for specifics.

When registering an item, a “name” and a “priority” must be provided. All items are automatically sorted by the value of the “priority” parameter such that the item with the highest value will be processed first. The “name” is used to remove (deregister) and get items.

A Registry instance is like a list (which maintains order) when reading data. You may iterate over the items, get an item and get a count (length) of all items. You may also check that the registry contains an item.

When getting an item you may use either the index of the item or the string-based “name”. For example:

registry = Registry()
registry.register(SomeItem(), 'itemname', 20)
# Get the item by index
item = registry[0]
# Get the item by name
item = registry['itemname']

When checking that the registry contains an item, you may use either the string-based “name”, or a reference to the actual item. For example:

someitem = SomeItem()
registry.register(someitem, 'itemname', 20)
# Contains the name
assert 'itemname' in registry
# Contains the item instance
assert someitem in registry

markdown.util.Registry has the following methods:

`Registry.register(self, item, name, priority)`¶

Add an item to the registry with the given name and priority.

Arguments:

item: The item being registered.
name: A string used to reference the item.
priority: An integer or float used to sort against all items.

If an item is registered with a “name” which already exists, the existing item is replaced with the new item. Be careful as the old item is lost with no way to recover it. The new item will be sorted according to its priority and will not retain the position of the old item.

`Registry.deregister(self, name, strict=True)`¶

Remove an item from the registry.

Set strict=False to fail silently.

`Registry.get_index_for_name(self, name)`¶

Return the index of the given name.

Writing Extensions for Python-Markdown¶

Phases of processing¶

Preprocessors¶

Example¶

Usages¶

Block Processors¶

BlockParser¶

Example¶

Usages¶

Tree processors¶

Example¶

Usages¶

Inline Processors¶

Convenience Classes¶

Example¶

Usages¶

Patterns¶

Comparison with new `InlineProcessor`¶

Inline Patterns¶

Postprocessors¶

Example¶

Usages¶

Working with the ElementTree¶

Working with Raw HTML¶

Integrating Your Code Into Markdown¶

`extendMarkdown`¶

registerExtension¶

Configuration Settings¶

Naming an Extension¶

Entry Point¶

Dot Notation¶

Registries¶

`Registry.register(self, item, name, priority)`¶

`Registry.deregister(self, name, strict=True)`¶

`Registry.get_index_for_name(self, name)`¶

Table Of Contents

Previous topic

Next topic

This Page

Writing Extensions for Python-Markdown¶

Phases of processing¶

Preprocessors¶

Example¶

Usages¶

Block Processors¶

BlockParser¶

Example¶

Usages¶

Tree processors¶

Example¶

Usages¶

Inline Processors¶

Convenience Classes¶

Example¶

Usages¶

Patterns¶

Comparison with new InlineProcessor¶

Inline Patterns¶

Postprocessors¶

Example¶

Usages¶

Working with the ElementTree¶

Working with Raw HTML¶

Integrating Your Code Into Markdown¶

extendMarkdown¶

registerExtension¶

Configuration Settings¶

Naming an Extension¶

Entry Point¶

Dot Notation¶

Registries¶

Registry.register(self, item, name, priority)¶

Registry.deregister(self, name, strict=True)¶

Registry.get_index_for_name(self, name)¶

Comparison with new `InlineProcessor`¶

`extendMarkdown`¶

`Registry.register(self, item, name, priority)`¶

`Registry.deregister(self, name, strict=True)`¶

`Registry.get_index_for_name(self, name)`¶