This package contains the actual wikicode parser, split up into two main modules: the tokenizer and the builder. This module joins them together under one interface.
Combines a sequence of tokens into a tree of Wikicode objects.
To use, pass a list of Tokens to the build() method. The list will be exhausted as it is parsed and a Wikicode object will be returned.
Handle a case where a parameter is at the head of the tokens.
default is the value to use if no parameter name is defined.
This module contains various “context” definitions, which are essentially flags set during the tokenization process, either on the current parse stack (local contexts) or affecting all stacks (global contexts). They represent the context the tokenizer is in, such as inside a template’s name definition, or inside a level-two heading. This is used to determine what tokens are valid at the current point and also if the current parsing route is invalid.
The tokenizer stores context as an integer, with these definitions bitwise OR’d to set them, AND’d to check if they’re set, and XOR’d to unset them. The advantage of this is that contexts can have sub-contexts (as FOO == 0b11 will cover BAR == 0b10 and BAZ == 0b01).
Local (stack-specific) contexts:
TEMPLATE
- TEMPLATE_NAME
- TEMPLATE_PARAM_KEY
- TEMPLATE_PARAM_VALUE
ARGUMENT
- ARGUMENT_NAME
- ARGUMENT_DEFAULT
WIKILINK
- WIKILINK_TITLE
- WIKILINK_TEXT
EXT_LINK
- EXT_LINK_URI
- EXT_LINK_TITLE
- EXT_LINK_BRACKETS
HEADING
- HEADING_LEVEL_1
- HEADING_LEVEL_2
- HEADING_LEVEL_3
- HEADING_LEVEL_4
- HEADING_LEVEL_5
- HEADING_LEVEL_6
TAG
- TAG_OPEN
- TAG_ATTR
- TAG_BODY
- TAG_CLOSE
STYLE
- STYLE_ITALICS
- STYLE_BOLD
- STYLE_PASS_AGAIN
- STYLE_SECOND_PASS
DL_TERM
SAFETY_CHECK
- HAS_TEXT
- FAIL_ON_TEXT
- FAIL_NEXT
- FAIL_ON_LBRACE
- FAIL_ON_RBRACE
- FAIL_ON_EQUALS
Global contexts:
Aggregate contexts:
Creates a list of tokens from a string of wikicode.
Write the body of a tag and the tokens that should surround it.
Fail the current tokenization route.
Discards the current stack/context/textbuffer and raises BadRoute.
Handle text in a free ext link, including trailing punctuation.
Handle the (possible) start of an implicitly closing single tag.
Handle the end of an implicitly closing single-only HTML tag.
Handle a template parameter’s value at the head of the string.
Parse a template or argument at the head of the wikicode string.
Pop the current stack/context/textbuffer, returing the stack.
If keep_context is True, then we will replace the underlying stack’s context with the current stack’s.
Read the value at a relative point in the wikicode.
The value is read from self._head plus the value of delta (which can be negative). If wrap is False, we will not allow attempts to read from the end of the string if self._head + delta is negative. If strict is True, the route will be failed (with _fail_route()) if we try to read from past the end of the string; otherwise, self.END is returned. If we try to read from before the start of the string, self.START is returned.
Remove the URI scheme of a new external link from the textbuffer.
This module contains the token definitions that are used as an intermediate parsing data type - they are stored in a flat list, with each token being identified by its type and optional attributes. The token list is generated in a syntactically valid form by the Tokenizer, and then converted into the :py:class`~.Wikicode` tree by the Builder.
A token stores the semantic meaning of a unit of wikicode.