nodes Package

nodes Package

This package contains Wikicode “nodes”, which represent a single unit of wikitext, such as a Template, an HTML tag, a Heading, or plain text. The node “tree” is far from flat, as most types can contain additional Wikicode types within them - and with that, more nodes. For example, the name of a Template is a Wikicode object that can contain text or more templates.

class mwparserfromhell.nodes.Node[source]

Represents the base Node type, demonstrating the methods to override.

__str__() must be overridden. It should return a str representation of the node. If the node contains Wikicode objects inside of it, __children__() should be a generator that iterates over them. If the node is printable (shown when the page is rendered), __strip__() should return its printable version, stripping out any formatting marks. It does not have to return a string, but something that can be converted to a string with str(). Finally, __showtree__() can be overridden to build a nice tree representation of the node, if desired, for get_tree().

_base Module

class mwparserfromhell.nodes._base.Node[source]

Bases: StringMixIn

Represents the base Node type, demonstrating the methods to override.

__str__() must be overridden. It should return a str representation of the node. If the node contains Wikicode objects inside of it, __children__() should be a generator that iterates over them. If the node is printable (shown when the page is rendered), __strip__() should return its printable version, stripping out any formatting marks. It does not have to return a string, but something that can be converted to a string with str(). Finally, __showtree__() can be overridden to build a nice tree representation of the node, if desired, for get_tree().

argument Module

class mwparserfromhell.nodes.argument.Argument(name, default=None)[source]

Bases: Node

Represents a template argument substitution, like {{{foo}}}.

property default

The default value to substitute if none is passed.

This will be None if the argument wasn’t defined with one. The MediaWiki parser handles this by rendering the argument itself in the result, complete braces. To have the argument render as nothing, set default to "" ({{{arg}}} vs. {{{arg|}}}).

property name

The name of the argument to substitute.

comment Module

class mwparserfromhell.nodes.comment.Comment(contents)[source]

Bases: Node

Represents a hidden HTML comment, like <!-- foobar -->.

property contents

The hidden text contained between <!-- and -->.

heading Module

class mwparserfromhell.nodes.heading.Heading(title, level)[source]

Bases: Node

Represents a section heading in wikicode, like == Foo ==.

property level

The heading level, as an integer between 1 and 6, inclusive.

property title

The title of the heading, as a Wikicode object.

html_entity Module

class mwparserfromhell.nodes.html_entity.HTMLEntity(value, named=None, hexadecimal=False, hex_char='x')[source]

Bases: Node

Represents an HTML entity, like &nbsp;, either named or unnamed.

property hex_char

If the value is hexadecimal, this is the letter denoting that.

For example, the hex_char of "&#x1234;" is "x", whereas the hex_char of "&#X1234;" is "X". Lowercase and uppercase x are the only values supported.

property hexadecimal

If unnamed, this is whether the value is hexadecimal or decimal.

property named

Whether the entity is a string name for a codepoint or an integer.

For example, &Sigma;, &#931;, and &#x3a3; refer to the same character, but only the first is “named”, while the others are integer representations of the codepoint.

normalize()[source]

Return the unicode character represented by the HTML entity.

property value

The string value of the HTML entity.

tag Module

class mwparserfromhell.nodes.tag.Tag(tag, contents=None, attrs=None, wiki_markup=None, self_closing=False, invalid=False, implicit=False, padding='', closing_tag=None, wiki_style_separator=None, closing_wiki_markup=None)[source]

Bases: Node

Represents an HTML-style tag in wikicode, like <ref>.

add(name, value=None, quotes='"', pad_first=' ', pad_before_eq='', pad_after_eq='')[source]

Add an attribute with the given name and value.

name and value can be anything parsable by utils.parse_anything(); value can be omitted if the attribute is valueless. If quotes is not None, it should be a string (either " or ') that value will be wrapped in (this is recommended). None is only legal if value contains no spacing.

pad_first, pad_before_eq, and pad_after_eq are whitespace used as padding before the name, before the equal sign (or after the name if no value), and after the equal sign (ignored if no value), respectively.

property attributes

The list of attributes affecting the tag.

Each attribute is an instance of Attribute.

property closing_tag

The closing tag, as a Wikicode object.

This will usually equal tag, unless there is additional spacing, comments, or the like.

property closing_wiki_markup

The wikified version of the closing tag to show instead of HTML.

If set to a value, this will be displayed instead of the close tag brackets. If tag is self_closing is True then this is not displayed. If wiki_markup is set and this has not been set, this is set to the value of wiki_markup. If this has been set and wiki_markup is set to a False value, this is set to None.

property contents

The contents of the tag, as a Wikicode object.

get(name)[source]

Get the attribute with the given name.

The returned object is a Attribute instance. Raises ValueError if no attribute has this name. Since multiple attributes can have the same name, we’ll return the last match, since all but the last are ignored by the MediaWiki parser.

has(name)[source]

Return whether any attribute in the tag has the given name.

Note that a tag may have multiple attributes with the same name, but only the last one is read by the MediaWiki parser.

property implicit

Whether the tag is implicitly self-closing, with no ending slash.

This is only possible for specific “single” tags like <br> and <li>. See definitions.is_single(). This field only has an effect if self_closing is also True.

property invalid

Whether the tag starts with a backslash after the opening bracket.

This makes the tag look like a lone close tag. It is technically invalid and is only parsable Wikicode when the tag itself is single-only, like <br> and <img>. See definitions.is_single_only().

property padding

Spacing to insert before the first closing >.

remove(name)[source]

Remove all attributes with the given name.

Raises ValueError if none were found.

property self_closing

Whether the tag is self-closing with no content (like <br/>).

property tag

The tag itself, as a Wikicode object.

property wiki_markup

The wikified version of a tag to show instead of HTML.

If set to a value, this will be displayed instead of the brackets. For example, set to '' to replace <i> or ---- to replace <hr>.

property wiki_style_separator

The separator between the padding and content in a wiki markup tag.

Essentially the wiki equivalent of the TagCloseOpen.

template Module

class mwparserfromhell.nodes.template.Template(name, params=None)[source]

Bases: Node

Represents a template in wikicode, like {{foo}}.

add(name, value, showkey=None, before=None, after=None, preserve_spacing=True)[source]

Add a parameter to the template with a given name and value.

name and value can be anything parsable by utils.parse_anything(); pipes and equal signs are automatically escaped from value when appropriate.

If name is already a parameter in the template, we’ll replace its value.

If showkey is given, this will determine whether or not to show the parameter’s name (e.g., {{foo|bar}}’s parameter has a name of "1" but it is hidden); otherwise, we’ll make a safe and intelligent guess.

If before is given (either a Parameter object or a name), then we will place the parameter immediately before this one. Otherwise, it will be added at the end. If before is a name and exists multiple times in the template, we will place it before the last occurrence. If before is not in the template, ValueError is raised. The argument is ignored if name is an existing parameter.

If after is given (either a Parameter object or a name), then we will place the parameter immediately after this one. If after is a name and exists multiple times in the template, we will place it after the last occurrence. If after is not in the template, ValueError is raised. The argument is ignored if name is an existing parameter or if a value is passed to before.

If preserve_spacing is True, we will try to preserve whitespace conventions around the parameter, whether it is new or we are updating an existing value. It is disabled for parameters with hidden keys, since MediaWiki doesn’t strip whitespace in this case.

get(name, default=<object object>)[source]

Get the parameter whose name is name.

The returned object is a Parameter instance. Raises ValueError if no parameter has this name. If default is set, returns that instead. Since multiple parameters can have the same name, we’ll return the last match, since the last parameter is the only one read by the MediaWiki parser.

has(name, ignore_empty=False)[source]

Return True if any parameter in the template is named name.

With ignore_empty, False will be returned even if the template contains a parameter with the name name, if the parameter’s value is empty. Note that a template may have multiple parameters with the same name, but only the last one is read by the MediaWiki parser.

has_param(name, ignore_empty=False)[source]

Alias for has().

property name

The name of the template, as a Wikicode object.

property params

The list of parameters contained within the template.

remove(param, keep_field=False)[source]

Remove a parameter from the template, identified by param.

If param is a Parameter object, it will be matched exactly, otherwise it will be treated like the name argument to has() and get().

If keep_field is True, we will keep the parameter’s name, but blank its value. Otherwise, we will remove the parameter completely.

When removing a parameter with a hidden name, subsequent parameters with hidden names will be made visible. For example, removing bar from {{foo|bar|baz}} produces {{foo|2=baz}} because {{foo|baz}} is incorrect.

If the parameter shows up multiple times in the template and param is not a Parameter object, we will remove all instances of it (and keep only one if keep_field is True - either the one with a hidden name, if it exists, or the first instance).

text Module

class mwparserfromhell.nodes.text.Text(value)[source]

Bases: Node

Represents ordinary, unformatted text with no special properties.

property value

The actual text itself.

Subpackages