nodes Package

nodes Package

This package contains Wikicode “nodes”, which represent a single unit of wikitext, such as a Template, an HTML tag, a Heading, or plain text. The node “tree” is far from flat, as most types can contain additional Wikicode types within them - and with that, more nodes. For example, the name of a Template is a Wikicode object that can contain text or more templates.

class mwparserfromhell.nodes.Node[source]

Represents the base Node type, demonstrating the methods to override.

__unicode__() must be overridden. It should return a unicode or (str in py3k) representation of the node. If the node contains Wikicode objects inside of it, __children__() should be a generator that iterates over them. If the node is printable (shown when the page is rendered), __strip__() should return its printable version, stripping out any formatting marks. It does not have to return a string, but something that can be converted to a string with str(). Finally, __showtree__() can be overridden to build a nice tree representation of the node, if desired, for get_tree().

argument Module

class mwparserfromhell.nodes.argument.Argument(name, default=None)[source]

Bases: mwparserfromhell.nodes.Node

Represents a template argument substitution, like {{{foo}}}.

default[source]

The default value to substitute if none is passed.

This will be None if the argument wasn’t defined with one. The MediaWiki parser handles this by rendering the argument itself in the result, complete braces. To have the argument render as nothing, set default to "" ({{{arg}}} vs. {{{arg|}}}).

name[source]

The name of the argument to substitute.

comment Module

class mwparserfromhell.nodes.comment.Comment(contents)[source]

Bases: mwparserfromhell.nodes.Node

Represents a hidden HTML comment, like <!-- foobar -->.

contents[source]

The hidden text contained between <!-- and -->.

heading Module

class mwparserfromhell.nodes.heading.Heading(title, level)[source]

Bases: mwparserfromhell.nodes.Node

Represents a section heading in wikicode, like == Foo ==.

level[source]

The heading level, as an integer between 1 and 6, inclusive.

title[source]

The title of the heading, as a Wikicode object.

html_entity Module

class mwparserfromhell.nodes.html_entity.HTMLEntity(value, named=None, hexadecimal=False, hex_char=u'x')[source]

Bases: mwparserfromhell.nodes.Node

Represents an HTML entity, like &nbsp;, either named or unnamed.

hex_char[source]

If the value is hexadecimal, this is the letter denoting that.

For example, the hex_char of "&#x1234;" is "x", whereas the hex_char of "&#X1234;" is "X". Lowercase and uppercase x are the only values supported.

hexadecimal[source]

If unnamed, this is whether the value is hexadecimal or decimal.

named[source]

Whether the entity is a string name for a codepoint or an integer.

For example, &Sigma;, &#931;, and &#x3a3; refer to the same character, but only the first is “named”, while the others are integer representations of the codepoint.

normalize()[source]

Return the unicode character represented by the HTML entity.

value[source]

The string value of the HTML entity.

tag Module

class mwparserfromhell.nodes.tag.Tag(tag, contents=None, attrs=None, wiki_markup=None, self_closing=False, invalid=False, implicit=False, padding=u'', closing_tag=None)[source]

Bases: mwparserfromhell.nodes.Node

Represents an HTML-style tag in wikicode, like <ref>.

add(name, value=None, quoted=True, pad_first=u' ', pad_before_eq=u'', pad_after_eq=u'')[source]

Add an attribute with the given name and value.

name and value can be anything parasable by utils.parse_anything(); value can be omitted if the attribute is valueless. quoted is a bool telling whether to wrap the value in double quotes (this is recommended). pad_first, pad_before_eq, and pad_after_eq are whitespace used as padding before the name, before the equal sign (or after the name if no value), and after the equal sign (ignored if no value), respectively.

attributes[source]

The list of attributes affecting the tag.

Each attribute is an instance of Attribute.

closing_tag[source]

The closing tag, as a Wikicode object.

This will usually equal tag, unless there is additional spacing, comments, or the like.

contents[source]

The contents of the tag, as a Wikicode object.

get(name)[source]

Get the attribute with the given name.

The returned object is a Attribute instance. Raises ValueError if no attribute has this name. Since multiple attributes can have the same name, we’ll return the last match, since all but the last are ignored by the MediaWiki parser.

has(name)[source]

Return whether any attribute in the tag has the given name.

Note that a tag may have multiple attributes with the same name, but only the last one is read by the MediaWiki parser.

implicit[source]

Whether the tag is implicitly self-closing, with no ending slash.

This is only possible for specific “single” tags like <br> and <li>. See definitions.is_single(). This field only has an effect if self_closing is also True.

invalid[source]

Whether the tag starts with a backslash after the opening bracket.

This makes the tag look like a lone close tag. It is technically invalid and is only parsable Wikicode when the tag itself is single-only, like <br> and <img>. See definitions.is_single_only().

padding[source]

Spacing to insert before the first closing >.

remove(name)[source]

Remove all attributes with the given name.

self_closing[source]

Whether the tag is self-closing with no content (like <br/>).

tag[source]

The tag itself, as a Wikicode object.

wiki_markup[source]

The wikified version of a tag to show instead of HTML.

If set to a value, this will be displayed instead of the brackets. For example, set to '' to replace <i> or ---- to replace <hr>.

template Module

class mwparserfromhell.nodes.template.Template(name, params=None)[source]

Bases: mwparserfromhell.nodes.Node

Represents a template in wikicode, like {{foo}}.

add(name, value, showkey=None, before=None, preserve_spacing=True)[source]

Add a parameter to the template with a given name and value.

name and value can be anything parasable by utils.parse_anything(); pipes and equal signs are automatically escaped from value when appropriate.

If showkey is given, this will determine whether or not to show the parameter’s name (e.g., {{foo|bar}}‘s parameter has a name of "1" but it is hidden); otherwise, we’ll make a safe and intelligent guess.

If name is already a parameter in the template, we’ll replace its value while keeping the same whitespace around it. We will also try to guess the dominant spacing convention when adding a new parameter using _get_spacing_conventions().

If before is given (either a Parameter object or a name), then we will place the parameter immediately before this one. Otherwise, it will be added at the end. If before is a name and exists multiple times in the template, we will place it before the last occurance. If before is not in the template, ValueError is raised. The argument is ignored if the new parameter already exists.

If preserve_spacing is False, we will avoid preserving spacing conventions when changing the value of an existing parameter or when adding a new one.

get(name)[source]

Get the parameter whose name is name.

The returned object is a Parameter instance. Raises ValueError if no parameter has this name. Since multiple parameters can have the same name, we’ll return the last match, since the last parameter is the only one read by the MediaWiki parser.

has(name, ignore_empty=False)[source]

Return True if any parameter in the template is named name.

With ignore_empty, False will be returned even if the template contains a parameter with the name name, if the parameter’s value is empty. Note that a template may have multiple parameters with the same name, but only the last one is read by the MediaWiki parser.

has_param(name, ignore_empty=False)

Alias for has().

name[source]

The name of the template, as a Wikicode object.

params[source]

The list of parameters contained within the template.

remove(param, keep_field=False)[source]

Remove a parameter from the template, identified by param.

If param is a Parameter object, it will be matched exactly, otherwise it will be treated like the name argument to has() and get().

If keep_field is True, we will keep the parameter’s name, but blank its value. Otherwise, we will remove the parameter completely unless other parameters are dependent on it (e.g. removing bar from {{foo|bar|baz}} is unsafe because {{foo|baz}} is not what we expected, so {{foo||baz}} will be produced instead).

If the parameter shows up multiple times in the template and param is not a Parameter object, we will remove all instances of it (and keep only one if keep_field is True - the first instance if none have dependents, otherwise the one with dependents will be kept).

text Module

class mwparserfromhell.nodes.text.Text(value)[source]

Bases: mwparserfromhell.nodes.Node

Represents ordinary, unformatted text with no special properties.

value[source]

The actual text itself.