nodes Package¶
nodes
Package¶
This package contains Wikicode
“nodes”, which represent a single unit
of wikitext, such as a Template, an HTML tag, a Heading, or plain text. The
node “tree” is far from flat, as most types can contain additional
Wikicode
types within them - and with that, more nodes. For example,
the name of a Template
is a Wikicode
object that can
contain text or more templates.
- class mwparserfromhell.nodes.Node[source]¶
Represents the base Node type, demonstrating the methods to override.
__str__()
must be overridden. It should return astr
representation of the node. If the node containsWikicode
objects inside of it,__children__()
should be a generator that iterates over them. If the node is printable (shown when the page is rendered),__strip__()
should return its printable version, stripping out any formatting marks. It does not have to return a string, but something that can be converted to a string withstr()
. Finally,__showtree__()
can be overridden to build a nice tree representation of the node, if desired, forget_tree()
.
_base
Module¶
- class mwparserfromhell.nodes._base.Node[source]¶
Bases:
StringMixIn
Represents the base Node type, demonstrating the methods to override.
__str__()
must be overridden. It should return astr
representation of the node. If the node containsWikicode
objects inside of it,__children__()
should be a generator that iterates over them. If the node is printable (shown when the page is rendered),__strip__()
should return its printable version, stripping out any formatting marks. It does not have to return a string, but something that can be converted to a string withstr()
. Finally,__showtree__()
can be overridden to build a nice tree representation of the node, if desired, forget_tree()
.
argument
Module¶
- class mwparserfromhell.nodes.argument.Argument(name, default=None)[source]¶
Bases:
Node
Represents a template argument substitution, like
{{{foo}}}
.- property default¶
The default value to substitute if none is passed.
This will be
None
if the argument wasn’t defined with one. The MediaWiki parser handles this by rendering the argument itself in the result, complete braces. To have the argument render as nothing, set default to""
({{{arg}}}
vs.{{{arg|}}}
).
- property name¶
The name of the argument to substitute.
comment
Module¶
external_link
Module¶
heading
Module¶
html_entity
Module¶
- class mwparserfromhell.nodes.html_entity.HTMLEntity(value, named=None, hexadecimal=False, hex_char='x')[source]¶
Bases:
Node
Represents an HTML entity, like
, either named or unnamed.- property hex_char¶
If the value is hexadecimal, this is the letter denoting that.
For example, the hex_char of
"ሴ"
is"x"
, whereas the hex_char of"ሴ"
is"X"
. Lowercase and uppercasex
are the only values supported.
- property hexadecimal¶
If unnamed, this is whether the value is hexadecimal or decimal.
- property named¶
Whether the entity is a string name for a codepoint or an integer.
For example,
Σ
,Σ
, andΣ
refer to the same character, but only the first is “named”, while the others are integer representations of the codepoint.
- property value¶
The string value of the HTML entity.
tag
Module¶
- class mwparserfromhell.nodes.tag.Tag(tag, contents=None, attrs=None, wiki_markup=None, self_closing=False, invalid=False, implicit=False, padding='', closing_tag=None, wiki_style_separator=None, closing_wiki_markup=None)[source]¶
Bases:
Node
Represents an HTML-style tag in wikicode, like
<ref>
.- add(name, value=None, quotes='"', pad_first=' ', pad_before_eq='', pad_after_eq='')[source]¶
Add an attribute with the given name and value.
name and value can be anything parsable by
utils.parse_anything()
; value can be omitted if the attribute is valueless. If quotes is notNone
, it should be a string (either"
or'
) that value will be wrapped in (this is recommended).None
is only legal if value contains no spacing.pad_first, pad_before_eq, and pad_after_eq are whitespace used as padding before the name, before the equal sign (or after the name if no value), and after the equal sign (ignored if no value), respectively.
- property attributes¶
The list of attributes affecting the tag.
Each attribute is an instance of
Attribute
.
- property closing_tag¶
The closing tag, as a
Wikicode
object.This will usually equal
tag
, unless there is additional spacing, comments, or the like.
- property closing_wiki_markup¶
The wikified version of the closing tag to show instead of HTML.
If set to a value, this will be displayed instead of the close tag brackets. If tag is
self_closing
isTrue
then this is not displayed. Ifwiki_markup
is set and this has not been set, this is set to the value ofwiki_markup
. If this has been set andwiki_markup
is set to aFalse
value, this is set toNone
.
- get(name)[source]¶
Get the attribute with the given name.
The returned object is a
Attribute
instance. RaisesValueError
if no attribute has this name. Since multiple attributes can have the same name, we’ll return the last match, since all but the last are ignored by the MediaWiki parser.
- has(name)[source]¶
Return whether any attribute in the tag has the given name.
Note that a tag may have multiple attributes with the same name, but only the last one is read by the MediaWiki parser.
- property implicit¶
Whether the tag is implicitly self-closing, with no ending slash.
This is only possible for specific “single” tags like
<br>
and<li>
. Seedefinitions.is_single()
. This field only has an effect ifself_closing
is alsoTrue
.
- property invalid¶
Whether the tag starts with a backslash after the opening bracket.
This makes the tag look like a lone close tag. It is technically invalid and is only parsable Wikicode when the tag itself is single-only, like
<br>
and<img>
. Seedefinitions.is_single_only()
.
- property padding¶
Spacing to insert before the first closing
>
.
- remove(name)[source]¶
Remove all attributes with the given name.
Raises
ValueError
if none were found.
- property self_closing¶
Whether the tag is self-closing with no content (like
<br/>
).
- property wiki_markup¶
The wikified version of a tag to show instead of HTML.
If set to a value, this will be displayed instead of the brackets. For example, set to
''
to replace<i>
or----
to replace<hr>
.
- property wiki_style_separator¶
The separator between the padding and content in a wiki markup tag.
Essentially the wiki equivalent of the TagCloseOpen.
template
Module¶
- class mwparserfromhell.nodes.template.Template(name, params=None)[source]¶
Bases:
Node
Represents a template in wikicode, like
{{foo}}
.- add(name, value, showkey=None, before=None, after=None, preserve_spacing=True)[source]¶
Add a parameter to the template with a given name and value.
name and value can be anything parsable by
utils.parse_anything()
; pipes and equal signs are automatically escaped from value when appropriate.If name is already a parameter in the template, we’ll replace its value.
If showkey is given, this will determine whether or not to show the parameter’s name (e.g.,
{{foo|bar}}
’s parameter has a name of"1"
but it is hidden); otherwise, we’ll make a safe and intelligent guess.If before is given (either a
Parameter
object or a name), then we will place the parameter immediately before this one. Otherwise, it will be added at the end. If before is a name and exists multiple times in the template, we will place it before the last occurrence. If before is not in the template,ValueError
is raised. The argument is ignored if name is an existing parameter.If after is given (either a
Parameter
object or a name), then we will place the parameter immediately after this one. If after is a name and exists multiple times in the template, we will place it after the last occurrence. If after is not in the template,ValueError
is raised. The argument is ignored if name is an existing parameter or if a value is passed to before.If preserve_spacing is
True
, we will try to preserve whitespace conventions around the parameter, whether it is new or we are updating an existing value. It is disabled for parameters with hidden keys, since MediaWiki doesn’t strip whitespace in this case.
- get(name, default=<object object>)[source]¶
Get the parameter whose name is name.
The returned object is a
Parameter
instance. RaisesValueError
if no parameter has this name. If default is set, returns that instead. Since multiple parameters can have the same name, we’ll return the last match, since the last parameter is the only one read by the MediaWiki parser.
- has(name, ignore_empty=False)[source]¶
Return
True
if any parameter in the template is named name.With ignore_empty,
False
will be returned even if the template contains a parameter with the name name, if the parameter’s value is empty. Note that a template may have multiple parameters with the same name, but only the last one is read by the MediaWiki parser.
- property params¶
The list of parameters contained within the template.
- remove(param, keep_field=False)[source]¶
Remove a parameter from the template, identified by param.
If param is a
Parameter
object, it will be matched exactly, otherwise it will be treated like the name argument tohas()
andget()
.If keep_field is
True
, we will keep the parameter’s name, but blank its value. Otherwise, we will remove the parameter completely.When removing a parameter with a hidden name, subsequent parameters with hidden names will be made visible. For example, removing
bar
from{{foo|bar|baz}}
produces{{foo|2=baz}}
because{{foo|baz}}
is incorrect.If the parameter shows up multiple times in the template and param is not a
Parameter
object, we will remove all instances of it (and keep only one if keep_field isTrue
- either the one with a hidden name, if it exists, or the first instance).