Nodes

pytidyhtml5.Node(Document document=None)

A node in the deserialized document.

This includes text nodes, processing instructions and more, cf. NodeType.

Getters

Node.parent

Returns the parent node.

Returns

  • pytidyhtml5.Node – The parent node.

  • None – If self.type == NodeType.root or not bool(self).

Node.child

Returns the first child node.

Returns

  • pytidyhtml5.Node – The first child node.

  • None – If not bool(self) or the if self has no children.

Node.next

Returns the next sibling node.

Returns

  • pytidyhtml5.Node – The next child sibling node.

  • None – If not bool(self) or there are no further siblings.

Node.prev

Returns the previous sibling node.

Returns

  • pytidyhtml5.Node – The previous child sibling node.

  • None – If not bool(self) or there are no prior siblings.

Node.attr_first

Returns the first attribute.

Returns

  • pytidyhtml5.Attr – The first attribute.

  • None – If not bool(self) or the if self has no attributes.

Node.name

Returns the tag name of the current node.

Returns

  • str – Tag name of the current node.

  • None – If the node was falsy or the node does not have a name.

Node.position

Returns a tuple of the position in the input stream.

Returns

  • tuple – line and column

  • None – If the node was falsy.

Node.is_text

Returns whether the current node is a text node.

Same as self.type is NoteType.Text.

Returns

  • bool – Yes or no.

  • None – If the node was falsy.

Node.id

Returns the tag of the current node numerically.

Returns

  • pytidyhtml5.TagId – Id of the current tag.

  • int – The result was not understood. Is the loaded tidy.so newer than this wrapper?

  • Nonenot bool(self)

Node.type

Returns the type of the current node.

Returns

  • pytidyhtml5.NodeType – Id of the current node type.

  • int – The result was not understood. Is the loaded tidy.so newer than this wrapper?

  • Nonenot bool(self)

Setters

Node.discard(self)

Removes the current not from the document.

If the have a second reference to the current node, then don’t use it anymore. In the best case the program will only crash.

Returns

  • pytidyhtml5.Node – The next sibling. The result may be falsy, if the discarded node was the last node.

  • None – If not bool(self).

Iterators

Node.iter_attrs(self)

Yield the attributes of the current node.

Essentially the same as

attr = self.get_attr_first()
while attr:
    yield attr
    attr = attr.get_next()

Expect undefined behavior if you alter the tree during an iteration.

Yields

pytidyhtml5.Attr – Attributes of the current node.

Node.iter_children(self)

Yield the children of the current node.

Essentially the same as

node = self.get_child()
while node:
    yield node
    node = node.get_next()

Expect undefined behavior if you alter the tree during an iteration.

Yields

pytidyhtml5.Node – Children of the current node.

Node.attr

A proxy for easier access of the node’s attributes.

E.g.

self.attr['id']                 # retrieve attribute value
self.attr['id'] = 'some_value'  # set attribute value
del self.attr['id']             # discard attribute

set(self.attr) == { AttrId.ALT, ... }  # Iterate over set ids
Node.attr

A proxy for easier access of the node’s attributes.

E.g.

self.attr['id']                 # retrieve attribute value
self.attr['id'] = 'some_value'  # set attribute value
del self.attr['id']             # discard attribute

set(self.attr) == { AttrId.ALT, ... }  # Iterate over set ids

Miscellaneous

Node.__bool__()

A Node is truthy if it has an assigned TidyNode (i.e. if it was created by some Document method), and the document has was not been released in the meantime.

Node.__eq__()

Compares the underlying node.

If you get the same node twice from a document, you will will have two distinct pytidyhtml5.Node instances that compare equal.

Node.__ne__()

Simply not (self == other)

Getters as methods

All getters exist as methods, too:

pytidyhtml5.Node.get_parent(self)

Returns the parent node.

pytidyhtml5.Node.get_child(self)

Returns the first child node.

pytidyhtml5.Node.get_next(self)

Returns the next sibling node.

pytidyhtml5.Node.get_prev(self)

Returns the previous sibling node.

pytidyhtml5.Node.get_attr_first(self)

Returns the first attribute.

pytidyhtml5.Node.get_name(self)

Returns the tag name of the current node.

pytidyhtml5.Node.get_position(self)

Returns a tuple of the position in the input stream.

pytidyhtml5.Node.get_is_text(self)

Returns whether the current node is a text node.

pytidyhtml5.Node.get_tag_id(self)

Returns the tag of the current node numerically.

pytidyhtml5.Node.get_node_type(self)

Returns the type of the current node.