Main Wrapper Classes¶
- class soupy.Node(value)¶
The Node class is the main wrapper around BeautifulSoup elements like Tag. It implements many of the same properties and methods as BeautifulSoup for navigating through documents, like find, select, parents, etc.
- dump(**kwargs)¶
Extract derived values into a Scalar(dict)
The keyword names passed to this function become keys in the resulting dictionary.
The keyword values are functions that are called on this Node.
Notes
- The input functions are called on the Node, not the underlying BeautifulSoup element
- If the function returns a wrapper, it will be unwrapped
Example
>>> soup = Soupy("<b>hi</b>").find('b') >>> data = soup.dump(name=Q.name, text=Q.text).val() >>> data == {'text': 'hi', 'name': 'b'} True
- val()¶
Return the value inside a wrapper.
Raises NullValueError if called on a Null object
- orelse(value)¶
Provide a fallback value for failed matches.
Examples
>>> Scalar(5).orelse(10).val() 5 >>> Null().orelse(10).val() 10
- nonnull()¶
Require that a node is not null
Null values will raise NullValueError, whereas nonnull values return self.
useful for being strict about portions of queries.
Examples
node.find(‘a’).nonnull().find(‘b’).orelse(3)
This will raise an error if find(‘a’) doesn’t match, but provides a fallback if find(‘b’) doesn’t match.
- require(func, msg=u'Requirement violated')¶
Assert that self.apply(func) is True.
Parameters: - func – func(wrapper)
- msg – str The error message to display on failure
Returns: If self.apply(func) is True, returns self. Otherwise, raises NullValueError.
- attrs¶
A Scalar of this Node’s attribute dictionary
Example
>>> Soupy("<a val=3></a>").find('a').attrs Scalar({'val': '3'})
- children¶
A Collection of the child elements.
- contents¶
A Collection of the child elements.
- descendants¶
A Collection of all elements nested inside this Node.
- find(*args, **kwargs)¶
Find a single Node among this Node’s descendants.
Returns NullNode if nothing matches.
This inputs to this function follow the same semantics as BeautifulSoup. See http://bit.ly/bs4doc for more info.
Examples
- node.find(‘a’) # look for a tags
- node.find(‘a’, ‘foo’) # look for a tags with class=`foo`
- node.find(func) # find tag where func(tag) is True
- node.find(val=3) # look for tag like <a, val=3>
- find_all(*args, **kwargs)¶
Like find(), but selects all matches (not just the first one).
Returns a Collection.
If no elements match, this returns a Collection with no items.
- find_next_sibling(*args, **kwargs)¶
Like find(), but searches through next_siblings
- find_next_siblings(*args, **kwargs)¶
Like find_all(), but searches through next_siblings
- find_parents(*args, **kwargs)¶
Like find_all(), but searches through parents
- find_previous_sibling(*args, **kwargs)¶
Like find(), but searches through previous_siblings
- find_previous_siblings(*args, **kwargs)¶
Like find_all(), but searches through previous_siblings
- name¶
A Scalar of this Node’s tag name.
Example
>>> node = Soupy('<p>hi there</p>').find('p') >>> node Node(<p>hi there</p>) >>> node.name Scalar('p')
- next_siblings¶
A Collection of all siblings after this node
- parents¶
A Collection of the parents elements.
- previous_siblings¶
A Collection of all siblings before this node
- select(selector)¶
Like find_all(), but takes a CSS selector string as input.
- class soupy.Collection(items)¶
Collection’s store lists of other wrappers.
They support most of the list methods (len, iter, getitem, etc).
- apply(func)¶
Call a function on a wrapper, and wrap the result if necessary.
Parameters: func – function(wrapper) -> val Examples
>>> s = Scalar(5) >>> s.apply(lambda val: isinstance(val, Scalar)) Scalar(True)
- map(func)¶
Call a function on a wrapper’s value, and wrap the result if necessary.
Parameters: func – function(val) -> val Examples
>>> s = Scalar(3) >>> s.map(Q * 2) Scalar(6)
- all()¶
Scalar(True) if all items are truthy, or collection is empty.
- any()¶
Scalar(True) if any items are truthy. False if empty.
- dictzip(keys)¶
Turn this collection into a Scalar(dict), by zipping keys and items.
Parameters: keys – list or Collection of NavigableStrings The keys of the dictionary Examples
>>> c = Collection([Scalar(1), Scalar(2)]) >>> c.dictzip(['a', 'b']).val() == {'a': 1, 'b': 2} True
- dropwhile(func)¶
Return a new Collection with the first few items removed.
Parameters: func – function(Node) -> Node Returns: A new Collection, discarding all items before the first item where bool(func(item)) == True
- dump(*args, **kwargs)¶
Build a list of dicts, by calling Node.dump() on each item.
Each keyword provides a function that extracts a value from a Node.
Examples
>>> c = Collection([Scalar(1), Scalar(2)]) >>> c.dump(x2=Q*2, m1=Q-1).val() [{'x2': 2, 'm1': 0}, {'x2': 4, 'm1': 1}]
- each(func)¶
Call func on each element in the collection
Returns a new Collection.
- filter(func)¶
Return a new Collection with some items removed.
Parameters: func – function(Node) -> Node Returns: A new Collection consisting of the items where bool(func(item)) == True Examples
node.find_all(‘a’).filter(Q[‘href’].startswith(‘http’))
- none()¶
Scalar(True) if no items are truthy, or collection is empty.
- takewhile(func)¶
Return a new Collection with the last few items removed.
Parameters: func – function(Node) -> Node Returns: A new Collection, discarding all items at and after the first item where bool(func(item)) == False Examples
node.find_all(‘tr’).takewhile(Q.find_all(‘td’).count() > 3)
- val()¶
Unwraps each item in the collection, and returns as a list
- zip(*others)¶
Zip the items of this collection with one or more other sequences, and wrap the result.
Unlike Python’s zip, all sequences must be the same length.
Parameters: others – One or more iterables or Collections Returns: A new collection. Examples
>>> c1 = Collection([Scalar(1), Scalar(2)]) >>> c2 = Collection([Scalar(3), Scalar(4)]) >>> c1.zip(c2).val() [(1, 3), (2, 4)]
- class soupy.Scalar(value)¶
A wrapper around single values.
Scalars support boolean testing (<, ==, etc), and use the wrapped value in the comparison. They return the result as a Scalar(bool).
Calling a Scalar calls the wrapped value, and wraps the result.
Examples
>>> s = Scalar(3) >>> s > 2 Scalar(True) >>> s.val() 3 >>> s + 5 Scalar(8) >>> s + s Scalar(6) >>> bool(Scalar(3)) True >>> Scalar(lambda x: x+2)(5) Scalar(7)