matchpy.matching.syntactic module

This module contains various many-to-one matchers for syntactic patterns:

  • There DiscriminationNet class that is a many-to-one matcher for syntactic patterns.
  • The SequenceMatcher can be used to match patterns with a common surrounding operation with some fixed syntactic patterns.
  • The FlatTerm representation for an expression flattens the expression’s tree structure and allows faster preoder traversal.

Furthermore, the module contains some utility functions for working with flatterms.

class FlatTerm(expression: Union[matchpy.expressions.expressions.Expression, Sequence[Union[matchpy.expressions.expressions.Symbol, matchpy.expressions.expressions.Wildcard, Type[matchpy.expressions.expressions.Operation], Type[matchpy.expressions.expressions.Symbol], str]]])

Bases: typing.Sequence

A flattened representation of an Expression.

This is a subclass of list. This representation is similar to the prefix notation generated by Expression.preorder_iter(), but contains some additional elements.

Operation expressions are represented by the type() of the operation before the operands as well as OPERATION_END after the last operand of the operation:

>>> FlatTerm(f(a, b))
[f, a, b, )]

Variables are not included in the flatterm representation, only wildcards remain.

>>> FlatTerm(x_)
[_]

Consecutive wildcards are merged, as the DiscriminationNet cannot handle multiple consecutive sequence wildcards:

>>> FlatTerm(f(_, _))
[f, _[2], )]
>>> FlatTerm(f(_, __, __))
[f, _[3+], )]

Furthermore, every SymbolWildcard is replaced by its symbol_type:

>>> class SpecialSymbol(Symbol):
...     pass
>>> _s = Wildcard.symbol(SpecialSymbol)
>>> FlatTerm(_s)
[<class '__main__.SpecialSymbol'>]

Symbol wildcards are also not merged like other wildcards, because they can never be sequence wildcards:

>>> FlatTerm(f(_, _s))
[f, _, <class '__main__.SpecialSymbol'>, )]
__init__(expression: Union[matchpy.expressions.expressions.Expression, Sequence[Union[matchpy.expressions.expressions.Symbol, matchpy.expressions.expressions.Wildcard, Type[matchpy.expressions.expressions.Operation], Type[matchpy.expressions.expressions.Symbol], str]]]) → None

Initialize self. See help(type(self)) for accurate signature.

static _combined_wildcards_iter(flatterm: Iterator[Union[matchpy.expressions.expressions.Symbol, matchpy.expressions.expressions.Wildcard, Type[matchpy.expressions.expressions.Operation], Type[matchpy.expressions.expressions.Symbol], str]]) → Iterator[Union[matchpy.expressions.expressions.Symbol, matchpy.expressions.expressions.Wildcard, Type[matchpy.expressions.expressions.Operation], Type[matchpy.expressions.expressions.Symbol], str]]

Combine consecutive wildcards in a flatterm into a single one.

classmethod _flatterm_iter(expression: matchpy.expressions.expressions.Expression) → Iterator[Union[matchpy.expressions.expressions.Symbol, matchpy.expressions.expressions.Wildcard, Type[matchpy.expressions.expressions.Operation], Type[matchpy.expressions.expressions.Symbol], str]]

Generator that yields the atoms of the expressions in prefix notation with operation end markers.

classmethod empty() → matchpy.matching.syntactic.FlatTerm

An empty flatterm.

is_syntactic

True, iff the flatterm is syntactic.

classmethod merged(*flatterms) → matchpy.matching.syntactic.FlatTerm

Concatenate the given flatterms to a single flatterm.

Parameters:*flatterms – The flatterms which are concatenated.
Returns:The concatenated flatterms.
is_operation(term: Any) → bool

Return True iff the given term is a subclass of Operation.

is_symbol_wildcard(term: Any) → bool

Return True iff the given term is a subclass of Symbol.

class DiscriminationNet(*patterns)

Bases: typing.Generic

An automaton to distinguish which patterns match a given expression.

This is a DFA with an implicit fail state whenever a transition is not actually defined. For every pattern added, an automaton is created and then the product automaton with the existing one is used as the new automaton.

The matching assumes that patterns are linear, i.e. it will treat all variables as non-existent and only consider the wildcards.

__init__(*patterns) → None
Parameters:*patterns – Optional pattern to initially add to the discrimination net.
classmethod _generate_net(flatterm: matchpy.matching.syntactic.FlatTerm, final_label: T) → matchpy.matching.syntactic._State[T]

Generates a DFA matching the given pattern.

add(pattern: Union[matchpy.expressions.expressions.Pattern, matchpy.matching.syntactic.FlatTerm], final_label: T = None) → int

Add a pattern to the discrimination net.

Parameters:
  • pattern – The pattern which is added to the DiscriminationNet. If an expression is given, it will be converted to a FlatTerm for internal processing. You can also pass a FlatTerm directly.
  • final_label – A label that is returned if the pattern matches when using match(). This will default to the pattern itself.
Returns:

The index of the newly added pattern. This is used internally to later to get the pattern and its final label once a match is found.

as_graph() → None

Renders the discrimination net as graphviz digraph.

is_match(subject: Union[matchpy.expressions.expressions.Expression, matchpy.matching.syntactic.FlatTerm]) → bool

Check if the given subject matches any pattern in the net.

Parameters:subject – The subject that is matched. Must be constant.
Returns:True, if any pattern matches the subject.
match(subject: Union[matchpy.expressions.expressions.Expression, matchpy.matching.syntactic.FlatTerm]) → Iterator[Tuple[T, matchpy.expressions.substitution.Substitution]]

Match the given subject against all patterns in the net.

Parameters:subject – The subject that is matched. Must be constant.
Yields:A tuple (final label, substitution), where the first component is the final label associated with the pattern as given when using add() and the second one is the match substitution.
class SequenceMatcher(*patterns)

Bases: object

A matcher that matches many syntactic patterns in a surrounding sequence.

It can match patterns that have the form \(f(x^*, s_1, \dots, s_n, y^*)\) where

  • \(f\) is a non-commutative operation,
  • \(x^*, y^*\) are star sequence wildcards or variables (they can be the same of different), and
  • all the \(s_i\) are syntactic patterns.

After adding these patterns with add(), they can be matched simultaneously against a subject with match(). Note that all patterns matched by one sequence matcher must have the same outer operation \(f\).

operation

The outer operation that all patterns have in common. Is set automatically when adding the first pattern and is check for all following patterns.

__init__(*patterns) → None
Parameters:*patterns – Initial patterns to add to the sequence matcher.
add(pattern: matchpy.expressions.expressions.Pattern) → int

Add a pattern that will be recognized by the matcher.

Parameters:

pattern – The pattern to add.

Returns:

An internal index for the pattern.

Raises:
  • ValueError – If the pattern does not have the correct form.
  • TypeError – If the pattern is not a non-commutative operation.
as_graph() → None

Renders the underlying discrimination net as graphviz digraph.

classmethod can_match(pattern: matchpy.expressions.expressions.Pattern) → bool

Check if a pattern can be matched with a sequence matcher.

Parameters:pattern – The pattern to check.
Returns:True, iff the pattern can be matched with a sequence matcher.
match(subject: matchpy.expressions.expressions.Expression) → Iterator[Tuple[matchpy.expressions.expressions.Pattern, matchpy.expressions.substitution.Substitution]]

Match the given subject against all patterns in the sequence matcher.

Parameters:subject – The subject that is matched. Must be constant.
Yields:A tuple (pattern, substitution) for every matching pattern.
operation