matchpy.utils module

This module contains various utility functions.

fixed_integer_vector_iter(max_vector: Tuple[int, ...], vector_sum: int) → Iterator[Tuple[int, ...]]

Return an iterator over the integer vectors which

  • are componentwise less than or equal to max_vector, and
  • are non-negative, and where
  • the sum of their components is exactly vector_sum.

The iterator yields the vectors in lexicographical order.

Examples

List all vectors that are between (0, 0) and (2, 2) componentwise, where the sum of components is 2:

>>> vectors = list(fixed_integer_vector_iter([2, 2], 2))
>>> vectors
[(0, 2), (1, 1), (2, 0)]
>>> list(map(sum, vectors))
[2, 2, 2]
Parameters:
  • max_vector – Maximum vector for the iteration. Every yielded result will be less than or equal to this componentwise.
  • vector_sum – Every iterated vector will have a component sum equal to this value.
Yields:

All non-negative vectors that have the given sum and are not larger than the given maximum.

Raises:

ValueError – If vector_sum is negative.

weak_composition_iter(n: int, num_parts: int) → Iterator[Tuple[int, ...]]

Yield all weak compositions of integer n into num_parts parts.

Each composition is yielded as a tuple. The generated partitions are order-dependant and not unique when ignoring the order of the components. The partitions are yielded in lexicographical order.

Example

>>> compositions = list(weak_composition_iter(5, 2))
>>> compositions
[(0, 5), (1, 4), (2, 3), (3, 2), (4, 1), (5, 0)]

We can easily verify that all compositions are indeed valid:

>>> list(map(sum, compositions))
[5, 5, 5, 5, 5, 5]

The algorithm was adapted from an answer to this Stackoverflow question.

Parameters:
  • n – The integer to partition.
  • num_parts – The number of parts for the combination.
Yields:

All non-negative tuples that have the given sum and size.

Raises:

ValueError – If n or num_parts are negative.

commutative_sequence_variable_partition_iter(values: multiset.Multiset, variables: List[matchpy.utils.VariableWithCount]) → Iterator[Dict[str, multiset.Multiset]]

Yield all possible variable substitutions for given values and variables.

Note

The results are not yielded in any particular order because the algorithm uses dictionaries. Dictionaries until Python 3.6 do not keep track of the insertion order.

Example

For a subject like fc(a, a, a, b, b, c) and a pattern like f(x__, y___, y___) one can define the following input parameters for the partitioning:

>>> x = VariableWithCount(name='x', count=1, minimum=1, default=None)
>>> y = VariableWithCount(name='y', count=2, minimum=0, default=None)
>>> values = Multiset('aaabbc')

Then the solutions are found (and sorted to get a unique output):

>>> substitutions = commutative_sequence_variable_partition_iter(values, [x, y])
>>> as_strings = list(str(Substitution(substitution)) for substitution in substitutions)
>>> for substitution in sorted(as_strings):
...     print(substitution)
{x ↦ {a, a, a, b, b, c}, y ↦ {}}
{x ↦ {a, a, a, c}, y ↦ {b}}
{x ↦ {a, b, b, c}, y ↦ {a}}
{x ↦ {a, c}, y ↦ {a, b}}
Parameters:
  • values – The multiset of values which are partitioned and distributed among the variables.
  • variables – A list of the variables to distribute the values among. Each variable has a name, a count of how many times it occurs and a minimum number of values it needs.
Yields:

Each possible substitutions that is a valid partitioning of the values among the variables.

get_short_lambda_source(lambda_func: function) → Optional[str]

Return the source of a (short) lambda function. If it’s impossible to obtain, return None.

The source is returned without the lambda and signature parts:

>>> get_short_lambda_source(lambda x, y: x < y)
'x < y'

This should work well for most lambda definitions, however for multi-line or highly nested lambdas, the source extraction might not succeed.

Parameters:lambda_func – The lambda function.
Returns:The source of the lambda function without its signature.
solve_linear_diop(total: int, *coeffs) → Iterator[Tuple[int, ...]]

Yield non-negative integer solutions of a linear Diophantine equation of the format \(c_1 x_1 + \dots + c_n x_n = total\).

If there are at most two coefficients, base_solution_linear() is used to find the solutions. Otherwise, the solutions are found recursively, by reducing the number of variables in each recursion:

  1. Compute \(d := gcd(c_2, \dots , c_n)\)
  2. Solve \(c_1 x + d y = total\)
  3. Recursively solve \(c_2 x_2 + \dots + c_n x_n = y\) for each solution for \(y\)
  4. Combine these solutions to form a solution for the whole equation
Parameters:
  • total – The constant of the equation.
  • *coeffs – The coefficients \(c_i\) of the equation.
Yields:

The non-negative integer solutions of the equation as a tuple \((x_1, \dots, x_n)\).

generator_chain(initial_data: T, *factories) → Iterator[T]

Chain multiple generators together by passing results from one to the next.

This helper function allows to create a chain of generator where each generator is constructed by a factory that gets the data yielded by the previous generator. So each generator can generate new data dependant on the data yielded by the previous one. For each data item yielded by a generator, a new generator is constructed by the next factory.

Example

Lets say for every number from 0 to 4, we want to count up to that number. Then we can do something like this using list comprehensions:

>>> [i for n in range(1, 5) for i in range(1, n + 1)]
[1, 1, 2, 1, 2, 3, 1, 2, 3, 4]

You can use this function to achieve the same thing:

>>> list(generator_chain(5, lambda n: iter(range(1, n)), lambda i: iter(range(1, i + 1))))
[1, 1, 2, 1, 2, 3, 1, 2, 3, 4]

The advantage is, that this is independent of the number of dependant generators you have. Also, this function does not use recursion so it is safe to use even with large generator counts.

Parameters:
  • initial_data – The initial data that is passed to the first generator factory.
  • *factories – The generator factories. Each of them gets passed its predecessors data and has to return an iterable. The data from this iterable is passed to the next factory.
Yields:

Every data item yielded by the generators of the final factory.

class cached_property(getter, slot=None)

Bases: property

Property with caching.

An extension of the builtin property, that caches the value after the first access. This is useful in case the computation of the property value is expensive.

Use it just like a regular property decorator. Cached properties cannot have a setter.

Example

First, create a class with a cached property:

>>> class MyClass:
...     @cached_property
...     def my_property(self):
...         print('my_property called')
...         return 42
>>> instance = MyClass()

Then, access the property and get the computed value:

>>> instance.my_property
my_property called
42

Now the result is cached and the original method will not be called again:

>>> instance.my_property
42
__init__(getter, slot=None)

Use it as a decorator:

>>> class MyClass:
...     @cached_property
...     def my_property(self):
...         return 42

The slot argument specifies a class slot to use for caching the property. You should use the slot_cached_property decorator instead as that is more convenient.

Parameters:
  • getter – The getter method for the property.
  • slot – Optional slot to use for the cached value. Only relevant in classes that use slots. Use slot_cached_property instead.
Returns:

The wrapped property with caching.

slot_cached_property(slot)

Property with caching for classes with slots.

This is a wrapper around cached_property to be used with classes that have slots. It provides an extension of the builtin property, that caches the value in a slot after the first access. You need to specify which slot to use for the cached value.

Example

First, create a class with a cached property and a slot to hold the cached value:

>>> class MyClass:
...     __slots__ = ('_my_cached_property', )
...
...     @slot_cached_property('_my_cached_property')
...     def my_property(self):
...         print('my_property called')
...         return 42
...
>>> instance = MyClass()

Then, access the property and get the computed value:

>>> instance.my_property
my_property called
42

Now the result is cached and the original method will not be called again:

>>> instance.my_property
42
Parameters:slot – The name of the classes slot to use for the cached value.
Returns:The wrapped cached_property.
extended_euclid(a: int, b: int) → Tuple[int, int, int]

Extended Euclidean algorithm that computes the Bézout coefficients as well as \(gcd(a, b)\)

Returns x, y, d where x and y are a solution to \(ax + by = d\) and \(d = gcd(a, b)\). x and y are a minimal pair of Bézout’s coefficients.

See Extended Euclidean algorithm or Bézout’s identity for more information.

Example

Compute the Bézout coefficients and GCD of 42 and 12:

>>> a, b = 42, 12
>>> x, y, d = extended_euclid(a, b)
>>> x, y, d
(1, -3, 6)

Verify the results:

>>> import math
>>> d == math.gcd(a, b)
True
>>> a * x + b * y == d
True
Parameters:
  • a – The first integer.
  • b – The second integer.
Returns:

A tuple with the Bézout coefficients and the greatest common divider of the arguments.

base_solution_linear(a: int, b: int, c: int) → Iterator[Tuple[int, int]]

Yield solutions for a basic linear Diophantine equation of the form \(ax + by = c\).

First, the equation is normalized by dividing \(a, b, c\) by their gcd. Then, the extended Euclidean algorithm (extended_euclid()) is used to find a base solution \((x_0, y_0)\).

All non-negative solutions are generated by using that the general solution is \((x_0 + b t, y_0 - a t)\). Because the base solution is one of the minimal pairs of Bézout’s coefficients, for all non-negative solutions either \(t \geq 0\) or \(t \leq 0\) must hold. Also, all the non-negative solutions are consecutive with respect to \(t\).

Hence, by adding or subtracting \(a\) resp. \(b\) from the base solution, all non-negative solutions can be efficiently generated.

Parameters:
  • a – The first coefficient of the equation.
  • b – The second coefficient of the equation.
  • c – The constant of the equation.
Yields:

Each non-negative integer solution of the equation as a tuple (x, y).

Raises:

ValueError – If any of the coefficients is not a positive integer.