Counting anagrams¶

In [3]:
from itertools import permutations
import unittest
from math import factorial

In this notebook we address the problem of listing or counting anagrams of a word. Consider for example the word pot. By rearranging the letters we get six different strings: pot, opt, top, pto, otp and tpo. We will call all of these anagrams of pot, even though they are not real words. Similarly, the anagrams of boo are boo, obo and oob. Note that repeated letters reduce the number of anagrams: here we have only three instead of the usual six.

Below we define a function list_anagrams(word), which lists all the anagrams of word (using a rather inefficient algorithm, which will be slow if the word is long). We also define a function count_anagrams(word) which determines the number of anagrams without actually listing them. This is fast even for long words.

In [1]:
def list_anagrams(word):
    """
    Returns a list of (real or fake) anagrams of word.
    """
    word = word.lower()
    anagrams = [str.join('', w) for w in permutations(word)]
    anagrams = sorted(list(set(anagrams)))
    return anagrams

def count_anagrams(word):
    """
    Returns the number of anagrams of word.

    This function counts all anagrams; the anagrams do not have to be real words.
    """
    letter_counts = {}
    for c in word:
        if c in letter_counts:
            letter_counts[c] += 1
        else:
            letter_counts[c] = 1
    m = factorial(len(word))
    for key in letter_counts:
        m = m // factorial(letter_counts[key])
    return m
In [12]:
# This cell might take 10 seconds to run.
L = []
n = 0
%timeit -n1 L = list_anagrams('beeblebrox')
%timeit -n1 n = count_anagrams('beeblebrox')
print(f"n={n}, len(L)={len(L)}")
1.08 s ± 46.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
2.77 µs ± 767 ns per loop (mean ± std. dev. of 7 runs, 1 loop each)
n=100800, len(L)=100800
In [ ]: