Counting anagrams¶

In [1]:
from itertools import permutations
import math
import unittest

In this notebook we address the problem of listing or counting anagrams of a word. Consider for example the word pot. By rearranging the letters we get six different strings: pot, opt, top, pto, otp and tpo. We will call all of these anagrams of pot, even though they are not real words. Similarly, the anagrams of boo are boo, obo and oob. Note that repeated letters reduce the number of anagrams: here we have only three instead of the usual six.

Below we define a function list_anagrams(word), which lists all the anagrams of word (using a rather inefficient algorithm, which will be slow if the word is long). We also define a function count_anagrams(word) which determines the number of anagrams without actually listing them. This is fast even for long words.

In [2]:
def list_anagrams(word):
    """
    Returns a list of (real or fake) anagrams of word.
    """
    word = word.lower()
    anagrams = [str.join('', w) for w in permutations(word)]
    anagrams = sorted(list(set(anagrams)))
    return anagrams

def count_anagrams(word):
    """
    Returns the number of anagrams of word.

    This function counts all anagrams; the anagrams do not have to be real words.
    """
    letter_counts = {}
    for c in word:
        if c in letter_counts:
            letter_counts[c] += 1
        else:
            letter_counts[c] = 1
    m = math.factorial(len(word))
    for key in letter_counts:
        m = m // math.factorial(letter_counts[key])
    return m
In [10]:
# This cell might take 10 seconds to run.
%timeit -n1 list_anagrams('beeblebrox')
%timeit -n1 count_anagrams('beeblebrox')
L = list_anagrams('beeblebrox')
n = count_anagrams('beeblebrox')
print(f"n={n}, len(L)={len(L)}")
614 ms ± 12.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
1.63 µs ± 858 ns per loop (mean ± std. dev. of 7 runs, 1 loop each)
n=100800, len(L)=100800
In [13]:
class AnagramTest(unittest.TestCase):
    def test_0_count_anagrams(self):
        """len(list_anagrams(w)) = count_anagrams(w) for some sample words."""
        # For some sample words, check that the length of the list returned by 
        # list_anagrams() is the same as the number returned by count_anagrams().
        for w in ['boot','twelve','aaa']:
            with self.subTest(w=w):
                self.assertEqual(count_anagrams(w), len(list_anagrams(w)))

    def test_1_member(self):
        """w is a member of list_anagrams(w) for some sample words."""
        # For some sample words, check that the word itself is contained in the 
        # list returned by list_anagrams().
        for w in ['boot','twelve','aaa']:
            with self.subTest(w=w):
                self.assertTrue(w in list_anagrams(w))

    def test_2_single_letter(self):
        """list_anagrams('aaaa') = ['aaaa']"""
        # In general, if a word consists of a single letter repeated n times, 
        # then there is only one anagram, namely the word itself.  We check 
        # this in the case of the word aaaa.
        self.assertEqual(list_anagrams('aaaa'), ['aaaa'])

    def test_3_two_letters(self):
        """len(list_anagrams('aabbbb')) = math.comb(6,2)"""
        # In general, if a word consists of a single letter repeated n times
        # followed by a different letter repeated m times, then the number of
        # anagrams is the binomial coefficient (n+m choose n).  We check 
        # this in the case of the word aabbbb.
        self.assertEqual(len(list_anagrams('aabbbb')), math.comb(6,2))

    def test_4_distinct_letters(self):
        """len(list_anagrams('abcdef')) = 6!"""
        # In general, if a word consists of a n distinct letters with no 
        # repetitions, then the number of anagrams is n!.  We check 
        # this in the case of the word abcdef.
        self.assertEqual(len(list_anagrams('abcdef')), math.factorial(6))
        
In [14]:
unittest.main(argv=[''], verbosity=2, exit=False)
test_0_count_anagrams (__main__.AnagramTest.test_0_count_anagrams)
len(list_anagrams(w)) = count_anagrams(w) for some sample words. ... ok
test_1_member (__main__.AnagramTest.test_1_member)
w is a member of list_anagrams(w) for some sample words. ... ok
test_2_single_letter (__main__.AnagramTest.test_2_single_letter)
list_anagrams('aaaa') = ['aaaa'] ... ok
test_3_two_letters (__main__.AnagramTest.test_3_two_letters)
len(list_anagrams('aabbbb')) = math.comb(6,2) ... ok
test_4_distinct_letters (__main__.AnagramTest.test_4_distinct_letters)
len(list_anagrams('abcdef')) = 6! ... ok

----------------------------------------------------------------------
Ran 5 tests in 0.005s

OK
Out[14]:
<unittest.main.TestProgram at 0x280bc826110>
In [ ]: