MAS334 Combinatorics

1. Counting sets

We begin with a very simple point, which is easy to get slightly wrong (Google “off-by-one error”).

Definition 1.1.

In this course, we will rarely be using real numbers. We will therefore use interval notation to refer to intervals of integers:

[n,m] ={k|nkm} (n,m] ={k|n<km}
[n,m) ={k|nk<m} (n,m) ={k|n<k<m}.

For example, we have

[3,7] ={3,4,5,6,7} (3,7] ={4,5,6,7}
[3,7) ={3,4,5,6} (3,7) ={4,5,6}.

The sizes of these sets are

|[n,m]| =m-n+1 |(n,m]| =m-n
|[n,m)| =m-n |(n,m)| =m-n-1.

(The first three of these are valid for nm, but the last is only valid for n<m.) For example, we have

|[3,7]| =5=7-3+1 |(3,7]| =4=7-3
|[3,7)| =4=7-3 |(3,7)| =3=7-3-1.

It is a common mistake to say that |[n,m]|=m-n or |(n,m)|=m-n, but the above examples show that this is not correct.

Remark 1.2.

Here is a related observation: if we have a fence consisting of n sections supported by fenceposts, then the number of posts is one more than the number of sections. Each section has a post at the right hand end, and there is one more post at the left hand end of the whole fence. (Google “fencepost error”.)

Definition 1.3.

A binary sequence of length n is a sequence a=(a1,,an) with ai{0,1}. We write Bn for the set of binary sequences of length n. We also write Bnk for the subset of binary sequences of length n in which there are k ones.

Example 1.4.
  • The sequence a=(0,1,0,1,1,0) is a binary sequence of length 6, so aB6. We will typically use abbreviated notation and write a=010110 instead of a=(0,1,0,1,1,0). As there are 3 ones in a, we can also say that aB63.

  • The full list of elements of B3 is

    B3={000,001,010,011,100,101,110,111}.

    (We have written these in dictionary order, which is good practice. It is much easier to deal with these kind of constructions if we list things in a systematic and consistent order.)

  • The full list of elements of B53 is

    B53={00111,01011,01101,01110,10011,10101,10110,11001,11010,11100}.

In the above example we saw that |B3|=8=23. Of course, this can be generalised.

Proposition 1.5.

|Bn|=2n.

Proof.

To choose an element a=(a1,,an)Bn we have 2 choices for a1, 2 choices for a2 and so on, making 2×2××2=2n choices for the sequence as a whole. ∎

Definition 1.6.

Let A be a finite set. We let PA denote the set of all subsets of A. We also let PkA denote the set of all subsets of size k in A.

Example 1.7.

Take A={a,b,c}. Then

PA={,{a},{b},{c},{a,b},{a,c},{b,c},{a,b,c}}.

We also have

P2A={{a,b},{a,c},{b,c}}.

We might also use more abbreviated notation:

PA={,a,b,c,ab,ac,bc,abc}.
Proposition 1.8.

If |A|=n then |PA|=2n.

Proof.

List the elements of A as x1,,xn. To choose a subset of A, we first choose whether to include x1, then choose whether to include x2 and so on. We have two choices for each xi, and thus 2n choices altogether.

Here is another way to say essentially the same thing. Given a binary sequence a=(a1,,an), we define

Ua={xi|ai=1}.

For example, in the case n=6, we have

U111000={x1,x2,x3}     U100011={x1,x5,x6}.

(In the left hand example, we have ones in positions 1, 2 and 3, so the set is {x1,x2,x3}. In the right hand example, we have ones in positions 1, 5 and 6, so the set is {x1,x5,x6}.) This construction gives a one-to-one correspondence between subsets of A and binary sequences, so |PA|=|Bn|=2n. ∎

Definition 1.9.

For a finite set A, we define FkA to be the set of sequences a=(a1,,ak) such that the entries ai are distinct elements of A.

Example 1.10.

If A={a,b,c} then

F2A ={ab,ac,ba,bc,ca,cb}
F3A ={abc,acb,bac,bca,cab,cba}.
Proposition 1.11.

If |A|=n and 0kn then

|FkA|=i=0k-1(n-i)=n(n-1)(n-k+1)=n!(n-k)!.

In particular, we have |FnA|=n!, but FkA is empty for k>n.

Proof.

Suppose we want to choose a sequence a=(a1,,ak)FkA. Then a1 can be any element of A, so there are n choices. Then a2 can be any element of A other than a1, so there are n-1 choices. Then a3 can be any element other than a1 and a2, so there are n-2 choices, and so on. At the last stage, ak can be any element of A except for a1,,ak-1, so there are n-(k-1)=n-k+1 choices. Thus, the overall number of choices is

|FkA|=n(n-1)(n-k+1)=i=0k-1(n-i).

Note also that

n! =n×(n-1)×(n-2)××2×1
=n×(n-1)××(n-k+1)×(n-k)×(n-k-1)×2×1
(n-k)! =(n-k)×(n-k-1)×2×1
n!/(n-k)! =n×(n-1)××(n-k+1)=|FkA|.

In particular, we have |FnA|=n!/0!=n!. In the other hand, if k>n, it is clear that we cannot have a list of k distinct elements in A, because A has only n elements; so FkA=. ∎

Definition 1.12.

For integers n,k with 0kn we define

(nk)=n!k!(n-k)!.

For k<0 or k>n we define (nk)=0. In this course, we will consider (nk) to be undefined for n<0.

Corollary 1.13.

If |A|=n, then |PkA|=(nk).

Proof.

If k>n it is clear that PkA is empty so |PkA|=0=(nk). Suppose instead that 0kn. Every list a=(a1,,ak)FkA gives a subset Ua={a1,,ak}PkA, and every subset of size k arises in this way. However, we can reorder the list a in k! different ways, and they all give the same subset. Thus, we have

|PkA|=|FkA|k!=n!k!(n-k)!=(nk).

Corollary 1.14.

We also have |Bnk|=(nk).

Proof.

To specify an element of Bnk, we just need to specify the k positions in [1,n] where the ones appear. There are (nk) subsets of size k in [1,n], so |Bnk|=(nk). ∎

Problem 1.15.

Suppose that 6 people compete in an Olympic pie-eating competition. In how many ways can the medals be awarded? If the BBC decides to interview three of the finalists, chosen at random, in how many ways can they do that? What if there were 100 finalists?

Solution.

Let A be the set of competitors, so |A|=6. For the first question, we need an ordered list of three distinct medal winners (gold, then silver, then bronze), so the number of possibilities is |F3A|=6×5×4=120. In more detail, there are 6 choices for who gets the gold. When we have awarded the gold, there are 5 choices left for who gets silver, then 4 choices for who gets bronze. Thus, the total number of ways in which the medals can be awarded is 6×5×4=120.

For the second question, we need an unordered set of three interviewees, so the number of possibilities is |P3A|=(63)=20. In more detail, there are 120 possible choices for the list of people who get interviewed, in the order in which they get interviewed. But we do not care about the order, we only care about the set of interviewees. So we need to divide by the number of possible orders, which is 3!=6. Thus, the number of ways to choose a set of three interviewees is 120/6=20.

If there were 100 finalists, then the number of ways of awarding the medals would be 100×99×98=9702009.7×105, and the number of ways of choosing the interviewees would be (100×99×98)/6=1617001.6×105.

Problem 1.16.

In the National Lottery, six balls are drawn from a set of 59 balls. How many possible outcomes are there?

Solution.

We need to count the subsets of size 6 in a set of size 59; the answer is

(596)=59×58×57×56×55×546!=450574744.5×107.

The other familiar place where we see binomial coefficients is in the binomial expansion formula:

(1+x)n=k=0n(nk)xk.

We next recall how this works.

Example 1.17.

Consider the case n=4. We have

(1+x)4= (1+x)(1+x)(1+x)(1+x)
= 1111+111x+11x1+11xx+1x11+1x1x+1xx1+1xxx+
x111+x11x+x1x1+x1xx+xx11+xx1x+xxx1+xxxx
= 1111+
111x+11x1+1x11+x111+
11xx+1x1x+1xx1+x11x+x1x1+xx11+
1xxx+x1xx+xx1x+xxx1+
xxxx
= 1+4x+6x2+4x3+x4=k=04(4k)xk.

In the first step we have just expanded everything out in the obvious way, writing the terms in dictionary order. Each term is a product of four factors, each of which is either 1 or x. To generate all the terms, we have to make 4 choices of whether to have a 1 or an x, giving 24=16 terms altogether. In the second step, we just regroup the terms according to how many x’s appear. There is one term with no x’s, 4 terms with one x, 6 terms with two x’s, 4 terms with three x’s and one term with four x’s. In general, to generate a term with k x’s, we just need to choose k slots from 4 in which the x’s appear, and put ones in the other slots. Thus, there are (4k) terms with k x’s, and each of these contributes xk to the expansion. Thus, we have (1+x)4=k(4k)xk.

As an exercise in notation, we can write this slightly differently. Let A be a subset of {1,,n}. Let tA be the term in the expansion where we take x from the factors corresponding to iA, and 1 from the factors corresponding to iA. The number of x’s is then equal to |A|, so the product is x|A|. We get a term for every possible subset A{1,,n}, so we get

(1+x)n=AtA=Ax|A|.

The number of xk’s in this sum is the number of subsets A such that |A|=k, or in other words (nk). We therefore have (1+x)n=k(nk)xk as before.

Proposition 1.18.

For any n0, we have 1+2++n=(n+12).

Proof.

Put Sn=1+2++(n-1)+n. We can rewrite this with the terms in reverse order as Sn=n+(n-1)++2+1. Adding these two equations together, we get

2Sn=(1+n)+(2+(n-1))++((n-1)+2)+(n+1).

The right hand side consists of n terms, each of which is equal to n+1, so the total is (n+1)n. It follows that Sn=(n+1)n/2=(n+12) as claimed. This proof can be illustrated as shown below: there are Sn red dots above the diagonal line and Sn blue dots below it, showing that 2Sn=(n+1)n.

Alternatively, we can give a proof by induction. For n=0 the claim is that 0=(12), which is clear. For n=1 the claim is that 1=(22), which is also clear. For n>1, we can assume as an induction hypothesis that

Sn-1=1+2++(n-1)=(n2)=n(n-1)/2=12n2-12n.

Adding n to both sides, we get

Sn=(1+2++(n-1))+n=12n2-12n+n=12(n2+n)=(n+12),

as required. ∎

Proposition 1.19.

For n,k with (n,k)(0,0) we have (nk)=(n-1k)+(n-1k-1).

Note that we have explicitly excluded the case n=k=0. If k=0 and n>0 then the claim is that 1=1+0 which is true. If k>n then the claim is that 0=0+0 which is true. If k=n>0 then the claim is that 1=0+1 which is true. This just leaves the interesting case where 0<k<n. We will give two different proofs for this case.

Bijective proof.

The binomial coefficient (nk) is the number of subsets A[1,n] with |A|=k. To choose such a subset, we first decide whether we want n to be an element of A. If we decide that n should not be an element of k, then we just choose A to be a subset of size k in [1,n-1], and there are (n-1k) possibilities for this. If we decide that we do want n to be an element of A, then we need to choose a further k-1 elements from [1,n-1] to make up the rest of A, and there are (n-1k-1) possibilities for this. Thus, we have (n-1k)+(n-1k-1) possibilities for A, and this must agree with the number (nk) that we obtained more directly. ∎

Algebraic proof.

Recall that (nk)=n!k!(n-k)!. On the top we have

n!=n×(n-1)×(n-2)××2×1=n×((n-1)!).

We can also write the n here as (n-k)+k, giving n!=(n-k)×(n-1)!+k×(n-1)!. By substituting this into the definition of (nk), we get

(nk)=(n-k)(n-1)!k!(n-k)!+k(n-1)!k!(n-k)!.

In the first term, we can rewrite k! as k×(k-1)!, and in the second term, we can rewrite (n-k)! as (n-k)(n-k-1)!. (These are valid because we are assuming that 0<k<n, so k,n-k>0.) This gives

(nk) =(n-k)(n-1)!k!(n-k)(n-k-1)!+k(n-1)!k(k-1)!(n-k)!
=(n-1)!k!(n-1-k)!+(n-1)!(k-1)!(n-k)!=(n-1k)+(n-1k-1).

Proposition 1.20.

For 0kn we have (nk)=(nn-k).

Bijective proof.

The binomial coefficient (nk) is the number of subsets A[1,n] size k in [1,n]. To choose such a subset, we can just choose a subset B[1,n] of size n-k in [1,n], and take A=Bc. This gives a one-to-one correspondence between subsets of size k and subsets of size n-k, so (nk)=(nn-k). ∎

Algebraic proof.
(nn-k)=n!(n-k)!(n-(n-k))!=n!(n-k)!k!=(nk).

Definition 1.21.

A subset A[1,n] is gappy if there are no adjacent elements. In more detail, the condition is that there should not exist a[1,n-1] such that aA and a+1A. Similarly, we say that a binary sequence is gappy if it has no adjacent ones. We write Gnk for the set of gappy subsets A[1,n] with |A|=k.

Example 1.22.

The set A={1,5,7,11}[1,20] is gappy. The set B={1,5,6,11} is not gappy, because it contains the adjacent elements 5 and 6. The full list of elements of G73 is

G73={135,136,137,146,147,157,246,247,257,357},

so |G73|=10.

Proposition 1.23.

If n2k-1 then |Gnk|=(n-k+1k), but if n<2k-1 then Gnk= so |Gnk|=0.

Proof.

We have discussed before that subsets of [1,n] correspond to binary sequences, and it is clear that gappy subsets correspond to gappy sequences, so we will work with binary sequences from now on. We will also assume that k>0, leaving the trivial case k=0 to the reader. Suppose that we have a gappy sequence aGnk. The first one in a might appear in the very first position, so it need not be preceded by a zero. However, there are k-1 more ones, and by the gappy condition, each of them must have a zero immediately before it. The ones and these adjacent zeros take 2k-1 slots altogether, so we must have n2k-1. This shows that Gnk= if n<2k-1; we will assume that n2k-1 from now on. If we delete these zeros, we get a binary sequence of length n-k+1 containing k ones, or in other words, and element of Bn-k+1,k. On the other hand, if we are given an element of Bn-k+1,k, then we can get an element of Gnk by inserting zeros to the left of all the ones, except for the first one. Thus, we have a one-to-one correspondence between Gnk and Bn-k+1,k, showing that |Gnk|=|Bn-k+1,k|=(n-k+1k).

Here are some examples of elements of G73, and the corresponding elements of B53:

B53G731 0 1 0 11 0 0 1 0 0 10 1 1 1 00 1 0 1 0 1 01 1 0 0 11 0 1 0 0 0 1

Problem 1.24.

Suppose that a doctor’s surgery has a single row of 12 chairs, and there are 5 patients waiting. In how many ways can they be seated such that no two are next to each other?

Solution.

The number is |G12,5|=(12-5+15)=(85)=56.

Problem 1.25.

In a draw for the National Lottery (as in Problem 1.16), what is the probability that there is an adjacent pair of numbers?

Solution.

We are selecting an element of P6[1,59] at random, and we want to know the probability that it is not gappy. The total size of P6[1,59] is (596)4.5×107. The number of gappy sets is

|G6[1,59]|=(59-6+16)=(546)=288271652.6×107.

Thus, the number of non-gappy sets is (596)-(546)1.9×107, and the proportion of non-gappy sets is

(596)-(546)(596)1.9×1074.5×1070.42.

Thus, approximately 42% of draws will have an adjacent pair of numbers.