Broadcasting arrays¶
This notebook covers roughly the same ground as this video. There is also an interactive demonstration to help you understand some of the details.
import numpy as np
Remember that matrix multiplication is represented by a @ b in Python (and this is only meaningful if the number of columns of a is the same as the number of rows of b). We can also enter a * b but that is different. If a and b have the same shape, then a * b will also have the same shape. The entries in a * b will just be the products of the entries in a with the corresponding entries in b.
a = np.array([1,2,3])
b = np.array([10,100,1000])
print(f'a = {a} b = {b} a * b = {a * b} a + b = {a + b}')
Now suppose we enter a * b in a case where $a$ and $b$ have different shapes. In some cases numpy will expand $a$ and/or b to make them have the same shape, and then multiply them. For example, suppose that $a=\left[\begin{array}{cc}3 & 4\end{array}\right]$ and $b=\left[\begin{array}{c}10 \\ 100\end{array}\right]$. Then numpy will expand $a$ vertically to make the $2\times 2$ matrix $A=\left[\begin{array}{cc}3 & 4\\ 3&4\end{array}\right]$ (in which both rows are equal to the original $a$). It will also expand $b$ horizontally to make the $2\times 2$ matrix $B=\left[\begin{array}{cc}10 & 10\\ 100&100\end{array}\right]$ (in which both columns are equal to the original $b$). It will then multiply corresponding elements in $A$ and $B$ to make the matrix $a*b=A*B=\left[\begin{array}{cc}30 & 40\\ 300&400\end{array}\right]$. This kind of expansion is called broadcasting.
a = np.array([[3,4]]) # Row vector
b = np.array([[10],[100]]) # Column vector
print(f'a = {a}\n')
print(f'b = \n{b}\n')
print(f'a * b = \n{a * b}\n')
As another example, suppose we want to make a matrix $M$ with $M_{pq}=p+q\sqrt{-1}$ (elsewhere we will see examples where this is useful). We first make a column vector $u$ with $u_p=p$, and a row vector $v$ with $v_q=q\sqrt{-1}$. We then enter u + v. As $u$ and $v$ have different shapes, it is not possible to add them directly. Instead, numpy broadcasts $u$ to make a matrix $U_{pq}=u_p=p$, and it broadcasts $v$ to make a matrix $V$ with $V_{pq}=v_q=q\sqrt{-1}$. The matrix $M=U+V$ then has $M_{pq}=U_{pq}+V_{pq}=u_p+v_q=p+q\sqrt{-1}$ as required.
u = np.arange(3).reshape((1,3)) # Row vector
v = np.arange(4).reshape((4,1)) * 1j # Column vector
M = u + v
print(f'M = \n{M}\n')
As another example, consider a $3\times 3$ matrix $C$ and a vector $d$ of length $3$. If we evaluate C * d then d will be broadcast to make a $3\times 3$ matrix $D$ in which every row is equal to $d$, then C * d will be the same as C * D. The upshot is that the three columns of $C$ get multiplied by $d_0$, $d_1$ and $d_2$ respectively.
C = np.arange(9).reshape((3,3))
d = np.array([1,11,111])
print(C * d)
Many of the applications of broadcasting are covered by the patterns discussed above, but there are also other possibilities. The full story can be found at numpy.org.