Skip to content

Broadcasting

Broadcasting is how NumPy lets you do math between arrays of different shapes. Once you understand it, you'll write half as much code.

The simplest case — scalar + array

import numpy as np

a = np.array([1, 2, 3, 4])
print(a + 5)        # [6, 7, 8, 9]

The scalar 5 is broadcast to the same shape as a.

Adding a 1D row to every row of a 2D array

import numpy as np

matrix = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
])
row = np.array([10, 20, 30])

print(matrix + row)

Result:

[[11 22 33]
 [14 25 36]
 [17 28 39]]

NumPy stretched row from shape (3,) to (3, 3) — adding it to every row.

Subtracting a column

To add a different value to each row (a column vector), reshape first:

import numpy as np

matrix = np.array([
    [1, 2, 3],
    [4, 5, 6],
    [7, 8, 9],
])
col = np.array([100, 200, 300])

# Without reshaping: 3 vs 3 cols → applies to rows. WRONG for our purpose.
print(matrix + col)

# Reshape to a column
col2 = col.reshape(-1, 1)        # shape (3, 1)
print(matrix + col2)

Output of matrix + col2:

[[101 102 103]
 [204 205 206]
 [307 308 309]]

The rules of broadcasting

NumPy compares shapes right-to-left. Two dimensions are compatible if:

  1. They are equal, OR
  2. One of them is 1.

If they're incompatible → ValueError.

shape A shape B Compatible? Result shape
(3, 4) (4,) yes (3, 4)
(3, 4) (3, 1) yes (3, 4)
(3, 4) (1, 4) yes (3, 4)
(3, 4) (3, 4) yes (3, 4)
(3, 4) (3, 2) NO error
(3, 1, 5) (4, 5) yes (3, 4, 5)

Walk-through for (3, 4) + (4,):

A: (3, 4)
B:    (4,)        ← align right
   (3, 4) — same!
   (4 == 4) — ok
   B has no first axis → treated as 1 → stretches to 3

A practical example — feature scaling

For ML, you often need to subtract the mean of each column and divide by the std:

import numpy as np

# 5 samples, 3 features
data = np.array([
    [1.0, 2.0, 100.0],
    [2.0, 3.0, 200.0],
    [3.0, 4.0, 300.0],
    [4.0, 5.0, 400.0],
    [5.0, 6.0, 500.0],
])

# Per-column statistics — shape (3,)
mean = data.mean(axis=0)
std  = data.std(axis=0)

print("mean:", mean)
print("std :", std)

# Broadcasting: (5,3) - (3,) → (5,3)
scaled = (data - mean) / std
print("\nscaled:")
print(scaled)
print("scaled.mean(axis=0):", scaled.mean(axis=0).round(4))
print("scaled.std(axis=0) :", scaled.std(axis=0).round(4))

Each column now has mean ≈ 0 and std ≈ 1. That's a one-liner thanks to broadcasting.

Outer product — every combination

A 1D column times a 1D row gives a 2D table of all products:

import numpy as np

x = np.arange(1, 6)          # shape (5,)
y = np.arange(1, 4)          # shape (3,)

# Reshape to column × row
table = x[:, None] * y[None, :]
print(table)
print("shape:", table.shape)   # (5, 3)

x[:, None] is shape (5, 1); y[None, :] is (1, 3). Broadcasting expands both → (5, 3).

Coordinate grids

import numpy as np

x = np.arange(-2, 3)
y = np.arange(-2, 3)

# Build all (x, y) combinations
xx, yy = np.meshgrid(x, y)
print("xx:")
print(xx)
print("yy:")
print(yy)

# Compute z = x² + y² at every grid point
z = xx**2 + yy**2
print("\nz:")
print(z)

Useful for plotting surfaces, image filters, mathematical functions.

When broadcasting fails — visualize the shapes

import numpy as np

a = np.zeros((3, 4))
b = np.zeros((3, 2))

try:
    a + b
except ValueError as e:
    print("Error:", e)

(3, 4) + (3, 2) — last dims 4 and 2 are not equal and neither is 1 → fails.

Fix it by aligning shapes explicitly (often with reshape or [:, None]).

More examples

Distance from each point to a center:

import numpy as np

points = np.array([
    [1, 2],
    [3, 4],
    [5, 6],
    [7, 8],
])
center = np.array([0, 0])

# (4, 2) - (2,) → (4, 2)
diffs = points - center
print("diffs:")
print(diffs)

# Euclidean distance per point
distances = np.sqrt((diffs ** 2).sum(axis=1))
print("distances:", distances)

Multiplication table:

import numpy as np

n = 10
nums = np.arange(1, n + 1)
table = nums[:, None] * nums[None, :]
print(table)

Broadcasting and memory

Broadcasting doesn't actually copy data — it pretends to. NumPy uses clever strides to reuse memory. So broadcasting is fast and memory-efficient.

Cheatsheet — common patterns

Goal Shape match
Add scalar to array arr + 5
Add row vector to every row arr (M,N) + row (N,)
Add col vector to every col arr (M,N) + col[:, None] (M,1)
Outer product a[:, None] * b[None, :]
Normalize each column (arr - arr.mean(axis=0)) / arr.std(axis=0)
Normalize each row (arr - arr.mean(axis=1, keepdims=True)) / arr.std(axis=1, keepdims=True)

keepdims=True is the trick — keeps the reduced dim as size 1 so it broadcasts back.

Common pitfalls

  • Forgetting keepdims=Truearr.sum(axis=1) reduces shape from (M, N) to (M,). Then arr - that broadcasts incorrectly. Use arr.sum(axis=1, keepdims=True) (shape (M, 1)) so it broadcasts back to (M, N).
  • Adding a row instead of a column — always check the shapes. (M, N) + (M,) will FAIL or do the wrong thing. Reshape to (M, 1).
  • Operator precedence in & / | — wrap parts in parens: (a > 1) & (a < 5), not a > 1 & a < 5.
  • Mixing dtypesint + float → float. Sometimes surprising.

Practice

What does this print?

Expected: [[11 22 33] [14 25 36]]

import numpy as np
m = np.array([[1, 2, 3], [4, 5, 6]])
row = np.array([10, 20, 30])
print(m + row)

Add col to every column (not every row) of the matrix

Expected: [[101 102 103] [204 205 206] [307 308 309]]

import numpy as np
m = np.array([[1,2,3],[4,5,6],[7,8,9]])
col = np.array([100, 200, 300])
print(m + col)        # bug: this broadcasts col across ROWS — reshape to (3,1)

Quiz — Quick check

What you remember

Q1. Broadcasting (3, 4) + (4,) produces a result of shape…

  • (3, 4)
  • (3, 1)
  • (4, 4)
  • Error — shapes don't match

Why: NumPy aligns shapes right-to-left. (4,) becomes (1, 4), then stretches to (3, 4) — same as the first operand.

Q2. Why does (3, 4) + (3, 2) fail?

  • Last dims (4 and 2) are neither equal nor 1
  • First dims don't match
  • NumPy doesn't broadcast at all
  • You need np.broadcast

Why: Two dimensions can broadcast only if they're equal OR one is 1. Neither applies to 4 vs 2, so NumPy raises ValueError.

Q3. When normalizing columns of a (M, N) array with (arr - arr.mean(axis=0)) / arr.std(axis=0), which axis is correct?

  • axis=0 — collapses rows, gives per-column stats
  • axis=1
  • axis=-1
  • No axis needed

Why: axis=0 reduces along the first axis (rows), producing per-column statistics. Then broadcasting expands the result back to (M, N).

Common doubts

Does broadcasting actually copy memory?

No — it uses strides to pretend the smaller array is bigger. The data isn't duplicated. That's why broadcasting is both fast and memory-efficient.

Why do people use keepdims=True so often?

Because reductions collapse a dimension. After arr.mean(axis=1) for shape (M, N), you get (M,). Subtracting that from the original via broadcasting may fail or do the wrong thing. keepdims=True keeps the dimension as size 1 ((M, 1)), which broadcasts cleanly back to (M, N).

When should I reach for np.meshgrid vs broadcasting x[:, None] * y[None, :]?

They achieve the same thing. np.meshgrid is more explicit and produces both xx and yy matrices — better for plotting. The [:, None] trick is shorter and produces just the result. Use whichever is clearer in context.

What's next

Aggregations — sum, mean, std, axis