Stacking & Splitting Arrays¶
How to combine arrays end-to-end and break them back apart.
np.concatenate() — the general workhorse¶
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print(np.concatenate([a, b])) # [1 2 3 4 5 6]
For 2D, pick the axis:
import numpy as np
A = np.array([[1, 2], [3, 4]])
B = np.array([[5, 6], [7, 8]])
# Vertical (axis 0) — stack rows
print(np.concatenate([A, B], axis=0))
print()
# Horizontal (axis 1) — stack columns
print(np.concatenate([A, B], axis=1))
Convenience shortcuts¶
np.vstack= vertical stack =concatenate(axis=0)np.hstack= horizontal stack =concatenate(axis=1)np.dstack= depth stack = stack into the 3rd dimensionnp.stack= creates a NEW axis
import numpy as np
a = np.array([1, 2, 3])
b = np.array([4, 5, 6])
print("vstack:")
print(np.vstack([a, b])) # treats 1D as rows
# shape (2, 3)
print("\nhstack:")
print(np.hstack([a, b])) # [1 2 3 4 5 6]
# shape (6,)
print("\nstack (new axis):")
print(np.stack([a, b])) # shape (2, 3) — like vstack here
print(np.stack([a, b], axis=1)) # shape (3, 2) — pairs them as columns
stack vs vstack — stack creates a NEW axis; vstack just joins along the existing first axis.
import numpy as np
A = np.ones((3, 4))
B = np.zeros((3, 4))
print("vstack:", np.vstack([A, B]).shape) # (6, 4) — concat along axis 0
print("stack :", np.stack([A, B]).shape) # (2, 3, 4) — NEW axis
print("dstack:", np.dstack([A, B]).shape) # (3, 4, 2) — stack along last axis
Adding a column to a 2D array¶
import numpy as np
A = np.array([
[1, 2],
[3, 4],
[5, 6],
])
new_col = np.array([10, 20, 30])
# Reshape new_col to a column vector
result = np.hstack([A, new_col[:, None]])
print(result)
Splitting — opposite of stacking¶
import numpy as np
a = np.arange(12)
print(np.split(a, 3)) # split into 3 equal parts
print(np.split(a, [3, 7])) # split at indices 3 and 7
# Same with vsplit / hsplit
matrix = np.arange(24).reshape(4, 6)
print("Top half:")
top, bottom = np.vsplit(matrix, 2)
print(top)
print("Bottom half:")
print(bottom)
np.array_split() — handles uneven splits¶
import numpy as np
a = np.arange(10)
# `split` would fail because 10 doesn't divide evenly into 3
# np.split(a, 3) # ValueError
# array_split happily makes uneven chunks
for chunk in np.array_split(a, 3):
print(chunk)
Repeating arrays — np.repeat and np.tile¶
import numpy as np
a = np.array([1, 2, 3])
# Repeat each element 3 times
print(np.repeat(a, 3)) # [1 1 1 2 2 2 3 3 3]
# Tile the whole array 3 times
print(np.tile(a, 3)) # [1 2 3 1 2 3 1 2 3]
# 2D tile
print(np.tile(a, (2, 3))) # 2 rows, each tile a 3 times
Flip and roll¶
import numpy as np
a = np.array([1, 2, 3, 4, 5])
print(np.flip(a)) # [5 4 3 2 1]
print(np.roll(a, 2)) # [4 5 1 2 3] — shift right by 2
print(np.roll(a, -1)) # [2 3 4 5 1] — shift left by 1
For 2D, flip an axis:
import numpy as np
m = np.array([
[1, 2, 3],
[4, 5, 6],
])
print(np.flip(m, axis=0)) # flip vertically
print()
print(np.flip(m, axis=1)) # flip horizontally
Insert and append¶
import numpy as np
a = np.array([1, 2, 3, 4, 5])
# Insert 99 at index 2
print(np.insert(a, 2, 99))
# Insert multiple
print(np.insert(a, 2, [99, 88, 77]))
# Append to the end
print(np.append(a, 999))
print(np.append(a, [100, 200]))
Delete¶
import numpy as np
a = np.array([10, 20, 30, 40, 50])
print(np.delete(a, 2)) # remove index 2 → [10 20 40 50]
print(np.delete(a, [0, 4])) # remove indices 0 and 4
# For 2D
m = np.arange(12).reshape(3, 4)
print(np.delete(m, 1, axis=0)) # remove row 1
print(np.delete(m, 1, axis=1)) # remove column 1
These return COPIES (NumPy arrays are fixed-size).
A real example — building a feature matrix¶
import numpy as np
rng = np.random.default_rng(0)
# 5 samples, each with 3 base features
X = rng.random((5, 3))
# Add a bias column (intercept) at the start
ones = np.ones((5, 1))
X_with_bias = np.hstack([ones, X])
print("Shape before:", X.shape)
print("Shape after :", X_with_bias.shape)
print(X_with_bias.round(2))
That's the trick to fit a linear regression with one matrix expression (see Linear Algebra).
Cheatsheet¶
| Goal | Function |
|---|---|
| Combine along existing axis | np.concatenate([a, b], axis=...) |
| Stack as rows (1D arrays become rows) | np.vstack([a, b]) |
| Stack as cols / side-by-side | np.hstack([a, b]) |
| Stack creating a NEW axis | np.stack([a, b], axis=...) |
| Stack along last dim | np.dstack([a, b]) |
| Split into N equal parts | np.split(a, N) |
| Split allowing uneven | np.array_split(a, N) |
| Split at indices | np.split(a, [i1, i2]) |
| Repeat each elem | np.repeat(a, n) |
| Tile the array | np.tile(a, n) |
| Reverse | np.flip(a, axis=...) |
| Cyclic shift | np.roll(a, n) |
| Insert | np.insert(a, idx, val) |
| Append | np.append(a, val) |
| Delete | np.delete(a, idx) |
Common pitfalls¶
- ❗ Shape mismatch — to vstack, columns must match; to hstack, rows must match.
- ❗
np.appendis slow for repeated use — it creates a new array every time. For building up data, use a Python list and convert once at the end. - ❗
stackvsconcatenate—stackcreates a new dim,concatenatedoesn't. Easy to mix up. - ❗ 1D vs 2D in
hstack—hstack([1Darr, 1Darr])concatenates as 1D. To stack as columns, reshape first.
Practice¶
What does this print?
Expected: (6,)
Combine A and B into a (2, 3, 4) array by stacking on a new axis
Expected: (2, 3, 4)
Quiz — Quick check¶
What you remember
Q1. What's the difference between np.stack and np.concatenate?
- No difference
-
np.stackcreates a new axis;np.concatenatejoins along an existing axis -
np.stackis faster -
np.concatenateis deprecated
Why: Stacking 2 arrays of shape
(3, 4)withnp.stack→(2, 3, 4). Withnp.concatenate(axis=0)→(6, 4). Different operations, easy to confuse.
Q2. Why use np.array_split instead of np.split?
- It's faster
-
array_splithandles uneven splits gracefully;splitraises an error -
splitonly works on 2D - Same function, different name
Why:
np.split(arr_of_10, 3)raises because 10 doesn't divide by 3.np.array_splithappily makes uneven chunks (e.g. sizes 4, 3, 3).
Q3. What does np.repeat([1, 2, 3], 2) produce?
-
[1, 2, 3, 1, 2, 3] -
[1, 1, 2, 2, 3, 3] -
[2, 4, 6] -
[[1, 2, 3], [1, 2, 3]]
Why:
repeatduplicates each element the given number of times.tileis the one that repeats the whole array —np.tile([1, 2, 3], 2)gives[1, 2, 3, 1, 2, 3].
Common doubts¶
Why is np.append(arr, val) slow when called in a loop?
Because NumPy arrays have fixed size — every append allocates a new buffer and copies the old data. For O(N) appends you do O(N²) work. Build a Python list with .append() (O(1)), then convert once with np.array(my_list) at the end.
Difference between np.flip and np.roll?
np.flip reverses an axis — [1, 2, 3] → [3, 2, 1]. np.roll shifts cyclically — np.roll([1, 2, 3], 1) → [3, 1, 2] (last element wraps to front). Different operations, easy to mix up.
When should I use hstack vs np.column_stack?
For 2D arrays they're equivalent. For 1D arrays they're different: np.hstack([a, b]) concatenates as 1D [a..., b...]. np.column_stack([a, b]) treats each 1D array as a column, producing a 2D matrix. Use column_stack when you want columns.