You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
A vector is the bridge between data structures and the mathematics that powers graphics, physics engines and machine learning. At one level it is utterly familiar — an ordered list of numbers, which you already know how to store in an array. What makes vectors a topic rather than just "a list of numbers" is the small algebra that comes with them: you can add vectors, scale them, take their dot product, and form convex combinations, and each operation has a clean geometric meaning. This lesson covers both halves: vectors as a data structure (how a program represents one) and vector arithmetic (the four operations, each with a worked example and its geometric reading), finishing with why these ideas underpin so much of modern computing. The mathematics is presented in KaTeX throughout so the notation matches what you will meet in graphics and ML.
This lesson addresses the H446 1.4.2 Data Structures content on vectors:
(Phrasing here paraphrases the specification content; it is not a verbatim quote.)
A vector is a quantity with both magnitude (size) and direction, captured as an ordered list of numbers called its components. A vector with n components is an n-dimensional vector. The ordering matters: (3,4) and (4,3) are different vectors. Contrast this with a scalar, a single number carrying magnitude only:
| Scalar | Vector | |
|---|---|---|
| What it is | a single number | an ordered list of numbers |
| Carries | magnitude only | magnitude and direction |
| Examples | temperature 20, speed 5 | position (3,4), velocity (2,−1,5) |
We write a vector as v=(v1,v2,…,vn). Geometrically a 2-D vector (3,4) is an arrow from the origin to the point (3,4): the magnitude is the arrow's length and the direction is where it points. This dual nature — a list of numbers that is also a geometric arrow — is the key to everything below: the arithmetic is done on the numbers, but its meaning is read off the arrows.
It is worth being clear about why a computer scientist cares about vectors at all, rather than treating them as a stray piece of maths in a programming course. Almost any quantity that has more than one number attached to it is naturally a vector: a point on the screen is two numbers (its x and y), a point in a 3-D world is three, a colour is three (red, green, blue), and a data point in a machine-learning model might be hundreds. Treating these as single objects with their own arithmetic — rather than as loose collections of separate numbers — is exactly the kind of abstraction this whole module is about. Once you can add two positions, scale a velocity, or measure how similar two data points are with a single operation, you can write code that manipulates whole quantities at once instead of fiddling with their components individually. That is why vectors sit in a data-structures course: they are a structured way of holding related numbers, packaged with the operations that make sense for them.
The dimension of a vector is simply how many components it has, and the operations below work in any dimension — the formulae are written for the general n-dimensional case precisely so that the same rule covers a 2-D screen position, a 3-D world position and a 300-D word embedding without change. When you meet a worked example in two dimensions, remember it is only an easy-to-draw instance of a rule that scales to any number of components.
In a program a vector is almost always a list or array of its components, indexed from 0:
v = [3, 4] # the 2-D vector (3, 4)
w = [1, -2, 5] # the 3-D vector (1, -2, 5)
print(v[0]) # 3 — first component
print(w[2]) # 5 — third component
But that is only the commonest of three valid views, and the spec wants you to recognise all three:
| Representation | Idea | Best when |
|---|---|---|
| List / array | components stored contiguously, accessed by index | the usual case; dense vectors |
| Dictionary (map) | index → value, storing only non-zero components | sparse vectors (mostly zeros) |
| Function | a function v(i) returning the i-th component | the abstract/mathematical view |
The function view is the most illuminating conceptually: a vector (3,4) is the function v with v(0)=3 and v(1)=4. This reframes a vector as "a mapping from index to value", which is exactly the same abstraction as an array (an array is a function from index to element) — a neat link back to the arrays lesson. The dictionary view is the practical pay-off of the function view: if a 10,000-dimensional vector has only a handful of non-zero components (common in machine learning's "bag-of-words" feature vectors), storing it as a dictionary of just the non-zero {index:value} pairs is hugely more space-efficient than a list of 10,000 numbers — the same sparse-versus-dense trade-off seen with graph representations.
# A sparse vector as a dictionary: only non-zero components stored
sparse = {0: 3, 7: 5, 999: -2} # a 1000-D vector, 997 implicit zeros
def component(vec, i):
return vec.get(i, 0) # the "function" view: index -> value
The three representations are not just trivia — choosing the right one is a real design decision with the same flavour as the choices made throughout this course. A list gives O(1) access to any component and is compact when most components are non-zero (a dense vector such as a 3-D position or an RGB colour). A dictionary wins decisively when the vector is sparse: a feature vector representing which of 50,000 dictionary words appear in a short document is overwhelmingly zero, so storing it as a list of 50,000 numbers wastes memory on entries that are all 0, whereas a dictionary of the handful of words actually present is tiny. The function view, finally, is the abstraction that unifies the other two — both "look up index i" by being a mapping from index to value — and it is the cleanest way to think about a vector when you do not want to commit to a storage layout at all. Being able to name all three and justify when each is appropriate is exactly the kind of representation-trade-off reasoning the specification rewards, echoing the array-vs-list and matrix-vs-list choices made earlier.
Four operations make up the examinable algebra. Throughout, let u=(u1,…,un) and v=(v1,…,vn).
Two vectors of the same dimension are added component-wise:
u+v=(u1+v1, u2+v2, …, un+vn)
Worked example with u=(3,4,1) and v=(1,−2,5):
u+v=(3+1, 4+(−2), 1+5)=(4, 2, 6)
def vector_add(u, v):
return [u[i] + v[i] for i in range(len(u))]
print(vector_add([3, 4, 1], [1, -2, 5])) # [4, 2, 6]
Geometrically, addition is the tip-to-tail rule: place v's tail at u's tip, and u+v is the arrow from the start to the finish. If you walk 3 east and 4 north, then 1 east and 2 south, your total displacement is (4,2). This is why vectors model movement so naturally: in a game, if an object is at position p and moves by velocity v each frame, its new position is simply p+v — one vector addition advances the whole object, however many dimensions it lives in. The same operation combines forces in a physics engine: two forces pulling on an object add tip-to-tail into a single resultant force, which is why addition is the workhorse of any simulation. Note also that addition is commutative (u+v=v+u) and associative, just like ordinary number addition, because it is performed component-by-component — a reassurance that the familiar algebra rules carry over.
Multiplying a vector by a scalar k scales every component:
kv=(k⋅v1, k⋅v2, …, k⋅vn)
Worked example with k=3 and v=(2,−1,4):
3v=(3⋅2, 3⋅(−1), 3⋅4)=(6, −3, 12)
def scalar_multiply(k, v):
return [k * c for c in v]
print(scalar_multiply(3, [2, -1, 4])) # [6, -3, 12]
print(scalar_multiply(0.5, [2, -1, 4])) # [1.0, -0.5, 2.0]
Geometrically, scaling stretches or shrinks the arrow without rotating it: k>1 lengthens it, 0<k<1 shortens it, and k<0 flips it to point the opposite way (so −1⋅v is v reversed). The direction is preserved (or exactly reversed); only the magnitude changes. Scalar multiplication is what lets us combine the two ideas: subtracting one vector from another, u−v, is really u+(−1)v, and the resulting vector points from v to u — the standard way to find the direction and distance from one point to another in a game or simulation. Scaling also underpins applying a time step in physics: multiplying a velocity vector by the small time interval Δt gives the displacement for that step, so pnew=p+(Δt)v combines a scalar multiplication and an addition in a single, dimension-independent update. The two simplest operations, used together, already capture how moving objects are simulated.
The dot product (or scalar product) of two vectors yields a single scalar, formed by multiplying corresponding components and summing:
u⋅v=u1v1+u2v2+⋯+unvn=∑i=1nuivi
Worked example with u=(3,4) and v=(2,−1):
u⋅v=(3⋅2)+(4⋅(−1))=6−4=2
def dot_product(u, v):
return sum(u[i] * v[i] for i in range(len(u)))
print(dot_product([3, 4], [2, -1])) # 2
print(dot_product([1, 0], [0, 1])) # 0 -> perpendicular
The dot product measures how aligned two vectors are, and its sign is the examinable point:
| u⋅v | Geometric meaning |
|---|---|
| positive | the vectors point in broadly the same direction (angle <90°) |
| zero | the vectors are perpendicular (at right angles) |
| negative | the vectors point in broadly opposite directions (angle >90°) |
The link to the angle is exact: u⋅v=∣u∣∣v∣cosθ, where θ is the angle between them. Since cos90°=0, a zero dot product means perpendicular — the single most-tested fact about the dot product, and the basis of lighting calculations in graphics and similarity scoring in machine learning.
It is worth pausing on why this one operation is so central, because at first glance "multiply matching components and add them up" looks arbitrary. The reason is that the dot product collapses two vectors into a single number that captures their agreement: each term uivi is positive when both vectors share the same sign in that component (they "agree" on that axis) and negative when they disagree, so the sum measures the overall tendency to point the same way. That is exactly the information a graphics engine needs when shading a surface: the brightness of a point depends on the angle between the surface's outward normal vector and the direction to the light, and the dot product of those two unit vectors gives cosθ directly — 1 when the surface faces the light head-on (brightest), 0 when it is edge-on (dark). The very same calculation, applied to high-dimensional feature vectors, tells a recommendation system how similar two users' tastes are. One small formula, computed in O(n) time, therefore underlies both realistic lighting and machine-learning similarity, which is why examiners treat the dot product as the most important of the four operations.
Exam Tip: The dot product question almost always asks you to compute the value and state what it tells you. A result of zero means the vectors are perpendicular — say so explicitly. Positive = similar direction, negative = opposing.
A convex combination of u and v is a weighted average whose weights are non-negative and sum to 1:
w=αu+(1−α)v,0≤α≤1
Worked example with u=(2,6), v=(8,2) and α=0.25:
w=0.25(2,6)+0.75(8,2)=(0.5,1.5)+(6,1.5)=(6.5, 3.0)
def convex_combination(u, v, alpha):
return [alpha * u[i] + (1 - alpha) * v[i] for i in range(len(u))]
print(convex_combination([2, 6], [8, 2], 0.5)) # [5.0, 4.0] midpoint
print(convex_combination([2, 6], [8, 2], 0.25)) # [6.5, 3.0]
Geometrically, a convex combination is always a point on the straight line segment joining u and v; α slides the point along it:
| α | Result | Position |
|---|---|---|
| 1 | u | at u |
| 0.5 | midpoint | halfway |
| 0.25 | (6.5,3.0) | one-quarter of the way from v toward u |
| 0 | v | at v |
Because the weights are non-negative and sum to 1, the point can never escape the segment — that "stays between" property is what convex means, and it is exactly the operation used to interpolate (blend) between two positions, colours or animation key-frames in graphics.
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.