You are viewing a free preview of this lesson.
Subscribe to unlock all 10 lessons in this course and every other course on LearningBro.
Two data structures appear in almost every program you will ever write: the string (a sequence of characters — names, messages, file contents) and the array (an ordered collection of values, implemented as a list in Python). This lesson develops fluency with both, because the H446 exam routinely asks you to write, trace or correct algorithms that index, slice, search and transform them. For strings we cover indexing and slicing, concatenation, the common methods (upper, lower, find, split, replace, strip), and ASCII / ord / chr conversions — the gateway to ciphers and validation. For arrays we cover creation and access, traversal, linear search, the staple algorithms (max/min, sum/average, count) and two-dimensional arrays for grids and tables. Every idea is shown in ```python with OCR-style pseudocode alongside.
A theme to hold throughout: indices in computing start at 0, and almost every bug in this topic is an off-by-one — reading one element too few or too many, or slicing one character short. The cure is the same every time: trace the first and last iterations by hand. A second theme is immutability — a Python string cannot be changed in place; any "edit" actually builds a new string — which explains a whole class of surprises and connects directly to how strings are stored.
Within H446 2.2.1 this lesson covers string and array handling. You should be able to:
upper/lower, find, replace, split and strip;ord and chr, and reason about ASCII ranges;These points paraphrase the specification; nothing is quoted verbatim.
A string is an ordered, 0-indexed sequence of characters. Indexing picks one character; slicing s[start:end] returns the substring from start up to but not including end.
word = "COMPUTER"
# 01234567 <- each character's index
print(word[0]) # 'C' (first character)
print(word[7]) # 'R' (last character)
print(word[-1]) # 'R' (negative indices count from the end)
print(len(word)) # 8
word = "COMPUTER"
print(word[0:4]) # 'COMP' (indices 0,1,2,3 — index 4 is EXCLUDED)
print(word[4:]) # 'UTER' (from 4 to the end)
print(word[:3]) # 'COM' (from start up to index 3)
print(word[::-1]) # 'RETUPMOC' (a reversed copy — step of -1)
The half-open rule (end is excluded) is the number-one slicing trap. A handy check: the length of s[a:b] is exactly b - a, so word[0:4] has 4 - 0 = 4 characters. The same idea in OCR pseudocode often uses a start position and a length:
# OCR-style pseudocode: substring(string, startPosition, numberOfCharacters)
firstFour = word.substring(0, 4) # "COMP" — start at 0, take 4 characters
| Operation | Python | OCR-style | Result for the example |
|---|---|---|---|
| Length | len(s) | s.length | len("Hello") → 5 |
| Concatenation | s1 + s2 | s1 + s2 | "Hi" + " there" → "Hi there" |
| Substring / slice | s[a:b] | s.substring(a, n) | "Hello"[1:4] → "ell" |
| Character at index | s[i] | s[i] | "Hello"[0] → "H" |
| Find / search | s.find(sub) | s.find(sub) | "Hello".find("ll") → 2 |
| Upper case | s.upper() | s.upper | "hi".upper() → "HI" |
| Lower case | s.lower() | s.lower | "HI".lower() → "hi" |
| Strip whitespace | s.strip() | — | " hi ".strip() → "hi" |
| Replace | s.replace(a, b) | — | "cat".replace("c","b") → "bat" |
| Split | s.split(d) | — | "a,b,c".split(",") → ["a","b","c"] |
A note on find: it returns the index of the first match, or -1 if the substring is absent — that -1 sentinel is examinable. split is the everyday way to break a CSV line into fields, and join is its inverse:
record = "Alice,17,A"
fields = record.split(",") # ['Alice', '17', 'A']
print(fields[1]) # '17' (still a string!)
rebuilt = ",".join(fields) # 'Alice,17,A' (join is the inverse of split)
In Python a string cannot be changed in place. Trying to assign to one character raises an error:
name = "cat"
# name[0] = "b" # TypeError: 'str' object does not support item assignment
name = "b" + name[1:] # instead, BUILD a new string -> "bat"
Every operation that appears to "edit" a string — replace, upper, slicing — actually returns a brand-new string and leaves the original untouched. A classic consequence: text.upper() on its own does nothing visible unless you capture the result.
text = "hello"
text.upper() # creates "HELLO" then THROWS IT AWAY — text is still "hello"
print(text) # hello
text = text.upper() # correct: reassign to keep the new string
print(text) # HELLO
This connects to how strings are stored: because the characters are fixed, the language can share and optimise them safely. Arrays/lists, by contrast, are mutable — you can change an element in place — which is the central difference between the two structures and a frequent exam discriminator.
Because strings are immutable, "adding" to one in a loop does not modify it — each += creates a whole new string and discards the old. The pattern below is perfectly readable and standard at A-Level, but it is worth understanding what is happening underneath:
def shout(text: str) -> str:
result = "" # start with an empty string
for char in text:
result += char.upper() # each += BUILDS A NEW string, reassigning result
return result
print(shout("hello")) # HELLO
Each iteration takes the current result, makes a brand-new longer string, and rebinds result to it; the previous string is thrown away. For the short inputs in exam work this is completely fine and is the expected style. The deeper point — examinable as an efficiency observation — is that repeatedly concatenating in a long loop does more work than it appears to, because each step copies everything built so far. The usual remedy is to collect the pieces in a list (which is mutable, so appending is cheap) and join them once at the end:
def shout_efficient(text: str) -> str:
pieces = [] # a mutable list of fragments
for char in text:
pieces.append(char.upper()) # cheap append, no copying of the whole result
return "".join(pieces) # one final build of the string
This is a neat illustration of choosing the right structure for the job: the immutable string is ideal for a final, fixed value, but a mutable list is the better workspace while the value is still being assembled — the same mutability distinction, applied to performance.
Iterating a string visits each character. Two idioms — directly over the characters, or over the indices when you also need the position:
text = "Hello"
for char in text: # simplest: gives each character
print(char)
for i in range(len(text)): # when you also need the index
print(f"Index {i}: {text[i]}")
ord and chrEvery character is stored as a number. ord(c) gives a character's code; chr(n) turns a code back into a character. At this level you should know the ASCII ranges.
| Function | Does | Example |
|---|---|---|
ord(c) | character → code | ord("A") → 65 |
chr(n) | code → character | chr(65) → "A" |
| Characters | ASCII range |
|---|---|
Digits 0–9 | 48–57 |
Uppercase A–Z | 65–90 |
Lowercase a–z | 97–122 |
Two facts drop out of these ranges and are widely useful. First, the gap from any uppercase letter to its lowercase form is 97 - 65 = 32, so you can change case with arithmetic. Second, because the letters are contiguous, you can do alphabet arithmetic — the basis of the Caesar cipher:
def caesar_encrypt(text: str, shift: int) -> str:
result = ""
for char in text:
if char.isalpha():
base = ord("A") if char.isupper() else ord("a") # anchor of this case
# subtract base -> 0..25, shift, wrap with % 26, add base back
shifted = (ord(char) - base + shift) % 26
result += chr(base + shifted)
else:
result += char # leave spaces/punctuation
return result
print(caesar_encrypt("Hello World", 3)) # Khoor Zruog
The % 26 is what makes Z wrap round to A; subtracting base first maps the letter to the range 0–25 so the modulo behaves. This single example exercises traversal, ord/chr, conditionals and building a new string — which is why ciphers are such common exam fodder. It links straight to the Encryption topic in the wider course.
def count_vowels(text: str) -> int:
vowels = "aeiouAEIOU"
count = 0
for char in text: # a linear search/scan over the characters
if char in vowels:
count += 1
return count
print(count_vowels("Hello World")) # 3
The string test methods are the backbone of input validation — checking a value's shape before trusting it (linking to the defensive programming of the previous lesson):
value = "Hello123"
print(value.isalpha()) # False — contains digits
print(value.isdigit()) # False — contains letters
print(value.isalnum()) # True — all letters/digits
print(value.startswith("He"))# True
print(value.endswith("23")) # True
Input always arrives as a string, so converting is constant. str, int and float move between the worlds; list and join move between a string and its characters.
text = str(42) # "42" number -> string
n = int("42") # 42 string -> integer
x = float("3.14") # 3.14 string -> float
chars = list("Hello") # ['H','e','l','l','o'] string -> list of characters
joined = "".join(chars) # "Hello" list -> string
Remember that a digit character and an integer are different: "5" + "3" is "53" (concatenation), whereas int("5") + int("3") is 8.
An array is an ordered, 0-indexed collection. In Python these are lists, and — unlike strings — they are mutable.
numbers = [10, 20, 30, 40, 50]
print(numbers[0]) # 10 (first)
print(numbers[-1]) # 50 (last)
print(numbers[1:3]) # [20, 30] (slice — same half-open rule as strings)
numbers[2] = 35 # lists ARE mutable: now [10, 20, 35, 40, 50]
| Operation | Python | Does |
|---|---|---|
| Length | len(a) | number of elements |
| Append | a.append(x) | add x to the end |
| Insert | a.insert(i, x) | insert x at index i |
| Remove | a.remove(x) | delete first x |
| Pop | a.pop(i) | remove and return element i |
| Sort | a.sort() | sort in place |
| Index | a.index(x) | index of first x |
| Membership | x in a | is x present? |
These five algorithms recur constantly and should be writable from memory in both notations.
def linear_search(arr: list, target) -> int:
for i in range(len(arr)):
if arr[i] == target:
return i # found — return its position
return -1 # not found — the -1 sentinel
numbers = [4, 7, 2, 9, 1, 5]
print(linear_search(numbers, 9)) # 3
print(linear_search(numbers, 6)) # -1
The pattern: assume the first element is the best so far, then update whenever you beat it. Starting the loop at index 1 (not 0) avoids comparing the first element with itself.
def find_max(arr: list):
maximum = arr[0] # best-so-far starts as the first element
for i in range(1, len(arr)):
if arr[i] > maximum:
maximum = arr[i]
return maximum
print(find_max([4, 7, 2, 9, 1, 5])) # 9
def average(arr: list) -> float:
total = 0
for value in arr:
total += value
return total / len(arr) # beware: dividing by 0 if arr is empty
print(average([85, 92, 78, 90, 88])) # 86.6
def count_occurrences(arr: list, target) -> int:
count = 0
for item in arr:
if item == target:
count += 1
return count
print(count_occurrences(["A","B","A","C","A"], "A")) # 3
Bubble sort repeatedly compares adjacent pairs and swaps them if out of order, so the largest value "bubbles" to the end each pass. The swapped flag stops early if a pass makes no swaps (already sorted).
def bubble_sort(arr: list):
n = len(arr)
for i in range(n - 1):
swapped = False
for j in range(n - 1 - i): # the last i items are already in place
if arr[j] > arr[j + 1]:
arr[j], arr[j + 1] = arr[j + 1], arr[j] # swap
swapped = True
if not swapped:
break # no swaps this pass -> already sorted
data = [5, 1, 4, 2]
bubble_sort(data)
print(data) # [1, 2, 4, 5]
Tracing the first pass over [5, 1, 4, 2] makes the mechanism concrete:
| Compare | Before | Swap? | After |
|---|---|---|---|
arr[0],arr[1] (5,1) | [5,1,4,2] | yes | [1,5,4,2] |
arr[1],arr[2] (5,4) | [1,5,4,2] | yes | [1,4,5,2] |
arr[2],arr[3] (5,2) | [1,4,5,2] | yes | [1,4,2,5] |
After one pass the largest value (5) has reached the end; later passes sort the rest. This connects to the Algorithms topic, where bubble sort's O(n²) cost is compared with merge and insertion sort.
A 2-D array is an array of arrays — a grid of rows and columns, used for tables, game boards and matrices. Access is grid[row][column].
grid = [
[1, 2, 3, 4], # row 0
[5, 6, 7, 8], # row 1
[9, 10, 11, 12], # row 2
]
print(grid[0][0]) # 1 (row 0, column 0)
print(grid[1][2]) # 7 (row 1, column 2)
print(grid[2][3]) # 12 (row 2, column 3)
Traversal uses nested loops — the outer over rows, the inner over columns:
for row in range(len(grid)):
for col in range(len(grid[row])):
print(grid[row][col], end=" ")
print() # newline after each row
# OCR-style pseudocode for the same nested traversal
declare grid : array[3][4] of integer
for row = 0 to 2
for col = 0 to 3
print(grid[row][col])
next col
next row
Subscribe to continue reading
Get full access to this lesson and all 10 lessons in this course.