String and Array Manipulation

Two data structures appear in almost every program you will ever write: the string (a sequence of characters — names, messages, file contents) and the array (an ordered collection of values, implemented as a list in Python). This lesson develops fluency with both, because the H446 exam routinely asks you to write, trace or correct algorithms that index, slice, search and transform them. For strings we cover indexing and slicing, concatenation, the common methods (upper, lower, find, split, replace, strip), and ASCII / ord / chr conversions — the gateway to ciphers and validation. For arrays we cover creation and access, traversal, linear search, the staple algorithms (max/min, sum/average, count) and two-dimensional arrays for grids and tables. Every idea is shown in ```python with OCR-style pseudocode alongside.

A theme to hold throughout: indices in computing start at 0, and almost every bug in this topic is an off-by-one — reading one element too few or too many, or slicing one character short. The cure is the same every time: trace the first and last iterations by hand. A second theme is immutability — a Python string cannot be changed in place; any "edit" actually builds a new string — which explains a whole class of surprises and connects directly to how strings are stored.

Spec Mapping

Within H446 2.2.1 this lesson covers string and array handling. You should be able to:

perform core string operations: find the length, index a single character, take a substring (slice), concatenate, and use methods such as upper/lower, find, replace, split and strip;
convert between characters and their numeric codes using ord and chr, and reason about ASCII ranges;
understand that strings are immutable and what that implies for "editing" them;
create, index, slice and traverse one-dimensional arrays/lists, and apply the staple algorithms (linear search, max/min, sum/average, count);
create, index and traverse two-dimensional arrays with nested loops for grids, tables and matrices;
move comfortably between Python and OCR-style pseudocode for all of the above.

These points paraphrase the specification; nothing is quoted verbatim.

Strings: Indexing and Slicing

A string is an ordered, 0-indexed sequence of characters. Indexing picks one character; slicing s[start:end] returns the substring from start up to but not including end.

word = "COMPUTER"
#       01234567   <- each character's index
print(word[0])      # 'C'  (first character)
print(word[7])      # 'R'  (last character)
print(word[-1])     # 'R'  (negative indices count from the end)
print(len(word))    # 8

word = "COMPUTER"
print(word[0:4])    # 'COMP'  (indices 0,1,2,3 — index 4 is EXCLUDED)
print(word[4:])     # 'UTER'  (from 4 to the end)
print(word[:3])     # 'COM'   (from start up to index 3)
print(word[::-1])   # 'RETUPMOC'  (a reversed copy — step of -1)

The half-open rule (end is excluded) is the number-one slicing trap. A handy check: the length of s[a:b] is exactly b - a, so word[0:4] has 4 - 0 = 4 characters. The same idea in OCR pseudocode often uses a start position and a length:

# OCR-style pseudocode: substring(string, startPosition, numberOfCharacters)
firstFour = word.substring(0, 4)   # "COMP" — start at 0, take 4 characters

Common String Operations

Operation	Python	OCR-style	Result for the example
Length	`len(s)`	`s.length`	`len("Hello")` → `5`
Concatenation	`s1 + s2`	`s1 + s2`	`"Hi" + " there"` → `"Hi there"`
Substring / slice	`s[a:b]`	`s.substring(a, n)`	`"Hello"[1:4]` → `"ell"`
Character at index	`s[i]`	`s[i]`	`"Hello"[0]` → `"H"`
Find / search	`s.find(sub)`	`s.find(sub)`	`"Hello".find("ll")` → `2`
Upper case	`s.upper()`	`s.upper`	`"hi".upper()` → `"HI"`
Lower case	`s.lower()`	`s.lower`	`"HI".lower()` → `"hi"`
Strip whitespace	`s.strip()`	—	`" hi ".strip()` → `"hi"`
Replace	`s.replace(a, b)`	—	`"cat".replace("c","b")` → `"bat"`
Split	`s.split(d)`	—	`"a,b,c".split(",")` → `["a","b","c"]`

A note on find: it returns the index of the first match, or -1 if the substring is absent — that -1 sentinel is examinable. split is the everyday way to break a CSV line into fields, and join is its inverse:

record = "Alice,17,A"
fields = record.split(",")          # ['Alice', '17', 'A']
print(fields[1])                    # '17'  (still a string!)
rebuilt = ",".join(fields)          # 'Alice,17,A'  (join is the inverse of split)

Strings are Immutable

In Python a string cannot be changed in place. Trying to assign to one character raises an error:

name = "cat"
# name[0] = "b"        # TypeError: 'str' object does not support item assignment
name = "b" + name[1:]   # instead, BUILD a new string -> "bat"

Every operation that appears to "edit" a string — replace, upper, slicing — actually returns a brand-new string and leaves the original untouched. A classic consequence: text.upper() on its own does nothing visible unless you capture the result.

text = "hello"
text.upper()            # creates "HELLO" then THROWS IT AWAY — text is still "hello"
print(text)             # hello
text = text.upper()     # correct: reassign to keep the new string
print(text)             # HELLO

This connects to how strings are stored: because the characters are fixed, the language can share and optimise them safely. Arrays/lists, by contrast, are mutable — you can change an element in place — which is the central difference between the two structures and a frequent exam discriminator.

Building a string up character by character

Because strings are immutable, "adding" to one in a loop does not modify it — each += creates a whole new string and discards the old. The pattern below is perfectly readable and standard at A-Level, but it is worth understanding what is happening underneath:

def shout(text: str) -> str:
    result = ""                       # start with an empty string
    for char in text:
        result += char.upper()        # each += BUILDS A NEW string, reassigning result
    return result

print(shout("hello"))                 # HELLO

Each iteration takes the current result, makes a brand-new longer string, and rebinds result to it; the previous string is thrown away. For the short inputs in exam work this is completely fine and is the expected style. The deeper point — examinable as an efficiency observation — is that repeatedly concatenating in a long loop does more work than it appears to, because each step copies everything built so far. The usual remedy is to collect the pieces in a list (which is mutable, so appending is cheap) and join them once at the end:

def shout_efficient(text: str) -> str:
    pieces = []                        # a mutable list of fragments
    for char in text:
        pieces.append(char.upper())    # cheap append, no copying of the whole result
    return "".join(pieces)             # one final build of the string

This is a neat illustration of choosing the right structure for the job: the immutable string is ideal for a final, fixed value, but a mutable list is the better workspace while the value is still being assembled — the same mutability distinction, applied to performance.

String Traversal

Iterating a string visits each character. Two idioms — directly over the characters, or over the indices when you also need the position:

text = "Hello"
for char in text:                 # simplest: gives each character
    print(char)

for i in range(len(text)):        # when you also need the index
    print(f"Index {i}: {text[i]}")

ASCII, `ord` and `chr`

Every character is stored as a number. ord(c) gives a character's code; chr(n) turns a code back into a character. At this level you should know the ASCII ranges.

Function	Does	Example
`ord(c)`	character → code	`ord("A")` → `65`
`chr(n)`	code → character	`chr(65)` → `"A"`

Characters	ASCII range
Digits `0`–`9`	48–57
Uppercase `A`–`Z`	65–90
Lowercase `a`–`z`	97–122

Two facts drop out of these ranges and are widely useful. First, the gap from any uppercase letter to its lowercase form is 97 - 65 = 32, so you can change case with arithmetic. Second, because the letters are contiguous, you can do alphabet arithmetic — the basis of the Caesar cipher:

def caesar_encrypt(text: str, shift: int) -> str:
    result = ""
    for char in text:
        if char.isalpha():
            base = ord("A") if char.isupper() else ord("a")   # anchor of this case
            # subtract base -> 0..25, shift, wrap with % 26, add base back
            shifted = (ord(char) - base + shift) % 26
            result += chr(base + shifted)
        else:
            result += char                                    # leave spaces/punctuation
    return result

print(caesar_encrypt("Hello World", 3))   # Khoor Zruog

The % 26 is what makes Z wrap round to A; subtracting base first maps the letter to the range 0–25 so the modulo behaves. This single example exercises traversal, ord/chr, conditionals and building a new string — which is why ciphers are such common exam fodder. It links straight to the Encryption topic in the wider course.

String Searching and Validation

def count_vowels(text: str) -> int:
    vowels = "aeiouAEIOU"
    count = 0
    for char in text:                 # a linear search/scan over the characters
        if char in vowels:
            count += 1
    return count

print(count_vowels("Hello World"))    # 3

The string test methods are the backbone of input validation — checking a value's shape before trusting it (linking to the defensive programming of the previous lesson):

value = "Hello123"
print(value.isalpha())       # False — contains digits
print(value.isdigit())       # False — contains letters
print(value.isalnum())       # True  — all letters/digits
print(value.startswith("He"))# True
print(value.endswith("23"))  # True

Type Conversion Between Strings and Numbers

Input always arrives as a string, so converting is constant. str, int and float move between the worlds; list and join move between a string and its characters.

text = str(42)              # "42"   number -> string
n = int("42")              # 42     string -> integer
x = float("3.14")          # 3.14   string -> float

chars = list("Hello")       # ['H','e','l','l','o']   string -> list of characters
joined = "".join(chars)     # "Hello"                  list -> string

Remember that a digit character and an integer are different: "5" + "3" is "53" (concatenation), whereas int("5") + int("3") is 8.

Arrays (Lists in Python)

An array is an ordered, 0-indexed collection. In Python these are lists, and — unlike strings — they are mutable.

numbers = [10, 20, 30, 40, 50]
print(numbers[0])     # 10   (first)
print(numbers[-1])    # 50   (last)
print(numbers[1:3])   # [20, 30]  (slice — same half-open rule as strings)
numbers[2] = 35       # lists ARE mutable: now [10, 20, 35, 40, 50]

Operation	Python	Does
Length	`len(a)`	number of elements
Append	`a.append(x)`	add `x` to the end
Insert	`a.insert(i, x)`	insert `x` at index `i`
Remove	`a.remove(x)`	delete first `x`
Pop	`a.pop(i)`	remove and return element `i`
Sort	`a.sort()`	sort in place
Index	`a.index(x)`	index of first `x`
Membership	`x in a`	is `x` present?

Common Array Algorithms

These five algorithms recur constantly and should be writable from memory in both notations.

Linear search

def linear_search(arr: list, target) -> int:
    for i in range(len(arr)):
        if arr[i] == target:
            return i          # found — return its position
    return -1                 # not found — the -1 sentinel

numbers = [4, 7, 2, 9, 1, 5]
print(linear_search(numbers, 9))   # 3
print(linear_search(numbers, 6))   # -1

Maximum and minimum

The pattern: assume the first element is the best so far, then update whenever you beat it. Starting the loop at index 1 (not 0) avoids comparing the first element with itself.

def find_max(arr: list):
    maximum = arr[0]                  # best-so-far starts as the first element
    for i in range(1, len(arr)):
        if arr[i] > maximum:
            maximum = arr[i]
    return maximum

print(find_max([4, 7, 2, 9, 1, 5]))   # 9

Sum and average

def average(arr: list) -> float:
    total = 0
    for value in arr:
        total += value
    return total / len(arr)           # beware: dividing by 0 if arr is empty

print(average([85, 92, 78, 90, 88]))  # 86.6

Counting occurrences

def count_occurrences(arr: list, target) -> int:
    count = 0
    for item in arr:
        if item == target:
            count += 1
    return count

print(count_occurrences(["A","B","A","C","A"], "A"))   # 3

Bubble sort (traced)

Bubble sort repeatedly compares adjacent pairs and swaps them if out of order, so the largest value "bubbles" to the end each pass. The swapped flag stops early if a pass makes no swaps (already sorted).

def bubble_sort(arr: list):
    n = len(arr)
    for i in range(n - 1):
        swapped = False
        for j in range(n - 1 - i):        # the last i items are already in place
            if arr[j] > arr[j + 1]:
                arr[j], arr[j + 1] = arr[j + 1], arr[j]   # swap
                swapped = True
        if not swapped:
            break                          # no swaps this pass -> already sorted

data = [5, 1, 4, 2]
bubble_sort(data)
print(data)                                # [1, 2, 4, 5]

Tracing the first pass over [5, 1, 4, 2] makes the mechanism concrete:

Compare	Before	Swap?	After
`arr[0],arr[1]` (5,1)	[5,1,4,2]	yes	[1,5,4,2]
`arr[1],arr[2]` (5,4)	[1,5,4,2]	yes	[1,4,5,2]
`arr[2],arr[3]` (5,2)	[1,4,5,2]	yes	[1,4,2,5]

After one pass the largest value (5) has reached the end; later passes sort the rest. This connects to the Algorithms topic, where bubble sort's O(n²) cost is compared with merge and insertion sort.

2-D Arrays

A 2-D array is an array of arrays — a grid of rows and columns, used for tables, game boards and matrices. Access is grid[row][column].

grid = [
    [1,  2,  3,  4],     # row 0
    [5,  6,  7,  8],     # row 1
    [9, 10, 11, 12],     # row 2
]
print(grid[0][0])        # 1   (row 0, column 0)
print(grid[1][2])        # 7   (row 1, column 2)
print(grid[2][3])        # 12  (row 2, column 3)

Traversal uses nested loops — the outer over rows, the inner over columns:

for row in range(len(grid)):
    for col in range(len(grid[row])):
        print(grid[row][col], end=" ")
    print()              # newline after each row

# OCR-style pseudocode for the same nested traversal
declare grid : array[3][4] of integer
for row = 0 to 2
    for col = 0 to 3
        print(grid[row][col])
    next col
next row

String and Array Manipulation

String and Array Manipulation

Spec Mapping

Strings: Indexing and Slicing

Common String Operations

Strings are Immutable

Building a string up character by character

String Traversal

ASCII, ord and chr

String Searching and Validation

Type Conversion Between Strings and Numbers

Arrays (Lists in Python)

Common Array Algorithms

Linear search

Maximum and minimum

Sum and average

Counting occurrences

Bubble sort (traced)

2-D Arrays

More in Computer Science

ASCII, `ord` and `chr`