NumPy Fundamentals

NumPy (Numerical Python) is the foundational library for numerical computing in Python. It provides a powerful N-dimensional array object, broadcasting functions, and tools for integrating C/C++ and Fortran code. Nearly every data science library in Python — Pandas, Matplotlib, Scikit-Learn, TensorFlow — is built on top of NumPy.

Why NumPy?

Python lists are flexible but slow for numerical computation. NumPy arrays are:

Feature	Python List	NumPy Array
Speed	Slow (interpreted loops)	Fast (compiled C code)
Memory	High overhead per element	Compact, contiguous memory
Operations	Element-by-element loops required	Vectorised operations
Broadcasting	Not supported	Automatic shape matching
Type	Mixed types allowed	Homogeneous type

A NumPy operation on a million elements can be 100x faster than the equivalent Python loop.

Creating Arrays

From Python Lists

import numpy as np

# 1D array
a = np.array([1, 2, 3, 4, 5])
print(a)         # [1 2 3 4 5]
print(type(a))   # <class 'numpy.ndarray'>

# 2D array (matrix)
b = np.array([[1, 2, 3],
              [4, 5, 6]])
print(b)
# [[1 2 3]
#  [4 5 6]]

Using Built-in Functions

# Array of zeros
np.zeros((3, 4))         # 3x4 matrix of 0.0

# Array of ones
np.ones((2, 3))          # 2x3 matrix of 1.0

# Array filled with a value
np.full((2, 2), 7)       # 2x2 matrix of 7

# Identity matrix
np.eye(3)                # 3x3 identity matrix

# Evenly spaced values
np.arange(0, 10, 2)      # [0, 2, 4, 6, 8]

# Evenly spaced values (specify count)
np.linspace(0, 1, 5)     # [0.0, 0.25, 0.5, 0.75, 1.0]

# Random values
np.random.rand(3, 3)     # 3x3 matrix of random floats [0, 1)
np.random.randn(3, 3)    # 3x3 matrix from standard normal distribution
np.random.randint(0, 10, size=(2, 3))  # 2x3 matrix of random ints [0, 10)

Array Attributes

a = np.array([[1, 2, 3], [4, 5, 6]])

a.shape     # (2, 3) — 2 rows, 3 columns
a.ndim      # 2 — number of dimensions
a.size      # 6 — total number of elements
a.dtype     # int64 — data type of elements
a.itemsize  # 8 — bytes per element
a.nbytes    # 48 — total bytes consumed

Data Types

NumPy supports many data types:

Type	Description
`np.int32`	32-bit integer
`np.int64`	64-bit integer
`np.float32`	32-bit floating point
`np.float64`	64-bit floating point (default)
`np.bool_`	Boolean
`np.complex128`	Complex number
`np.str_`	String

# Specify dtype at creation
a = np.array([1, 2, 3], dtype=np.float32)
print(a.dtype)  # float32

# Cast to a different type
b = a.astype(np.int64)
print(b.dtype)  # int64

Indexing and Slicing

1D Arrays

a = np.array([10, 20, 30, 40, 50])

a[0]      # 10 — first element
a[-1]     # 50 — last element
a[1:4]    # [20, 30, 40] — slice
a[::2]    # [10, 30, 50] — every other element

2D Arrays

b = np.array([[1, 2, 3],
              [4, 5, 6],
              [7, 8, 9]])

b[0, 0]    # 1 — row 0, col 0
b[1, 2]    # 6 — row 1, col 2
b[0, :]    # [1, 2, 3] — entire first row
b[:, 1]    # [2, 5, 8] — entire second column
b[0:2, 1:3]  # [[2, 3], [5, 6]] — sub-matrix

Boolean Indexing

a = np.array([10, 20, 30, 40, 50])

mask = a > 25
print(mask)     # [False False  True  True  True]
print(a[mask])  # [30 40 50]

# Or directly
print(a[a > 25])  # [30 40 50]

Fancy Indexing

a = np.array([10, 20, 30, 40, 50])
indices = [0, 2, 4]
print(a[indices])  # [10 30 50]

Vectorised Operations

NumPy operations are applied element-wise without writing loops:

a = np.array([1, 2, 3, 4])
b = np.array([10, 20, 30, 40])

NumPy Fundamentals

NumPy Fundamentals

Why NumPy?

Creating Arrays

From Python Lists

Using Built-in Functions

Array Attributes

Data Types

Indexing and Slicing

1D Arrays

2D Arrays

Boolean Indexing

Fancy Indexing

Vectorised Operations

More in Data Science