☑ Numpy

Author

Ken Pu

1 Construction of NDArrays

2 Creating arrays

NDArray (n-dimensional array) is a high performant data structure provided by the NumPy library.

At first glance, one might think that NDArrays are redundant as they can be equally represented as nested Python lists.

The main advantages of NumPy NDArrays are:

  1. NDArrays are magnitudes more efficient (memory and speed) than native Python lists for numerical computation.

  2. NumPy library provides vast functions to support complex (and efficient) processing of data that are stored in NDArray format.

For this point, we will simply refer to NDArrays as arrays.

2.1 Importing the numpy module

We will typically prefer to use a short-hand alias np for the numpy module.

import numpy as np

2.2 The np.array(...) constructor converts Python iterables into arrays.

x_array = np.array([1, 2, 3])

x_array
array([1, 2, 3])

3 Datatype of numpy arrays

  • Unlike Python list, arrays are homogeneous, which means that all the elements must have the same data type.

  • The data type of an array can be accessed using the array.dtype property.

x_array.dtype
dtype('int64')

3.1 Shape of of numpy array

Arrays can have multiple axes, each having multiple entries. In this example, we only have a single axis, which has three elements.

  • The capacities of individual axes are called the shape of the array. It can be accessed as a tuple of integers via the array.shape property.

  • The capacity of the entire array is its size property.

x_array.shape
(3,)
x_array.size
3

4 Multidimensional arrays

An array can have arbitrary number of axes, with each axis its own size.

For example, consider the following example.

    190   170   130
    105   122   111
  • It has two axes. The rows are axis 0, and the columns are axis 1.

  • The shape is (2, 3), which means that the row axis has two rows, and column axis has three columns.

  • Each element is uniquely identified by a tuple of integers, known as the coordinates. The value of 170 is specified by the coordinate (0, 1) because the value exists in the first row (row=0) and second column (column=1).

4.1 Constructing multidimensional arrays

We can construct arrays with multiple axes using nd.array(...) by provide a nested Python list.

x_array = np.array([
    [190, 170, 130],
    [105, 122, 111]
])

x_array
array([[190, 170, 130],
       [105, 122, 111]])

4.2 Shape of multidimensional array

The shape now has two entries

x_array.shape
(2, 3)

4.3 Indexing entries in a multidimensional array

We can index values, subarrays and more using array indexing.

gdfwqqpn **Note** We will have a more detailed discussion on the indexing syntax of NDarrays.

print("A value =",        x_array[0, 2])
print("The last value =", x_array[-1, -1])
print("A row =",          x_array[0])
print("A column =",       x_array[:, -1])
print("A subarray =\n",   x_array[:, 1:])
A value = 130
The last value = 111
A row = [190 170 130]
A column = [130 111]
A subarray =
 [[170 130]
 [122 111]]

5 More array constructors

NumPy has a vast collection of functions.

We will only illustrate the flexibility and comprehensiveness of NumPy by presenting just how many ways there are to construct an array.

  • zeros
  • ones
  • full array of constant value
  • integer sequences
  • uniform interval
  • random arrays

5.1 Making zeros

np.zeros(shape=(3, 3))
array([[0., 0., 0.],
       [0., 0., 0.],
       [0., 0., 0.]])

5.2 Making ones

np.ones(shape=(3, 3))
array([[1., 1., 1.],
       [1., 1., 1.],
       [1., 1., 1.]])

5.3 Making constants

np.full(shape=(3, 3), fill_value=123)
array([[123, 123, 123],
       [123, 123, 123],
       [123, 123, 123]])

5.4 Making a sequence

np.arange(start=0, stop=10, step=2)
array([0, 2, 4, 6, 8])

5.5 Making an evenly spaced intervals

np.linspace(start=0, stop=1, num=10)
array([0.        , 0.11111111, 0.22222222, 0.33333333, 0.44444444,
       0.55555556, 0.66666667, 0.77777778, 0.88888889, 1.        ])

5.6 Making uniformly random numbers

np.random.uniform(low=0, high=1, size=(3,3))
array([[0.20047814, 0.88189994, 0.98612809],
       [0.60495931, 0.29549187, 0.90444808],
       [0.850353  , 0.17078463, 0.22032825]])

5.7 Making Gaussian random numbers

np.random.normal(loc=0, scale=1, size=(3,3))
array([[-0.63704409,  1.31100406,  0.02671199],
       [-0.70561716, -0.89945186,  0.87917878],
       [ 0.79822736, -0.43878865, -1.75284735]])

6 In-depth indexing of numpy arrays

6.1 Indexing single element in vectors

x1 = np.array([5, 0, 3, 3, 7, 9])
x1
array([5, 0, 3, 3, 7, 9])
x1[0]
5
x1[4]
7

6.2 Index from the end of the axis

x1[-1]
9
x1[-2]
7

6.3 Indexing single element in multidimensional array

x2 = np.array([[3, 5, 2, 4],
               [7, 6, 8, 8],
               [1, 6, 7, 7]])

x2
array([[3, 5, 2, 4],
       [7, 6, 8, 8],
       [1, 6, 7, 7]])
x2[0,0]
3
x2[2,0]
1

6.4 Negative indexes work too

x2[2, -1]
7
x2[-2, -1]
8

6.5 Subarrays of vectors

x = np.arange(10)
x
array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
x[4:7]
array([4, 5, 6])

6.6 Default start or end of ranges

x[:5]
array([0, 1, 2, 3, 4])
x[5:]
array([5, 6, 7, 8, 9])

6.7 Range with steps

x[::2]
array([0, 2, 4, 6, 8])
x[1::2]
array([1, 3, 5, 7, 9])
x[::-1]
array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
x[5::-2]
array([5, 3, 1])

6.8 Subarrays of multidimensional arrays

x2 = np.array([[12, 5, 2, 4],
             [ 7, 6, 8, 8],
             [ 1, 6, 7, 7]])

x2
array([[12,  5,  2,  4],
       [ 7,  6,  8,  8],
       [ 1,  6,  7,  7]])
x2[:2, :3]
array([[12,  5,  2],
       [ 7,  6,  8]])
x2[:3, ::2]
array([[12,  2],
       [ 7,  8],
       [ 1,  7]])
x2[::-1, ::-1]
array([[ 7,  7,  6,  1],
       [ 8,  8,  6,  7],
       [ 4,  2,  5, 12]])
x2[:, 0]
array([12,  7,  1])

7 Some array transformations

7.1 Reshaping

np.arange(9).reshape(3, 3)
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
x = np.array([1,2,3])
x
array([1, 2, 3])
x.reshape((1, 3))
array([[1, 2, 3]])
x.reshape((3, 1))
array([[1],
       [2],
       [3]])
x[np.newaxis, :]
array([[1, 2, 3]])
x[:, np.newaxis]
array([[1],
       [2],
       [3]])

7.2 Array concatenation and splitting

x1 = np.array([1, 2, 3])
y1 = np.array([3, 2, 1])
np.concatenate([x1, y1])
array([1, 2, 3, 3, 2, 1])
x2 = np.array([
    [1, 2, 3],
    [-1, -2, -3]
])

y2 = np.array([
    [3, 2, 1],
    [-3, -2, -1]
])
np.concatenate([x2, y2], axis=0)
array([[ 1,  2,  3],
       [-1, -2, -3],
       [ 3,  2,  1],
       [-3, -2, -1]])
np.concatenate([x2, y2], axis=1)
array([[ 1,  2,  3,  3,  2,  1],
       [-1, -2, -3, -3, -2, -1]])
np.vstack([x1, y2])
array([[ 1,  2,  3],
       [ 3,  2,  1],
       [-3, -2, -1]])

8 Splitting arrays

x = [1, 2, 3, 99, 99, 3, 2, 1]
x
[1, 2, 3, 99, 99, 3, 2, 1]
x1, x2, x3 = np.split(x, [3, 5])
print('x1 =', x1)
print('x2 =', x2)
print('x3 =', x3)
x1 = [1 2 3]
x2 = [99 99]
x3 = [3 2 1]
grid = np.arange(16).reshape((4, 4))
grid
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])
upper, lower = np.vsplit(grid, [2])
print('upper =\n', upper)
print('lower =\n', lower)
upper =
 [[0 1 2 3]
 [4 5 6 7]]
lower =
 [[ 8  9 10 11]
 [12 13 14 15]]
left, right = np.hsplit(grid, [2])
print("left =\n", left)
print("right =\n", right)
left =
 [[ 0  1]
 [ 4  5]
 [ 8  9]
 [12 13]]
right =
 [[ 2  3]
 [ 6  7]
 [10 11]
 [14 15]]

9 Replication

Sometimes, it may be useful (and necessary) to transform an array by replicating part or all of the elements in the array.

This type of transformation is supported by np.repeat and np.tile.

Repeat allows us to replicate slices of an array along as given axis.

Tile allows us to replicate the entire array into a grid of replicas.

x = np.arange(12).reshape(3,4)
x
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
np.repeat(x, 2, axis=0)
array([[ 0,  1,  2,  3],
       [ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [ 8,  9, 10, 11]])
np.repeat(x, 2, axis=1)
array([[ 0,  0,  1,  1,  2,  2,  3,  3],
       [ 4,  4,  5,  5,  6,  6,  7,  7],
       [ 8,  8,  9,  9, 10, 10, 11, 11]])
np.tile(x, (1,2))
array([[ 0,  1,  2,  3,  0,  1,  2,  3],
       [ 4,  5,  6,  7,  4,  5,  6,  7],
       [ 8,  9, 10, 11,  8,  9, 10, 11]])
np.tile(x, (2,2))
array([[ 0,  1,  2,  3,  0,  1,  2,  3],
       [ 4,  5,  6,  7,  4,  5,  6,  7],
       [ 8,  9, 10, 11,  8,  9, 10, 11],
       [ 0,  1,  2,  3,  0,  1,  2,  3],
       [ 4,  5,  6,  7,  4,  5,  6,  7],
       [ 8,  9, 10, 11,  8,  9, 10, 11]])