What is Numpy
Numpy, a python library, a wrapper around C and Fortran code, very fast. It focuses on matrices called ndarrays (N-dimensional array).
I have used it a lot during my studies at GA Tech and recently had to use it to do some data analysis and then I realize that I forgot so many things about the basic operations.
I had to dive into some old school materials in order to get back on track.
Numpy basic recipes
Below is a very quick and simple cookbook, that I have prepared for myself obviously :-) But I also hope it can be helpful to anybody else.
I know, I know.. there is no figure whatsoever. Graphic representation is always the best. You can check lesson 4 of this course on udacity.com, The power of Numpy. It's free :-)
Assuming we have imported the Numpy library as follow import numpy as np
and have nd1 an ndarray that represent a matrix of dimension #row x #col.
Assuming all the following operations will be within the limit of the matrix dimension.
nd1[0,0] is element at row 0, column 0
nd1[-1, 1] is element at last row, column 1
nd1[-2, 2] is element at the row preceding the last row, column 2
nd1[0:3, 1:7] is the region in the matrix from row 0 to the row before 3 (thus row 2) and column to the column before 7 (thus column 6)
a colon (:) with nothing after means all the way to the end
nd1[1:, 2], is the region from row 1 to the last, and column 2
nd1[:, 1:4] is the region from row 0 to the last (all the rows) and column 1 to column 3.
nd1[:, 1:8:2], select all element in row 0 to the last and column 1, 3, 5, 7 (start at col 1, each time increment by 2 until you reach the upper bound)
nd1[0,:] = 1, assign value 1 to entire row 0
nd1[:, 1] = [1,2,5,9], assign the list [1,2,5,9] to column 1, assuming the number of rows for nd1 is 4. Very important to pay attention to the dimension.
np.array([1,2,5,0]), convert the list [1,2,5,0] to 1D array
np.array([(2,3), (7,10)]), convert list of list to array (2D array in this example [ [2,3], [7,10] ]
a = np.array([(2,3), (7,10)]), a.sum() is sum of all elements which is 22, a.sum(axis=0) is column sums [5, 17], a.sum(axis=1) is row sums [9, 10]
similarly we can use a.min(), a.max(), a.mean() same way as the example above
np.empty(5), create an empty 1D array
np.empty((2,3)), create an empty 2x3 array (2D array)
np.ones((2,4)), create an 2x4 array of ones (by default the datatype is double)
np.ones((2,2), dtype=np.int_), create an 2x2 array of ones (datatype integer)
a is in interval [x, y) means number x<= a < y
np.random.random((3,4)), create a 3x2 array of random values uniformly sampled from [0.0, 1.0)
np.random.rand(3,4), same as line above (similar to matlab syntax)
np.random.normal(size=(2,3)), create a 2x3 array of sample for a gaussian (normal) distribution, 0 mean and unit standard deviation
np.random.normal(10, 5, size=(2,3)), same as above but mean=10, standard deviation=5
np.random.randint(5), single random integer between [0, 5)
np.random.randint(0, 5), single random integer between [0, 5)
np.random.randint(0,5, size=5), 5 random integer between [0, 5) as a 1D array
np.random.randint(0, 5, size=(2,3)), 2x3 array of random integers between [0, 5)
Assuming we have the following a=np.random.random(5,4)
a.shape, returns the shape of the Numpy array where a.shape[0]=5 and a.shape[1]==1
a.size, is the number of elements in the Numpy array
Assuming we have the following a=np.array([1,5,4,7,9,11,3])
a[ [0,2,3] ], returns [1,4,7] where [0,2,3] is an array of indices passed to the 1D array
a[ a < 8], returns the list [1,5,4,7,3] (Numpy masking)
a[ a < 8] = 0, replaces all the numbers less than 8 by 0 in the original list, then returns [0,0,0,0,9,11,0]
Further reference
Assuming array A and array B are 2D arrays and are the same shape. Mathematical operations in Numpy arrays are element wise. Thus considering A + B, it will result in a new array where element at position x,y in array A is added to element at x,y in array B. It's the same for A-B, A/B and A*B.
Todo proper matrices multiplication, we need to use Numpy.matmul.
This is not a exhaustive list but this is the basic I have used during my school days and till now whenever am doing some data analysis using Python. For further Numpy functions you can refer to the Numpy reference page.