Many statistical procedures such as least squares involve solving large systems of equations. We saw a numerical algorithm to accomplish this task by splitting the computation into two subroutines. As we discussed, we form an augmented matrix $[A\ b]$. We then:
i) reduce the system into upper triangular form
ii) back substitute the values of the variables to find a solution.
Let's define a matrix A to represent the left side of the system of equations:
\begin{align} x_1 + 2x_2 + x_3 &= 5\\ 3x_2 + 2x_2 + 4x_3 &= 17 \\ 4x_1 + 4x_2 + 3x_3 &= 26 \end{align}import numpy as np
A = np.array([[1,2,1],[3,2,4],[4,4,3]])
b = np.array([5,17,26])
Can you combine the system into an augmented matrix using a numpy function? It should look like this:
system = np.array([[1,2,1,5],[3,2,4,17],[4,4,3,26]])
print system
check the dimensions:
system.shape
# colon in the first position means include all rows
# we're only selecting the last one
b = system[:,system.shape[1]-1]
b
# include all rows, all columns except for the last one
a = system[:,:system.shape[1]-1]
a
Here was the code I gave you based on the algorithm we derived in class. Did you find the bug?
def solver_w_typo(system):
"""
Write a solver that takes as input a system of equations via an augmented matrix
and returns the system in upper triangular form.
"""
# define a and b from the augmented matrix
b = system[:,system.shape[1]-1]
a = system[:,:system.shape[1]-1]
m = np.zeros([a.shape[0],a.shape[1]])
for j in range(a.shape[1]-1):
print "j", j
for i in range(j+1,a.shape[0]):
print "i", i
# a[i,j] is the component of the variable we want to eliminate
# a[j,j] is the component we are multiplying
# the multiplier is defined by the ratio of the two
print "a[i,j]", a[i,j], "\na[j,j]", a[j,j]
m[i,j] = a[i,j]/a[j,j]
print "m[i,j]", m[i,j]
for k in range(j, a.shape[0]):
print "k", k
print "a[i,k]", a[i,k]
a[i,k] = a[i,k] - m[i,j]*a[j,k]
print "new a[i,k]", a[i,k]
b[i] = b[i] - m[i,j]*b[j]
U = np.zeros([system.shape[0],system.shape[1]])
U[:,:system.shape[1]-1] = a
U[:,system.shape[1]-1] = b
return U
The index in the third for loop should start at j not at j+1. Here it is correct:
def solver(system):
"""
Write a solver that takes as input a system of equations via an augmented matrix
and returns the system in upper triangular form.
-----
Input
* system : Augmented matrix of the form [A b]
Output
* U : A matrix in upper triangular form
* a : RHS of system upper triangular matrix
* b : LHS of system
"""
# define a and b from the augmented matrix
b = system[:,system.shape[1]-1]
a = system[:,:system.shape[1]-1]
m = np.zeros([a.shape[0],a.shape[1]])
# iterate over columns
for j in range(a.shape[1]-1):
# iterate over rows
for i in range(j+1,a.shape[0]):
# a[i,j] is the component of the variable we want to eliminate
# a[j,j] is the component we are multiplying
# the multiplier is defined by the ratio of the two
m[i,j] = a[i,j]/a[j,j]
# iterate over components of the equation
for k in range(j, a.shape[0]):
a[i,k] = a[i,k] - m[i,j]*a[j,k]
# update the right side of the augmented system
b[i] = b[i] - m[i,j]*b[j]
# store the results in a new upper triangular matrix U
U = np.zeros([system.shape[0],system.shape[1]])
U[:,:system.shape[1]-1] = a
U[:,system.shape[1]-1] = b
return U,a,b
Let's test it out:
U,a,b = solver(system)
print "the system in upper triangular form:\n", U
print "the RHS of the system:\n", b
We now need to plug in our values of x iteratively, starting with $x_n$. This algorithm is known as backsubstitution:
def backsubstitute(U,y):
x = np.zeros(U.shape[0])
for ind in range(U.shape[0]):
i = U.shape[0] - ind - 1
x[i] = y[i]
for j in range(i+1,U.shape[0]):
x[i] = x[i] - U[i,j]*x[j]
x[i] = x[i]/U[i,i]
return x
Let's try it out on a larger example:
aug_mat1 = np.array([[1,2,1,-1,5],[3,2,4,4,16],[4,4,3,4,22],[2,0,1,5,15]])
print aug_mat1
U,a,b = solver(aug_mat1)
print "the system in upper triangular form:\n", U
Calling backsubstitute on our upper triangular matrix and RHS of the system returns the solution:
backsubstitute(U,b)
Try your own system of equations by entering large augmented matrices. Time your code and plot the running time as a function of input size for various size matrices [A b].
How do we know whether or not a system will have a solution? We also saw a function called the determinant:
The summand on the left side refers to all permutations of rows (or equivalently columns) in the matrix. What is the order of the number of permutations? That is, how many permutations are there of rows (or columns) for an n x n matrix?
Can you write a function that takes as input a list (or a numpy array) and returns all permutations?
Recall that we defined the number of inversions as the number of pairs out of order in the list. That is, $ \forall i < j$, count all pairs such that $a[i] > a[j]$.
The final term is a product over all components of the permutation. That is, we multiply n components of A where the components are selected by first taking the row via the permutation and the column via the original index.