class: center, middle, inverse, title-slide .title[ # Matrix algebra ] .author[ ###
MACS 33000
University of Chicago ] --- # Misc updates * Grading up * Check github for updates * Academic integrity session @ 11:30 --- # Learning objectives * Linear vs matrix algebra!! * Define vector and matrix * Visualize vectors in multiple dimensions * Demonstrate applicability of linear algebra to text analysis and cosine similarity * Perform basic algebraic operations on vectors and matricies --- # Linear algebra * Data stored in **matricies** * Matrix algebra study of matrices and finite dimensions * Higher dimensional spaces * Flood of big data, stored in many dimensions * Linear algebra * Algebra of matricies * Geometry of high dimensional space * Calculus (multivariable) in many dimensions * Very important for regression/machine learning/deep learning --- # Points and vectors * A point in `\(\Re^1\)` * `\(1\)` * `\(\pi\)` * `\(e\)` -- * An ordered pair in `\(\Re^2 = \Re \times \Re\)` * `\((1,2)\)` * `\((0,0)\)` * `\((\pi, e)\)` -- * An ordered triple in `\(\Re^3 = \Re \times \Re \times \Re\)` * `\((3.1, 4.5, 6.1132)\)` -- * An ordered `\(n\)`-tuple in `\(R^n = \Re \times \Re \times \ldots \times \Re\)` * `\((a_{1}, a_{2}, \ldots, a_{n})\)` --- # One dimensional example <img src="05-matrix-algebra_files/figure-html/one-d-1.png" width="864" style="display: block; margin: auto;" /> --- # Two dimensional example <img src="05-matrix-algebra_files/figure-html/two-d-1.png" width="864" style="display: block; margin: auto;" /> --- # Two dimensional example <img src="05-matrix-algebra_files/figure-html/two-d2-1.png" width="864" style="display: block; margin: auto;" /> --- # Two dimensional example <img src="05-matrix-algebra_files/figure-html/two-d3-1.png" width="864" style="display: block; margin: auto;" /> --- # Three dimensional example * (Latitude, Longitude, Elevation) * `\((1,2,3)\)` * `\((0,1,2)\)` -- ## `\(N\)`-dimensional example * Individual campaign donation records `$$\mathbf{x} = (1000, 0, 10, 50, 15, 4, 0, 0, 0, \ldots, 2400000000)$$` * Counties have proportion of vote for Trump `$$\mathbf{y} = (0.8, 0.5, 0.6, \ldots, 0.2)$$` * Run experiment, assess feeling thermometer of elected official `$$\mathbf{t} = (0, 100, 50, 70, 80, \ldots, 100)$$` --- # Vectors (*source: Gill*) * Serial listing of numbers where the order matters * Can add and subtract at will (either by adding / subtracting a constant (*scalar*) or adding / subtracting a second vector ) * If adding / subtracting vectors, they need to be *conformable* -- otherwise, *nonconformable* * Multiplying vectors (not fun): Either dot `\(\cdot\)` or cross product `\(\times\)` (note: difference in these!) * dot product AKA inner product, produces a scalar. (we'll get to this later) * cross product, produces vector --- # Vector/scalar addition/multiplication $$ `\begin{aligned} \mathbf{u} & = [1, 2, 3, 4, 5] \\ \mathbf{v} & = [1, 1, 1, 1, 1] \\ k & = 2 \end{aligned}` $$ -- $$ `\begin{aligned} \mathbf{u} + \mathbf{v} & = [1 + 1, 2 + 1, 3+ 1, 4 + 1, 5+ 1] = [2, 3, 4, 5, 6] \\ k \mathbf{u} & = [2 \times 1, 2 \times 2, 2 \times 3, 2 \times 4, 2 \times 5] = [2, 4, 6, 8, 10] \\ k \mathbf{v} & = [2 \times 1,2 \times 1,2 \times 1,2 \times 1,2 \times 1] = [2, 2, 2, 2, 2] \end{aligned}` $$ --- # Linear combinations: * Linear combinations of vectors `\(\mathbf{a}\)` and `\(\mathbf{b}\)` `$$\mathbf{a} + \mathbf{b}$$` `$$2\mathbf{a} - 3\mathbf{b}$$` * Generic form `$$\alpha \mathbf{a} + \beta\mathbf{b} + \gamma\mathbf{c} + \delta\mathbf{d} + \ldots$$` --- # Linear combinations: (in)dependence * **Linear independence**: cannot get from one vector to another through transformation. * Another way to consider is if you have to go *beyond the space (span)* of the set to include an additional vector. * Can think of as multiplication by a scalar, but technically, combining each NON-ZERO vector, multiplied by a scalar, gives us zero. (e.g. `\(\alpha_1*\mathbf{v_1} + \alpha_2*\mathbf{v_2} + ... + \alpha_n*\mathbf{v_n}=\mathbf{0}\)` where at least one scalar is non-zero) * Overall, we may be in a situation where the number of dimensions helps us understand the size of the solution set --- # Linear (in)dependence `$$\mathbf{a} = \begin{bmatrix} 3 \\ 1 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 2 \\ 2 \end{bmatrix}, \quad \mathbf{c} = \begin{bmatrix} 1 \\ 3 \end{bmatrix}$$` Are any of these linearly dependent? -- These are (together) linearly dependent. -- `$$\mathbf{a}-2*\mathbf{b}+\mathbf{c} =\Big( \begin{matrix} 0 \\ 0 \end{matrix}\Big)$$` --- # Detecting linear dependence * `\(\mathbf{v}^1, \mathbf{v}^2, \ldots \mathbf{v}^k\)` are linearly dependent if and only if there exist scalars `\(\alpha_1, \alpha_2, \ldots, \alpha_k\)` *not all zero* such that `$$\alpha_1 \mathbf{v}_1 + \alpha_2 \mathbf{v}_2 + \ldots + \alpha_k \mathbf{v}_k = \mathbf{0}$$` --- # Example: determining (independence) `$$\mathbf{a} = \begin{bmatrix} 2 \\ 1 \\ 2 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 4 \\ 1 \\ 3 \end{bmatrix}, \quad \mathbf{c} = \begin{bmatrix} 1 \\ 1 \\ 2 \end{bmatrix}$$` If we're not sure what to do, we set this up as trying to solve for our variable of choice. so, if we're multiplying scalars, it must be the case, for example, that `\(2\alpha + 4\beta + 1\gamma = 0\)`, and so on. That is, we're trying to solve for `\(\alpha \mathbf{a} + \beta\mathbf{b} + \gamma\mathbf{c} = \mathbf{0}\)`. -- * Express as a **system of equations** (note: renaming each line so you can see the transformations more easily on the next slide) `$$\begin{aligned}2\alpha &+ 4\beta &+ \gamma &= 0 \text{ (}\mathbf{h}\text{) }\\ \alpha &+ \beta &+ \gamma &= 0 \text{ (}\mathbf{j}\text{) }\\ 2\alpha &+ 3\beta &+ 2\gamma &= 0\text{ (}\mathbf{k}\text{) } \end{aligned}$$` --- ## Aside: solving systems of equations * Need at least x equations for x unknowns (e.g. 2 unknowns, 2 equations) * Multiply/divide each equation and add together * Different approaches, but often try to eliminate or 'kick out' variables to get down to solving for one variable * Once you have one variable, continue! --- # Solve the system of equations Let's work through an example to evaluate whether our matrices are linearly independent or dependent. .pull-left[ * **Step 1: write out equations** `$$\begin{aligned} 2\alpha + 4\beta &+ \gamma &= 0 \text{ (}\mathbf{h}\text{) }\\ \alpha + \beta &+ \gamma &= 0 \text{ (}\mathbf{j}\text{) }\\ 2\alpha + 3\beta &+ 2\gamma &= 0\text{ (}\mathbf{k}\text{) } \end{aligned}$$` * **Step 2: Add/subtract** $$ `\begin{aligned} 2\alpha + 4\beta + \gamma &= 0 \\ \alpha + \beta + \gamma &= 0 \\ -\beta + \gamma &= 0 \end{aligned}` $$ ] .pull-right[ * **Step 1:** We re-write our matrices from the previous slide, setting each equal to zero. Note that we've named them to make following along a little easier. * **Step 2:** We leave our first equation `\(\mathbf{h}\)` as-is. We then take our second equation and calculate: `\(h-k\)` and replace that result in the third row. We can see from the third row that `\(\beta = \gamma\)`. ] --- # Solve the system of equations Let's work through an example to evaluate whether our matrices are linearly independent or dependent. .pull-left[ * **Step 2: Add/subtract** $$ `\begin{aligned} 2\alpha + 4\beta + \gamma &= 0 \\ \alpha + \beta + \gamma &= 0 \\ -\beta + \gamma &= 0 \end{aligned}` $$ * **Step 3: ... Solve ...** $$ `\begin{aligned} 2\alpha + 5\gamma &= 0 \\ \alpha + 2\gamma &= 0 \\ -\beta + \gamma &= 0 \end{aligned}` $$ ] .pull-right[ * **Step 3**: Now, we continue to work, trying to solve for any of our three variables (three variables, three functions). We can sub in `\(\gamma\)` for `\(\beta\)` and simplify our middle equation. ] --- # Solve the system of equations Let's work through an example to evaluate whether our matrices are linearly independent or dependent. .pull-left[ * **Step 3: ... Solve ...** $$ `\begin{aligned} 2\alpha + 5\gamma &= 0 \\ \alpha + 2\gamma &= 0 \\ -\beta + \gamma &= 0 \end{aligned}` $$ * **Step 4: ... continue ...** $$ `\begin{aligned} \gamma &= 0 \\ \alpha + 2\gamma &= 0 \\ -\beta + \gamma &= 0 \end{aligned}` $$ ] .pull-right[ * **Step 4**: We can then take 2 times our middle row (j) from the top row (replacing h) to get `\(\gamma = 0\)`. As `\(\gamma = \beta = 0\)`, we can substitute those in to our middle equation. * **Conclusion**: This means that ... `\(\alpha = 0\)` as well. Therefore, these equations (and matrices) are linearly independent as the only possible solution that satisfies our requirements is for all variables to be zero (trivial). ] --- # Example Now, let's try this together -- work with a partner. Are these matrices linearly dependent or independent? `$$\mathbf{a} = \begin{bmatrix} 4 \\ 3 \\ 3 \end{bmatrix}, \quad \mathbf{b} = \begin{bmatrix} 2 \\ 1 \\ -1 \end{bmatrix}, \quad \mathbf{c} = \begin{bmatrix} 1 \\ 1 \\ 2 \end{bmatrix}$$` --- # Example * **Step 1: write out** $$ `\begin{aligned} 4\alpha + 2\beta + 1\gamma &= 0 \\ 3\alpha + 1\beta + 1\gamma &= 0 \\ 3\alpha - 1\beta + 2\gamma &= 0 \end{aligned}` $$ -- * **Step 2: manipulate** $$ `\begin{aligned} 4\alpha + 2\beta + 1\gamma &= 0 \\ 2\beta - 1\gamma &= 0 \\ 3\alpha - 1\beta + 2\gamma &= 0 \end{aligned}` $$ -- * **Step 3: Solve!!** $$ `\begin{aligned} 4\alpha + 4\beta &= 0 \\ 2\beta &=\gamma \\ \end{aligned}` $$ -- `$$\gamma = 2, \quad \beta = 1, \quad \alpha = -1$$` *Is this our only solution?* --- class: center, middle, inverse # Vectors --- # Inner product: vector operations Multiplying two vectors produces a scalar (single number) output. $$ `\begin{aligned} \mathbf{u} \cdot \mathbf{v} &= u_{1} v_{1} + u_{2}v_{2} + \ldots + u_{n} v_{n} \\ & = \sum_{i=1}^{N} u_{i} v_{i} \end{aligned}` $$ * Aka **dot product** * Results in a scalar value --- # Inner product * `\(\mathbf{u} = (1, 2, 3)\)` and `\(\mathbf{v} = (2, 3, 1)\)` `$$\begin{aligned} \mathbf{u} \cdot \mathbf{v} & = 1 \times 2 + 2 \times 3 + 3 \times 1 \\ & = 2+ 6 + 3 \\ & = 11 \end{aligned}$$` --- # Calculating vector length The length of a vector, the vector norm, can be thought of as the distance of the vector from the origin. <img src="05-matrix-algebra_files/figure-html/pythagorean-theorem-1.png" width="864" style="display: block; margin: auto;" /> --- # Vector norm $$ `\begin{aligned} \| \mathbf{v}\| & = (\mathbf{v} \cdot \mathbf{v} )^{1/2} \\ & = (v_{1}^2 + v_{2}^{2} + v_{3}^{2} + \ldots + v_{n}^{2} )^{1/2} \end{aligned}` $$ -- * Vector norm of a three-dimensional vector `\(\mathbf{x} = (1,1,1)\)`: `$$\begin{aligned}\| \mathbf{x}\| & = (\mathbf{x} \cdot \mathbf{x} )^{1/2} \\ & = (x_{1}^2 + x_{2}^{2} + x_{3}^{2})^{1/2} \\ & = (1 + 1 + 1)^{1/2} \\ &= \sqrt{3}\end{aligned}$$` --- # Text analysis ``` a abandoned abc ability able about above abroad absorbed absorbing abstract 43 0 0 0 0 10 0 0 0 0 1 ``` -- `$$(43,0,0,0,0,10,\dots)$$` --- # Text analysis $$ `\begin{aligned} \text{Doc1} & = (1, 1, 3, \ldots, 5) \\ \text{Doc2} & = (2, 0, 0, \ldots, 1) \\ \textbf{Doc1}, \textbf{Doc2} & \in \Re^{M} \end{aligned}` $$ --- # Inner product $$ `\begin{aligned} \textbf{Doc1} \cdot \textbf{Doc2} & = (1, 1, 3, \ldots, 5) \cdot (2, 0, 0, \ldots, 1) \\ & = 1 \times 2 + 1 \times 0 + 3 \times 0 + \ldots + 5 \times 1 \\ & = 7 \end{aligned}` $$ --- # Length $$ `\begin{aligned} \| \textbf{Doc1} \| & \equiv \sqrt{ \textbf{Doc1} \cdot \textbf{Doc1} } \\ & = \sqrt{(1, 1, 3, \ldots , 5) (1, 1, 3, \ldots, 5)' } \\ & = \sqrt{1^{2} +1^{2} + 3^{2} + 5^{2} } \\ & = 6 \end{aligned}` $$ --- # Cosine similarity $$ `\begin{aligned} \cos (\theta) & \equiv \left(\frac{\textbf{Doc1} \cdot \textbf{Doc2}}{\| \textbf{Doc1}\| \|\textbf{Doc2} \|} \right) \\ & = \frac{7} { 6 \times 2.24} \\ & = 0.52 \end{aligned}` $$ --- # Measuring similarity * Usefulness * Desirable properties * The **maximum** should be the document with itself * The **minimum** should be documents which have no words in common (orthogonal to one another) * Increasing when more of the same words are used * Normalize for document length * We can look at the similarity and also calculate the angle between them ($\theta$) --- # Similarity: Using the inner product `$$(2,1) \cdot (1,4) = 6$$` <img src="05-matrix-algebra_files/figure-html/inner-product-1.png" width="864" style="display: block; margin: auto;" /> --- # Length dependence `$$(4,2) \cdot (1,4) = 12$$` <img src="05-matrix-algebra_files/figure-html/inner-product-not-same-1.png" width="864" style="display: block; margin: auto;" /> --- # Cosine similarity Allows us to look at the angle between them instead <img src="05-matrix-algebra_files/figure-html/cosine-sim-1.png" width="864" style="display: block; margin: auto;" /> --- # Cosine similarity $$ `\begin{aligned} (4,2) \cdot (1,4) &= 12 \\ \mathbf{a} \cdot \mathbf{b} &= \|\mathbf{a} \| \times \|\mathbf{b} \| \times \cos(\theta) \\ \frac{\mathbf{a} \cdot \mathbf{b}}{\|\mathbf{a} \| \times \|\mathbf{b} \|} &= \cos(\theta) \end{aligned}` $$ --- # Cosine similarity $$ `\begin{aligned} \cos (\theta) & \equiv \left(\frac{\textbf{Doc1} \cdot \textbf{Doc2}}{\| \textbf{Doc1}\| \|\textbf{Doc2} \|} \right) \\ & = \frac{(2, 1) \cdot (1, 4)} {\| (2,1)\| \| (1,4) \|} \\ & = \frac{6} {(\sqrt{2^2 + 1^2}) (\sqrt{1^2 + 4^2})} \\ & = \frac{6} {(\sqrt{5}) (\sqrt{17})} \\ & \approx 0.65 \end{aligned}` $$ --- # Cosine similarity $$ `\begin{aligned} \cos (\theta) & \equiv \left(\frac{\textbf{Doc1} \cdot \textbf{Doc2}}{\| \textbf{Doc1}\| \|\textbf{Doc2} \|} \right) \\ & = \frac{(4, 2) \cdot (1, 4)} {\| (4,2)\| \| (1,4) \|} \\ & = \frac{12} {(\sqrt{4^2 + 2^2}) (\sqrt{1^2 + 4^2})} \\ & = \frac{12} {(\sqrt{20}) (\sqrt{17})} \\ & \approx 0.65 \end{aligned}` $$ --- # Cosine similarity $$ `\begin{aligned} \cos (\theta) & \equiv \left(\frac{\textbf{Doc3} \cdot \textbf{Doc2}}{\| \textbf{Doc3}\| \|\textbf{Doc2} \|} \right) \\ & = \frac{(1,2) \cdot (1, 4)} {\| (1,2)\| \| (1,4) \|} \\ & = \frac{9} {(\sqrt{1^2 + 2^2}) (\sqrt{1^2 + 4^2})} \\ & = \frac{9} {(\sqrt{5}) (\sqrt{17})} \\ & \approx 0.976 \end{aligned}` $$ -- Interpretation: closer to 1, more similar --- # Exploring similarity: .pull-left[ <img src="05-matrix-algebra_files/figure-html/cosine-simd1-1.png" width="864" style="display: block; margin: auto;" /> ] .pull-right[ <img src="05-matrix-algebra_files/figure-html/cosine-simb-1.png" width="864" style="display: block; margin: auto;" /> ] We can see how the similarity of 0.65 has a larger `\(\theta\)` and a similarity of 0.97 has a smaller `\(\theta\)`. We can also use the inverse of cos(arccos) to find our respective angles: approx 0.86 radians (approx 50 degrees) vs 0.25 radians (approx 15 degrees). --- class: center, middle, inverse # The Matrix, Part I <img src="https://media.tenor.com/ang0VzOwbdAAAAAC/the-matrix-reloaded-matrix.gif" style="display: block; margin: auto;" /> --- # Matricies * **Rectangular** arrangement (array) of numbers defined by two **axes** 1. Rows 1. Columns `$$\mathbf{A} = \begin{bmatrix} a_{11} & a_{12} & \ldots & a_{1n} \\ a_{21} & a_{22} & \ldots & a_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ a_{m1} & a_{m2} & \ldots & a_{mn} \\ \end{bmatrix}$$` --- # Example matricies: Fun facts * Dimensionality matters * Diagonal from upper left to lower right matters * Sometimes the opposite diagonal is referred to by a name, but this varies * Matrices referred to by their overall properties (elements) and/or behavior of the diagonal * Special names for *special* matrices. --- # Special matrices * **Diagonal matrix** (only numbers on diagonal; zero elsewhere) * Identity** (only 1s on diagonal; zero elsewhere -- special case of diagonal -- called `\(I\)`) * **Square** (n rows = n columns) * **Triangular matri**x (half of matrix has zeros -- see below for lower/upper distinction) * **lower triangular** (numbers in lower half (diagonal down) and zeros above diagonal) * **upper triangular** (numbers in upper half (diagonal up) and zeros below diagonal) * **Zero matrix** (all zeros!) * **Matrix** of only ones (sometimes given letter J) * **Transpose** / transposition (mathematical operation on a matrix) (superscript with T) * **Inverse** (this is a mathematical operation) (superscipt with `\(-1\)`) --- # Example matricies Consider the following: `$$\mathbf{X} = \begin{bmatrix}1 & 2 & 3 & 4 \\ 2 & 1 & 4 & 3\\ \end{bmatrix}$$` `$$\mathbf{Y} = \left[ \begin{array}{rr} 1 & 2 \\3 & 2 \\1 & 4\end{array} \right]\\$$` `$$\mathbf{Z} = \left[ \begin{array}{rrr}1 & 2 & 3 \\5 & 1 & 4 \\6 & 1 & 7\end{array} \right]$$` --- # Matrix oparations: addition * `\(\mathbf{X}\)` and `\(\mathbf{Y}\)` are `\(m \times n\)` matrices `$$\begin{aligned} \mathbf{X} + \mathbf{Y} & = \begin{pmatrix} x_{11} & x_{12} & \ldots & x_{1n} \\ x_{21} & x_{22} & \ldots & x_{2n} \\\vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & \ldots & x_{mn} \\ \end{pmatrix} + \begin{pmatrix} y_{11} & y_{12} & \ldots & y_{1n} \\ y_{21} & y_{22} & \ldots & y_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ y_{m1} & y_{m2} & \ldots & y_{mn} \\ \end{pmatrix} \\ & = \begin{pmatrix} x_{11} + y_{11} & x_{12} + y_{12} & \ldots & x_{1n} + y_{1n} \\ x_{21} + y_{21} & x_{22} + y_{22} & \ldots & x_{2n} + y_{2n} \\ \vdots & \vdots & \ddots & \vdots\\ x_{m1} + y_{m1} & x_{m2} + y_{m2} & \ldots & x_{mn} + y_{mn} \\ \end{pmatrix} \end{aligned}$$` --- # Scalar Multiplication You can just multiply the constant (scalar) through the matrix. * `\(\mathbf{X}\)` is an `\(m \times n\)` matrix and `\(k \in \Re\)` `$$k \mathbf{X} = \begin{pmatrix} k x_{11} & k x_{12} & \ldots & k x_{1n} \\ k x_{21} & k x_{22} & \ldots & k x_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ k x_{m1} & k x_{m2} & \ldots & k x_{mn} \\ \end{pmatrix}$$` --- # Matrix transposition Transposing a matrix means *flipping* it in a sense -- swapping across the diagonal. `$$\mathbf{X} = \begin{pmatrix} x_{11} & x_{12} & \ldots & x_{1n} \\ x_{21} & x_{22} & \ldots & x_{2n} \\ \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & \ldots & x_{mn} \\ \end{pmatrix}$$` `$$\mathbf{X}' = \begin{pmatrix} x_{11} & x_{21} & \ldots & x_{m1} \\ x_{12} & x_{22} & \ldots & x_{m2} \\ \vdots & \vdots & \ddots & \vdots \\ x_{1n} & x_{2n} & \ldots & x_{mn}\end{pmatrix}\\$$` --- # Matrix multiplication * Step 1: have two (or more matrices to multiply) * Step 2: check dimensions (rows vs columns) * Step 3: **ORDER MATTERS** the number of COLUMNS of the FIRST matrix MUST MATCH the number of ROWS of the second matrix (YES, I KNOW!! ) * Step n: final output will have the number of ROWS of the FIRST and the number of COLUMNS of the second matrix. (steps continued on next slide -- just setting up logic here!) <span style = "color: blue;"> `$$\mathbf{X} = \begin{pmatrix} 1 \\ 10\end{pmatrix}$$` </span> <span style = "color: red;"> `$$\quad \mathbf{Y} = \begin{pmatrix} 5 & 2 \\ 3 & 4 \\ \end{pmatrix}$$` </span> Note our two matrices above. In what order do we multiply them? What dimensions will our final matrix have? -- Well, `\(X\)` is `\(2\times 1\)`, `\(Y \text{ is } 2 \times 2\)`. Since the number of cols of X do not match the number of rows of Y, we can't multiply them. BUT! We **CAN** multiply `\(Y\)` and `\(X\)`! This is because we will have `\(2 \times 2 \text{ and } 2 \times 1\)`, producing a final matrix of `\(2 \times 1\)`. --- # Matrix multiplication: Color-coded example To multiply, you go ACROSS the ROW of the first matrix and multiply pairwise DOWN THE COLUMN of the second matrix. You then add these together. For example, if you have matrix A times matrix B, it would be element `\(a_{1,1}*b_{1,1} + ... + a_{1,n}*b_{n,1}\)`. This sum goes into the first cell in the resulting matrix. This sounds confusing (because it is), so let's see an example. -- <span style = "color: blue;"> `$$\mathbf{X} = \begin{pmatrix} 1 \\ 10\end{pmatrix}$$` </span> <span style = "color: red;"> `$$\quad \mathbf{Y} = \begin{pmatrix} 5 & 2 \\ 3 & 4 \\ \end{pmatrix}$$` </span> -- Here, we have a very simple example -- so, we'll take the row (in this case, the entirety) of X and multiply the pairs (each element with its corresponding partner) of X and Y. So, <span style = "color: red;"> `\(5*\)` </span> <span style = "color: blue;"> `\(1\)` </span> + <span style = "color: red;"> `\(2*\)` </span> <span style = "color: blue;"> `\(10\)` </span>. This will give us <span style = "color: purple;"> `\(5+ 20 = 25\)` </span> as the first cell in our resulting matrix. -- Similarly, we'll multiply the remaining row of our matrix Y by the column in X to produce the following: <span style = "color: red;"> `\(3*\)` </span> <span style = "color: blue;"> `\(1\)` </span> + <span style = "color: red;"> `\(4*\)` </span> `\(+\)` <span style = "color: blue;"> `\(10\)` </span>. This will give us <span style = "color: purple;"> `\(3+ 40 = 43\)` </span> as the second cell in our resulting matrix. --- # Matrix multiplication: Color-coded example To multiply, you go ACROSS the ROW of the first matrix and multiply pairwise DOWN THE COLUMN of the second matrix. You then add these together. For example, if you have matrix A times matrix B, it would be element `\(a_{1,1}*b_{1,1} + ... + a_{1,n}*b_{n,1}\)`. This sum goes into the first cell in the resulting matrix. This sounds confusing (because it is), so let's see an example. -- <span style = "color: blue;"> `$$\mathbf{X} = \begin{pmatrix} 1 \\ 10\end{pmatrix}$$` </span> <span style = "color: red;"> `$$\quad \mathbf{Y} = \begin{pmatrix} 5 & 2 \\ 3 & 4 \\ \end{pmatrix}$$` </span> **This produces** <span style = "color: purple;"> `$$\mathbf{Z} = \begin{pmatrix} 25 \\ 43 \\ \end{pmatrix}$$` </span> --- # Matrix multiplication Try this example and check with a neighbor: `\(\mathbf{XY}\)` `$$\mathbf{X} = \begin{pmatrix} 1 & 1 \\ 1& 1 \\ \end{pmatrix} , \quad \mathbf{Y} = \begin{pmatrix} 1 & 2 \\ 3 & 4 \\ \end{pmatrix}$$` -- `$$\begin{aligned} \mathbf{A} & = \mathbf{X} \mathbf{Y} \\& = \begin{pmatrix}1 & 1 \\ 1 & 1 \\\end{pmatrix} \begin{pmatrix}1 & 2 \\3 & 4 \\\end{pmatrix} \\&= \begin{pmatrix}1 * 1 + 1 * 3 & 1 * 2 + 1 * 4 \\1 * 1 + 1 * 3 & 1 * 2 + 1 * 4\\\end{pmatrix} \\&= \begin{pmatrix}4 & 6 \\4 & 6\end{pmatrix}\end{aligned}$$` --- # Matrix multiplication: Identity Matrix Try this example and check with a neighbor: `\(\mathbf{IY}\)` (let's see why `\(I\)` is called the **identity matrix**) `$$\mathbf{I} = \begin{pmatrix} 1 & 0 \\ 0 & 1 \\ \end{pmatrix} , \quad \mathbf{Y} = \begin{pmatrix} 1 & 2 \\ 3 & 4 \\ \end{pmatrix}$$` -- `$$\begin{aligned} \mathbf{A} & = \mathbf{I} \mathbf{Y} \\& = \begin{pmatrix}1 & 0 \\ 0 & 1 \\\end{pmatrix} \begin{pmatrix}1 & 2 \\3 & 4 \\\end{pmatrix} \\&= \begin{pmatrix}1 * 1 + 0 \times 3 & 1 \times 2 + 0 \times 4 \\ 0 \times 1 + 1 \times 3 & 0 \times 2 + 1 \times 4\\\end{pmatrix} \\&= \begin{pmatrix}1 & 2 \\3 & 4\end{pmatrix}\end{aligned}$$` IT'S LIKE MAGIC! (*actually it's just the matrix version of multiplying by 1*) --- # Let's try this out on our own: Suppose we have a `\(3 \times 4\)` matrix, **A** and a `\(4 \times 1\)` matrix **v**: `$$\begin{aligned} \mathbf{A} & = \begin{pmatrix} 2 & 3 & 4 & 5 \\ 1 & 5 & 1 & 2\\ 3 & 5 & 3 & 4 \\\end{pmatrix} \\ \mathbf{v} & = \begin{pmatrix}3 \\ 3 \\ 4 \\10 \\\end{pmatrix} \end{aligned}$$` -- * What is `\(\boldsymbol{A} \boldsymbol{v}\)`? * What is `\(\boldsymbol{A}^{'} \boldsymbol{v}\)`? * What is `\(\boldsymbol{A} \boldsymbol{A}^{'}\)`? * What is `\(\boldsymbol{A}^{'} \boldsymbol{A}\)`? --- ## Answers: * What is `\(\boldsymbol{A} \boldsymbol{v}\)`? `$$\begin{aligned} \mathbf{A}*\mathbf{v} & = \begin{pmatrix} 2 & 3 & 4 & 5 \\ 1 & 5 & 1 & 2\\ 3 & 5 & 3 & 4 \\\end{pmatrix} \begin{pmatrix}3 \\ 3 \\ 4 \\10 \\\end{pmatrix} = \begin{pmatrix}2*3 + 3*3+4*4+5*10 \\ 1*3+5*3+1*4+2*10 \\ 3*3+5*3+3*4+4*10 \end{pmatrix} \\ &= \begin{pmatrix} 81 \\ 42 \\ 76 \\ \end{pmatrix}\end{aligned}$$` * What is `\(\boldsymbol{A}^{'} \boldsymbol{v}\)`? **Not possible: `\(\boldsymbol{A}^{'}\)` has dimensions `\(4\times 3\)` and `\(\boldsymbol{v}\)` has dimensions `\(4 \times 1\)` ** --- ## Answers: * What is `\(\boldsymbol{A} \boldsymbol{A}^{'}\)`? `\(\mathbf{A}*\mathbf{v} =\)` `$$\begin{aligned} & = \begin{pmatrix} 2 & 3 & 4 & 5 \\ 1 & 5 & 1 & 2\\ 3 & 5 & 3 & 4 \end{pmatrix} \begin{pmatrix} 2 & 1 & 3 \\ 3 & 5 & 5 \\ 4 & 1 & 3 \\ 5 & 2 & 4 \\ \end{pmatrix} \\ &= \begin{pmatrix} 2^2 + 3^2 + 4^2 + 5^2 & 2* 1 + 3* 5 + 4*1 + 5*2 & 2*3 + 3* 5 + 4*3 + 5*4 \\ 1*2 + 5*3 + 1* 4 + 2* 5 & 1^2 + 5^2 + 1^1 + 2^2 & 1*3 + 5*5 + 1*3 + 2*4 \\ 3*2 + 5*3 + 3* 4 + 4*5 & 3*1 + 5*5 + 3*1 + 4*2 & 3^2 + 5^ 2 + 3^ 2 + 4^2 \\ \end{pmatrix} \\ & = \begin{pmatrix} 54 & 31 & 53 \\ 31 & 31 & 39 \\ 53 & 39 & 59 \end{pmatrix} \end{aligned}$$` --- ## Answers: * What is `\(\boldsymbol{A}^{'}\boldsymbol{A}\)`? `$$\begin{aligned} & = \begin{pmatrix} 2 & 1 & 3 \\ 3 & 5 & 5 \\ 4 & 1 & 3 \\ 5 & 2 & 4 \\ \end{pmatrix} \begin{pmatrix} 2 & 3 & 4 & 5 \\ 1 & 5 & 1 & 2\\ 3 & 5 & 3 & 4 \end{pmatrix} & = \begin{pmatrix} 14 & 26 & 18 & 24 \\ 26 & 59 & 32 & 45\\ 18 & 32 & 26 & 34 \\ 24 & 45 & 34 & 45 \end{pmatrix} \\ \end{aligned}$$` --- class: center # Uses of matrices: how you might see them in your life --- # Neural networks <img src="https://upload.wikimedia.org/wikipedia/commons/9/99/Neural_network_example.svg" style="display: block; margin: auto;" /> --- # Uses for neural networks * Self-driving cars * Voice activated assistants * Automatic machine translation * Image recognition * Detection of diseases --- # Social networks <img src="https://macs40700.netlify.app/slides/13-visualize-text-network/index_files/figure-html/unnamed-chunk-67-1.png" style="display: block; margin: auto;" /> --- # Uses for social networks * Connections * Groupings (e.g. cliques) * Density * 'Spread' (ideas, disease, etc.) --- # Algebraic properties: Recap! * Matricies must be **conformable** (right dimensions) * Matricies have weird rules: * Associative property: `\((\mathbf{XY})\mathbf{Z} = \mathbf{X}(\mathbf{YZ})\)` * Additive distributive property: `\((\mathbf{X} + \mathbf{Y})\mathbf{Z} = \mathbf{XZ} + \mathbf{YZ}\)` * Zero property: `\(\mathbf{X0} = 0\)` * Identity matrix: `\(\mathbf{IX} = \mathbf{X}\)` * Order matters: `\(\mathbf{XY} \neq \mathbf{YX}\)` * Different from scalar multiplication: `\(xy = yx\)` * Order of matrix: sometimes you'll see subscripts referring to a matrix's order -- this is particularly true for square matrices. E.g: * `\(\mathbf{I_2}\)` is an identity matrix (and therefore a square matrix) with two rows and two columns * `\(\mathbf{J}_{2\times3}\)` is a matrix (likely full of ones) with two rows and three columns * Operations on matrix: transposing means flipping / reflecting across the diagonal --- # Example matricies: Fun facts * Dimensionality matters * Diagonal from upper left to lower right matters * Sometimes the opposite diagonal is referred to by a name, but this varies * Matrices referred to by their overall properties (elements) and/or behavior of the diagonal * Special names for *special* matrices: * **Diagonal matrix** (only numbers on diagonal; zero elsewhere) * Identity** (only 1s on diagonal; zero elsewhere -- special case of diagonal -- called `\(I\)`) * **Square** (n rows = n columns) * **Triangular matri**x (half of matrix has zeros -- see below for lower/upper distinction) * **lower triangular** (numbers in lower half (diagonal down) and zeros above diagonal) * **upper triangular** (numbers in upper half (diagonal up) and zeros below diagonal) * **Zero matrix** (all zeros!) * **Matrix** of only ones (sometimes given letter J) * **Transpose** / transposition (mathematical operation on a matrix) (superscript with T) * **Inverse** (this is a mathematical operation) (superscipt with `\(-1\)`) <!-- # Linear algebra roots * Tensor * Scalars (0D tensors) * Vectors (1D tensors) * Matricies (2D tensors) * 3D tensors and higher-dimensional tensors # Tensor operations * Generalizations of matrix operations * Tensor addition * Tensor multiplication * **If you can do it with a matrix, you can do it with a tensor** # Linear algebra notation `$$\mathbf{Y} = \text{activation}(\mathbf{W} \cdot \mathbf{X} + \mathbf{B})$$` * `\(\mathbf{X}\)` * `\(\mathbf{Y}\)` * `\(\mathbf{W}, \mathbf{B}\)` * `\(\text{activation}()\)` * Rectified Linear Units (RELU) `$$R(z) = \max(0, z)$$` * Sigmoid function (aka logistic regression) `$$S(z) = \frac{1}{1 + e^{-z}}$$` -->