Karan Jayachandra

Essence of Linear Algebra

When I first encountered linear algebra in school, I failed to realize the profound impact it has on the world around us. At this moment all I could think of was that this was a tool to solve $n$ equations in $n$ unknowns. This recurred when I was studying Linear Algebra although at a higher level during my undergraduate studies. When studying for my master’s degree I saw the gaping hole in knowledge I had in basic mathematics. Things my fellow students found intuitive would require me to spend time writing several equations. It was at this point that I realized that I needed a refresher course. At this moment I came across Gilbert Strang’s Lectures for MIT18.06. This was a godsend. Without this course I do not think I would have my degree or a job. I even wrote an email to this effect to Prof. Strang who was also kind enough to respond and warm greetings. I find his book on this topic to be the definitive source of information on Linear Algebra. No article can hope to capture the effect that attending his lectures would. I highly recommend that you do it if you have the time. However if you are in need of a quick recap of its concept, I summarize my understanding of the course in a way that made sense. It is more of a reference for myself for the future when I might forget a thing or two. Here we go!

Vectors

The fundamental unit of linear algebra is a vector. A vector is a collection of numbers that belong together. The most basic of which you might be familiar with as the cartesian coordinate system where the numbers represent the components along the $x$, $y$ and $z$ axes.

$$ \mathbf{v} = \begin{bmatrix} x \\ y \\ z \end{bmatrix} $$

But it doesn’t have to be this literal. A vector can contain any information. A more realistic example could be a vector containing id of a car dealership, the number of sedans sold at the location, the total number of sales persons and the number of years the location has been active. Any set of related information can be added to a vector. Even words can be used by creating a numerical representation of it. This is what is done by LLMs. A vector is not limited in dimensions. The examples above have 3 and 4 dimensions or in mathematical terms in $\mathcal{R}^3$ and $\mathcal{R}^4$. But this can be extended to any random number of dimensions, $\mathcal{R}^n$. Vectors can be added together and scale. A linear combination is a generalization of combining vectors in different amounts. Notice that vectors are usually denoted in bold in mathematical notation.

$$ \mathbf{y} = c_1 \mathbf{x}_1 + c_2 \mathbf{x}_2 + c_3 \mathbf{x}_3 $$

Matrices

A matrix is a useful way of representing a transformation. A matrix is a set of vector as described in the earlier section stacked together horizontally. Another way of thinking of a matrix is the transformation of reference axes. In 3 dimensions, the unit vector along the $x$, $y$ and $z$ axis is defined as $\begin{bmatrix} 1 & 0 & 0 \end{bmatrix}$, $\begin{bmatrix} 0 & 1 & 0 \end{bmatrix}$ and $\begin{bmatrix} 0 & 0 & 1 \end{bmatrix}$. If we stack these one top of each other, we arrive at the identity matrix or the default axes. We can transform these axes to point to any other directions by just stacking those vectors into a matrix and then multiplying them. Notice that the operation carried out is now just the dot product of the vector with axes that was just defined. The dot product is just a measure of how much the vector is pointing in the direction of another vector which in our case is the new axes.

$$ \mathbf{v} = \begin{bmatrix} 1 & 2 & 3\\ 3 & 5 & 8 \\ 11 & 3 & 1 \end{bmatrix} \begin{bmatrix} x \\ y \\ z \end{bmatrix} = \begin{bmatrix} x + 2y + 3z \\ 3x + 5y + 8z \\ 11x + 3y + z \end{bmatrix} $$

This operation is another way of representing the linear transformation discussed in the previous section. I like to this of this operation as refocusing any and all vectors to a different section of space. This can be generalized with a change in the origin as well using the affine transformation written as:

$$ \mathbf{y} = \mathbf{A} \mathbf{x} + \mathbf{b} $$

where $\mathbf{A}$ is the change in perspective and $\mathbf{b}$ is the change in point of reference.

Matrix transformations

Matrix Rank

This section is to be continued.