#
1.5

Systems of Equations

In this section, we’ll discuss how to solve **systems of linear equations**: perhaps 2, 3, 100, or even 1000 simultaneous equations, and correspondingly many unknowns.

Simple circuit problems can have tens to hundreds of equations and unknowns. It is not uncommon for complex circuits to be in the thousands of equations or beyond.

We’re not going to discuss **nonlinear equations** because their complexity is hard to reason about. For example, consider the equation:

This nonlinear equation has just 1 equation and 1 unknown, but has infinitely many solutions: $x = n\pi$ for all integers $n$ . Additionally, as discussed in Linear & Nonlinear, we can find a solution for a nonlinear system by making a series of locally linear approximations using the Newton-Raphson Method.

## A Linear Equation

To start, we need to define what we mean by a linear equation. A linear equation is one that can be written as:

This equation has $n$ unknown variables $x_i$ . Each term has some known constant coefficient $r_i$ , a number which may be zero, in which case we don’t usually write the $x_i$ term at all. And this equation has a single known constant term $c$ which the equation sums up to, which might be 0, or some other number.

These are a few *unrelated* linear equations:

They are unrelated because they don’t fit into the same space of variables. They do not form a system.

Sometimes, an equation needs to be algebraically manipulated to fit our desired standard form, but is still a linear equation. For example, from the slope-intercept form of a line:

If an equation can’t be manipulated to fit this form, it is nonlinear. (It may be possible to construct a linear approximation, however: see the Algebraic Approximations and Linear & Nonlinear sections.)

Here are some examples of nonlinear equations:

These sorts of equations will not be addressed here, but are still solvable with multiple numerical iterations using the same techniques shown here as a foundation.

## A System of Equations

A system of equations simply means that we have multiple equations, all of which must be **satisfied at the same time**, and multiple unknowns, which are shared between the equations.

An equation with unknowns is a search problem: we are *searching for the value of the unknowns that will make the equation be true*. The equation is true when the left side equals the right side.

For most values of the unknowns, the equation will be false: $y + 1 = 3$ is a false statement for infinitely many possible choices of $y$ . It is only a true statement at one particular choice. In this case, this equation is a true statement only when $y=2$ .

When we jump to having multiple equations and multiple unknowns, we have to think about not just whether our one equation is a true statement, but whether *all* of the equations in our system are true *at the same time* and for the same values of the unknowns.

Consider this system of two linear equations in two unknowns:

Let’s search for a solution by trying different values for the unknowns. We have to choose specific values for all of the unknowns in order to evaluate the left-hand sides of each equation, so we are searching over all possible values of all unknowns.

If we choose $x=0, y=0$
, this will make both equations false. Therefore, $x=0, y=0$
is *not* a solution to this system.

If we choose $x=0, y=2$
, this will make the first equation true. But it will make the second equation false. All of the equations must be true for the solution to be valid. Therefore, $x=0, y=2$
is *not* a solution to this system.

If we choose $x=2, y=1$ , this will make both equations true. Therefore, $x=2, y=1$ is a solution to this system.

In fact, it is a **unique solution point**: it is the *only* solution to this system. Why? Because if, for example, we were to increase $x$
by a little bit, then one equation would require $y$
to increase a little bit to remain valid, and the other equation would require it to decrease. The variable $y$
can’t both increase and decrease at the same time.

## A Geometric Interpretation

If we only have two unknowns, it’s easy to map these to a two-dimensional x-y plane.

Continuing with the same two-equation example above, we could convert both equations to slope-intercept form:

And now we can use the CircuitLab software to plot these two lines:

Exercise Click the “circuit,” click “Simulate,” then “Run DC Sweep.” You’ll see the two lines plotted, with an intersection at $x=2, y=1$ . This shows how we can quickly use the DC Sweep mode of a circuit simulator as a simple but flexible and powerful graphing tool.

## Zero, One, or Infinitely Many Solutions

A solution is a specified configuration of values for *all* of the unknowns such that *all* of the equations are simultaneously satisfied.

There are only three possible outcomes for any system of equations. Either:

- No solution exists.
- One unique solution exists.
- Infinitely many solutions exist.

Using our two-equation, two-unknown example from earlier, we can consider these three cases. (If you haven’t plotted it, do so now!)

If our two equations produce two perfectly parallel lines (but not the same line), there is **no intersection**. This means that there is no solution. No point in the x-y plane lies on both lines, so no x-y pair satisfies both equations at once.

If our two equations produce the exact same line overlapping with itself, then **all points are intersecting**. There are infinitely many solutions. All points on that line are possible solutions of the system of equations.

If the lines are not parallel, then they are guaranteed to have a **single intersection point**. This point is our one unique solution to the system. This is the normal case, and usually the one we’re interested in, but it’s important to see what can go wrong and what’s necessary for our solution to exist at all.

You can quickly modify the plotted equations above to illustrate the two other cases for yourself. For example, see this one for two equations with no solution:

Exercise Click to plot the no solutions case.

## Geometric Interpretations in Higher Dimensions

In two dimensions, solving a system of linear equations is equivalent to finding an intersection of the lines described by each equation. Each equation $A x + B y = C$ describes some line in 2D space.

In three dimensions, solving a linear system is equivalent to finding an intersection of the planes described by each equation. Each equation $A x + B y + C z = D$ describes some plane in 3D space.

In 3D, note that the **intersection of two planes** could be either:

- A line (the normal case),
- A plane (if both planes are identical), or
- No points (if both planes are parallel but not identical).

That’s why we can’t just have an intersection between two planes (two equations) to uniquely solve a 3D linear system problem.

If we have a third equation (i.e. a third plane), we can take the intersection of that plane with the line that resulted from the intersection of the first two planes. In 3D, the **intersection of a plane and a line** can either be:

- A point (the normal case),
- The line (if the line happens to be part of the plane), or
- No points (if the line is parallel to the plane but not touching).

When we successfully solve a system of 3 equations and 3 unknowns, we’re in the case where the first two equations intersected to produce a line, and that line then intersected with the third equation’s plane to produce a single solution point.

In higher dimensions, solving a linear system is equivalent to finding an intersection of the hyperplanes described by each equation.

Every equation in our system is like a **hyperplane in N-1 dimensions** where N is the number of unknowns in the system. Why N-1? Because, in general, if you specify N-1 of the unknowns, then there’s only 1 specific numerical value the remaining unknown can take for that equation to be satisfied:

- In one dimension ($A x = D$ ), nothing needs to be specified, and there’s only one $x$ coordinate that satisfies the equation.
- In two dimensions ($A x + B y = D$ ), if you specify an $x$ coordinate, there’s only one $y$ coordinate value that falls on any given line.
- In three dimensions ($A x + B y + C z = D$ ), if you specify an $x \ \text{and} \ y$ coordinates, then there’s only one $z$ coordinate value that falls on any given plane.
- And so on.

Every time we include another equation, we’re applying a new constraint, and, *so long as a unique solution exists*, we’re **reducing the dimensionality of the solution space by 1**. Ideally, after we’ve applied all our equations one at a time, we’re left with just a single unique solution point.

While most people do not have good intuition for geometry above 3D (or good drawing skills above 2D!), the geometric interpretation allows us to quickly see the possible cases of solving any linear system.

## One Equation & One Unknown

Consider an equation with just one unknown and one equation:

This has one solution: $x=5$
. But can we modify this equation to have *no* solution? Actually, yes:

There is no value of $x$ for which this equation is true. Therefore, this system (of 1 equation and 1 unknown) has no solution.

But what if we change the right hand side, too?

In this case, $x$
can take on any value, whether $x=5$
or $x=-\pi$
or $x=6 \times 10^{23}$
, and still the equation remains true. There are *infinitely many* possible solutions.

It may appear that we’ve oversimplified, but in fact, there’s nothing special about this example. These three outcomes are the only three possibilities with systems of any size, but they are illustrated here with the system of minimum size $N=1$ equations and unknowns.

## One Equation & Two Unknowns

When we have **fewer equations than unknowns**, a system definitely does not have a single solution, but it might have *either* zero or infinitely many solutions.

We’re familiar with an equation like:

This is the equation of a line, with 1 equation and 2 unknowns. It has infinitely many solution points: all points on that line are solutions to the equation. That is the definition of a line.

With 2 unknowns and 1 equation constraining them, this infinite solution space happens to have $2 - 1 = 1$ degrees of freedom. For example, we could write the line as a vector equation:

where the real number $m$ projects the slope vector along the one degree of freedom allowed by the equation. (In a problem with 3 unknowns and 1 equation, we would have a vector equation with two different vectors and two independent degrees of freedom.)

How can a 1 equation, 2 unknown system have zero solutions? Well, a simple way is:

There are no values of $(x,y)$ that would satisfy this equation.

We don’t normally think about it, but when we evaluate a line such as $-x + y = 3$ at a single point, such as $x=10$ , we’re really adding a second equation to the system:

To evaluate at $x=10$ , we are actually first adding that second equation to our system, and then solving a system with 2 equations and 2 unknowns. This happens so fast that we don’t consider it consciously or write it down, but that simple second equation is vital to producing a single solution point. Geometrically, we are intersecting the original line with a new vertical line defined as $x=10$ , and then finding the unique intersection point.

## Two Equations & One Unknown

When a system has **more equations than unknowns**, the system is usually overdetermined and inconsistent, meaning no solution exists. For example:

There is no value of $x$ that satisfies both of these equations, so no solution exists.

However, what if our two equations were:

In this special case, the value $x=2$
happens to satisfy both equations. What’s happened here? We’ve found two equations that are **not linearly independent**. The second equation doesn’t actually give us any new information compared to the first. Any value of $x$
that satisfies the first equation will also satisfy the second.

## Linear Independence

We’re not just interested in having N equations for N unknowns. We need all of those N equations to provide new information – **new constraints** – or else we would have the same solution space with just N-1 equations.

Our simple example of not-independent equations was very simple:

It’s clear that one equation is just double the other.

But **linear independence is not a pairwise property**. That means we can’t figure it out by only looking at one pair of equations at a time. We have to look at the whole collection of equations.

Here’s an example of three equations:

Any two of these equations taken alone are linearly independent, but all three together are not. We can add the second and third equations together and get the first equation.

If a set of equations are not linearly independent, we can remove one (or possibly more) of them and still have the same space of solutions.

There is no simple rule for inspecting a set of equations and quickly determining if they are or are not linearly independent. If you see quickly see any obvious issues, such as where one equation is a just a nonzero multiple of another, or where one equation is a linear combination of two other equations, you can quickly determine that the set is *not* linearly independent. However, in general, you may have to combine all previous N-1 equations to determine that the Nth equation is or is not linearly independent. In practice, determining linear independence is done by solving the system using Gaussian eliminiation or LU decomposition, as will be described below.

## Linear Independence and the Right Hand Side

Note that the linear independence of a set of equations only has to do with the coefficients of the unknowns on the left-hand-side of the equation. It does not depend on the values of the constants on the right-hand-side.

In the case where the system is *not* linearly independent, then the right-hand-side constants may help determine whether there are zero solutions (if the duplicated equations are inconsistent with each other) or infinitely many solutions (if the duplicated equations are consistent with each other).

For example, the non-independent system:

happens to be *consistent* with respect to the right-hand-side constants, so it could be reduced to a system of 2 equations in 3 unknowns which represent a line (in 3D space) passing through the $(x,y,z)$
points $(-5,3,0) \ \text{and}\ (-5,0,2)$
, and infinitely many other points on the same line.

However, if we change just a single right-hand-side constant from 6 to 0:

Now, while the left-hand-sides of the second and third equations sum to match the first, the right-hand-side constants do not. This guarantees that this system has no solutions.

While the outcome of no solutions versus infinitely many solutions may seem quite different, this is analogous to our $0 x = 10$
versus $0 x = 0$
in the 1 equation and 1 unknown base case described earlier. In either case, it is the *structure of the left-hand-side alone* that guarantees that we’ve lost our ability to produce a single solution point. That’s why, at least from the perspective of solving linear systems of equations, **linear independence is a property of the left-hand-side coefficients only**.

## N Linearly-Independent Equations & N Unknowns

A system having N linearly independent equations and N unknowns, where N is the same for both quantities, is a prerequisite for having a solvable linear system of equations. Why?

- If there are more unknowns than linearly independent equations, our system is
**underdetermined**: we have unconstrained degrees of freedom, and will have infinitely many solutions rather than a single point. - If there are more linearly independent equations than unknowns, we have a problem: it’s actually impossible for this case to happen, because those equations
*won’t actually be linearly independent*.

Only N linearly independent equations in N unknowns can produce a single solution point, which is what we’re usually looking for.

## Matrix Representation

Let’s consider a system with 3 equations and 3 unknowns:

This system has 3 linearly independent equations, and has a single valid solution: $f=-5, g=-3, h=4$ . You should quickly substitute these values in to each equation and confirm that it is valid.

We could also write this directly as a matrix equation $\mathbf{A} \mathbf{x} = \mathbf{B}$ , where $\mathbf{A}$ is a 3x3 matrix of left-hand-side coefficients, $\mathbf{x}$ is a 3x1 vector of unknowns, and $\mathbf{B}$ is a 3x1 vector of right-hand-side constants:

By performing the matrix multiplication on the left, we recover one equation from each row of $\mathbf{A}$ .

You should become comfortable going back and forth between the matrix representation and the equation representation. They are truly identical. However, we often work in the matrix representation because it’s more compact and it’s also the way we solve these systems with computational tools. **Learning to approach and solve problems both in the equation representation and the matrix representation is valuable, and we will flip back and forth in the rest of this book.**

This particular problem is a simple matrix equation, and can easily be solved by any matrix solver. For example, click here to solve this matrix equation on Wolfram Alpha.

Solving a linear system like this is so common that in many mathematical software packages (including MATLAB, Octave, Sage) one simply uses the backslash (“\”) operator to designate solving an equation: $\mathbf{x} = \mathbf{A} \backslash \mathbf{B}$ .

## Augmented Matrices

To make it even more compact to represent systems of equations, it’s common to omit the vector of unknowns, and to simply join the $\mathbf{B}$ vector on to the right-hand-side of $\mathbf{A}$ , with a vertical bar separating them:

This is called an **augmented matrix**, and it’s just another identical way of writing the system of equations from above. There’s nothing special about it except as a convenient way of writing these equations, and it happens to keep the left-hand-side coefficients and right-hand-side constants of each row together nicely.

## Ordering of Rows and Columns

When we consider writing a system of equations as a matrix, we have two considerations:

- In what order should we place the columns?
- In what order should we place the rows?

The ordering of the *columns* maps directly to the **ordering of the unknowns** $(f,g,h)$
in the example above. However, we could have chosen a different column ordering, so long as we’re careful to keep track as we work.

The ordering of the *rows* maps directly to the **ordering of the equations**. But a system of equations has the same meaning regardless of what order we write the equations. If we reorder the rows of $\mathbf{A}$
, we must also reorder the corresponding rows of $\mathbf{B}$
. However, the augmented matrix form $[\mathbf{A}|\mathbf{B}]$
does this for us “automatically” by keeping the entire row together. We can write this augmented matrix in any row order and the system will have the same solution.

Are there any especially convenient orderings that make a system of equations easier to solve? Yes. Let’s look at a few.

## Diagonal Matrices

Here’s a system of 5 equations in 5 unknowns which takes **practically zero effort to solve**:

It’s obvious that the equations don’t interact with each other at all, and we can solve each by simply dividing the right-hand-side constant by the left-hand-side coefficient:

As an augmented matrix, this system of equations looks like:

The 5x5 coefficient matrix has a special shape: it’s a **diagonal matrix**, meaning it only has nonzero entries on the main diagonal, and is strictly zero everywhere else.

Remember that for any system of equations we are free to rearrange the ordering of the rows freely, and additionally, we are free to rerrange the ordering of the columns so long as we keep track of the corresponding ordering of the unknown variables. That’s why the easy-to-solve diagonal structure has more to do with the structure of the equations themselves, rather than the order in which we may first see them written.

## Triangular Matrices

Here’s a system of 4 equations in 4 unknowns which was selected to be easily solvable by hand:

Make sure you can solve this system quickly by hand before you continue. You should find $w=8, x=-2, y=0, z=3$ .

This system is easy to solve because we can work one row at a time. Looking only at the first equation, we have one equation and one unknown, so we’ll immediately find a value for $w = \frac {16} {2} = 8$ .

Next, looking at the second row, we can substitute in the first value we found $w=8$ , and then the second row then becomes $8 - 3 x = 14$ , or $-3 x = 6$ , so $x = -2$ .

This process continues on and on for each subsequent equation because **each new equation contains only variables we’ve already solved, plus one new unknown** which is now easily solved by doing a bit of subtraction and division.

Here’s the same system written as an augmented matrix:

This shape of coefficient matrix is called a **lower triangular matrix**, meaning that nonzero values are present only on the main diagonal and any cells *below* the main diagonal. The area *above* the main diagonal is filled with strictly zero values.

A lower triangular matrix problem is easy to solve through **forward substitution**:

- Start at the first row, which contains a single nonzero variable term, and solve that equation directly with division.
- For all subsequent rows, substitute in the already-known values of the N-1 variables corresponding to the previous N-1 rows, again leaving a simple algebra problem to find the value of the Nth variable.

Similarly, we can mirror the situation. For example, consider this 3 equation and 3 unknown system:

We can again easily solve by hand to find $x=1, y=2, z=3$
. But in this case, our intuition is to *solve from the bottom up*: first find $z = \frac {6} {2}$
, then substitute that into the equation above that to find $y$
, and so on.

This system written as an augmented matrix:

The shape of this coefficient matrix is **upper triangular**, meaning that there are nonzero values present only on the main diagonal and any cells above the main diagonal. All the cells *below* the main diagonal are zero.

An upper triangular matrix problem is easy to solve through **back substitution**:

- Start at the last row, solve directly.
- Move up to the proceeding equation and substitute in the already-known values. Repeat until finished.

In general, most systems of equations we’re interested in solving will *not* happen to be one of the three easy to solve cases (diagonal, lower triangular, upper triangular). However, we’ll now show that it’s possible to solve *any* matrix problem by doing a series of simple operations that *convert* it into a triangular matrix problem without changing the solution of the system of equations.

## Row Operations

Equations have some nice properties that we take for granted:

These basic rules say that if we have an equation, we can always add a constant to both sides and it’s still true, or we can multiply by a constant and it’s still true.

If we have two equations $a=b \ \text{and} \ c=d$ then we can combine them:

Suppose we have a linearly independent system of 2 equations with 2 unknowns $x, y$ :

We can, for example, add the two equations together to create a new equation:

Or, after factoring the terms:

This last line is a new equation that is a **linear combination** of the original two equations. It is *not* linearly independent of the original two. However, importantly, if we pick either one (and only one) of the original equations, alongside our new equation, those two form a new system of two equations that is a linearly independent (assuming the original equations were linearly independent).

That is to say that we can modify our original system of equations in this way and get a new system of equations which is equivalent to (i.e. has the same solution to) the original system.

Because this is usually done with the matrix representation as discussed above, and each equation is represented by a row in our matrix, these modifications are called “row operations.” For example, to illustrate the same addition and overwrite to form a new system:

Both the original & modified augmented matrices have the same solution. We can apply further row operations to continue to manipulate the system. The goal of these row operations will be to turn our original system into a triangular system which we know is easy to solve.

Here are the three types of allowed row operations:

### Row Operation #1: Multiply a Row by a Constant

An equation can always be scaled by multiplying both sides by a constant factor (as long as that factor is not zero). This does not change the solution space of that equation.

We have to store the new, scaled equation in place of the original. We might write:

to indicate that we’re going to double row #3 and then store that new equation back in the same row.

### Row Operation #2: Swap Rows

A system of equations has the same meaning regardless of what order we list the equations in. Therefore, we can always swap two rows of our augmented matrix and have a new system with the same meaning. We might write:

to indicate that we’re going to swap row #2 with row #3.

(All reorderings of rows are possible through a series of pairwise row swaps.)

### Row Operation #3: Add a Row to Another Row

We can always add any nonzero multiple of one row to another row. This produces a linear combination of the two parent rows.

In order to maintain all the information that is previously present in the original system, we have to replace one of the two parent rows with our new linear combination.

For example, we might write:

to indicate that we’re going to take the fourth row, subtract three times the first row, and then store the result again in the fourth row.

All three of these row operations produce a new system of equations which has the same solution as the original system. Now, we’ll see how to use these operations to solve a system of equations.

## Gaussian Elimination

Solving a linear system using Gaussian elimination is a two-step process:

- First, we’ll use row operations to change a matrix into an upper-triangular one.
- Second, we’ll solve the upper-triangular system (which is easy to do, as we’ve shown earlier).

Let’s consider the following system of equations:

For each row of the matrix, we’ll look at a **pivot value** – the value that happens to be on the main diagonal – and use it to **eliminate** any nonzero matrix coefficients in the equations below it. For the first row, the pivot is marked here in red, and the two nonzero values to be eliminated are marked in green:

To remove the green values we’ll subtract the correct multiple (the multiple which cancels out the corresponding term) of the pivot row from each of the rows below it:

(No operation was necessary on the 3rd row, because its corresponding cell in the pivot column was already 0.)

Note that after these two row operations, we’ve made it so that the entire column below the pivot cell is all zeros.

Now, we move to the 2nd row. The pivot is the red cell on the main diagonal, and the green cells *below* it are ones we’ll work on eliminating:

Again, we need two row operations to eliminate the 8 and the -1 values:

There are lots of negative signs flying around, so be careful!

Finally, we’ll operate on the 3rd row:

With just one remaining row, only one row operation is necessary to clear that cell.

There are some unpleasant fractions flying around, but the result of this operation is that the coefficient matrix is now upper triangular!

Now that it’s upper triangular, we just solve with back-substitution: start from the last row, $\frac {3} {22} z = -\frac{30} {22}$ and solve to find $\boxed{z=-10}$ .

Next, we move up a row and look at the 3rd row, while substituting in the value of the one unknown $z$ we’ve already solved:

And now the 2nd row, substituting in the two unknowns $y, z$ we now know:

And finally the 1st row, substituting in all the other terms:

And that’s it! We’ve solved for all our unknowns. Most of the work was involved in **using each pivot to eliminate all the nonzero values in the column below it**. The resulting upper triangular matrix was easy to solve bottom-up.

Gaussian elimination, as shown here, is a very mechanical way to use row operations to transform an arbitrary matrix problem into a triangular one by eliminating nonzero terms below the main diagonal, one at a time.

## Gaussian Elimination with Pivoting

At each step in the Gaussian elimination process, we looked at row $p$ and our pivot value $A_{pp}$ , and then used it to cancel out any nonzero terms $A_{np} \ \text{for} \ n>p$ . The cancellation worked because we multiplied row $p$ by a ratio of the terms, $\frac{A_{np}}{A_{pp}}$ so that:

in order to zero out that cell. However, **this process breaks if we have to divide by zero**, i.e. if $A_{pp} = 0$
.

As an example, let’s consider the same four equations, but let’s swap row 1 and row 3 before we begin:

There is no way we can use the red 0 pivot cell to eliminate the green nonzero values in the column below it. There is no multiple of 0 that will additively cancel out nonzero values.

**A pivot value must be nonzero.**

If you encounter a case where the pivot value is zero, you must **swap rows** so that the pivot is not zero before you proceed. If this happens at an intermediate step, that’s OK: you can pause and swap row $p$
with some later row $n>p$
, and then resume. (Don’t swap with an earlier row $n \lt p$
because that will have nonzero values to the left of the pivot cell!)

## LU Decomposition

Circuit simulators like CircuitLab (and other numerical software) usually approach Gaussian elimination with a small twist. Instead of only working on an NxN original coefficient matrix $\mathbf{A}$ to create the NxN upper triangular matrix (which we’ll call $\mathbf{U}$ for upper), we’ll simultaneously create a NxN lower triangular matrix ($\mathbf{L}$ for lower), such that $\mathbf{A} = \mathbf{L} \mathbf{U}$ , the matrix product of the lower and upper triangular matrices.

The process of solving $\mathbf{A} = \mathbf{L} \mathbf{U}$
is called **LU Decomposition** or **LU Factorization**.

Notice that the Gaussian elimination process to make the matrix be upper triangular did not depend on the right-hand-side constant values; while they were modified by the row operations, they did not determine the modifications of the coefficient rows, and were referenced only in the backsubstitution step. If we wanted to use the same left-hand-side $\mathbf{A}$ but re-solve our equations for a different set of constants $\mathbf{B}$ on the right, we would have to redo much of our work.

However, if we already have a decomposition $\mathbf{A} = \mathbf{L} \mathbf{U}$ , then we can solve the system $\mathbf{A} \mathbf{x} = \mathbf{B}$ for a new right-hand-side $\mathbf{B}$ using this two-step process:

- Solve $\mathbf{L} \mathbf{y} = \mathbf{B}$
to find a vector $\mathbf{y}$
using
**forward substitution**since $\mathbf{L}$ is lower triangular. - Solve $\mathbf{U} \mathbf{x} = \mathbf{y}$
to find $\mathbf{x}$
using
**back substitution**since $\mathbf{U}$ is upper triangular.

These two solves are very, very fast because they’re each working on a triangular matrix.

The process of creating the $\mathbf{L}, \mathbf{U}$ matrices is very similar to Gaussian elimination, but uses $\mathbf{L}$ to keep track of the multiplications used for each pivot elimination. We won’t get into more detail here, but the LU decomposition for our 4x4 example matrix above is:

Where do these $\mathbf{L}, \mathbf{U}$ matrices come from?

- Notice that the $\mathbf{U}$
shown here is
*identical*to the result of our Gaussian elimination process above. - If you look closely, you may notice that the cell values in the $\mathbf{L}$
matrix come from the
**coefficients on each of the row operations**we performed to turn $\mathbf{A}$ into $\mathbf{U}$ .

Solving $\mathbf{L} \mathbf{y} = \mathbf{B}$ finds the same transformed right-hand side constants that Gaussian elimination produces on the augmented matrix. Then, solving $\mathbf{U} \mathbf{x} = \mathbf{y}$ just solves the same upper triangular system resulting from Gaussian elimination. The first step may look unfamiliar, but it’s really just doing the same work that Gaussian elimination would do on the right-hand side.

In practice, many problems (including many circuit problems) involve re-solving the same coefficient matrix $\mathbf{A}$ with different constant values $\mathbf{B}$ repeatedly, so performing the relatively slow LU decomposition just once and then performing the fast triangular solves as many times as needed can be a tremendous speedup.

## LU Factorization with Pivoting

Just as Gaussian elimination requires swapping rows to ensure nonzero pivot values, LU decomposition actually may require reordering the rows too. In practical implementations, LU decomposition actually transforms $\mathbf{A}$ by solving $\mathbf{P} \mathbf{A} = \mathbf{L} \mathbf{U}$ , where $\mathbf{P}$ is a permutation matrix (exactly one 1 in every row and every column; zero otherwise) which simply reorders the rows of A so that a decomposition exists.

From the LU decomposition with row permutation, we can solve:

Notice that we apply the row permutation to both the coefficients $\mathbf{P} \mathbf{A}$ and to the right-hand-side constants $\mathbf{P} \mathbf{B}$ ; this preserves their integrity as equations, just like the augmented matrix form does.

As $\mathbf{P} \mathbf{B}$ is just a reordering of the vector $\mathbf{B}$ , we can then continue with the forward and back substitution solves as before:

- First solve $\mathbf{L} \mathbf{y} = \mathbf{P} \mathbf{B}$ to find $\mathbf{y}$ .
- Then solve $\mathbf{U} \mathbf{x} = \mathbf{y}$ .

The algorithms for actually finding $\mathbf{P}, \mathbf{L}, \mathbf{U}$ from a given $\mathbf{A}$ are somewhat advanced and we won’t elaborate in this book, but the implementation details are important to producing a good result, especially on matrices of large size, high sparsity, and wide dynamic range, as are often present in electronics problems. High quality implementations of LU decomposition like the one built into the CircuitLab software provide high performance and excellent numerical stability over a wide space of possible input matrices.

## Strategies for Solving Systems of Equations by Hand

When we’re asked to solve a system of equations by hand, we’ll still use the same tools, but we can make our task slightly easier by taking a few shortcuts as we see them.

For systems of N linear equations with N unknowns, write the equations with all unknown terms on the left-hand side and all constants on the right-hand side, and then follow these three rules:

- If there are any
**equations with just one unknown**(such as $5 x = 3$ ), first solve for that unknown and then substitute its value into all remaining equations, and collect with the constant term on the right-hand side. You can now remove that equation from the system. This reduces the remaining number of equations by 1 and reduces the remaining number of unknowns by 1. - If there are any
**equations with two unknowns and zero constant**(such as $x - 2 y = 0$ ), then these two variables are simply a multiple of each other ($x = 2 y$ ). Choose one variable to keep, and replace all instances of the removed variable with the correct multiple of the retained variable (such as replacing $x$ with $2 y$ ) in all remaining equations. You can now set aside the original equation (such as $x - 2 y = 0$ ) and deal with it at the very end of your solution process (for example, after you know the value of $y$ ). This reduces the remaining number of equations by 1 and reduces the remaining number of unknowns by 1. - Use the row operation of adding a multiple of one row to another row in order to
**cancel out a term of one equation**, reducing the number of unknowns in that particular equation by one. Aim toward modifying an equation to look more like rule #1 or #2. (Once there is only one unknown left in that equation, then rule #1 applies.)

Repeat, doing one operation from this list until the system is completely solved. Some tips:

- You may work on the rows in whatever order will make the least work.
- Look for overlap.
- Look for places you can cancel terms without adding any new terms.
- You may work with the equations or with the augmented matrix form; both are equivalent, but one is more verbose and the other might make it too easy to make mistakes when working by hand.

This strategy is essentially Gaussian elimination, but we’ve *intermixed the reduction and the substitution steps* to do whichever is more convenient first, because if we’re showing our work by writing each step on paper it’s often convenient to reduce the dimensionality of our problem as early as possible.

Here’s an example of how we might solve a system by hand. Let’s start with the same system we solved using Gaussian elimination earlier:

First, we notice that we can use the 4th row to cancel out 3rd-column $y$ terms in both the 1st and 2nd equations without introducing any new nonzero terms:

The first and second row now have just one nonzero term each, so we immediately have two of our results:

We can substitute these values into the remaining two equations to produce our remaining system:

We now have a simple second equation with $\boxed{y=6}$ , which we can then plug into the first equation and find $\boxed{z=-10}$ . We now have solved all four unknowns. You can and should quickly check that these values satisfy the original four equations, and they match the Gaussian elimination solution we found previously.

Let’s try another approach. Here’s the original system of equations again:

Now we’ll notice that the 2nd row has two terms and zero constant value: it says $2 w - y = 0$ , or $y = 2 w$ . We’re going to choose to keep $w$ and remove $y$ from our set of equations by substituting $2 w$ every time we see a $y$ , and then we can remove the 2nd equation entirely from our system:

Notice that the right-hand side values are unchanged. While we haven’t actually determined the numerical value for $y$ yet, we’ve now quickly reduced the dimensionality of our problem from 4 to 3. And in doing so, the new third equation is trivially solvable for $\boxed{w=3}$ .

Now that we know $w$ , we also can remember to solve our simpliciation, $y = 2w \ \text{for} \ \boxed{y = 6}$ .

We substitute $w=3$ into the remaining two equations and remove the third equation to find:

From here, it’s clear from the 1st row that $\boxed{x=2}$ , and then when substituting that value into the second row, we find $\boxed{z=10}$ .

This is the same example system of equations we solved mechanically in the “Gaussian Elimination” section above, but by applying these solve-by-hand strategies, we have to do a lot less work (significantly fewer total operations), avoid ugly fractions, and get the answer faster!

When solving by hand, we have lots of opportunities to make arithmetic mistakes (for example, adding rather than subtracting when computing the new right-hand side after substitution), as well as record-keeping mistakes (for example, not keeping track of which column maps to which variable as we go). Fortunately, it’s relatively easy to check your solution after you compute it: just plug it into each of the equations and make sure the left side equals the right. Get in the habit of quickly checking your work after you solve systems of equations.

## What’s Next

In the next section, Steady State & Transient, we’ll consider how to separate the analysis of a system’s normal operating conditions from events that may change those conditions temporarily or permanently.

**Ultimate Electronics: Practical Circuit Design and Analysis.**CircuitLab, Inc., 2019, ultimateelectronicsbook.com. Accessed . (Copyright © 2019 CircuitLab, Inc.)