Tuesday, December 27, 2011

Uncertainty approximations

In the field of experimental science, high emphasis is placed on measuring real variables. These variables can then be computed into a certain equation to derive useful results. A famous example of an equation requiring such measured results is Newton's Law of Gravitation. This is what it looks like:

In this case, the useful result is the gravitational force F. To derive it's value, the measurements of the variables of mass and distance are necessary.

Every measurable variable has a precise real value. In that I mean that every measurable variable can be computed as a number with an infinite decimal expansion. But the problem lies in the fact that measurable variables are not perfectly measurable. This is because the instruments that we use to measure such variables have a certain error, in terms of what they measure. As a matter of fact, the human race has yet to encounter a perfect instrument that measures the value of a variable to its complete extent. This type of error inherent in common instruments is known as uncertainty.

Uncertainty is basically a quantity, measured in terms of the variable one is trying to quantify, and it provides an idea of how widely distributed the value of the variable could be, about a certain central value. I'll give you an example of how one can quantify a variable, taking into account its uncertainty. Assume you are measuring some distance, and your view of the meter rule (measuring device) looks like this:

The measurement looks like it's 18.50 cm. But you simply cannot be sure of this value. It could possibly be between 18.55 cm and 18.45 cm, and it would still look like 18.50 cm from the view above.

The value could also possibly be 18.43 cm or 18.58 cm, but then it wouldn't look like it does above. It would look more like 18.40 cm and 18.60 cm respectively.

So a convenient way to express the distance measured is this:

$\large (18.50 \pm 0.05) \textup{cm}$

If this distance was used in the formula for Newton's Law of Gravitation, it would lead to different possible values of the force F. This would create uncertainty in F.

Likewise, when measured variables (with their uncertainties) are computed into an equation, they extend their uncertainty to the theoretical (calculated) variable. In that, I mean that they create uncertainty, or possible deviations, in the theoretical variable. This phenomenon is known as uncertainty propagation.

I'm going to illustrate the idea of uncertainty propagation to you for a certain type of formula. Assume such a formula for a theoretical variable Z exists:

$\large Z=\frac{A^n B^m}{C^l}\; \; \; \, n,m,l\in \mathbb{R}^+$

A, B and C are measured variables.

A, B and C have some uncertainties to them as expressed below:

\large \begin{align*} &\textbf{A}: A \pm \Delta A \\\\ &\textbf{B}: B\pm \Delta B \\\\ &\textbf{C}: C\pm \Delta C \end{align*}

As a result of the uncertainties in A, B and C, Z develops an uncertainty too (Since A, B and C create the value of Z). Thus, Z can be expressed as shown:

$\large \textbf{Z}: Z \pm \Delta Z$

The theory, at least as far the A level syllabus is concerned, states that the fractional uncertainty in Z can be approximated by the fractional uncertainties in A,B and C and their powers in the following way, given certain conditions:

\large \begin{align*} &\frac{\Delta Z}{Z} \approx n \frac{\Delta A}{A}+m \frac{\Delta B}{B}+ l \frac{\Delta C}{C} \\\\ &\textup{For}\:\: \Delta A\ll A,\: \Delta B\ll B,\: \Delta C\ll C \end{align*}

My aim is to derive this approximation. I assume you have a strong knowledge of the binomial theorem for all real powers and that you're open-minded about approximations, because they might look cheap or desperate in some ways.

To start off, Z could have a maximum value of Z + Delta Z (postitive) due to the combination of uncertainties in the three measured variables.

\large \begin{align*} &Z=\frac{A^n B^m}{C^l} \\\\ &Z+\Delta Z_p=\frac{(A+\Delta A)^n (B+\Delta B)^m}{(C-\Delta C)^l} \\\\ &Z+\Delta Z_p=\frac{A^n(1+\frac{\Delta A}{A})^n\: B^m(1+\frac{\Delta B}{B})^m}{C^l(1-\frac{\Delta C}{C})^l} \end{align*}

\large \begin{align*} &\Delta Z_p=\frac{A^n(1+\frac{\Delta A}{A})^n\: B^m(1+\frac{\Delta B}{B})^m}{C^l(1-\frac{\Delta C}{C})^l}-Z \\\\ &\Delta Z_p=\frac{A^n(1+\frac{\Delta A}{A})^n\: B^m(1+\frac{\Delta B}{B})^m}{C^l(1-\frac{\Delta C}{C})^l}-\frac{A^n B^m}{C^l} \\\\ &\Delta Z_p=\frac{A^n B^m}{C^l}\left [ \frac{(1+\frac{\Delta A}{A})^n\: (1+\frac{\Delta B}{B})^m}{(1-\frac{\Delta C}{C})^l}-1 \right ] \end{align*}

\large \begin{align*} &\frac{\Delta Z_p}{Z}=\frac{\frac{A^n B^m}{C^l}}{\frac{A^n B^m}{C^l}}\left [ \frac{(1+\frac{\Delta A}{A})^n\: (1+\frac{\Delta B}{B})^m}{(1-\frac{\Delta C}{C})^l}-1 \right ] \\\\ &\frac{\Delta Z_p}{Z}=\left [ \frac{(1+\frac{\Delta A}{A})^n\: (1+\frac{\Delta B}{B})^m}{(1-\frac{\Delta C}{C})^l}-1 \right ] \\\\ \end{align*}

According to the binomial theorem:

\large \begin{align*} &\Rightarrow (1+\frac{\Delta A}{A})^n \approx (1+n\frac{\Delta A}{A}) \\\\&\Rightarrow (1+\frac{\Delta B}{B})^m \approx (1+m\frac{\Delta B}{B}) \\\\ &\Rightarrow (1-\frac{\Delta C}{C})^l\approx (1-l\frac{\Delta C}{C}) \\\\ &\therefore \frac{\Delta Z_p}{Z}\approx \left [ \frac{(1+n\frac{\Delta A}{A})\: (1+m\frac{\Delta B}{B})}{(1-l\frac{\Delta C}{C})}-1 \right ] \end{align*}

\large \begin{align*} &\frac{\Delta Z_p}{Z}\approx \left [ \frac{(1+n\frac{\Delta A}{A})\: (1+m\frac{\Delta B}{B})-(1-l\frac{\Delta C}{C})}{(1-l\frac{\Delta C}{C})} \right ] \\\\ &\frac{\Delta Z_p}{Z}\approx \left [ \frac{1+n\frac{\Delta A}{A}+m\frac{\Delta B}{B}+mn\frac{\Delta A}{A}\frac{\Delta B}{B}-1+l\frac{\Delta C}{C}}{(1-l\frac{\Delta C}{C})} \right ] \end{align*}

Another approximation:

\large \begin{align*} &\Rightarrow mn\frac{\Delta A}{A}\cdot \frac{\Delta B}{B} \approx 0 \\\\ &\textup{Since both}\, \frac{\Delta A}{A}\, \textup{and} \, \frac{\Delta B}{B} \, \textup{are small} \\\\ &\therefore\frac{\Delta Z_p}{Z}\approx \left [ \frac{n\frac{\Delta A}{A}+m\frac{\Delta B}{B}+l\frac{\Delta C}{C}}{(1-l\frac{\Delta C}{C})} \right ] \end{align*}

Another approximation:

\large \begin{align*} &\Rightarrow (1-l\frac{\Delta C}{C})\approx 1 \\\\ &\textup{Since} \,\, \frac{\Delta C}{C} \,\, \textup{is small} \\\\ &\therefore \frac{\Delta Z_p}{Z} \approx n\frac{\Delta A}{A}+m\frac{\Delta B}{B}+l\frac{\Delta C}{C} \end{align*}

This shows how the positive fractional deviation or positive fractional uncertainty in Z can be computed.

Z could also have a minimum value due to the distribution of the values of the measured variables.

\large \begin{align*} &Z=\frac{A^n B^m}{C^l} \\\\ &Z-\Delta Z_n=\frac{(A-\Delta A)^n (B-\Delta B)^m}{(C+\Delta C)^l} \\\\ &Z-\Delta Z_n=\frac{A^n(1-\frac{\Delta A}{A})^n\: B^m(1-\frac{\Delta B}{B})^m}{C^l(1+\frac{\Delta C}{C})^l} \end{align*}

\large \begin{align*} &-\Delta Z_n=\frac{A^n(1-\frac{\Delta A}{A})^n\: B^m(1-\frac{\Delta B}{B})^m}{C^l(1+\frac{\Delta C}{C})^l}-Z \\\\ &-\Delta Z_n=\frac{A^n(1-\frac{\Delta A}{A})^n\: B^m(1-\frac{\Delta B}{B})^m}{C^l(1+\frac{\Delta C}{C})^l}-\frac{A^n B^m}{C^l} \\\\ &-\Delta Z_n=\frac{A^n B^m}{C^l}\left [ \frac{(1-\frac{\Delta A}{A})^n\: (1-\frac{\Delta B}{B})^m}{(1+\frac{\Delta C}{C})^l}-1 \right ] \\\\ &-\frac{\Delta Z_n}{Z}=\left [ \frac{(1-\frac{\Delta A}{A})^n\: (1-\frac{\Delta B}{B})^m}{(1+\frac{\Delta C}{C})^l}-1 \right ] \end{align*}

According to the binomial theorem:

\large \begin{align*} &\Rightarrow (1-\frac{\Delta A}{A})^n \approx (1-n\frac{\Delta A}{A}) \\\\&\Rightarrow (1-\frac{\Delta B}{B})^m \approx (1-m\frac{\Delta B}{B}) \\\\ &\Rightarrow (1+\frac{\Delta C}{C})^l\approx (1+l\frac{\Delta C}{C}) \\\\ &\therefore -\frac{\Delta Z_n}{Z}\approx \left [ \frac{(1-n\frac{\Delta A}{A})\: (1-m\frac{\Delta B}{B})}{(1+l\frac{\Delta C}{C})}-1 \right ] \end{align*}

\large \begin{align*} &-\frac{\Delta Z_n}{Z}\approx \left [ \frac{(1-n\frac{\Delta A}{A})\: (1-m\frac{\Delta B}{B})-(1+l\frac{\Delta C}{C})}{(1+l\frac{\Delta C}{C})} \right ] \\\\ &-\frac{\Delta Z_n}{Z}\approx \left [ \frac{1-n\frac{\Delta A}{A}-m\frac{\Delta B}{B}+mn\frac{\Delta A}{A}\frac{\Delta B}{B}-1-l\frac{\Delta C}{C}}{(1+l\frac{\Delta C}{C})} \right ] \end{align*}

Another approximation:

\large \begin{align*} &\Rightarrow mn\frac{\Delta A}{A}\cdot \frac{\Delta B}{B} \approx 0 \\\\ &\textup{Since both}\, \frac{\Delta A}{A}\, \textup{and} \, \frac{\Delta B}{B} \, \textup{are small} \\\\ &\therefore-\frac{\Delta Z_n}{Z}\approx \left [ \frac{-n\frac{\Delta A}{A}-m\frac{\Delta B}{B}-l\frac{\Delta C}{C}}{(1+l\frac{\Delta C}{C})} \right ] \end{align*}

Another approximation:

\large \begin{align*} &\Rightarrow (1+l\frac{\Delta C}{C})\approx 1 \\\\ &\textup{Since} \,\, \frac{\Delta C}{C} \,\, \textup{is small} \\\\ &\therefore \left | -\frac{\Delta Z_n}{Z} \right |\approx n\frac{\Delta A}{A}+m\frac{\Delta B}{B}+l\frac{\Delta C}{C} \end{align*}

\large \begin{align*} &\left |\frac{-\Delta Z_n}{Z} \right |\approx \frac{\Delta Z_p}{Z} \\\\ &\therefore \left | -\Delta Z_n \right | \approx \Delta Z_p \end{align*}

$\large \therefore\frac{\Delta Z}{Z} \approx n \frac{\Delta A}{A}+m \frac{\Delta B}{B}+ l \frac{\Delta C}{C}$

This shows how the negative fractional deviation or the negative fractional uncertainty in Z can be computed. It also shows that both the positive and negative fractional uncertainties in Z are the same. This implies the negative and positive uncertainties in Z are the same. This implies you can know either the positive or negative uncertainties by simply knowing its opposite (negative or positive respectively). This lastly implies that the central values of A, B and C will give you the central value of Z under the stated conditions, since the deviations of Z on both polarities are the same.

$\large \textup{QED}$

I was wondering whether one could extend this type of an approximation to a more general expression of Z, given similar conditions regarding all of the measured variables:

$\large Z=\frac{A_1^{a_1}A_2^{a_2}A_3^{a_3}...A_n^{a_n}}{B_1^{b_1}B_2^{b_2}B_3^{b_3}...B_m^{b_m}}$

Which can be conveniently defined in the following way:

$\large Z=\frac{\prod_{j=1}^{n} A_j^{a_j}}{\prod_{i=1}^{m} B_i^{b_i}}$

In other words, I mean can the fractional uncertainty in this Z look like this, as predicted from the above derivation:

\large \begin{align*} &\frac{\Delta Z}{Z}\approx a_1 \frac{\Delta A_1}{A_1}+...+ a_n \frac{\Delta A_n}{A_n}+b_1\frac{\Delta B_1}{B_1}+...+ b_m \frac{\Delta B_m}{B_m} \\\\ &=\sum_{j=1}^{n} a_j\frac{\Delta A_j}{A_j}+\sum_{i=1}^{m} b_i\frac{\Delta B_i}{B_i} \end{align*}

Read on to find out. I'll be using the same method of analysis as I did in the derivation earlier on.

\large \begin{align*} &Z=\frac{A_1^{a_1}A_2^{a_2}A_3^{a_3}...A_n^{a_n}}{B_1^{b_1}B_2^{b_2}B_3^{b_3}...B_m^{b_m}} \\\\ &=\frac{\prod_{j=1}^{n} A_j^{a_j}}{\prod_{i=1}^{m} B_i^{b_i}} \\\\ &Z+\Delta Z_p=\frac{\prod_{j=1}^{n} (A_j+\Delta A_j)^{a_j}}{\prod_{i=1}^{m} (B_i-\Delta B_i)^{b_i}} \\\\ &Z+\Delta Z_p=\frac{\prod_{j=1}^{n} A_j(1+\frac{\Delta A_j}{A_j})^{a_j}}{\prod_{i=1}^{m} B_i(1-\frac{\Delta B_i}{B_i})^{b_i}} \end{align*}

\large \begin{align*} &Z+\Delta Z_p=\frac{\prod_{j=1}^{n} A_j}{\prod_{i=1}^{m} B_i}\cdot \frac{\prod_{j=1}^{n} (1+\frac{\Delta A_j}{A_j})^{a_j}}{\prod_{i=1}^{m} (1-\frac{\Delta B_i}{B_i})^{b_i}} \\\\ &\Delta Z_p=\frac{\prod_{j=1}^{n} A_j}{\prod_{i=1}^{m} B_i}\cdot \frac{\prod_{j=1}^{n} (1+\frac{\Delta A_j}{A_j})^{a_j}}{\prod_{i=1}^{m} (1-\frac{\Delta B_i}{B_i})^{b_i}}-\frac{\prod_{j=1}^{n} A_j}{\prod_{i=1}^{m} B_i} \\\\ &=\frac{\prod_{j=1}^{n} A_j}{\prod_{i=1}^{m} B_i} \left [\frac{\prod_{j=1}^{n} (1+\frac{\Delta A_j}{A_j})^{a_j}}{\prod_{i=1}^{m} (1-\frac{\Delta B_i}{B_i})^{b_i}}-1 \right ] \\\\ &\frac{\Delta Z_p}{Z}=\frac{\prod_{j=1}^{n} (1+\frac{\Delta A_j}{A_j})^{a_j}}{\prod_{i=1}^{m} (1-\frac{\Delta B_i}{B_i})^{b_i}}-1 \end{align*}

\large \begin{align*} &\frac{\Delta Z_p}{Z}=\frac{\prod_{j=1}^{n} (1+\frac{\Delta A_j}{A_j})^{a_j}}{\prod_{i=1}^{m} (1-\frac{\Delta B_i}{B_i})^{b_i}}-1 \\\\ &=\frac{\prod_{j=1}^{n} (1+\frac{\Delta A_j}{A_j})^{a_j}-\prod_{i=1}^{m} (1-\frac{\Delta B_i}{B_i})^{b_i}}{\prod_{i=1}^{m} (1-\frac{\Delta B_i}{B_i})^{b_i}} \end{align*}

According to the binomial theorem:

\large \begin{align*} &\Rightarrow (1+\frac{\Delta A_j}{A_j})^{a_j}\approx(1+a_j\frac{\Delta A_j}{A_j}) \\\\ &\Rightarrow (1-\frac{\Delta B_i}{B_i})^{b_i}\approx(1-b_i\frac{\Delta B_i}{B_i}) \\\\ &\therefore \frac{\Delta Z_p}{Z}\approx \frac{\prod_{j=1}^{n} (1+a_j\frac{\Delta A_j}{A_j})-\prod_{i=1}^{m} (1-b_i\frac{\Delta B_i}{B_i})}{\prod_{i=1}^{m} (1-b_i\frac{\Delta B_i}{B_i})} \end{align*}

Another approximation:

\large \begin{align*} &\Rightarrow (1-b_i\frac{\Delta B_i}{B_i}) \approx 1 \\\\ &\Rightarrow {\prod_{i=1}^{m} (1-b_i\frac{\Delta B_i}{B_i})} \approx 1 \\\\ &\textup{Since} \, \, \frac{\Delta B_i}{B_i}\, \, \textup{is small} \\\\ &\therefore \frac{\Delta Z_p}{Z}\approx \prod_{j=1}^{n} (1+a_j\frac{\Delta A_j}{A_j})-\prod_{i=1}^{m} (1-b_i\frac{\Delta B_i}{B_i}) \end{align*}

Another approximation... This one can be worked out algebraically. However, it is very lengthy and difficult to explain, so I will omit the explanation here. You can try understanding this approximation privately, or you can ask me personally:

\large \begin{align*} &\Rightarrow \prod_{j=1}^{n} (1+a_j\frac{\Delta A_j}{A_j}) \approx 1+\sum_{j=1}^{n} a_j\frac{\Delta A_j}{A_j} \\\\ &\Rightarrow \prod_{i=1}^{m} (1-b_i\frac{\Delta B_i}{B_i}) \approx 1- \sum_{i=1}^{m} b_i\frac{\Delta B_i}{B_i} \\\\ &\textup{Since} \, \, \frac{\Delta A_j}{A_j}\, \, \textup{and}\, \, \frac{\Delta B_i}{B_i}\, \, \textup{are small} \\\\ &\therefore \frac{\Delta Z_p}{Z}\approx 1+\sum_{j=1}^{n} a_j\frac{\Delta A_j}{A_j}-1+\sum_{i=1}^{m} b_i\frac{\Delta B_i}{B_i} \\\\ &\frac{\Delta Z_p}{Z}\approx \sum_{j=1}^{n} a_j\frac{\Delta A_j}{A_j}+\sum_{i=1}^{m} b_i\frac{\Delta B_i}{B_i} \end{align*}

Wow! An exact match for the prediction!

But of course, I only calculated the positive fractional uncertainty. You can try working out the negative fractional uncertainty. The modulus of the negative value should be exactly the same as the positive one.

Picture sources:
http://iruler.net/

LaTeX source:
http://www.codecogs.com/latex/eqneditor.php

Thursday, December 22, 2011

This year, I went through a number of life-changing experiences, some good and some bad. One bad one was picking up the habit of smoking.

I wan't a social smoker. In fact, I smoked when I was alone and feeling depressed. This happened back in August/September, and soon I developed a hardy addiction.

Smoking damaged my health and my soul. My habit made me a slave to tobacco. Because of the damage delivered directly to my lungs, I could no longer run long distances. Without this ability to run, I felt incomplete. Worst of all, I became someone who I never thought I could become. This caused me great sadness.

I realized then that I had to quit for the sake of my health, my family and self-esteem. With sheer will and the support of my family, I managed to quit by going cold turkey. Currently, I've quit for 3 months and it feels amazing!

From my experience with smoking, I realized that overcoming an addiction is similar to snapping a rubber band:

When I initially tried to quit by going cold turkey, I was burdened by withdrawal symptoms. I had persistent headaches and felt nauseous at times. I also had trouble breathing. But the worst part about withdrawal were the cravings. The cravings made it very difficult for me to focus on important tasks, like working. This situation is similar to the pain felt on one's fingers when trying to stretch and snap a rubber band.

Sometimes when the cravings got really bad, I gave in and smoked again. At these periods of relapse, I smoked so excessively that I fell sick. This situation is very similar to the painful smack you get when you let go of a stretched rubber band.

In the end, I had to endure 3.5 weeks of nonstop cravings to get over my addiction. It was a very painful experience. But after those few weeks, things quickly became very smooth and positive. This situation is similar to continually stretching a rubber band till it finally snaps. I'm happy to say my rubber band finally broke!

What made my addiction so difficult to overcome was my inability to remember the purpose of quitting. This perceived lack of purpose caused many relapses. What I realized was that when you feel down and hopeless, you must remind yourself why you are quitting. You must remember the loved ones at stake and envision the noble individual you want to become.

Picture source: http://www.kokev.in/2011/10/31/struggling-with-focus/

Tuesday, December 20, 2011

Saving curved distance

Today I was walking towards a friend's place, and I was really exhausted because the walk there was quite long. I happened to reach a curved road just before his place. The plot of the road looked something like this:

Sorry for all the extra notations. I will explain.

I happened to be at position A, and needed to get to position B, which was his house. As I said earlier, I was really tired and wanted to walk the least distance to get to his place. I was also feeling creative. There happened to be pedestrian crossings at AC and BD too.

I could possibly walk from A to B. Alternatively, I could cross from A to C, walk to D, and then cross from D to B. I was thinking, which path would be shorter, AB or ACDB? If you assume the paths are perfectly circular, you can come up with some interesting results!

Here's the mathy part of it.

\large \begin{align*} &(1) \\\\ &s_{AB}=\frac{\pi}{180}\cdot \theta \cdot r_2 \\\\ &s_{ACDB}=\frac{\pi}{180}\cdot \theta \cdot r_1+ 2(r_2-r_1) \\\\ &=\frac{\pi}{180}\cdot \theta \cdot r_1+ 2(\Delta r) \\\\ &\textup{Where}\: \, \theta \:\, \textup{is in degrees} \end{align*}

\large \begin{align*} &(2) \\\\ s_{ACDB}-s_{AB}\! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \\\\ &=\frac{\pi}{180}\cdot \theta \cdot r_1+ 2(\Delta r)-\frac{\pi}{180}\cdot \theta \cdot r_2 \\\\ &=2(\Delta r)-\frac{\pi}{180} \cdot \theta \cdot (\Delta r) \\\\ &=\Delta r\: (2-\frac{\pi}{180} \cdot \theta) \end{align*}

To find the condition for which ACDB is shorter than AB:

\large \begin{align*} &(3) \\\\ s_{ACDB}-s_{AB}< 0\! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \\\\ &\therefore \Delta r\: (2-\frac{\pi}{180} \cdot \theta)< 0 \\\\ &\frac{\pi}{180} \cdot \theta-2> 0 \\\\ &\theta> \frac{360}{\pi} \\\\ &\theta> 115^{\circ}\, \textup{approximately} \end{align*}

What this result shows is that if my friend's place was located at an angle of rotation of more than 115 degrees from A, it would be more economical (distance wise) to travel via ACDB. If you are faced with a similar problem, check if the rotational distance covers about a third of an entire circle, or about 1.5 times the rotation of a quarter of a circle. If it does, a path similar to ACDB should be taken.

This idea also holds for any value of r1 and r2, as shown by the proof above. But of course, if r1 and r2 are small, then the distance saved would be quite small too. To some, this distance saved might be so small that it wouldn't matter which path is taken. But for larger scale problems, where r1 and r2 are huge, taking the path ACDB might just save you a lot of distance, provided that the angle of rotation is greater than 115 degrees and that the paths are perfectly or reasonably circular.

As to what really happened, I took the longer path and, being the nerd I am, drained even more of my energy pouring over this minute but interesting problem.

LaTeX codes source: http://www.codecogs.com/latex/eqneditor.php

None of the above

Sin
God's satiric grin

Escape
The ignoble path

Abuse
Modern life

Happiness
Hope and none of the above.

Sunday, December 18, 2011

Derivation of the cross product formula

This post is a continuation of the earlier post on the dot product. In this post, I will be deriving the vector product or cross product formula from basic principles, and I will also be elaborating on the logical basis of the derivation.

The cross product formula is a very famous one, and it is commonly used to find a vector that is orthogonal to two other vectors, both of which have been computed into the formula. It is also used to find areas of triangles or parallelograms that are formed by the initial two vectors as they join together.

This is what it looks like:

\large \begin{align*} &\textup{For}\, \, \, \mathbf{a}=\begin{pmatrix} a_1 \\ a_2\\ a_3 \end{pmatrix} \, \textup{and}\, \; \; \mathbf{b}=\begin{pmatrix} b_1 \\ b_2\\ b_3 \end{pmatrix}\; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \\\\ &\mathbf{a}\times \mathbf{b}=\left \| \mathbf{a} \right \|\cdot\left \| \mathbf{b} \right \|\sin\theta\: \: \mathbf{\hat{n}}=\begin{pmatrix} a_2b_3-a_3b_2\\ a_3b_1-a_1b_3 \\a_1b_2-a_2b_1 \end{pmatrix} \end{align*}

Where a and b are some nonequivalent vectors with different directions, and n (cap) is a unit vector orthogonal to both a and b.

The direction of a x b is given by the left-hand rule (Note: it can also be given by the right-hand rule, but I'll be explaining a possible reason later on as to why the left-hand rule is preferred).

To start off the derivation, assume two vectors a and b exist, and so does another vector n, that is orthogonal to both a and b:

$\large \mathbf{a}=\begin{pmatrix} a_1 \\ a_2\\ a_3 \end{pmatrix} \; \; \mathbf{b}=\begin{pmatrix} b_1 \\ b_2\\ b_3 \end{pmatrix} \; \mathbf{n}=\begin{pmatrix} n_1 \\ n_2\\ n_3 \end{pmatrix}$

Our aim is to find the components of n, in terms of the components of a and b. Since n is orthogonal to both a and b, these conditions must apply:

\large \begin{align*} &\mathbf{a}\cdot\mathbf{n}=0 \\ &\mathbf{b}\cdot\mathbf{n}=0 \\ \\&(1)\: a_1n_1+a_2n_2+a_3n_3=0\! \! \! \! \! \! \! \! \! \! \! \! \\ &(2)\: \, b_1n_1+b_2n_2+b_3n_3=0\! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \end{align*}

Another independent equation of the component of n is necessary to solve this simultaneous equation with three unknowns of n1, n2 and n3. Notice that we haven't specified the magnitude of n. For simplicity, let the magnitude of n be 1 or unity.

\large \begin{align*} &\left \| \mathbf{n} \right \|=1\: \: \: \: \: \: \: \: \: \: \: \\\\&\sqrt{n_1^2+n_2^2+n_3^2}=1\\ \\\therefore \; &(3)\, \, \, n_1^2+n_2^2+n_3^2=1 \end{align*}

\large \begin{align*} &(1)\: a_1n_1+a_2n_2+a_3n_3=0\\ &(2)\: \, b_1n_1+b_2n_2+b_3n_3=0\\ &(3)\, \, \, n_1^2+n_2^2+n_3^2=1 \end{align*}

This now gives us 3 simultaneous equations to solve to determine the 3 unknowns of n1, n2 and n3. Solving this sort of a simultaneous equation is not easy, and that's why another field of linear algebra was invented; to make our lives easier. But I'm not going to use the help of linear algebra. I'm going to work from basic algebraic principles. This is going to be long!

\large \begin{align*} &(1)\\ \\&a_1n_1+a_2n_2+a_3n_3=0 \\\\&a_1n_1=-a_2n_2-a_3n_3 \\\\&n_1=-\frac{a_2}{a_1}n_2-\frac{a_3}{a_1}n_3 \end{align*}

\large \begin{align*} &(2)\\ \\&b_1n_1+b_2n_2+b_3n_3=0\\ \\&b_1\left (-\frac{a_2}{a_1}n_2-\frac{a_3}{a_1}n_3 \right )+b_2n_2+b_3n_3=0 \\\\ &n_2\left (\frac{a_2b_1}{a_1}-b_2 \right )=n_3\left ( b_3-\frac{b_1a_3}{a_1} \right ) \end{align*}

\large \begin{align*} &n_3=n_2\left ( \frac{\frac{b_1a_2}{a_1}-b_2}{b_3-\frac{b_1a_3}{a_1}} \right ) \\\\&=n_2\left (\frac{b_1a_2-b_2a_1}{b_3a_1-a_3b_1} \right ) \end{align*}

\large \begin{align*} &\textup{Back to} \: \: (1)\\\\ &n_1=-\frac{a_2}{a_1}n_2-\frac{a_3}{a_1}n_3 \\\\&=-\frac{a_2}{a_1}n_2-\frac{a_3}{a_1}{n_2}\left ( \frac{b_1a_2-b_2a_1}{b_3a_1-a_3b_1} \right ) \\\\&=-\frac{n_2}{a_1}\left [a_2+a_3\left (\frac{b_1a_2-b_2a_1}{b_3a_1-a_3b_1} \right ) \right ] \\\\&=-\frac{n_2}{a_1}\left [\frac{a_2(b_3a_1-a_3b_1)+a_3(b_1a_2-b_2a_1)}{b_3a_1-a_3b_1}\right ] \end{align*}

\large \begin{align*} &=-\frac{n_2}{a_1}\left ( \frac{a_1a_2b_3-b_1a_2a_3+b_1a_2a_3-a_1b_2a_3}{b_3a_1-a_3b_1} \right )\\\\ &=-\frac{n_2}{a_1}\left ( \frac{a_1a_2b_3-a_1b_2a_3}{b_3a_1-a_3b_1} \right ) \end{align*}

\large \begin{align*} &=-\frac{n_2}{a_1}\left ( \frac{a_1a_2b_3-a_1b_2a_3}{b_3a_1-a_3b_1} \right ) \\\\ &=-n_2\left ( \frac{a_2b_3-b_2a_3}{b_3a_1-a_3b_1} \right )\\\\ &n_1=n_2\left ( \frac{b_2a_3-a_2b_3}{b_3a_1-a_3b_1} \right ) \end{align*}

$\large \! \! \! \! \! \! \! \! (3) \\\\ n_1^2+n_2^2+n_3^2=1 \\\\ \left[n_2\left ( \frac{b_2a_3-a_2b_3}{b_3a_1-a_3b_1} \right )\right] ^2+n_2^2+\left[n_2\left (\frac{b_1a_2-b_2a_1}{b_3a_1-a_3b_1} \right )\right] ^2=1 \\\\\\ n_2^2\left[\left ( \frac{b_2a_3-a_2b_3}{b_3a_1-a_3b_1} \right \) ^2+1+\left (\frac{b_1a_2-b_2a_1}{b_3a_1-a_3b_1} \right )^2\right] =1$

\large \begin{align*} &n_2^2\left[\frac{(b_2a_3-a_2b_3)^2+(b_3a_1-a_3b_1)^2+(b_1a_2-b_2a_1)^2}{(b_3a_1-a_3b_1)^2}\right] =1 \\\\ &\textup{Let}\: \: \left[ (b_2a_3-a_2b_3)^2+(b_3a_1-a_3b_1)^2+(b_1a_2-b_2a_1)^2\right]\: \: \textup{be} \: \: Z \\\\& n_2^2\left[\frac{Z}{(b_3a_1-a_3b_1)^2}\right] =1 \end{align*}

\large \begin{align*} &n_2^2=\left[\frac{(b_3a_1-a_3b_1)^2}{Z}\right]\\\\ &n_2= \pm \frac{b_3a_1-a_3b_1}{\sqrt{Z}}\\\\ &\textup{From} \: \: (2)\\\\ &n_3=n_2\left (\frac{b_1a_2-b_2a_1}{b_3a_1-a_3b_1} \right ) \\\\ &=\pm \frac{b_3a_1-a_3b_1}{\sqrt{Z}}\cdot\left (\frac{b_1a_2-b_2a_1}{b_3a_1-a_3b_1} \right )=\pm \frac{b_1a_2-b_2a_1}{\sqrt{Z}} \end{align*}

\large \begin{align*} &\textup{From "Back to (1)"} \\\\ &n_1=n_2\left ( \frac{b_2a_3-a_2b_3}{b_3a_1-a_3b_1} \right ) \\\\ &=\pm \frac{b_3a_1-a_3b_1}{\sqrt{Z}}\cdot\left ( \frac{b_2a_3-a_2b_3}{b_3a_1-a_3b_1} \right )=\pm\frac{b_2a_3-a_2b_3}{\sqrt{Z}} \end{align*}

Now we've found n1, n2 and n3 in terms of the components of a and b.

\large \begin{align*} &\therefore\mathbf{\hat{n}}=\begin{pmatrix} n_1 \\ n_2\\ n_3 \end{pmatrix} \\\\&=\pm \begin{pmatrix} \frac{b_2a_3-a_2b_3}{\sqrt{Z}}\\ \frac{b_3a_1-a_3b_1}{\sqrt{Z}} \\ \frac{b_1a_2-b_2a_1}{\sqrt{Z}} \end{pmatrix} \\\\ &=\frac{\pm 1}{\sqrt{Z}}\begin{pmatrix} b_2a_3-a_2b_3\\ b_3a_1-a_3b_1 \\ b_1a_2-b_2a_1 \end{pmatrix} \end{align*}

Although we have succeeded in expressing n (cap) in terms of the components of a and b, we have arrived at a couple of problems. Firstly, we are stuck with an ugly and long square root Z term. This is a problem because we want our formula to be neat and easy to compute for future use. Elegance is an important part of Mathematics, but this formula is unfortunately not very elegant because of the square root Z term. Secondly, we are stuck with plus or minus signs. This arises because the vector n (cap) orthogonal to a and b can be pointing in one direction, or a direction that is anti-parallel to it. Which sign, and thus direction, do we choose? I will be solving both of these problems one at a time, to finally arrive at the definition of the cross product, as intended.

Firstly, we need to get rid of the square root Z term by understanding what it is made up of.

By multiplying the (cap) vector by square root Z, we get rid of the ugliness of Z on the right hand side. The n vector orthogonal to a and b now no longer has a magnitude of 1, but now has a magnitude of the square root of Z. To make things elegant, we have to find a way to break Z down to smaller components.

$\large \sqrt{Z}\cdot\mathbf{\hat{n}}=\pm\begin{pmatrix} b_2a_3-a_2b_3\\ b_3a_1-a_3b_1 \\ b_1a_2-b_2a_1 \end{pmatrix}$

As said earlier,

\large \begin{align*} &Z=(b_2a_3-a_2b_3)^2+(b_3a_1-a_3b_1)^2+(b_1a_2-b_2a_1)^2\\\\ &=(b_2a_3)^2-2(b_2a_3)(a_2b_3)+(a_2b_3)^2\\\\&+(b_3a_1)^2-2(b_3a_1)(a_3b_1)+(a_3b_1)^2\\\\&+(b_1a_2)^2-2(b_1a_2)(a_1b_2)+(a_1b_2)^2 \end{align*}

\large \begin{align*} &Z=(b_2a_3)^2+(a_2b_3)^2+(b_3a_1)^2+(a_3b_1)^2+(b_1a_2)^2+(a_1b_2)^2 \\\\ &-2(a_2b_2a_3b_3+a_1b_1a_3b_3+a_1b_1a_2b_2) \end{align}

Notice that Z comprises some common components like a2b2a3b3 and a1b1a3b3. These terms are also common in the dot product formula. Introducing information from the dot product formula might help us in understanding the structure of Z.

\large \begin{align*} &\left \| \textbf{a} \right \|\left \| \textbf{b} \right \|\cdot\cos \theta=a_1b_1+a_2b_2+a_3b_3\\\\ &(a_1^2+a_2^2+a_3^3)(b_1^2+b_2^2+b_3^2)\cdot \cos ^2\theta=(a_1b_1+a_2b_2+a_3b_3)^2\\\\ & \end{align*}
_______________________________________________________________

\large \begin{align*} &\textup{RHS}:\\\\ &(a_1b_1+a_2b_2+a_3b_3)^2\\\\ &=a_1^2b_1^2+a_1b_1a_2b_2+a_1b_1a_3b_3\\\\&+a_2^2b_2^2+a_1b_1a_2b_2+a_2b_2a_3b_3\\\\&+a_3^2b_3^2+a_1b_1a_3b_3+a_2b_2a_3b_3\\\\ &=a_1^2b_1^2+a_2^2b_2^2+a_3^2b_3^2+2(a_1b_1a_2b_2+a_1b_1a_3b_3+a_2b_2a_3b_3) \end{align*}
_______________________________________________________________

\large \begin{align*} &\therefore(a_1^2+a_2^2+a_3^3)(b_1^2+b_2^2+b_3^2)\cdot \cos ^2\theta \\\\ &=a_1^2b_1^2+a_2^2b_2^2+a_3^2b_3^2+2(a_1b_1a_2b_2+a_1b_1a_3b_3+a_2b_2a_3b_3) \\\\\\\\ &\Rightarrow -(a_1^2+a_2^2+a_3^3)(b_1^2+b_2^2+b_3^2)\cdot \cos ^2\theta \\\\&=-a_1^2b_1^2-a_2^2b_2^2-a_3^2b_3^2-2(a_1b_1a_2b_2+a_1b_1a_3b_3+a_2b_2a_3b_3) \end{align*}

\large \begin{align*} &\Rightarrow -(a_1^2+a_2^2+a_3^3)(b_1^2+b_2^2+b_3^2)\cdot \cos ^2\theta+a_1^2b_1^2+a_2^2b_2^2+a_3^2b_3^2\\\\ &+(b_2a_3)^2+(b_3a_1)^2+(b_1a_2)^2+(a_2b_3)^2+(a_3b_1)^2+(a_1b_2)^2 \\\\&=-a_1^2b_1^2-a_2^2b_2^2-a_3^2b_3^2-2(a_1b_1a_2b_2+a_1b_1a_3b_3+a_2b_2a_3b_3) \\\\ &+a_1^2b_1^2+a_2^2b_2^2+a_3^2b_3^2+(b_2a_3)^2+(b_3a_1)^2+(b_1a_2)^2\\\\&+(a_2b_3)^2+(a_3b_1)^2+(a_1b_2)^2 \end{align*}

\large \begin{align*} &\therefore a_1^2b_1^2+a_2^2b_2^2+a_3^2b_3^2+(b_2a_3)^2+(b_3a_1)^2+(b_1a_2)^2\\\\ &+(a_2b_3)^2+(a_3b_1)^2+(a_1b_2)^2\\\\&-(a_1^2+a_2^2+a_3^3)(b_1^2+b_2^2+b_3^2)\cdot \cos ^2\theta\\\\ &=(b_2a_3)^2+(b_3a_1)^2+(b_1a_2)^2+(a_2b_3)^2+(a_3b_1)^2+(a_1b_2)^2\\\\ &-2(a_1b_1a_2b_2+a_1b_1a_3b_3+a_2b_2a_3b_3)=Z \end{align*}

\large \begin{align*} &Z=a_1^2b_1^2+a_2^2b_2^2+a_3^2b_3^2+(b_2a_3)^2+(b_3a_1)^2+(b_1a_2)^2\\\\ &+(a_2b_3)^2+(a_3b_1)^2+(a_1b_2)^2\\\\&-(a_1^2+a_2^2+a_3^3)(b_1^2+b_2^2+b_3^2)\cdot \cos ^2\theta\\\\ &=a_1^2b_1^2+a_2^2b_2^2+a_3^2b_3^2+(b_2a_3)^2+(b_3a_1)^2+(b_1a_2)^2\\\\ &+(a_2b_3)^2+(a_3b_1)^2+(a_1b_2)^2\\\\ &-[a_1^2b_1^2+a_2^2b_2^2+a_3^2b_3^2+(b_2a_3)^2+(b_3a_1)^2+(b_1a_2)^2\\\\ &+(a_2b_3)^2+(a_3b_1)^2+(a_1b_2)^2]\cdot \cos ^2\theta\\\\ \end{align*}

\large \begin{align*} &\textup{Since} \left \|\textbf{a} \right \|^2\left \| \textbf{b} \right \|^2\\\\ &=a_1^2b_1^2+a_2^2b_2^2+a_3^2b_3^2+(b_2a_3)^2+(b_3a_1)^2+(b_1a_2)^2\\\\ &+(a_2b_3)^2+(a_3b_1)^2+(a_1b_2)^2 \\\\\\ &\therefore Z=\left \|\textbf{a} \right \|^2\left \| \textbf{b} \right \|^2-\left \|\textbf{a} \right \|^2\left \| \textbf{b} \right \|^2\cdot \cos^2\theta \\\\ &=\left \|\textbf{a} \right \|^2\left \| \textbf{b} \right \|^2(1-\cos^2 \theta) \end{align*}

\large \begin{align*} &\therefore \sqrt{Z}=\pm\left \|\textbf{a} \right \|\left \| \textbf{b} \right \|\sqrt{(1-\cos^2 \theta)} =\pm\left \|\textbf{a} \right \|\left \| \textbf{b} \right \|\sqrt{\sin^2 \theta}\\\\ &=\pm\left \|\textbf{a} \right \|\left \| \textbf{b} \right \|\sin \theta \\\\ &=\left \|\textbf{a} \right \|\left \| \textbf{b} \right \|\sin \theta\\\\ &\textup{Since}\, \, \theta\, \, \textup{is defined to be acute or obtuse} \end{align*}

\large \begin{align*} &\therefore\sqrt{Z}\cdot\mathbf{\hat{n}}\\&= \left \|\textbf{a} \right \|\left \| \textbf{b} \right \|\sin \theta\cdot\mathbf{\hat{n}} = \pm\begin{pmatrix} b_2a_3-a_2b_3\\ b_3a_1-a_3b_1 \\ b_1a_2-b_2a_1 \end{pmatrix} \\\\ \end{align*}

Now we have figured out the composition of square root Z! The next problem is understanding which sign to give the orthogonal vector (cross product vector). If no particular sign is chosen, these two cross product vectors n1 and n2 are possible:

Since the left-hand rule convention is adopted (as defined at the start and also shown above), the vector we are looking for is n1. Now we have to figure out which sign, positive or negative, produces n1.

To do that, we firstly have to assume once again that two different vectors a and b exist. We need to apply a mathematical trick here. Assume these two vectors are free vectors. That means that we can move them around without affecting them or their components. We can place them at a common point such that both diverge outwards. Now, we transform the co-ordinate system that governs them, from i,j and k to a more convenient co-ordinate system of i',j' and k', such that a and b now become a' and b' in the new co-ordinate system and now lie in the plane i'j'. The two vectors should look like this:

Such that,

$\large \textbf{a}'=\begin{pmatrix}a'_1 \\ a'_2 \\ 0 \end{pmatrix} \: \: \: \textbf{b}'=\begin{pmatrix}b'_1 \\ b'_2 \\ 0 \end{pmatrix}$

Using this information, I will test out the possible cross products with the positive sign and the negative sign to determine which sign gives the cross product vector that follows the left-hand rule (ie, gives a positive k' axis value).

\large \begin{align*} &\left ( \left \| \mathbf{a'} \right \|\left \| \mathbf{b'}\right \| \sin \theta\: \mathbf{{\hat{n}}} \right)_+ = \begin{pmatrix}0 \\ 0 \\b_1'a_2'-b_2'a_1' \end{pmatrix} \\\\\ &\left( \left \| \mathbf{a'} \right \|\left \| \mathbf{b'}\right \| \sin \theta \: \mathbf{{\hat{n}}} \right )_- = \begin{pmatrix}0 \\ 0 \\b_2'a_1'-b_1'a_2' \end{pmatrix} \end{align*}

Note: The positive and negative signs below the brackets indicate which sign was used in the computation.

Since a and b travel along the positive i' axis and a travels along the negative j' axis while b travels along the positive j' axis:

\large \begin{align*} &a'_1\, ,\, b'_1> 0 \\\\ &a'_2<0\, ,\, b'_2>0 \\\\ \therefore\! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \\\\ &b'_1a'_2-b'_2a'_1<0 \\\\ &b'_2a'_1-b'_1a'_2>0 \end{align*}

\large \begin{align*} \left.\begin{matrix}&\left ( \left \| \mathbf{a'} \right \|\left \| \mathbf{b'}\right \| \sin \theta\: \mathbf{{\hat{n}}} \right)_+ = \begin{pmatrix}0 \\ 0 \\b_1'a_2'-b_2'a_1' \end{pmatrix} \end{matrix}\right\}- \\\\\ \left.\begin{matrix} &\left( \left \| \mathbf{a'} \right \|\left \| \mathbf{b'}\right \| \sin \theta \: \mathbf{{\hat{n}}} \right )_- = \begin{pmatrix}0 \\ 0 \\b_2'a_1'-b_1'a_2' \end{pmatrix} \end{matrix}\right\}+ \end{align*}

Thus, it is the negative sign in the cross product formula that produces an orthogonal vector that obeys the left-hand rule, as proven above for arbitrary vectors a and b.

Summing everything up:
\large \begin{align*} &\textbf{a}\times\textbf{b}\overset{{\mathrm{def}}{}}{=}\left \|\textbf{a} \right \|\left \| \textbf{b} \right \|\sin \theta\cdot\mathbf{\hat{n}} = \begin{pmatrix} - (b_2a_3-a_2b_3)\\ -(b_3a_1-a_3b_1) \\ - (b_1a_2-b_2a_1) \end{pmatrix} \\\\ &=\begin{pmatrix} a_2b_3-a_3b_2\\ a_3b_1-a_1b_3 \\ a_1b_2-a_2b_1 \end{pmatrix} \\\\ &\textup{QED} \end{align*}

That was really long!

Now for something softer... Why do you think the left-hand rule is sometimes preferred over the right-hand rule?

Well, it could possibly be that the person who coined the left-hand rule was actually right handed, and realized that it would be more efficient to choose the left-hand rule, since he could look at his left hand to analyze vectors and write with his right hand, all at the same time.

Picture source:
http://www.diracdelta.co.uk/science/source/f/l/flemings%20left%20hand%20rule/source.html

LaTeX codes:
http://www.codecogs.com/latex/eqneditor.php

Friday, December 16, 2011

Derivation of the dot product formula

There is a very famous formula in the field of vectors known as the dot product formula or the scalar product formula. It is used widely to find the angle between two vectors, or to find the length of projection of one vector onto another. This is what the formula looks like:

\large \begin{align*} \vec{a}=\begin{pmatrix} a_{1} \\ a_{2} \\ a_{3} \end{pmatrix} \; \; \vec{b}=\begin{pmatrix} b_{1} \\ b_{2} \\ b_{3} \end{pmatrix} \\\\ \vec{a}\cdot\vec{b}=a_1b_1+a_2b_2+a_3b_3=\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|\cos\theta\! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \end{align*}

My aim is to derive this formula from basic geometric principles. I will also be giving you the rationale behind these derivations- that is, the initial ideas that could have occurred to the pioneers of vectors, during the time in which they were inventing these formulas. Here we go.

The rationale for this formula goes as follows. Two vectors a and b exist in the manner shown below. How can I find the angle between these two vectors? (Note: This angle refers to the angle between the vectors such that they are diverging outwards together or converging inwards together.) Is there a function (of the components of the vectors) that exists that could allow me to deduce the angle between the vectors? Now that this logical basis for the derivation is set up, here is how the questions can be answered and how the formula can be derived:

(Note: The actual pioneers may not have done this. I am just providing a logical path for people to follow and to ultimately understand where these formulas came from.)

$\large \vec{a}= \begin{pmatrix} a_{1} \\ a_{2} \\ a_{3} \end{pmatrix} \; \; \vec{b}=\begin{pmatrix} b_{1} \\ b_{2} \\ b_{3} \end{pmatrix} \: \Rightarrow\: \vec{a}+\vec{b}=\begin{pmatrix} a_{1}+b_{1} \\ a_{2}+b_{2} \\ a_{3}+b_{3} \end{pmatrix}$

This is what the vectors would look like visually:

We are interested in finding theta, or some relation to theta. Using the cosine rule, one can deduce the following:

\large \begin{align*} \left \| \vec{a}+\vec{b} \right \|^{2}=\left \| \vec{a} \right \|^{2}+\left \| \vec{b} \right \|^{2}-2\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|\cdot\cos (\pi -\theta )\\\\ \cos(\pi-\theta)=\frac{\left \| \vec{a} \right \|^{2}+\left \| \vec{b} \right \|^{2}-\left \| \vec{a}+\vec{b} \right \|^{2}}{2\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|}\; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \\\\ \textup{Since}\: \cos(\pi-\theta)=-\cos\theta\; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \end{align*}

\large \begin{align*} \therefore\cos\theta =\frac{\left \| \vec{a}+\vec{b} \right \|^{2}-\left \| \vec{a} \right \|^{2}-\left \| \vec{b} \right \|^{2}}{2\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|}\; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \\\\ =\: \: \frac{(a_{1}+b_{1})^{2}+(a_{2}+b_{2})^{2}+(a_{3}+b_{3})^{2}}{2\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|}\;\; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \\\\-\; \! \! \! \! \! \! \! \: \: \: \:\; \; \; \frac{a_{1}^{2}+b_{1}^2+a_2^2+b_2^2+a_3^2+b_3^2}{2\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|}\; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \end{align*}

\large \begin{align*} &=\frac{(a_1^2+b_1^2+2a_1b_1)+(a_2^2+b_2^2+2a_2b_2)+(a_3^2+b_3^2+2a_3b_3)}{2\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|} \\\\&-\: \:\frac{a_{1}^{2}+b_{1}^2+a_2^2+b_2^2+a_3^2+b_3^2}{2\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|} \end{align*}

\large \begin{align*} =\frac{2a_1b_1+2a_2b_2+2a_3b_3}{2\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|}\; \; \; \; \; \; \; \\\\ \cos\theta=\frac{a_1b_1+a_2b_2+a_3b_3}{\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|}\; \; \; \; \; \end{align*}

We found the function that determines the angle between the two vectors! Through further manipulation, we arrive at the dot product formula and will also understand how it is defined.

\large \begin{align*} \therefore a_1b_1+a_2b_2+a_3b_3=\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|\cos\theta \\\\ \vec{a}\cdot\vec{b}\overset{{\mathrm{def}}}{=}a_1b_1+a_2b_2+a_3b_3=\left \| \vec{a} \right \|\cdot\left \| \vec{b} \right \|\cos\theta\! \! \! \! \! \! \! \! \! \! \! \! \! \\\\\textup{QED}\; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \; \, \, \end{align*}

Voila! When I worked this out a couple of months back, I was blown off my feet. I never knew that this formula, which as a student I had to memorize, could actually be derived so beautifully from basic principles.

The beauty of this formula and of vectors in general is that information pertaining to the vectors, such as the angle between two vectors, can be found easily through a process of manipulating and computing a certain set of numbers in a specific manner. That's all. No algebra or complex thinking is required. Nothing novel too. It's just a simple set of rules.

LaTeX code source: http://www.codecogs.com/latex/eqneditor.php

Tuesday, December 13, 2011

The nonexistence of mathematical graphs

In many of our Mathematics textbooks or notes, we see these objects called "graphs". They are basically geometric figures that exist in metric space (with certain axes) and are well-defined, and what governs their behavior is a mathematical equation. Two examples of such graphs with their corresponding mathematical equations are shown below:
As you can see, these graphs are perfectly visible. In fact, nothing seems to be a problem here. Everything's smooth...

But things are not as simple as they seem. In fact, there exists a deep flaw in this and all outlooks of graphs. I will be explaining to you why all graphical representations of mathematical equations are wrong.

Firstly, I want you to take out a piece of paper (best if it's recycled; save the Earth!), and draw an x and y axis, something like in the picture above. Then, using only conventional wisdom, I want you to draw points that represent the co-ordinates (1,1), (2,1) and (1,2). Just to let you know, a theoretical point in 2 dimensions is a geometric representation of a set of 2 real numbers. Using only conventional wisdom, one should come up with something like this:

Now, let's zoom in on the point (1,1):
This point looks like a shaded circle, with its center at (1,1). It has a radius of epsilon (3 inverted laterally), such that epsilon is greater than zero. This has to be true. If epsilon was equal to zero, then we would not be able to see the point. If epsilon was negative, things wouldn't make sense. Then, assume epsilon equals to 0.1. The circular point would then encompass all possible real numbers between 0.9 and 1.1 on each axis. This basically means that this point does not represent a single set of 2 real numbers or a single co-ordinate, but encompasses an infinite number of them. Thus, we have now arrived at a contradiction. We started off by saying that this point we drew represents a single set of 2 real numbers or a single co-ordinate in 2 dimensions, but now we have discovered it does not. We have discovered that the point encompasses a range of co-ordinates.

Because of this contradiction, the only logical conclusion we can make is that the point we have drawn is wrong. So what now? Using the ideas above, we can tell that epsilon's existence entails the existence of more than one co-ordinate being encompassed by the point. Thus, the only logical conclusion would be that epsilon cannot exist. Only then would the point represent a single co-ordinate. Nonexistence is defined by the number zero. Thus, epsilon must be equal to zero to accurately represent a single co-ordinate geometrically.

But now another problem arises. By having a zero radius, a point is no longer a point. In other words, a point no longer exists visually; it is invisible. This may seem trivial, but this reality has great implications. Since continuous graphs like the one right above consist of a succession of these invisible points, the direct result is that any graph governed by a specific mathematical equation must also be invisible in metric space. This doesn't make any sense! But it's true.

In other words, all graphical representations of mathematical equations are theoretically incorrect, because one should not even be able to see a graphical representation of a mathematical equation in the first place.

QED

This is really bizarre. One question that arises here is that if graphs are in reality invisible, then what do visible graphs (that we see in documents) represent?

Even though graphs in reality are supposed to be invisible, the co-ordinates that comprise them do have a certain orientation and position in metric space. We cannot deny that. That's how metric space is defined. Thus, even though the graphs cannot exist visually, they do have some sort of a virtual existence, with a virtual shape and orientation. The visible graphs that we see in our daily lives actually represent approximations of the true "invisible" graphs. The visible graphs are not perfectly correct or precise (hence their visibility), but they provide us with an idea of how the true co-ordinates of a graph are distributed in metric space. The picture they provide us with is not perfect, but it's sufficient for us human beings because the margin of error is relatively low; much lower than what would seem perceivable or largely hindering to us.

Another question that could arise is that why do we even need these visible graphs in the first place? Why is it a need for us to commit a "mathematical crime" by bringing something mathematically unlawful into existence?

From the point of view of a mathematical purist, we actually do not need these graphs. Mathematics at its purest manifests as algebra. Algebraic problems can be solved by algebraic means, and these algebraic means characterize the purest means of Mathematics. I'll give you a simple example and solution of such an algebraic problem:

\large \begin{align*} (1)\; \; \; &y=5x \\ (2)\; \; \; & y=3x+2\\\\ \therefore5&x=3x+2 \\2&x=2 \\\therefore\: \: &x=1 \end{align*}

This problem was solved easily by algebraic means. Theoretically, all algebraic problems can be solved by algebraic means. There is actually no need for figures like graphs whatsoever. However, algebraic means to solve algebraic problems may not always be practical. Sometimes, solving an equation by algebra may be close to impossible. An example is shown below:

\large \begin{align*} (1)\; \; \;y&= x^{3}+2\\ (2)\; \; \; y&=\sin x^{2}\\\\ \underbrace{\sin\: x^{2}=x^{3}+2}\! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \! \\ ?\! \! \! \! \! \! \end{align*}
In such scenarios, graphs become very useful. Graphs help us by telling us the number of roots of the equation, the polarity of the roots (positive or negative) and also the approximate values of roots. Once again, the graphs may not be perfect, but the error inherent in the graphs is small enough for us human beings to neglect. The graphs corresponding to the equations above are shown below:

Using the graphs, we can know that there exists only one negative root to the equation, which is between -1 and -1.5, about -1.3.

This is one reason why we use such visible graphs: for the purpose of practicality.

Another possible reason why we like to use these graphs is because of their structural beauty. When you see a graph like the one below, you can't help but be struck by its beauty.

Perhaps we like visible graphs because they remind us of the structural beauty of the world that we live and thrive in. In the attempt of mapping the physical beauty of our world onto the inanimate, axiomatic mathematical world, we somehow forgo all of our mathematical purism. Humanity is pervasive isn't it?

Picture source: http://www.mathsrevision.net/gcse/pages.php?page=24
Math codes (LaTeX) source: http://www.codecogs.com/latex/eqneditor.php

Friday, December 2, 2011

Liberation

As I went through the finishing moments of exams, I couldn't help but feel a growing sense of freedom, of wild liberation. All around me in the cold confined space, I saw the same thing. Some were more excited, others less. Shouts, cheers, it's over! Liberty! Freedom! It felt like being released from a tiny dark cell.

Wild liberation, that thing was contagious. Running around in bouts of madness, watching my friends engulfed in their own frenzies, I couldn't help but feel the expanse within me contracting. It was fleeting; impermanent for some reason.

Then it all made sense. I was never forced into a cell in the first place. I created it. The pain wasn't real. It was all a product of my being. There was no actual period of confinement. There was no torture. It was all life. Life went on then. Life goes on now. I was never trapped, never liberated.

Now it occurs to me. I have always been living. There were no discontinuities. All that I've been through, those experiences were life. All those I met, those people were life. Life never trapped me. Life was never my confinement.

Life is my liberation.