Susskind's Quantum Mechanics

Published on

Susskind's Quantum Mechanics


  • avatar
    Name
    Gary Hu

Notes from watching Leonard Susskind's Theoretical Minimum series. Scribed notes for what I've learned, ideally at a level that's easy to understand. Another goal of this blog is to be succinct - these notes came from 18.5 hours of lectures, which were then turned into a ~400 page book. To be more succinct, I will also leave out some details and avoid being overly wordy; the reader should think of these notes as a skeleton for the actual course, rather than a replacement.

Please reach out to me if you would like to discuss anything here or have any questions!

Table of Contents

Lecture 1 - Systems and Experiments

Classical mechanics is intuitive because we can visualize the objects being studied. While it becomes mathematically abstract with concepts like Lagrangians and Poisson brackets, its basic principles are motivated by the observable world. In classical mechanics, we consider 1) a system of objects, and 2) the state of this system. Its deterministic nature means that knowing the system's state at any given time allows us to predict its future behavior.

Beyond classical mechanics, abstraction increases. Our brains operate in three dimensions, making it impossible to visualize higher or lower dimensions accurately. No one can visualize 4 dimensions, and when we visualize lower dimensions (say 2d), we are seeing a 2d plane embedded in 3 dimensions. Good physicists excel not in visualizing higher dimensions but in using abstract mathematics to describe them. Thus, understanding quantum mechanics requires relying on abstract math rather than trying to visualize it as an extension of classical mechanics.

In classical mechanics, the key question is: what is the state of a closed system? The space of states encompasses all possible states of a system. For example, a die has the state space 1,2,3,4,5,6{1,2,3,4,5,6}, and a coin has heads,tails{\text{heads}, \text{tails}}. For a point particle on a line, the phase space includes all possible positions and momenta, forming a continuously infinite, two-dimensional line that we can graph.

Consider the die again. If we look at two subsets:

  1. odd numbers: 1,3,5{1,3,5}
  2. numbers 3\leq 3: 1,2,3{1,2,3}. We see that "AND" is their intersection: 1,3{1,3}, and "OR" is their union: 1,2,3,5{1,2,3,5}. These concepts come from set theory, but in quantum mechanics, the space of states is a vector space, not a set.

Rather than diving into vector spaces, let's focus on experiments.

Experiments

Let's examine a coin in classical mechanics (cbit) and quantum mechanics (qubit). The quantum analog is more intriguing. Each qubit has two states: heads or tails, represented by σ=+1\sigma=+1 for heads (\uparrow) and σ=1\sigma=-1 for tails (\downarrow).

We also have an apparatus with "THIS SIDE UP" sign. When placed upright, the apparatus detects the qubit's state and displays σ\sigma on the screen. Instead of just measuring, think of this as preparing the system. If we measure σ=+1\sigma=+1, reset the apparatus, and quickly repeat the experiment with the same qubit, the screen consistently shows +1+1. Thus, the apparatus not only determines but rather prepares the qubit's state.

Let's flip the apparatus so that "THIS SIDE UP" is on the bottom, orienting it downwards. Now, redoing the experiment with the same qubit gives σ=1\sigma=-1. Repeating the experiment consistently yields σ=1\sigma=-1, but flipping the apparatus back shows σ=+1\sigma=+1 again. This suggests the qubit has directionality, indicating it behaves like a vector. Alternatively, the detector's orientation could influence the screen to display the vector component along its direction.

What if we rotate the apparatus 9090^\circ so "THIS SIDE UP" points left? Using multiple \uparrow qubits one after another, we expect the screen to display 0, as the horizontal component of \uparrow is 0. In classical physics, this would happen. However, in quantum mechanics, the screen shows either +1+1 or 1-1, averaging to 0 over many trials, so the probability of getting +1+1 and 1-1 is the same.

If we generalize by changing the angle to θ\theta^\circ, the horizontal component of the \uparrow vector along this tilt is cosθ\cos{\theta}. Classically, we'd expect the output to be cosθ\cos{\theta}. In the quantum world, the output is still +1+1 or 1-1, but the average value over many trials will be cosθ\cos{\theta}.

Some Mathematics

Let's diverge our discussion to introduce more mathematics - let's discuss vector spaces. For now, it suffices to define a vector space. Normally, we use v\vec{v} or v\textbf{v} to denote vectors; in quantum mechanics, we use bra-ket notation, where v\langle v | is a bra, and v| v\rangle is a ket. We can now define a vector space as follows:

A vector space VV is a collection of vectors v| v\rangle, satisfying the following:

  1. Addition: a+b=c|a \rangle + |b\rangle = |c\rangle, where a,b,cV|a \rangle, |b\rangle, |c\rangle \in V.

  2. Multiplication: za=az |a \rangle = |a' \rangle for zCz\in \mathbb{C} and a,aV|a \rangle, |a' \rangle \in V.

There are a couple more axioms that are generally used (ie. there is a zero; that is, a vector 0| 0 \rangle such that 0+a=a| 0 \rangle + |a \rangle = |a \rangle for aV|a \rangle \in V), but professor Susskind ignores them for now.

Let's take a look at a few examples:

  1. The real numbers are a real vector space, because we require zRz\in \mathbb{R}.
  2. The complex numbers are a complex vector space.
  3. Functions form a vector space.
  4. The collection of all vectors (α1α2αn)\begin{pmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \end{pmatrix} with α1,α2,...,αnC\alpha_1, \alpha_2, ..., \alpha_n \in \mathbb{C} is a vector space.

Now, let's introduce the idea of a dual vector space. Given a vector space VV, we construct a dual vector space VV^\vee. The basic idea is for every vector vV| v \rangle \in V, there is a vector vV\langle v | \in V^\vee.

We can formulate this more rigorously as follows:

Given a vector space VV containing vectors v|v\rangle, the dual vector space VV^\vee contains vectors v\langle v |, such that there is a 1-to-1 correspondence between:

  1. aa|a \rangle \leftrightarrow \langle a |

  2. a+ba+b|a\rangle + |b\rangle \leftrightarrow \langle a | + \langle b |.

  3. zaazz|a \rangle \leftrightarrow \langle a | z^*, where zz^* is the complex conjugate of zz.

Now, let's define the inner product, the analogue of the dot product.

Given a\langle a | and b|b\rangle, the inner product is an operation that gives ab\langle a|b\rangle, which satisfies ab=ba\langle a|b\rangle = \langle b|a\rangle^*.

Let's look at an example.

Define ba=(β1,β2)(α1α2)=β1α1+β2α2\langle b|a\rangle = (\beta_1^*, \beta_2^*) \begin{pmatrix} \alpha_1 \\ \alpha_2 \end{pmatrix} = \beta_1^* \alpha_1 + \beta_2^* \alpha_2. Then ab=(α1,α2)(β1β2)=α1β1+α2β2\langle a|b\rangle = (\alpha_1^*, \alpha_2^*) \begin{pmatrix} \beta_1 \\ \beta_2 \end{pmatrix} = \alpha_1^* \beta_1 + \alpha_2^* \beta_2. Now we can see abba\langle a|b\rangle \langle b|a\rangle^*.

What if we set b=ab=a? Then aa=aa\langle a|a\rangle = \langle a|a\rangle^*, so the inner product of any vector with itself is always positive and real. This value is generally known as the length of a vector.

We say two vectors are orthogonal if ba=0\langle b|a \rangle=0. Define the dimension of a vector space is the maximum number of nonzero mutually orthogonal vectors. For example, in the vector space of 2d column vectors, (10)\begin{pmatrix} 1 \\ 0 \end{pmatrix} and (01)\begin{pmatrix} 0 \\ 1 \end{pmatrix} are orthogonal, and since there are no other nonzero mutually orthogonal vectors, this vector space is of dimension 22.

Lecture 2 - Quantum States

Logic

Let's rename \uparrow to spin up, and \downarrow spin down. When we measure a spin up qubit in an upwards oriented apparatus, we are measuring σz=±1\sigma_z = \pm 1 - the zz-component of the spin. And if we turn the apparatus on it's side, we get σx=±1\sigma_x=\pm 1. And so on...

Now we generalize. Suppose we have previously prepared an experiment where everything is oriented in the n^\hat{n} axis. Then, we rotate the apparatus in 3d space so that the orientation vector points in the m^\hat{m} direction. It turns out that the average value of the experiment is σm=n^m^=cosθ\langle \sigma_m \rangle = \hat{n}\cdot \hat{m} = \cos{\theta}, the component of n^\hat{n} along m^\hat{m}.

Classically, proposition about the state of a system is a subset for which a proposition is true. One type of a proposition is a NOT statement, which is exactly what it sounds like - the subset of things that do not satisfy the proposition. A proposition can be either TRUE, or FALSE. Two other types of propositions we've seen already are AND and OR statements - they are different in quantum mechanics. In english, there are two types of OR - the inclusive one (the union) and the exclusive one (in one subset, but not the other). Generally when we speak, we mean the exclusive one, but in logic, we mean the inclusive OR (the union).

Take two propositions:

  • A:σz=+1A: \sigma_z=+ 1
  • B:σx=+1B: \sigma_x=+ 1

Let's design an experiment to check AA OR BB on \uparrow. Let's measure the spin along the zz-axis, first. Orient the apparatus in the upward direction, and run the experiment. If this is true, we are done. AA OR BB is TRUE because the OR only requires that one of them be true.

Let's redo the experiment, but flip the order of AA and BB.

  • We will get σx=+1\sigma_x=+1 with 12\frac{1}{2} probability. Then AA OR BB is TRUE.
  • We will get σx=1\sigma_x=-1 with 12\frac{1}{2} probability, so we need to test BB. Since we prepared the spin arrow to point to the sideways to test AA, when we test BB, we will get 12\frac{1}{2} chance it lands +1+1, and 12\frac{1}{2} chance it lands 1-1. So the chance BB OR AA is true is 12+14=34\frac{1}{2}+\frac{1}{4}=\frac{3}{4}, so order does matter.

The space of states of a quantum system is a linear vector space, so given any two states, we can add them to create a third space. You cannot do this with sets - given two elements of a set, you can't always add them together to make a third element.

The Connection

Finally, let's state the connection between a vector space and the space of states of a quantum statement. In our previous experiment, we saw that: σz,σx,\sigma_z, \sigma_x, or σy=±1\sigma_y=\pm 1. Let's rename these to U|U\rangle (up), D|D\rangle (down), L|L \rangle (left), R|R\rangle (right), I|I\rangle (in), and O|O\rangle (out). Suppose we had a ket vector A=αUU+αDD|A \rangle = \alpha_U | U \rangle + \alpha_D | D\rangle. The probability of getting up, would be Pu=αuαuP_u=\alpha_u^* \alpha_u, and the probability of getting down is αdαd\alpha_d^* \alpha_d.

One basic postulate that we will accept is orthogonality means that two things are sufficiently different that with a single experiment you can tell the difference. So U|U\rangle and D|D\rangle are orthogonal to each other. We will show that these can be the basis for our vector space (of dimension 22).

Since PU+PD=1P_U + P_D=1, we have αuαu+αdαd=1AA=1\alpha_u^* \alpha_u + \alpha_d^*\alpha_d = 1 \leftrightarrow \langle A | A \rangle =1. Using similar ideas, we can guess that R=12U+12D|R \rangle = \frac{1}{\sqrt{2}} |U\rangle + \frac{1}{\sqrt{2}}|D\rangle and L=12U12D|L \rangle = \frac{1}{\sqrt{2}} |U\rangle - \frac{1}{\sqrt{2}}|D\rangle. We can now derive the identities:

R+L2=URL2=D \begin{align*} \dfrac{|R \rangle + | L \rangle}{\sqrt{2}}&= |U\rangle \\ \dfrac{|R \rangle - | L \rangle}{\sqrt{2}}&= |D\rangle \\ \end{align*}

Furthermore, we can show that I=12U+i2D|I\rangle =-\frac{1}{\sqrt{2}}|U\rangle + \frac{i}{\sqrt{2}}|D\rangle and O=12Ui2D|O \rangle = \frac{1}{\sqrt{2}}|U\rangle - \frac{i}{\sqrt{2}}|D\rangle.

We can rewrite everything as column vectors to conclude the following:

U=(10)D=(01)R=(1212)L=(1212)I=(12i2)O=(12i2) \begin{align*} |U \rangle = \begin{pmatrix} 1\\ 0 \end{pmatrix} \qquad & \qquad |D \rangle = \begin{pmatrix} 0\\ 1 \end{pmatrix} \\ |R \rangle = \begin{pmatrix} \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} \end{pmatrix} \qquad & \qquad |L \rangle = \begin{pmatrix} \frac{1}{\sqrt{2}}\\ -\frac{1}{\sqrt{2}} \end{pmatrix} \\ |I \rangle = \begin{pmatrix} \frac{1}{\sqrt{2}}\\ \frac{i}{\sqrt{2}} \end{pmatrix} \qquad & \qquad |O \rangle = \begin{pmatrix} \frac{1}{\sqrt{2}}\\ -\frac{i}{\sqrt{2}} \end{pmatrix} \\ \end{align*}

Lecture 3 - Principles of Quantum Mechanics

More Linear Algebra

Let i|i\rangle be basis vectors (mutually orthonormal vectors). Then we can write any element in our vector space as a linear combination of the basis vectors, ie. A=iαii|A\rangle = \sum_i \alpha_i |i \rangle for αiC\alpha_i \in \mathbb{C}.

We can now do inner products: jA=iαiji=αiδji\langle j | A \rangle = \sum_i \alpha_i \langle j | i \rangle = \alpha_i \delta_{ji}, where δji\delta_{ji} is the Kronecker delta function:

δij={1if i=j0otherwise \delta_{ij} = \begin{cases} 1 & \text{if } i = j \\ 0 & \text{otherwise} \end{cases}

So A=iiia|A\rangle = \sum_i |i \rangle \langle i | a\rangle, which is a useful identity used to simplify things. The exact same thing is true for bra vectors: A=iAii\langle A | = \sum_i \langle A|i \rangle \langle i|.

Here's another reason why we want to use linear algebra: observables (thing we measure) are linear operators.

A linear operator MM, acting on A|A\rangle, gives a unique ket vector B|B\rangle satisfying:

  1. M[zA]=zMAM[z|A \rangle]=zM|A\rangle.
  2. M[A+B]=MA+MBM[|A\rangle + |B\rangle ]=M|A\rangle + M|B\rangle.

Using this definition, we can show that

iMA=iB=βi \langle i | M |A\rangle = \langle i | B \rangle = \beta_i

Using our useful identity, we can rewrite this as

jiMAjA=iMjαj=βi. \sum_j \langle i | M |A \rangle \langle j|A \rangle = \langle i | M |j \rangle \alpha_j =\beta_i.

We define the matrix elements as Mij:=iMjM_{ij} := \langle i | M |j \rangle, which are important because it's possible to characterize linear operators by these matrix elements. So we can rewrite jiMAjA\sum_j \langle i | M |A \rangle \langle j|A \rangle in two other ways: jMijaj=βi\sum_j M_{ij}a_j = \beta_i, and

(M11M12M13...M21M22M23...M31M32M33......)(α1α2...αN)=(β1β2...βN) \begin{pmatrix} M_{11} & M_{12} & M_{13} & ... \\ M_{21} & M_{22} & M_{23} & ... \\ M_{31} & M_{32} & M_{33} & ... \\ \vdots & \vdots & \vdots & ... \end{pmatrix} \begin{pmatrix} \alpha_1 \\ \alpha_2 \\ ... \\ \alpha_N \end{pmatrix} = \begin{pmatrix} \beta_1 \\ \beta_2 \\ ... \\ \beta_N \end{pmatrix}

Since we have matrices, it will be useful to define eigenvectors.

The eigenvectors of MM are vectors λi|\lambda_i \rangle such that Mλiλiλi=0M|\lambda_i \rangle - \lambda_i | \lambda_i \rangle =0, and each eigenvector λi|\lambda_i\rangle has eigenvalue λi\lambda_i.

These are the vectors which the direction before and after applying the operator is the same, and the only thing that happens to the vector is that it's scaled, and this scaling factor is called the eigenvalue.

We defined linear operators on ket vectors; they can also be defined on bra vectors by through

BM=(β1β2...βN)(M11M12M13...M21M22M23...M31M32M33......) \langle B | M = (\beta_1^* \beta_2^* ... \beta_N^*)\begin{pmatrix} M_{11} & M_{12} & M_{13} & ... \\ M_{21} & M_{22} & M_{23} & ... \\ M_{31} & M_{32} & M_{33} & ... \\ \vdots & \vdots & \vdots & ... \end{pmatrix}

One may think that MA=BAM=BM|A\rangle =|B\rangle \leftrightarrow \langle A| M = \langle B|, but this is wrong. The correct answer is that MA=BAM=BM|A\rangle =|B\rangle \leftrightarrow \langle A| M^\dagger = \langle B|, where M=[MT]M^\dagger = [M^T]^* (ie. change mjim_ji to mijm_{ij}^*: first transpose the matrix, then complex conjugate all of it's entries) is the Hermitian conjugate.

Since quantum mechanical measurements are always real, quantum mechanical observables are represented by Hermitian operators, which satisfy the property M=MM=M^\dagger. Professor Susskind calls this the fundamental theorem because of how important it is. One nice property of these Hermitian operators is that the eigenvalue of a Hermitian operator must be real: For a Hermitian operator LL, Lλ=λλ    λLλ=λλλL|\lambda \rangle = \lambda | \lambda \rangle \implies \langle \lambda | L | \lambda \rangle = \lambda \langle \lambda | \lambda \rangle and L=L=λλ    λLλ=λλλ\langle | L = \langle | L^\dagger = \langle \lambda | \lambda^* \implies \langle \lambda | L | \lambda = \lambda^* \langle \lambda | \lambda \rangle, so λ=λ    λ\lambda = \lambda^* \implies \lambda is real. We can state the fundamental theorem more precisely as follows:

The eigenvectors of a Hermitian operator form an orthonormal basis.

Take Lλ1=λ1λ1λ1L=λ1λ1L|\lambda_1 \rangle = \lambda_1| \lambda_1 \rangle \leftrightarrow \langle \lambda_1|L = \lambda_1 \langle \lambda_1 | and Lλ2=λ1λ1L|\lambda_2 \rangle = \lambda_1| \lambda_1 \rangle. Inner producting things together gives λ1Lλ2=λ1λ1λ2\langle \lambda_1 | L | \lambda_2 \rangle = \lambda_1 \langle \lambda_1 | \lambda_2 \rangle and λ1Lλ2=λ2λ1λ2\langle \lambda_1 | L | \lambda_2 \rangle = \lambda_2 \langle \lambda_1 | \lambda_2 \rangle. Subtracting gives (λ1λ2)λ1λ2=0(\lambda_1 - \lambda_2)\langle \lambda_1|\lambda_2 \rangle = 0, so the two eigenvectors are orthogonal.

Suppose λ1=λ2\lambda_1=\lambda_2 and let A=αλ1+βλ2|A\rangle = \alpha | \lambda_1 \rangle + \beta | \lambda_2 \rangle. Then LA=αLλ1+βLλ2,LA=αλλ1+βλλ2L|A\rangle = \alpha L| \lambda_1 \rangle + \beta L |\lambda_2 \rangle, L|A\rangle = \alpha \lambda|\lambda_1 \rangle + \beta \lambda |\lambda_2, and LA=λ(αλ1+βλ2)=λAL|A\rangle = \lambda (\alpha| \lambda_1 \rangle + \beta| \lambda_2 \rangle)=\lambda|A\rangle, so these two vectors are linearly independent because any ket vector can be written as a linear combination of these two. Now we have two orthonormal basis vectors, we can extend using Gram-Schmidt to obtain our desired number of basis vectors. It remains to show that if our space is NN-dimensional, there are NN orthonormal eigenvectors, which is just a simple linear algebra exercise.

Principles

Professor Susskind states that there are 4 basic principles of quantum mechanics:

  1. The observable or measurable quantities of quantum mechanics are represented by linear operators LL.
  2. If the system is in eigenstate λi|\lambda_i \rangle, the result of a measurement is guaranteed to be λi\lambda_i.
  3. Unambiguously distinguishable states are represented by orthogonal vectors.
  4. If A|A\rangle is the state vector of a system, and the observable LL is measured, the probability to observe value λi\lambda_i is P(λi)=AλiλiA=Aλi2P(\lambda_i)=\langle A|\lambda_i \rangle \langle \lambda_i | A \rangle = |\langle A|\lambda_i \rangle|^2.

In particular, these conditions combine to imply that that LL must be Hermitian.

Explicit Pauli Matrices

Let's write down spin operators as 2×22\times 2 matrices. Starting with σz\sigma_z, principle 2 tells us σzU=U,σzD=D\sigma_z|U \rangle = |U \rangle, \sigma_z|D\rangle = -|D\rangle, and principle 3 tells us uD=0\langle u|D\rangle =0. Writing this as matrix equations, we get

((σz)11(σz)12(σz)21(σz)22)(10)=(10)((σz)11(σz)12(σz)21(σz)22)(01)=(01) \begin{align*} \begin{pmatrix} \left(\sigma_z\right)_{11} & \left(\sigma_z\right)_{12} \\ \left(\sigma_z\right)_{21} & \left(\sigma_z\right)_{22} \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} &= \begin{pmatrix} 1 \\0 \end{pmatrix} \\ \begin{pmatrix} \left(\sigma_z\right)_{11} & \left(\sigma_z\right)_{12} \\ \left(\sigma_z\right)_{21} & \left(\sigma_z\right)_{22} \end{pmatrix} \begin{pmatrix} 0 \\ 1 \end{pmatrix} &=- \begin{pmatrix} 0 \\ 1 \end{pmatrix} \end{align*}

and the unique matrix satisfying this is

σz=(1001).\boxed{\sigma_z=\begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}.}

For σx\sigma_x, we have σxR=R\sigma_x |R\rangle = |R\rangle and σxL=L\sigma_x|L\rangle = -|L\rangle. Recall that R=12U+12D|R\rangle = \frac{1}{\sqrt{2}}|U\rangle + \frac{1}{\sqrt{2}}|D\rangle and L=12U12D|L\rangle = \frac{1}{\sqrt{2}}|U\rangle - \frac{1}{\sqrt{2}}|D\rangle so

((σz)11(σz)12(σz)21(σz)22)(1212)=(1212)((σz)11(σz)12(σz)21(σz)22)(1212)=(1212) \begin{align*} \begin{pmatrix} \left(\sigma_z\right)_{11} & \left(\sigma_z\right)_{12} \\ \left(\sigma_z\right)_{21} & \left(\sigma_z\right)_{22} \end{pmatrix} \begin{pmatrix} \dfrac{1}{\sqrt{2}} \\ \dfrac{1}{\sqrt{2}} \end{pmatrix} &= \begin{pmatrix} \dfrac{1}{\sqrt{2}} \\ \dfrac{1}{\sqrt{2}} \end{pmatrix} \\ \begin{pmatrix} \left(\sigma_z\right)_{11} & \left(\sigma_z\right)_{12} \\ \left(\sigma_z\right)_{21} & \left(\sigma_z\right)_{22} \end{pmatrix} \begin{pmatrix} \dfrac{1}{\sqrt{2}} \\ -\dfrac{1}{\sqrt{2}} \end{pmatrix} &=- \begin{pmatrix} \dfrac{1}{\sqrt{2}} \\ -\dfrac{1}{\sqrt{2}} \end{pmatrix} \end{align*}

which gives

σx=(0110).\boxed{\sigma_x=\begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}.}

For our last direction, σy\sigma_y, I=12U+i2D|I\rangle = \frac{1}{\sqrt{2}}|U\rangle + \frac{i}{\sqrt{2}}|D\rangle and L=12Ui2D|L\rangle = \frac{1}{\sqrt{2}}|U\rangle - \frac{i}{\sqrt{2}}|D\rangle so U=(12i2),O=(12i2)|U\rangle = \begin{pmatrix} \frac{1}{\sqrt{2}} \\ \frac{i}{\sqrt{2}}\end{pmatrix}, |O\rangle =\begin{pmatrix} \frac{1}{\sqrt{2}} \\ -\frac{i}{\sqrt{2}}\end{pmatrix} so

σy=(0ii0).\boxed{ \sigma_y=\begin{pmatrix} 0 & -i \\ i & 0 \end{pmatrix}.}

These three boxed matrices are collectively called the Pauli matrices.

Implications

Although σ\sigma is not a 33-vector (because they are written as matrices), it behaves a lot like one. Suppose σn=σn^\sigma_n=\vec{\sigma}\cdot \hat{n}. Then

σn=(nx(nxiny)(nx+iny)nx)\sigma_n = \begin{pmatrix} n_x & (n_x-in_y) \\ (n_x+in_y) & - n_x \end{pmatrix}

which gives us a way to calculate what happens when we orient the apparatus along n^\hat{n}, painting a complete picture of spin measurements in 3d space.

If we let n^\hat{n} sit in the xzx-z plane, then

σn=(cosθsinθsinθcosθ). \sigma_n = \begin{pmatrix} \cos{\theta} & \sin{\theta} \\ \sin{\theta} & - \cos{\theta} \end{pmatrix}.

The eigenvectors are λ1=(cosθ2sinθ2),λ1=(sinθ2cosθ2)|\lambda_1 \rangle = \begin{pmatrix} \cos{\frac{\theta}{2}} \\ \sin{\frac{\theta}{2}} \end{pmatrix}, |\lambda_1 \rangle = \begin{pmatrix} -\sin{\frac{\theta}{2}} \\ \cos{\frac{\theta}{2}} \end{pmatrix}, with eigenvalues +1+1 and 1-1, respectively. Suppose our apparatus initially points along the zz-axis, and we rotate it so it lies along the n^\hat{n} axis. Giving it a spin in the U|U\rangle state, what's the probability of observing σn=±1\sigma_n =\pm1. Principle 4 gives us

P(+1)=Uλ12=cos2θ2P(1)=Uλ22=sin2θ2 \begin{align*} P(+1)=|\langle U | \lambda_1 \rangle|^2 = \cos^2{\frac{\theta}{2}} \\ P(-1)=|\langle U | \lambda_2 \rangle|^2 = \sin^2{\frac{\theta}{2}} \\ \end{align*}

The expected value is thus

L=iλiP(λi)=cos2θ2sin2θ2=cosθ. \langle L \rangle = \sum_i \lambda_i P(\lambda_i) = \cos^2{\frac{\theta}{2}}-\sin^2{\frac{\theta}{2}} = \cos{\theta}.

There is one more theorem for this section:

(The Spin-Polarization Principle)

Any state of a single spin is an eigenvector of some component of the spin.

This means given a state A|A\rangle, there exists some direction n^\hat{n} such that σnA=A\vec{\sigma} \cdot \vec{n}|A\rangle = |A\rangle. Two implications: 1) for any spin state, we can orient the apparatus in some orientation so that it registers +1+1, and 2) there is no state where the expected value of all three components of the spin is 00. In fact, σx2+σy2+σz2=1\langle \sigma_x \rangle^2 + \langle \sigma_y \rangle^2 + \langle \sigma_z \rangle^2=1.

Lecture 4 - Time and Change

Unitarity

In classical mechanics, professor Susskind introduced what he called the minus first law: information from the past is never lost. The quantum analog is the conservation of distinctions. Let the state of the system be Ψ(t)|\Psi(t)\rangle. The state at time tt is a linear operator U(t)U(t), which satisfies Ψ(t)=U(t)Ψ(0)|\Psi(t)\rangle = U(t)|\Psi(0)\rangle, where UU is called the time-development operator for the system. This is based on the basic dynamical assumption of quantum mechanics is that if you know the state at one time, then the quantum equations of motion tell you what it will be later.

The main difference between classical and quantum determinism is as follows: classical determinism allows us to predict the results of an experiment, which quantum determinism allows us to compute the probabilities of the outcomes of later experiments.

Suppose Ψ(0)|\Psi(0)\rangle and Φ(0)|\Phi(0)\rangle are orthogonal. Then Ψ(0)Φ(0)=0\langle \Psi(0)|\Phi(0)\rangle =0, and the conservation of distinctions implies Ψ(t)Φ(t)=0\langle \Psi(t)|\Phi(t)\rangle =0. Rewriting Ψ(t)=Ψ(0)U(t)\langle \Psi(t)| = \langle \Psi(0)|U^\dagger(t), and Ψ(t)=U(t)Ψ(0)|\Psi(t)\rangle = U(t)|\Psi(0)\rangle from earlier implies Ψ(0)UU(t)Φ(0)=0\langle \Psi(0)|U^\dagger U(t)|\Phi(0)\rangle =0. Consider an orthonormal basis i|i\rangle with Φ(0)|\Phi(0)\rangle and Ψ(0)|\Psi(0)\rangle as basis vectors. We get iU(t)U(t)j=δij\langle i| U^\dagger(t)U(t)|j \rangle = \delta_{ij}, so UU=IU^\dagger U=I (an operator that satisfies this condition is called unitary).

Now we can introduce our fifth principle of quantum mechanics:

  1. The evolution of state vectors with time is unitary.

The Hamiltonian

Let U(ϵ)=IiϵHU(\epsilon)=I-i\epsilon H. Then (I+iϵH)(1iϵH)=I    H=H(I+i\epsilon H^\dagger)(1-i\epsilon H)=I \implies H^\dagger = H. This value HH is called the quantum Hamiltonian, a observable where the eigenvalues meausure the energy of a quantum system. Using Ψ(t)=U(t)Ψ(0)|\Psi(t) \rangle = U(t)|\Psi(0)\rangle, and taking t=ϵt=\epsilon, we get Ψ(ϵ)=Ψ(0)iϵHΨ(0)|\Psi(\epsilon)\rangle = |\Psi(0)\rangle - i\epsilon H |\Psi(0)\rangle. Rearranging and taking ϵ0\epsilon\to 0 gives

Ψt=HΨ, \dfrac{\partial |\Psi \rangle}{\partial t} = -H|\Psi\rangle,

the time-dependent Schrödinger equation. So the reason we care about the quantum Hamiltonian is that it tells us how the state of an undisturbed system evolves with time.

Planck's constant is =1.0545×1034kgm2/s\hbar=1.0545\times 10^{34} kg\cdot m^2/s, which we need to insert to the time-dependent Schrödinger equation to make the dimensions actually make sense:

Ψt=iHΨ. \hbar \dfrac{\partial |\Psi \rangle}{\partial t} = -i H|\Psi\rangle.

It is worth noting that Planck originally came up with the constant h=2πh = 2\pi \hbar, but later physicists changed it to remove the need to write 2π2\pi in a ton of places.

Averaging

Last time, we look at averages. Here's a nice trick to compute averages:

L=ALA \langle L \rangle = \langle A | L | A \rangle

where A|A\rangle is the noramlized state of a quantum system. To prove this, rewrite A=iαiλi    LA=iαiLλi|A\rangle = \sum_i \alpha_i |\lambda_i \rangle \implies L|A\rangle = \sum_i \alpha_i L |\lambda_i \rangle. Since Lλi=λiλi    LA=iαiλiλiL|\lambda_i \rangle = \lambda_i |\lambda_i \rangle \implies L|A\rangle = \sum_i \alpha_i \lambda_i |\lambda_i \rangle. Lastly, take the inner product with A\langle A | to get ALA=i(αiαi)λi\langle A|L|A\rangle = \sum_i (\alpha_i^* \alpha_i)\lambda_i, and the result follows.

We can use averaging to show that we can always scale a state-vector by whatever constant factors eiθe^{i\theta} we want and nothing will change: take A=iαiλi|A\rangle = \sum_i \alpha_i |\lambda_i \rangle. If we let B=eiθA|B\rangle = e^{i\theta} |A\rangle, we can show that they have the same magnitude because BB=AeiθeiθA=AA\langle B|B\rangle = \langle Ae^{-i\theta} | e^{i\theta} A \rangle = \langle A|A \rangle. Similarly, the eigenvalues will be λj|\lambda_j \rangle with probability αjeiθeiθαj=αjαj\alpha_j^* e^{-i\theta}e^{i\theta} \alpha_j = \alpha_j^* \alpha_j for both A|A\rangle and B|B\rangle. Finally, we have L=BLB=AeiθLeiθA=ALA\langle L \rangle = \langle B|L|B \rangle = \langle Ae^{-i\theta} | L | e^{i\theta}A \rangle = \langle A|L|A\rangle.

Comparison To Classical Mechanics

The expected value changes in time because

ddt(Ψ(t)LΨ(t))=Ψ˙(t)LΨ(t)+Ψ(t)LΨ˙(t)=ihΨ(t)HLΨ(t)ihΨ(t)LHΨ(t)=ihΨ(t)HLLHΨ(t)=ih[H,L]orih[L,H] \begin{align*} \dfrac{\,d}{\,dt}(\Psi(t)|L|\Psi(t))&=\langle \dot{\Psi}(t)|L|\Psi(t)\rangle + \langle \Psi(t)|L|\dot{\Psi}(t) \rangle \\ &= \dfrac{i}{h} \langle \Psi(t)|HL|\Psi(t) \rangle - \dfrac{i}{h} \langle \Psi(t)|LH | \Psi(t)\rangle \\ &= \dfrac{i}{h} \langle \Psi(t)| HL - LH |\Psi(t) \rangle \\ &= \dfrac{i}{h}\langle [H,L]\rangle \qquad \text{or} \qquad -\dfrac{i}{h}\langle [L,H]\rangle \end{align*}

where the second lines follows from Schrödinger. The term [L,M]=LMML[L,M]=LM-ML is called the commutator and is usually not 00. One important fact is that the commutator bracket is skew symmetric: LMML=[L,M]LM-ML=[L,M]. Sometimes, we rewrite this whole thing more succinctly as

dLdt=ih[L,H]. \dfrac{\,dL}{\,dt}=-\dfrac{i}{h}[L,H].

We may notice that the commutator [L,H][L,H] is awfully similar to the Poisson bracket i{L,H}i\hbar \{L,H\}. If we plug this into the previous equation, we get

dLdt={L,H}. \dfrac{\,dL}{\,dt}=\{L,H\}.

The major difference in quantum physics is that there is a difference between FGFG and GFGF for two linear operators F,GF,G, whereas this is not true in classical mechanics.

What does it mean when an observable (which we call QQ) is conserved? This means [Q,H]=0    [Qn,H]=0[Q,H]=0 \implies [Q^n, H]=0. HH is the definition of energy in quantum mechanics, and obviously, [H,H]=0[H,H]=0, which is an example of conservation of energy.

Spin has an energy depending on it's orientation when placed in a magnetic field: let HσB=σxBx+σyBy+σzBzH\sim \vec{\sigma} \cdot \vec{B}=\sigma_x B_x + \sigma_y B_y + \sigma_z B_z. For a simple example: take the magnetic field along the zz-axis. Then H=ω2σzH=\frac{\hbar \omega}{2} \sigma_z for some constant ω\omega. The average values are

σx=i[σx,H]=iω2[σx,σz]σy=i[σy,H]=iω2[σy,σz]σx=i[σx,H]=iω2[σz,σz] \begin{align*} \langle \sigma_x \rangle &= -\dfrac{i}{\hbar} \langle [\sigma_x, H]\rangle = -\dfrac{i\omega}{2}\langle [\sigma_x, \sigma_z]\rangle \\ \langle \sigma_y \rangle &= -\dfrac{i}{\hbar} \langle [\sigma_y, H]\rangle = -\dfrac{i\omega}{2}\langle [\sigma_y, \sigma_z]\rangle \\ \langle \sigma_x \rangle &= -\dfrac{i}{\hbar} \langle [\sigma_x, H]\rangle = -\dfrac{i\omega}{2}\langle [\sigma_z, \sigma_z]\rangle \\ \end{align*}

The Pauli matrices verify that [σx,σy]=2iσz,[σy,σz]=2iσx[\sigma_x, \sigma_y]=2i\sigma_z, [\sigma_y, \sigma_z]=2i\sigma_x, and [σz,σx]=2iσy[\sigma_z, \sigma_x]=2i\sigma_y. Thus,

σx=ωσyσy=ωσxσz=0. \begin{align*} \langle \sigma_x \rangle &= -\omega \langle \sigma_y \rangle \\ \langle \sigma_y \rangle &= \omega \langle \sigma_x \rangle \\ \langle \sigma_z \rangle &= 0. \end{align*}

This is exactly the same equations as the classical rotor in a magnetic field! In classical mechanics, precessing is the xx and yy components of angular momentum, whereas in quantum mechanics, it's the expected value.

Solving Schrödinger

The iconic Schrödinger is

iΨ(x)t=2m2Ψ(x)x2+U(x)Ψ(x). i\hbar \dfrac{\partial \Psi(x)}{\partial t}=-\dfrac{\hbar}{2m} \dfrac{\partial^2 \Psi(x)}{\partial x^2}+U(x)\Psi(x).

Earlier, we saw the time-dependent Schrödinger equation:

Ψt=iHΨ. \hbar \dfrac{\partial |\Psi\rangle}{\partial t}=-iH|\Psi \rangle.

There is also the time-independent Schrödinger equation:

HEj=EjEj. H|E_j \rangle = E_j | E_j \rangle.

Because HH is energy, EjE_j is the energy eigenvalues and Ej|E_j\rangle the energy eigenvectors. Suppose we know these. We can now solve the time-dependent analog by plugging in Ψ(t)=jαj(t)Ej|\Psi(t) \rangle = \sum_j \alpha_j(t) |E_j \rangle to get

jα˙j(t)Ej=iHjαj(t)Ej    jα˙j(t)Ej=ijEjαj(t)Ej    j{α˙j(t)+iEjαj(t)}Ej=0. \begin{align*} &\sum_j \dot{\alpha}_j(t)|E_j \rangle = -\dfrac{i}{\hbar}H\sum_j \alpha_j(t) |E_j \rangle \\ \implies & \sum_j \dot{\alpha}_j(t)|E_j \rangle = -\dfrac{i}{\hbar}\sum_j E_j \alpha_j(t) |E_j \rangle \\ \implies & \sum_j \{ \dot{\alpha}_j(t) + \dfrac{i}{\hbar} E_j \alpha_j(t) \} |E_j \rangle =0. \end{align*}

When the sum of basis vectors is 00, every coefficient must be 00, so

dαj(t)dt=iEjαj(t)    αj(t)=aj(0)eiEjt. \dfrac{\,d \alpha_j(t)}{\,dt} =-\dfrac{i}{\hbar} E_j\alpha_j(t) \implies \alpha_j(t)=a_j(0)e^{-\frac{i}{\hbar}E_jt}.

This is our first example of the connection between energy and frequency.

At t=0t=0, we have αj(0)=EjΨ(0)\alpha_j(0)=\langle E_j | \Psi(0)\rangle, so the time-dependent Schrödinger can be written as

Ψ(t)=jEjΨ(0)eihEjtEj=jEjEjΨ(0)eiEjt. \begin{align*} |\Psi(t) \rangle &= \sum_j \langle E_j | \Psi(0)\rangle e^{-\frac{i}{h}E_jt}|E_j \rangle \\ &= \sum_j |E_j \rangle \langle E_j | \Psi(0)\rangle e^{-\frac{i}{\hbar}E_j t}. \end{align*}

Using this, we can now predict probabilities: the probability for outcome λ\lambda is Pλ(t)=λΨ(t)2P_\lambda(t) = |\langle \lambda|\Psi(t)\rangle|^2, and we can calculate the ket using Schrödinger.

Wave Function Collapse

Let the state-vector be jαjλj\sum_j \alpha_j |\lambda_j\rangle before the measurement of LL. The apparatus measures λj\lambda_j with probability αj2|\alpha_j|^2, and then leaves the system in a single eigenstate of LL, namely λj|\lambda_j|. This is because we need to cnosider the apparatus as part of our single quantum system; otherwise, the state vector might reduce to a single eigenstate due to interaction with the external world. This is known as the wave function collapse, which we'll talk about later.

Lecture 5 - Uncertainty and Time Dependence

Simultaneous Eigenvectors

Consider a two-spin system. If we measure both spins, the system is in a state that is simultaneously an eigenvector of LL and MM - a simulatenous eigenvector. Assume that there is a basis of state vectors λ,μ|\lambda, \mu that are simultaneous eigenvectors, or

Lλ,μ=λλ,μMλ,μ=μλ,μ. \begin{align*} L|\lambda, \mu \rangle &= \lambda | \lambda, \mu \rangle \\ M|\lambda, \mu \rangle &= \mu | \lambda, \mu \rangle. \end{align*}

Using some algebra we can get

[L,M]λ,μ=0. [L, M] | \lambda, \mu \rangle =0.

So the condition for two observables to be simultaneously measurable is that they commute.

Wave Functions

Let a,b,c,...|a,b,c,...\rangle be orthonormal basis vectors with entries as commuting observables A,B,C,...A,B,C,.... We can rewrite any state vector as

Ψ=a,b,c,...ψ(a,b,c,...)a,b,c,.... |\Psi \rangle = \sum_{a,b,c,...} \psi(a,b,c,...)|a,b,c,...\rangle.

Then the wave function is

ψ(a,b,c,...)=a,b,c,...Ψ. \psi(a,b,c,...)=\langle a,b,c,...|\Psi \rangle.

The probability for the commuting observables to have values a,b,c,...a,b,c,... is

P(a,b,c,...)=ψ(a,b,c,...)ψ(a,b,c,...) P(a,b,c,...)=\psi^*(a,b,c,...)\psi(a,b,c,...)

and we know that the total probability sums to 1:1:

a,b,c,...ψ(a,b,c,...)ψ(a,b,c,...)=1. \sum_{a,b,c,...}\psi^*(a,b,c,...)\psi(a,b,c,...)=1.

We use Ψ\Psi for state-vectors and ψ\psi for wave functions.

Uncertainty

Another reason to care about the Pauli matrices is that every 2×22\times 2 Hermitian operator has the Pauli matrices and the identity matrix as a basis. Can we simultaneously measure any pair of spin components? When two observables do not commute, in general, it is impossible to precisely know everything about both. For example, since [σx,σy]=2iσz,[σy,σz]=2iσx[\sigma_x, \sigma_y]=2i\sigma_z, [\sigma_y, \sigma_z]=2i\sigma_x, and [σz,σx]=2iσy[\sigma_z, \sigma_x]=2i\sigma_y, we cannot simultaneously measure two spin components.

In general, we cannot simultaneously measure two observables with perfect precision unless they commute. We must have some uncertainty, by which we mean the standard deviation

A=AA. \overline{A} = A-\langle A \rangle.

The eigenvalues of A\overline{A} are a=aA\overline{a}=a-\langle A \rangle, and the square of the uncertainty is (ΔA)2=aa2P(a)=a(aA)2P(a)=ΨA2Ψ(\Delta A)^2=\sum_a \overline{a}^2 P(a) = \sum_a (a-\langle A \rangle)^2 P(a) = \langle \Psi | \overline{A}^2 | \Psi.

To bound uncertainties, we often need inequalities, most notably the triangle inequality, XYXY|\vec{X}| |\vec{Y}| \ge \vec{X} \cdot \vec{Y} and the Cauchy-Schwarz inequality, which is sometimes written in the form X2Y2XY2|\vec{X}|^2 |\vec{Y}|^2 \ge |\vec{X} \cdot \vec{Y}|^2 but we will use the (equivalent) form 2XYXY+YX2|X||Y| \ge |\langle X|Y \rangle + \langle Y|X \rangle|.

Let Ψ|\Psi\rangle be a ket and A,BA, B be observables. Define X=AΨ,Y=iBΨ|X\rangle = A|\Psi\rangle, |Y\rangle = iB|\Psi \rangle. Plugging this into Cauchy-Schwarz gives

2A2B2Ψ[A,B]Ψ    ΔAΔB12Ψ[A,B]Ψ \begin{align*} &2\sqrt{\langle A^2 \rangle \langle B^2 \rangle} \ge | \langle \Psi|[A,B]|\Psi \rangle| \\ \implies & \Delta A \Delta B \ge \dfrac{1}{2} |\langle \Psi | [A,B] |\Psi \rangle | \end{align*}

Lecture 6 - Combining Systems: Entanglement

Tensor Products

Now, let there be two systems: system AA with space of states SAS_A and system BB with space of states SBS_B. The combined system is then SASBS_A\otimes S_B, with basis ab|ab\rangle.

Suppose Charlie has a dime (σ=+1\sigma = +1 (σ=1\sigma = -1), and gives one each to Alice and Bob. Then, Alice and Bob travel very far away from each other without looking at the coin. When Alice finally looks at the coin, she will immediately know what Bob's coin is. So σA=σB=0\langle \sigma_A \rangle = \langle \sigma_B \rangle =0 and σAσB=1\langle \sigma_A \sigma_B \rangle = -1. σAσBσAσB\langle \sigma_A \sigma_B \rangle - \langle \sigma_A \rangle \langle \sigma_B \rangle is the statistical correlation, and since it is nonzero Alice and Bob's observations are correlated.

Let's take the quantum version, with spins rather than coins. We can write any state in the combined system as Ψ=a,bψ(a,b)ab|\Psi \rangle = \sum_{a,b} \psi(a,b)|ab \rangle. Let the components of Alice's spin be σx,σy,σz\sigma_x, \sigma_y, \sigma_z with her ket vectors notated A}|A\}, and Bob's spin be τx,τy,τz\tau_x, \tau_y, \tau_z with his ket vectors notated as usual. If Alice prepares her spin in state αuU}+αdD}\alpha_u |U\} + \alpha_d |D\} and Bob prepares his in state βuU+βdD\beta_u|U\rangle + \beta_d |D\rangle, the combined product state is then

{αuU}+αdD}}{βuU+βdD}=αuβuUU=αuβdUD+αdβuDU+αdβdDD. \{ \alpha_u |U\} + \alpha_d |D\} \}\otimes \{\beta_u|U\rangle + \beta_d |D\rangle\} = \alpha_u \beta_u |UU\rangle = \alpha_u\beta_d |UD \rangle + \alpha_d \beta_u |DU \rangle + \alpha_d \beta_d |DD\rangle.

Note that the tensor product is a vector space for studying combined systems; a product state is a state vector of the proudct space. Most state-vectors in the product space are not product states.

Tensor products work in matrices the same way we'd expect them to work.

Entanglement

The most general vector is

ψUUUU+ψUDUD+ψDUDU+ψDDDD \psi_{UU}|UU\rangle + \psi_{UD}|UD\rangle + \psi_{DU}|DU\rangle + \psi_{DD}|DD|\rangle

with normalization condition ψUUψUU+ψUDψUD+ψDUψDU+ψDDψDD=1\psi_{UU}^*\psi_{UU}+\psi_{UD}^*\psi_{UD}+\psi_{DU}^*\psi_{DU}+\psi_{DD}^*\psi_{DD}=1 so there are 6 real parameters. This space of states is much more complicated than just the individual ones combined; this is due to entanglement.

Two examples of maximally entangled states are the singlet state sing=12(UDDU)|\text{sing}\rangle = \frac{1}{\sqrt{2}}(|UD \rangle - |DU \rangle) and the triplet states 12(UD+DU),12(UU+DD)\frac{1}{\sqrt{2}}(|UD\rangle + |DU \rangle), \frac{1}{\sqrt{2}}(|UU\rangle + |DD \rangle) and 12(UUDD)\frac{1}{\sqrt{2}}(|UU\rangle - |DD \rangle).

So inside a maximally entangled state: 1) An entangled state is a complete description of the combined system and nothing else can be known about it, and 2) In a maximally entangled state, nothing is known about the individual subsystems.

Recall the spin-polarization principle: this holds for all product, states but does not hold for sing|\text{sing}\rangle. In fact, we can show that σx=σy=σz=0\langle \sigma_x \rangle = \langle \sigma_y \rangle = \langle \sigma_z \rangle =0 because

σz=singσzsing=singσz12(UDDU) \begin{align*} \langle \sigma_z \rangle &= \langle \text{sing} | \sigma_z | \text{sing} \rangle \\ &= \langle \text{sing} |\sigma_z \dfrac{1}{\sqrt{2}} (|UD \rangle - |DU \rangle )\\ \end{align*}

so

singσzsing=sing12(UD+DU)    σz=12(UDDU)(UD+DU). \begin{align*} &\langle \text{sing}|\sigma_z | \text{sing} \rangle = \langle \text{sing}|\dfrac{1}{\sqrt{2}}(|UD\rangle + |DU \rangle) \\ \implies & \langle \sigma_z \rangle = \dfrac{1}{2} \left( \langle UD| - \langle DU |\right)\left(|UD \rangle + |DU\rangle\right). \end{align*}

and similarly we can show

σx=12(UDDU)(DD+UU)σy=12(UDDU)(iDD+iUU) \begin{align*} \langle \sigma_x \rangle &= \dfrac{1}{2} \left( \langle UD| - \langle DU |\right)\left(|DD \rangle + |UU\rangle\right) \\ \langle \sigma_y \rangle &= \dfrac{1}{2} \left( \langle UD| - \langle DU |\right)\left(i|DD \rangle + i|UU\rangle\right) \\ \end{align*}

So there's something hidden here.

Charlie gives two sing|\text{sing}\rangle and Alice and Bob measure and multiply their results to obtain τzσz\tau_z\sigma_z. We get

τzσz12UDDU=τz12(UD+DU)=12(UD+DU) \begin{align*} \tau_z \sigma_z \dfrac{1}{\sqrt{2}}{|UD\rangle - |DU \rangle} &= \tau_z \dfrac{1}{\sqrt{2}} (|UD\rangle + |DU\rangle)\\ & = \dfrac{1}{\sqrt{2}} (-|UD\rangle + |DU \rangle) \end{align*}

so τzσzsing=sing\tau_z\sigma_z |\text{sing}\rangle = -|\text{sing}\rangle, or sing|\text{sing}\rangle is an eigevector of τzσz\tau_z \sigma_z with eigenvalue 1-1. We can check that this is also true when we replace zz with xx or yy.

So we need an apparatus that measures in terms of στ\vec{\sigma}\cdot \vec{\tau}, instead of measuring one component at a time. How? Sometimes the Hamiltonian of neighboring spins is proportional to στ\vec{\sigma}\cdot \vec{\tau}, so we just need to measure the energy of the atomic pair.

Why are they called singlets and triplets? The singlet is an eigenvector with one eigenvalue, and the triplets are all eigenvectors with a different degenerate eigenvalue.

Lecture 7 - More On Entanglement

Outer Products and Density Matrices

We can also form the outer product ψϕ|\psi \rangle \langle \phi |, which is a linear operator that acts on a ket by ψϕA=ψϕA|\psi \rangle \langle \phi | |A\rangle = |\psi \rangle \langle \phi |A \rangle and acts on a bra by Bψϕ=Bψϕ\langle B| |\psi \rangle \langle \phi | = \langle B|\psi \rangle \langle \phi |. The case ψψ|\psi \rangle \langle \psi | is called a projection operator, which is Hermitian and satisfies the following: ψ|\psi\rangle is an eigenvector of its projection operator with eigenvalue 11; any vector orthogonal to ψ|\psi \rangle is an eigenvector with eigenvalue 00; ψϕ2=ψϕ|\psi \rangle \langle \phi |^2 = |\psi \rangle \langle \phi |; the trace is 11; iii=I\sum_i |i \rangle \langle i | = I where ii is a basis; and lastly, L=TrψψL\langle L \rangle = \text{Tr} |\psi \rangle \langle \psi |L.

The reason why we care is because the density matrix defined as ρ=12ψψ+12ϕϕ\rho = \frac{1}{2} |\psi \rangle \langle \psi | + \frac{1}{2} |\phi \rangle \langle \phi |, which is useful because L=TrρL\langle L \rangle = \text{Tr} \rho L. We can extend this to nn states easily.

Suppose Alice knows a wave function Ψ(a,b)\Psi(a,b), but wishes to extract as much knowledge about aa without caring about bb. Then

L=ab,abΨ(ab)Lab,abΨ(ab)=a,b,aΨ(ab)La,aΨ(ab) \begin{align*} \langle L \rangle &= \sum_{ab, a'b'} \Psi^*(a'b')L_{a'b', ab}\Psi(ab)\\ &= \sum_{a,b,a'}\Psi^*(a'b)L_{a',a}\Psi(ab) \end{align*}

and we can write the information as a matrix ρaa=bΨ(ab)Ψ(ab)\rho_{aa'}=\sum_b \Psi^*(a'b)\Psi(ab). This shows that we can remove Bob's information without removing any of Alice's information; we can take Lab,ab=LaaδbbL_{a'b', ab}=L_{a'a}\delta_{b'b} to "filter out" Bob's information, which gives

L=ΨLΨ=a,b,a,bψ(a,b)Lab,abψ(a,b)=a,b,aψ(a,b)La,aψ(a,b)=aaρaaLa,a=TrρL. \begin{align*} \langle L \rangle &= \langle \Psi |L|\Psi \rangle \\ &= \sum_{a,b,a',b'}\psi^*(a',b')L_{a'b', ab}\psi(a,b) \\ &= \sum_{a',b,a}\psi^*(a',b) L_{a',a}\psi(a,b) \\ &=\sum_{a'a}\rho_{a'a}L_{a,a'}\\ &= \text{Tr}\rho L. \end{align*}

To know Alice's density matrix, we must first know the entire wave function. After that, we can disregard the rest and still compute everything about Alice using just her density matrix.

Here are some properties about density matrices: they are Hermitian; Tr(ρ)=1\text{Tr}(\rho)=1; the eigenvalues lie between 00 and 11; for a pure state: ρ2=ρ,Tr(ρ2)=1\rho^2=\rho,\text{Tr}\left(\rho^2\right)=1; for a mixed or entangled state, ρ2ρ,Tr(ρ2)<1\rho^2\neq \rho, \text{Tr}\left(\rho^2\right)<1.

For a quick example: for the state vector Ψ=12(UD+DU)|\Psi \rangle =\frac{1}{\sqrt{2}} \left(|UD\rangle + |DU\rangle \right), the density matrix is (120012).\begin{pmatrix} \frac{1}{2} & 0 \\ 0 & \frac{1}{2} \end{pmatrix}.

Tests for Entanglement

  • The correlation test: there is no entanglement if and only if ABAB=0\langle AB \rangle- \langle A \rangle \langle B \rangle =0.
  • The density matrix test: If the composite Alice-Bob system is ain a product state, then Alice's density matrix has exactly one eigenvalue equal to 11, and the others are 00. Additionally, the eigenvector with a nonzero eigenvalue is nothing but the Wave function of Alice's half of the system.

One issue with quantum mechanics is that in an experiment, the apparatus does not "know" the spin state until Alice looks at it. But once she does, the wave function collapses. If we bring in Bob to consider Alice, the apparatus, and the spin as one system, once he looks at this system, their wave function collapses. And so on...

Quantum mechanics does not violate locality, which states that it is impossible to send a signal faster than the speed of light. Let Alice's density matrix be ρaa=bψ(ab)ψ(ab)\rho_{aa'}=\sum_b \psi^*(a'b)\psi(ab) and UbbU_{bb'} be the unitary matrix for whatever happens to the entire system when Bob does his experiment. We get ψfinal(ab)=bUbbψ(ab)    ψfinal(ab)=bψ(ab)Ubb\psi_{\text{final}}(ab)=\sum_{b'}U_{bb'}\psi(ab') \implies \psi_{\text{final}}^*(ab)=\sum_{b''}\psi(a'b'')U_{b'b''}^{\dagger}, but when we plug this in, we still get the original ρaa=bψ(ab)ψ(ab)\rho_{aa'}=\sum_b \psi^*(a'b)\psi(ab) because UU is unitary.

Bell's Theorem

Consider a video game that tries to fool you into thinking there is a quantum spin in a magnetic field inside the computer, and we get to experiment to test this. The computer stores two normalized complex numbers αU\alpha_U and αD\alpha_D. At the start of the game, the computer initializes these values, then solves the Schrödinger equation to update the α\alpha's like they were the components of the spin's state-vector. We are allowed to manipulate the angle of the apparatus.

What if we do this on two computers? As long as they are connected throguh a cable and the computers can send messages instantaneously, we are good. But disconnecting the cable destroys the simulation.

This is essentially Bell's theorem: classical computers need to be connected with an instantaneous cable to simulate entanglement. This is not a problem about quantum mechanics, but rather a problem with simulating quantum mechanics inside classical computers.

Lecture 8 - Particles and Waves

Interlude: Mathematics

When we consider functions as vectors, we form a Hilbert space. But we need to modify things in three ways: 1) integrals replace sums so ΨΦ=ψ(x)ϕ(x)dx\langle \Psi | \Phi \rangle = \int_{-\infty}^\infty \psi^*(x)\phi(x)\,dx 2) probability densities replace probabilities so P(a,b)=abψ(x)ψ(x)dxP(a,b)=\int_a^b \psi^*(x)\psi(x)\,dx, and 3) Dirac delta functions replace Kronecker deltas, or a function δ(xx)\delta(x-x') with the property that for any function $F(x),

δ(xx)F(x)dx=F(x). \int_{-\infty}^\infty \delta(x-x')F(x')\,dx' = F(x).

For large values of nn, it's approximated by nπe(nx)2\frac{n}{\sqrt{\pi}}e^{-(nx)^2}.

In quantum mechanics, the limits of integration span the entire axis and our have function 0\to 0 at \infty to be normalized, so we can rewrite integration by parts as:

FdGdxdx=dFdxGdx. \int_{-\infty}^\infty F\dfrac{\,dG}{\,dx} \,dx = -\int_{-\infty}^\infty \dfrac{\,dF}{\,dx}G\,dx.

Consider X\textbf{X} and D\textbf{D}, the multiply by xx operator and differentiation operators. We can now see that X\textbf{X} is Hermitian and D\textbf{D} is anti-Hermitian (D=DD^\dagger = -D). We can make D\textbf{D} Hermitian by iD-i\hbar \textbf{D}, which satisfies

iDψ(x)=idψ(x)dx. -i\hbar \textbf{D}\psi(x)=-i\hbar \dfrac{\,d\psi(x)}{\,dx}.

Eigenstuff for Position and Momentum

For X\textbf{X}, every real number x0x_0 is an eigenvalue of X\textbf{X}, and the corresponding eigenvectors are functions that are infinitely concentrated at x=x0x=x_0. Additionally, we have x0Ψ=δ(xx0)ψ(x)    xΨ=ψ(x)\langle x_0 | \Psi \rangle = \int_{-\infty}^\infty \delta(x-x_0)\psi(x) \implies \langle x| \Psi\rangle = \psi(x), and we call ψ(x)\psi(x) the wave function in the position representation.

Define the position operator P=iD\textbf{P}=-i\hbar\textbf{D}. The eigenvectors are ψ(x)=12πeipx\psi(x)=\frac{1}{\sqrt{2\pi}}e^{\frac{ipx}{\hbar}} with eigenvalue pp. Now, note that xp=px\langle x|p \rangle = \langle p|x \rangle^*. We can now see that the wavelength of eipxe^{\frac{ipx}{\hbar}} is given by λ=2πp\lambda = \frac{2\pi \hbar}{p}, which is one reason we call it a wave function. Let's call ψ~(p)=PΨ\tilde{\psi}(p)=\langle \textbf{P}|\Psi \rangle the wave function in the momentum representation.

The relationship between the two representations is given by the Fourier transforms

ψ~(p)=12πdxeipxψ(x)ψ~(x)=12πdpeipxψ~(x) \begin{align*} \tilde{\psi}(p)&=\dfrac{1}{\sqrt{2\pi}} \int \,dx e^{-\frac{ipx}{\hbar}}\psi(x) \\ \tilde{\psi}(x)&=\dfrac{1}{\sqrt{2\pi}} \int \,dp e^{\frac{ipx}{\hbar}}\tilde{\psi}(x) \end{align*}

Heisenberg's Uncertainty Principle

We have [X,P]=i[\textbf{X}, \textbf{P}]=i\hbar (which     {x,p}=1\implies \{ x,p\}=1 in the classical case). Specializing the general uncertainty principle from earlier to this case gives the famous Heisenberg's Uncertainty Principle:

ΔXΔP2\Delta \textbf{X} \Delta \textbf{P} \ge \frac{\hbar}{2}

Lecture 9 - Particle Dynamics

Kinematics

How do particles move in quantum mechanics? Plugging H=cP\textbf{H}=c\textbf{P} gives

ψ(x,t)t=cψ(x,t)x \dfrac{\partial \psi(x,t)}{\partial t}=-c\dfrac{\partial \psi(x,t)}{\partial x}

so any ψ(xct)\psi(x-ct) is a solution. Since we want ψ(x)ψ(x)dx=1\int_{-\infty}^\infty \psi^*(x)\psi(x)\,dx =1, ψ\psi must look like a wave packet. All together, this particle can only move to the right, the energy can be either positive or negative, and the particle can only exist in a state when it moves at this particular velocity. The classical description is that momentum is conserved and the position moves with fixed velocity cc; the quantum analog is the whole probability distribution and the expected value move with velocity cc.

If we add in the Hamiltonian, we can write H=12mv2=p22mH=\frac{1}{2}mv^2 = \frac{p^2}{2m}, and messing around with some algebra gives us the traditional Schrödinger equation for an ordinary nonrelativistic free particle:

iψt=22m2ψx2. i\hbar \dfrac{\partial \psi}{\partial t}=-\dfrac{\hbar^2}{2m}\dfrac{\partial^2 \psi}{\partial x^2}.

To solve the time-dependent Schrödinger, we need to solve the time-independent version first. The function ψ(x)=eipx\psi(x)=e^{\frac{ipx}{\hbar}} solves the independent version, and we can use this to solve the time-dependent version:

ψ(x,t)=ψ~(p)(expi(pxp2t2m)dp. \psi(x,t)=\int \tilde{\psi}(p) \left( \text{exp} \dfrac{i(px-\frac{p^2t}{2m}}{\hbar} \right) \,dp.

Let's look at the quantum analog of v=pmv=\frac{p}{m}. We have

v=ddtψ(x,t)xψ(x,t). v=\dfrac{\,d}{\,dt}\int \psi^*(x,t)x\psi(x,t).

From lecture 4, we have

v=i2m[P2,X]=i2mP[P,X]+[P,X]P v=\dfrac{i}{2m\hbar}\langle [\textbf{P}^2, \textbf{X}] \rangle = \dfrac{i}{2m\hbar}\langle \textbf{P}[\textbf{P},\textbf{X}]+[\textbf{P}, \textbf{X}]\textbf{P} \rangle

and using [P,X]=i[\textbf{P}, \textbf{X}]=-i\hbar gives P=mv\langle P\rangle = mv.

As we saw, we can take a classical system, replace the classical phase space with a linear vector space, replace xx with X\textbf{X} and pp with P\textbf{P}, and use the Hamiltonian to solve the time-dependent equation (how the wave function changes with time) or the time-independent equation (to find the eigenvectors and eigenvalues of the Hamiltonian). This process is known as quantization.

Forces

Classically,

F(x)=md2xdt2=Vx. F(x)=m\dfrac{\,d^2 x}{\,dt^2}=-\dfrac{\partial V}{\partial x}.

Quantization tells us to make H=P22m+V(x)\textbf{H}=\frac{\textbf{P}^2}{2m}+\textbf{V}(x) and modify Schrödinger to

iψt=22m2ψx2+V(x)ψ=Eψ. \begin{align*} i\hbar \dfrac{\partial \psi}{\partial t} &= -\dfrac{\hbar^2}{2m}\dfrac{\partial^2 \psi} \\ {\partial x^2}+V(x)\psi = E\psi. \end{align*}

We can show that X,V(x)]=0\textbf{X}, \textbf{V}(x)]=0 and

ddtP=i2m[P2,P]+i[V,P]. \dfrac{\,d}{\,dt} \langle \textbf{P}\rangle = \dfrac{i}{2m\hbar}\langle [\textbf{P}^2, \textbf{P}]\rangle + \dfrac{i}{\hbar} \langle [\textbf{V}, \textbf{P}]\rangle.

Using [V(x),P]=idV(x)dx[\textbf{V}(x), \textbf{P}]=i\hbar \frac{\,dV(x)}{\,dx} we get ddtP=dvdx\frac{\,d}{\,dt}\langle \textbf{P}\rangle = -\langle \frac{\,dv}{\,dx}\rangle.

This shows that the classical equations are only approximations that are good when we can replace the average of dVdx\frac{\,dV}{dx} with the average of xx. We can do this when V(x)V(x) varies slowly compared to the size of the wave packets.

For an example of a "bad potential", consider a bunch of large closely packed spikes of size δx\delta x with δx<Δx\delta x<\Delta x. The Heisenberg Uncertainty Principle tells us ΔxmΔv\Delta x \sim \frac{\hbar}{m\Delta v}, which shows that large masses and smooth potentials work more classically; particles with low mass moving through an abrupt potential behave like a quantum mechanical system. When the equality of Heisenberg's Uncertainty Principle holds, these are the Gaussian wave packets, which we'll discuss later.

Path Integrals

Classically, the action is A=t1t2L(x,x)dt=t1t2(mx22V(x))dtA=\int_{t_1}^{t_2} L(x,x)\,dt = \int_{t_1}^{t_2}\left( \frac{mx^2}{2} - V(x) \right)\,dt. The quantum analog question is: given a particle starts at (x1,t1)(x_1,t_1), what is the amplitude C1,2C_{1,2} that it shows up at (x2,t2)(x_2,t_2)? We have

C1,2=x2eiH(t2t1)x1=dxx2eiHt/2xxeiHt/2x1=pathseiA \begin{align*} C_{1,2}&=\langle x_2 | e^{-iH(t_2-t_1)}|x_1\rangle\\ &= \int \,dx \langle x_2 |e^{-iHt/2}|x \rangle \langle x|e^{-iHt/2}|x_1 \rangle \\ &= \int_{\text{paths}}e^{\frac{iA}{\hbar}} \end{align*}

which is the extremely powerful path integral formulation by Feynman.

Lecture 10 - Harmonic Oscillator

Classical vs Quantum

Classically, L=12my212ky2=12x212ω2x2L=\frac{1}{2}my^2-\frac{1}{2}ky^2 = \frac{1}{2}x^2-\frac{1}{2}\omega^2 x^2 where x=myx=\sqrt{m}y, ω=km\omega = \sqrt{\frac{k}{m}}. The Lagrangian is Lx=ddtLx˙    x=Acos(ωt)+Bsin(ωt)\frac{\partial L}{\partial x}=\frac{\,d}{\,dt}\frac{\partial L}{\partial \dot{x}} \implies x=A\cos(\omega t)+B\sin(\omega t).

We have p=Lx˙=x˙p=\frac{\partial L}{\partial \dot{x}}=\dot{x} so H=px˙L=12x˙2+12ω2x2=12p2+12ω2x2H=p\dot{x}-\mathcal{L}=\frac{1}{2}\dot{x}^2 + \frac{1}{2}\omega^2x^2 =\frac{1}{2}p^2 + \frac{1}{2}\omega^2 x^2. For the quantum mechanical version, we have Hψ(x)    222ψ(x)x2+12ω2x2ψ(x)\mathbf{H}|\psi(x)\rangle \implies -\frac{\hbar^2}{2} \frac{\partial^2 \psi(x)}{\partial x^2}+\frac{1}{2}\omega^2 x^2 \psi(x). Plugging this into the time-dependent Schrödinger gives

iψt=22ψx2+12ω2x2ψ. i\dfrac{\partial \psi}{\partial t} = -\dfrac{\hbar}{2}\dfrac{\partial^2 \psi}{\partial x^2}+\dfrac{1}{2\hbar}\omega^2 x^2 \psi.

Energy Levels

We can also calculate the energy levels by solving the time-independent Schrödinger:

222ψE(x)x2+12ω2x2ψE(x)=EψE(x). -\dfrac{\hbar^2}{2} \dfrac{\partial^2 \psi_E(x)}{\partial x^2}+\dfrac{1}{2} \omega^2 x^2 \psi_E(x)=E\psi_E(x).

But there are a bunch of nonsensical solutions that make no sense physically, we need to impose conditions such as: physical solutions of the Schrödinger equation must be normalizable.

The lowest energy level is the ground state ψ0(x)\psi_0(x). To identify this, there is a theorem (left unproved): the ground-state wave function for any potential has no zeros and it's the only energy eigenstate that has no nodes. With a huge amount of algebra we can reduce the Schrödinger to

2ωeω2x2=Eeω2x2. \dfrac{\hbar}{2} \omega e^{-\frac{\omega}{2\hbar}x^2}=Ee^{-\frac{\omega}{2\hbar}x^2}.

We can write E0=ω2E_0=\frac{\omega \hbar}{2} and ψ0(x)=eω2x2\psi_0(x)=e^{-\frac{\omega}{2\hbar}x^2}.

Creation and Annihilation

Take H=12(P2+ω2X2)=12(P+iωX)(PiωX)+ω2\mathbf{H}=\frac{1}{2}(\mathbf{P}^2 + \omega^2 \mathbf{X}^2)= \frac{1}{2} (\mathbf{P}+i\omega \mathbf{X})(\mathbf{P}-i\omega \mathbf{X})+\frac{\omega \hbar}{2}. Consider the lowering operator a=i2ω(PiωX)\mathbf{a}^- = \frac{i}{\sqrt{2\omega \hbar}}(\mathbf{P}-i\omega \mathbf{X}) and the raising operator a+=i2ω(P+iωX)\mathbf{a}^+ = \frac{-i}{\sqrt{2\omega \hbar}}(\mathbf{P}+i\omega \mathbf{X}). We can now rewrite H=ω(a+a+12)\mathbf{H}=\omega \hbar (\mathbf{a}^+ \mathbf{a}^- + \frac{1}{2}). Additionally, we have [a,a+][\mathbf{a}^-, \mathbf{a}^+] and defining N=a+a\mathbf{N}=\mathbf{a}^+ \mathbf{a}^- gives H=ω(N+12)\mathbf{H}=\omega \hbar(\mathbf{N}+\frac{1}{2}), as well as the relations [a,N]=a[\mathbf{a}^-, \mathbf{N}]=\mathbf{a}^- and [a+,N]=a+[\mathbf{a}^+, \mathbf{N}]=-\mathbf{a}^+. Since a+n=n+1\mathbf{a}^+|n\rangle = | n+1 \rangle and an=n1\mathbf{a}^-|n\rangle = |n-1 \rangle, we can show that En=ω(n+12)E_n=\omega\hbar (n+\frac{1}{2}).

Wave Functions, Again

We have i2ω(PiωX)ψ0(x)=0    dψ0dx=ωxψ0(x)\frac{i}{\sqrt{2\omega \hbar}}(\mathbf{P}-i\omega \mathbf{X})\psi_0(x)=0 \implies \frac{\,d\psi_0}{\,dx}=-\frac{\omega x}{\hbar}\psi_0(x) which has solution eω2x2e^{-\frac{\omega}{2\hbar}x^2}. Applying the raising operator gives ψ1(x)=2iωxψ0(x)\psi_1(x)=2i\omega x \psi_0(x), and we can do this as many times as we want to get ψn(x)\psi_n(x), and the polynomials in this sequence are called the Hermite polynomials.

These eigenfunctions also show quantum tunneling: they approach zero asymptotically but never reach 00, so there's a chance the particle is "outside the bowl" that defines its potential energy function.

Gary Hu
© 2025