Notes from watching Leonard Susskind's Theoretical Minimum series. Scribed notes for what I've learned, ideally at a level that's easy to understand. Another goal of this blog is to be succinct - these notes came from 18.5 hours of lectures, which were then turned into a ~400 page book. To be more succinct, I will also leave out some details and avoid being overly wordy; the reader should think of these notes as a skeleton for the actual course, rather than a replacement.

Please reach out to me if you would like to discuss anything here or have any questions!

Lecture 1 - Systems and Experiments
Lecture 2 - Quantum States
Lecture 3 - Principles of Quantum Mechanics
Lecture 4 - Time and Change
Lecture 5 - Uncertainty and Time Dependence
Lecture 6 - Combining Systems: Entanglement
Lecture 7 - More On Entanglement
Lecture 8 - Particles and Waves
Lecture 9 - Particle Dynamics
Lecture 10 - Harmonic Oscillator

Lecture 1 - Systems and Experiments

Classical mechanics is intuitive because we can visualize the objects being studied. While it becomes mathematically abstract with concepts like Lagrangians and Poisson brackets, its basic principles are motivated by the observable world. In classical mechanics, we consider 1) a system of objects, and 2) the state of this system. Its deterministic nature means that knowing the system's state at any given time allows us to predict its future behavior.

Beyond classical mechanics, abstraction increases. Our brains operate in three dimensions, making it impossible to visualize higher or lower dimensions accurately. No one can visualize 4 dimensions, and when we visualize lower dimensions (say 2d), we are seeing a 2d plane embedded in 3 dimensions. Good physicists excel not in visualizing higher dimensions but in using abstract mathematics to describe them. Thus, understanding quantum mechanics requires relying on abstract math rather than trying to visualize it as an extension of classical mechanics.

In classical mechanics, the key question is: what is the state of a closed system? The space of states encompasses all possible states of a system. For example, a die has the state space ${1,2,3,4,5,6}$ , and a coin has ${\text{heads}, \text{tails}}$ . For a point particle on a line, the phase space includes all possible positions and momenta, forming a continuously infinite, two-dimensional line that we can graph.

Consider the die again. If we look at two subsets:

odd numbers: ${1,3,5}$
numbers $\leq 3$ : ${1,2,3}$ . We see that "AND" is their intersection: ${1,3}$ , and "OR" is their union: ${1,2,3,5}$ . These concepts come from set theory, but in quantum mechanics, the space of states is a vector space, not a set.

Rather than diving into vector spaces, let's focus on experiments.

Experiments

Let's examine a coin in classical mechanics (cbit) and quantum mechanics (qubit). The quantum analog is more intriguing. Each qubit has two states: heads or tails, represented by $\sigma=+1$ for heads ( $\uparrow$ ) and $\sigma=-1$ for tails ( $\downarrow$ ).

We also have an apparatus with "THIS SIDE UP" sign. When placed upright, the apparatus detects the qubit's state and displays $\sigma$ on the screen. Instead of just measuring, think of this as preparing the system. If we measure $\sigma=+1$ , reset the apparatus, and quickly repeat the experiment with the same qubit, the screen consistently shows $+1$ . Thus, the apparatus not only determines but rather prepares the qubit's state.

Let's flip the apparatus so that "THIS SIDE UP" is on the bottom, orienting it downwards. Now, redoing the experiment with the same qubit gives $\sigma=-1$ . Repeating the experiment consistently yields $\sigma=-1$ , but flipping the apparatus back shows $\sigma=+1$ again. This suggests the qubit has directionality, indicating it behaves like a vector. Alternatively, the detector's orientation could influence the screen to display the vector component along its direction.

What if we rotate the apparatus $90^\circ$ so "THIS SIDE UP" points left? Using multiple $\uparrow$ qubits one after another, we expect the screen to display 0, as the horizontal component of $\uparrow$ is 0. In classical physics, this would happen. However, in quantum mechanics, the screen shows either $+1$ or $-1$ , averaging to 0 over many trials, so the probability of getting $+1$ and $-1$ is the same.

If we generalize by changing the angle to $\theta^\circ$ , the horizontal component of the $\uparrow$ vector along this tilt is $\cos{\theta}$ . Classically, we'd expect the output to be $\cos{\theta}$ . In the quantum world, the output is still $+1$ or $-1$ , but the average value over many trials will be $\cos{\theta}$ .

Some Mathematics

Let's diverge our discussion to introduce more mathematics - let's discuss vector spaces. For now, it suffices to define a vector space. Normally, we use $\vec{v}$ or $\textbf{v}$ to denote vectors; in quantum mechanics, we use bra-ket notation, where $\langle v |$ is a bra, and $| v\rangle$ is a ket. We can now define a vector space as follows:

A vector space $V$ is a collection of vectors $| v\rangle$ , satisfying the following:

Addition: $|a \rangle + |b\rangle = |c\rangle$ , where $|a \rangle, |b\rangle, |c\rangle \in V$ .
Multiplication: $z |a \rangle = |a' \rangle$ for $z\in \mathbb{C}$ and $|a \rangle, |a' \rangle \in V$ .

Let's take a look at a few examples:

The real numbers are a real vector space, because we require $z\in \mathbb{R}$ .
The complex numbers are a complex vector space.
Functions form a vector space.
The collection of all vectors $\begin{pmatrix} \alpha_1 \\ \alpha_2 \\ \vdots \\ \alpha_n \end{pmatrix}$ with $\alpha_1, \alpha_2, ..., \alpha_n \in \mathbb{C}$ is a vector space.

Now, let's introduce the idea of a dual vector space. Given a vector space $V$ , we construct a dual vector space $V^\vee$ . The basic idea is for every vector $| v \rangle \in V$ , there is a vector $\langle v | \in V^\vee$ .

We can formulate this more rigorously as follows:

Given a vector space $V$ containing vectors $|v\rangle$ , the dual vector space $V^\vee$ contains vectors $\langle v |$ , such that there is a 1-to-1 correspondence between:

$|a \rangle \leftrightarrow \langle a |$
$|a\rangle + |b\rangle \leftrightarrow \langle a | + \langle b |$ .
$z|a \rangle \leftrightarrow \langle a | z^*$ , where $z^*$ is the complex conjugate of $z$ .

Now, let's define the inner product, the analogue of the dot product.

Let's look at an example.

Define $\langle b|a\rangle = (\beta_1^*, \beta_2^*) \begin{pmatrix} \alpha_1 \\ \alpha_2 \end{pmatrix} = \beta_1^* \alpha_1 + \beta_2^* \alpha_2$ . Then $\langle a|b\rangle = (\alpha_1^*, \alpha_2^*) \begin{pmatrix} \beta_1 \\ \beta_2 \end{pmatrix} = \alpha_1^* \beta_1 + \alpha_2^* \beta_2$ . Now we can see $\langle a|b\rangle \langle b|a\rangle^*$ .

What if we set $b=a$ ? Then $\langle a|a\rangle = \langle a|a\rangle^*$ , so the inner product of any vector with itself is always positive and real. This value is generally known as the length of a vector.

We say two vectors are orthogonal if $\langle b|a \rangle=0$ . Define the dimension of a vector space is the maximum number of nonzero mutually orthogonal vectors. For example, in the vector space of 2d column vectors, $\begin{pmatrix} 1 \\ 0 \end{pmatrix}$ and $\begin{pmatrix} 0 \\ 1 \end{pmatrix}$ are orthogonal, and since there are no other nonzero mutually orthogonal vectors, this vector space is of dimension $2$ .

Lecture 2 - Quantum States

Logic

Let's rename $\uparrow$ to spin up, and $\downarrow$ spin down. When we measure a spin up qubit in an upwards oriented apparatus, we are measuring $\sigma_z = \pm 1$ - the $z$ -component of the spin. And if we turn the apparatus on it's side, we get $\sigma_x=\pm 1$ . And so on...

Now we generalize. Suppose we have previously prepared an experiment where everything is oriented in the $\hat{n}$ axis. Then, we rotate the apparatus in 3d space so that the orientation vector points in the $\hat{m}$ direction. It turns out that the average value of the experiment is $\langle \sigma_m \rangle = \hat{n}\cdot \hat{m} = \cos{\theta}$ , the component of $\hat{n}$ along $\hat{m}$ .

Classically, proposition about the state of a system is a subset for which a proposition is true. One type of a proposition is a NOT statement, which is exactly what it sounds like - the subset of things that do not satisfy the proposition. A proposition can be either TRUE, or FALSE. Two other types of propositions we've seen already are AND and OR statements - they are different in quantum mechanics. In english, there are two types of OR - the inclusive one (the union) and the exclusive one (in one subset, but not the other). Generally when we speak, we mean the exclusive one, but in logic, we mean the inclusive OR (the union).

Take two propositions:

$A: \sigma_z=+ 1$
$B: \sigma_x=+ 1$

Let's design an experiment to check $A$ OR $B$ on $\uparrow$ . Let's measure the spin along the $z$ -axis, first. Orient the apparatus in the upward direction, and run the experiment. If this is true, we are done. $A$ OR $B$ is TRUE because the OR only requires that one of them be true.

Let's redo the experiment, but flip the order of $A$ and $B$ .

We will get $\sigma_x=+1$ with $\frac{1}{2}$ probability. Then $A$ OR $B$ is TRUE.
We will get $\sigma_x=-1$ with $\frac{1}{2}$ probability, so we need to test $B$ . Since we prepared the spin arrow to point to the sideways to test $A$ , when we test $B$ , we will get $\frac{1}{2}$ chance it lands $+1$ , and $\frac{1}{2}$ chance it lands $-1$ . So the chance $B$ OR $A$ is true is $\frac{1}{2}+\frac{1}{4}=\frac{3}{4}$ , so order does matter.

The space of states of a quantum system is a linear vector space, so given any two states, we can add them to create a third space. You cannot do this with sets - given two elements of a set, you can't always add them together to make a third element.

The Connection

Finally, let's state the connection between a vector space and the space of states of a quantum statement. In our previous experiment, we saw that: $\sigma_z, \sigma_x,$ or $\sigma_y=\pm 1$ . Let's rename these to $|U\rangle$ (up), $|D\rangle$ (down), $|L \rangle$ (left), $|R\rangle$ (right), $|I\rangle$ (in), and $|O\rangle$ (out). Suppose we had a ket vector $|A \rangle = \alpha_U | U \rangle + \alpha_D | D\rangle$ . The probability of getting up, would be $P_u=\alpha_u^* \alpha_u$ , and the probability of getting down is $\alpha_d^* \alpha_d$ .

One basic postulate that we will accept is orthogonality means that two things are sufficiently different that with a single experiment you can tell the difference. So $|U\rangle$ and $|D\rangle$ are orthogonal to each other. We will show that these can be the basis for our vector space (of dimension $2$ ).

Since $P_U + P_D=1$ , we have $\alpha_u^* \alpha_u + \alpha_d^*\alpha_d = 1 \leftrightarrow \langle A | A \rangle =1$ . Using similar ideas, we can guess that $|R \rangle = \frac{1}{\sqrt{2}} |U\rangle + \frac{1}{\sqrt{2}}|D\rangle$ and $|L \rangle = \frac{1}{\sqrt{2}} |U\rangle - \frac{1}{\sqrt{2}}|D\rangle$ . We can now derive the identities:

\begin{align*} \dfrac{|R \rangle + | L \rangle}{\sqrt{2}}&= |U\rangle \\ \dfrac{|R \rangle - | L \rangle}{\sqrt{2}}&= |D\rangle \\ \end{align*}

Furthermore, we can show that $|I\rangle =-\frac{1}{\sqrt{2}}|U\rangle + \frac{i}{\sqrt{2}}|D\rangle$ and $|O \rangle = \frac{1}{\sqrt{2}}|U\rangle - \frac{i}{\sqrt{2}}|D\rangle$ .

We can rewrite everything as column vectors to conclude the following:

\begin{align*} |U \rangle = \begin{pmatrix} 1\\ 0 \end{pmatrix} \qquad & \qquad |D \rangle = \begin{pmatrix} 0\\ 1 \end{pmatrix} \\ |R \rangle = \begin{pmatrix} \frac{1}{\sqrt{2}}\\ \frac{1}{\sqrt{2}} \end{pmatrix} \qquad & \qquad |L \rangle = \begin{pmatrix} \frac{1}{\sqrt{2}}\\ -\frac{1}{\sqrt{2}} \end{pmatrix} \\ |I \rangle = \begin{pmatrix} \frac{1}{\sqrt{2}}\\ \frac{i}{\sqrt{2}} \end{pmatrix} \qquad & \qquad |O \rangle = \begin{pmatrix} \frac{1}{\sqrt{2}}\\ -\frac{i}{\sqrt{2}} \end{pmatrix} \\ \end{align*}

Lecture 3 - Principles of Quantum Mechanics

More Linear Algebra

Let $|i\rangle$ be basis vectors (mutually orthonormal vectors). Then we can write any element in our vector space as a linear combination of the basis vectors, ie. $|A\rangle = \sum_i \alpha_i |i \rangle$ for $\alpha_i \in \mathbb{C}$ .

We can now do inner products: $\langle j | A \rangle = \sum_i \alpha_i \langle j | i \rangle = \alpha_i \delta_{ji}$ , where $\delta_{ji}$ is the Kronecker delta function:

\delta_{ij} = \begin{cases} 1 & \text{if } i = j \\ 0 & \text{otherwise} \end{cases}

Here's another reason why we want to use linear algebra: observables (thing we measure) are linear operators.

A linear operator $M$ , acting on $|A\rangle$ , gives a unique ket vector $|B\rangle$ satisfying:

$M[z|A \rangle]=zM|A\rangle$ .
$M[|A\rangle + |B\rangle ]=M|A\rangle + M|B\rangle$ .

Using this definition, we can show that

\langle i | M |A\rangle = \langle i | B \rangle = \beta_i

Using our useful identity, we can rewrite this as

\sum_j \langle i | M |A \rangle \langle j|A \rangle = \langle i | M |j \rangle \alpha_j =\beta_i.

We define the matrix elements as $M_{ij} := \langle i | M |j \rangle$ , which are important because it's possible to characterize linear operators by these matrix elements. So we can rewrite $\sum_j \langle i | M |A \rangle \langle j|A \rangle$ in two other ways: $\sum_j M_{ij}a_j = \beta_i$ , and

\begin{pmatrix} M_{11} & M_{12} & M_{13} & ... \\ M_{21} & M_{22} & M_{23} & ... \\ M_{31} & M_{32} & M_{33} & ... \\ \vdots & \vdots & \vdots & ... \end{pmatrix} \begin{pmatrix} \alpha_1 \\ \alpha_2 \\ ... \\ \alpha_N \end{pmatrix} = \begin{pmatrix} \beta_1 \\ \beta_2 \\ ... \\ \beta_N \end{pmatrix}

Since we have matrices, it will be useful to define eigenvectors.

The eigenvectors of $M$ are vectors $|\lambda_i \rangle$ such that $M|\lambda_i \rangle - \lambda_i | \lambda_i \rangle =0$ , and each eigenvector $|\lambda_i\rangle$ has eigenvalue $\lambda_i$ .

These are the vectors which the direction before and after applying the operator is the same, and the only thing that happens to the vector is that it's scaled, and this scaling factor is called the eigenvalue.

We defined linear operators on ket vectors; they can also be defined on bra vectors by through

\langle B | M = (\beta_1^* \beta_2^* ... \beta_N^*)\begin{pmatrix} M_{11} & M_{12} & M_{13} & ... \\ M_{21} & M_{22} & M_{23} & ... \\ M_{31} & M_{32} & M_{33} & ... \\ \vdots & \vdots & \vdots & ... \end{pmatrix}

One may think that $M|A\rangle =|B\rangle \leftrightarrow \langle A| M = \langle B|$ , but this is wrong. The correct answer is that $M|A\rangle =|B\rangle \leftrightarrow \langle A| M^\dagger = \langle B|$ , where $M^\dagger = [M^T]^*$ (ie. change $m_ji$ to $m_{ij}^*$ : first transpose the matrix, then complex conjugate all of it's entries) is the Hermitian conjugate.

Since quantum mechanical measurements are always real, quantum mechanical observables are represented by Hermitian operators, which satisfy the property $M=M^\dagger$ . Professor Susskind calls this the fundamental theorem because of how important it is. One nice property of these Hermitian operators is that the eigenvalue of a Hermitian operator must be real: For a Hermitian operator $L$ , $L|\lambda \rangle = \lambda | \lambda \rangle \implies \langle \lambda | L | \lambda \rangle = \lambda \langle \lambda | \lambda \rangle$ and $\langle | L = \langle | L^\dagger = \langle \lambda | \lambda^* \implies \langle \lambda | L | \lambda = \lambda^* \langle \lambda | \lambda \rangle$ , so $\lambda = \lambda^* \implies \lambda$ is real. We can state the fundamental theorem more precisely as follows:

The eigenvectors of a Hermitian operator form an orthonormal basis.

Suppose $\lambda_1=\lambda_2$ and let $|A\rangle = \alpha | \lambda_1 \rangle + \beta | \lambda_2 \rangle$ . Then $L|A\rangle = \alpha L| \lambda_1 \rangle + \beta L |\lambda_2 \rangle, L|A\rangle = \alpha \lambda|\lambda_1 \rangle + \beta \lambda |\lambda_2$ , and $L|A\rangle = \lambda (\alpha| \lambda_1 \rangle + \beta| \lambda_2 \rangle)=\lambda|A\rangle$ , so these two vectors are linearly independent because any ket vector can be written as a linear combination of these two. Now we have two orthonormal basis vectors, we can extend using Gram-Schmidt to obtain our desired number of basis vectors. It remains to show that if our space is $N$ -dimensional, there are $N$ orthonormal eigenvectors, which is just a simple linear algebra exercise.

Principles

Professor Susskind states that there are 4 basic principles of quantum mechanics:

The observable or measurable quantities of quantum mechanics are represented by linear operators $L$ .
If the system is in eigenstate $|\lambda_i \rangle$ , the result of a measurement is guaranteed to be $\lambda_i$ .
Unambiguously distinguishable states are represented by orthogonal vectors.
If $|A\rangle$ is the state vector of a system, and the observable $L$ is measured, the probability to observe value $\lambda_i$ is $P(\lambda_i)=\langle A|\lambda_i \rangle \langle \lambda_i | A \rangle = |\langle A|\lambda_i \rangle|^2$ .

In particular, these conditions combine to imply that that $L$ must be Hermitian.

Explicit Pauli Matrices

Let's write down spin operators as $2\times 2$ matrices. Starting with $\sigma_z$ , principle 2 tells us $\sigma_z|U \rangle = |U \rangle, \sigma_z|D\rangle = -|D\rangle$ , and principle 3 tells us $\langle u|D\rangle =0$ . Writing this as matrix equations, we get

\begin{align*} \begin{pmatrix} \left(\sigma_z\right)_{11} & \left(\sigma_z\right)_{12} \\ \left(\sigma_z\right)_{21} & \left(\sigma_z\right)_{22} \end{pmatrix} \begin{pmatrix} 1 \\ 0 \end{pmatrix} &= \begin{pmatrix} 1 \\0 \end{pmatrix} \\ \begin{pmatrix} \left(\sigma_z\right)_{11} & \left(\sigma_z\right)_{12} \\ \left(\sigma_z\right)_{21} & \left(\sigma_z\right)_{22} \end{pmatrix} \begin{pmatrix} 0 \\ 1 \end{pmatrix} &=- \begin{pmatrix} 0 \\ 1 \end{pmatrix} \end{align*}

and the unique matrix satisfying this is

\boxed{\sigma_z=\begin{pmatrix} 1 & 0 \\ 0 & -1 \end{pmatrix}.}

For $\sigma_x$ , we have $\sigma_x |R\rangle = |R\rangle$ and $\sigma_x|L\rangle = -|L\rangle$ . Recall that $|R\rangle = \frac{1}{\sqrt{2}}|U\rangle + \frac{1}{\sqrt{2}}|D\rangle$ and $|L\rangle = \frac{1}{\sqrt{2}}|U\rangle - \frac{1}{\sqrt{2}}|D\rangle$ so

\begin{align*} \begin{pmatrix} \left(\sigma_z\right)_{11} & \left(\sigma_z\right)_{12} \\ \left(\sigma_z\right)_{21} & \left(\sigma_z\right)_{22} \end{pmatrix} \begin{pmatrix} \dfrac{1}{\sqrt{2}} \\ \dfrac{1}{\sqrt{2}} \end{pmatrix} &= \begin{pmatrix} \dfrac{1}{\sqrt{2}} \\ \dfrac{1}{\sqrt{2}} \end{pmatrix} \\ \begin{pmatrix} \left(\sigma_z\right)_{11} & \left(\sigma_z\right)_{12} \\ \left(\sigma_z\right)_{21} & \left(\sigma_z\right)_{22} \end{pmatrix} \begin{pmatrix} \dfrac{1}{\sqrt{2}} \\ -\dfrac{1}{\sqrt{2}} \end{pmatrix} &=- \begin{pmatrix} \dfrac{1}{\sqrt{2}} \\ -\dfrac{1}{\sqrt{2}} \end{pmatrix} \end{align*}

which gives

\boxed{\sigma_x=\begin{pmatrix} 0 & 1 \\ 1 & 0 \end{pmatrix}.}

For our last direction, $\sigma_y$ , $|I\rangle = \frac{1}{\sqrt{2}}|U\rangle + \frac{i}{\sqrt{2}}|D\rangle$ and $|L\rangle = \frac{1}{\sqrt{2}}|U\rangle - \frac{i}{\sqrt{2}}|D\rangle$ so $|U\rangle = \begin{pmatrix} \frac{1}{\sqrt{2}} \\ \frac{i}{\sqrt{2}}\end{pmatrix}, |O\rangle =\begin{pmatrix} \frac{1}{\sqrt{2}} \\ -\frac{i}{\sqrt{2}}\end{pmatrix}$ so

\boxed{ \sigma_y=\begin{pmatrix} 0 & -i \\ i & 0 \end{pmatrix}.}

These three boxed matrices are collectively called the Pauli matrices.

Implications

Although $\sigma$ is not a $3$ -vector (because they are written as matrices), it behaves a lot like one. Suppose $\sigma_n=\vec{\sigma}\cdot \hat{n}$ . Then

\sigma_n = \begin{pmatrix} n_x & (n_x-in_y) \\ (n_x+in_y) & - n_x \end{pmatrix}

which gives us a way to calculate what happens when we orient the apparatus along $\hat{n}$ , painting a complete picture of spin measurements in 3d space.

If we let $\hat{n}$ sit in the $x-z$ plane, then

\sigma_n = \begin{pmatrix} \cos{\theta} & \sin{\theta} \\ \sin{\theta} & - \cos{\theta} \end{pmatrix}.

The eigenvectors are $|\lambda_1 \rangle = \begin{pmatrix} \cos{\frac{\theta}{2}} \\ \sin{\frac{\theta}{2}} \end{pmatrix}, |\lambda_1 \rangle = \begin{pmatrix} -\sin{\frac{\theta}{2}} \\ \cos{\frac{\theta}{2}} \end{pmatrix}$ , with eigenvalues $+1$ and $-1$ , respectively. Suppose our apparatus initially points along the $z$ -axis, and we rotate it so it lies along the $\hat{n}$ axis. Giving it a spin in the $|U\rangle$ state, what's the probability of observing $\sigma_n =\pm1$ . Principle 4 gives us

\begin{align*} P(+1)=|\langle U | \lambda_1 \rangle|^2 = \cos^2{\frac{\theta}{2}} \\ P(-1)=|\langle U | \lambda_2 \rangle|^2 = \sin^2{\frac{\theta}{2}} \\ \end{align*}

The expected value is thus

\langle L \rangle = \sum_i \lambda_i P(\lambda_i) = \cos^2{\frac{\theta}{2}}-\sin^2{\frac{\theta}{2}} = \cos{\theta}.

There is one more theorem for this section:

(The Spin-Polarization Principle)

Any state of a single spin is an eigenvector of some component of the spin.

This means given a state $|A\rangle$ , there exists some direction $\hat{n}$ such that $\vec{\sigma} \cdot \vec{n}|A\rangle = |A\rangle$ . Two implications: 1) for any spin state, we can orient the apparatus in some orientation so that it registers $+1$ , and 2) there is no state where the expected value of all three components of the spin is $0$ . In fact, $\langle \sigma_x \rangle^2 + \langle \sigma_y \rangle^2 + \langle \sigma_z \rangle^2=1$ .

Lecture 4 - Time and Change

Unitarity

In classical mechanics, professor Susskind introduced what he called the minus first law: information from the past is never lost. The quantum analog is the conservation of distinctions. Let the state of the system be $|\Psi(t)\rangle$ . The state at time $t$ is a linear operator $U(t)$ , which satisfies $|\Psi(t)\rangle = U(t)|\Psi(0)\rangle$ , where $U$ is called the time-development operator for the system. This is based on the basic dynamical assumption of quantum mechanics is that if you know the state at one time, then the quantum equations of motion tell you what it will be later.

The main difference between classical and quantum determinism is as follows: classical determinism allows us to predict the results of an experiment, which quantum determinism allows us to compute the probabilities of the outcomes of later experiments.

Suppose $|\Psi(0)\rangle$ and $|\Phi(0)\rangle$ are orthogonal. Then $\langle \Psi(0)|\Phi(0)\rangle =0$ , and the conservation of distinctions implies $\langle \Psi(t)|\Phi(t)\rangle =0$ . Rewriting $\langle \Psi(t)| = \langle \Psi(0)|U^\dagger(t)$ , and $|\Psi(t)\rangle = U(t)|\Psi(0)\rangle$ from earlier implies $\langle \Psi(0)|U^\dagger U(t)|\Phi(0)\rangle =0$ . Consider an orthonormal basis $|i\rangle$ with $|\Phi(0)\rangle$ and $|\Psi(0)\rangle$ as basis vectors. We get $\langle i| U^\dagger(t)U(t)|j \rangle = \delta_{ij}$ , so $U^\dagger U=I$ (an operator that satisfies this condition is called unitary).

Now we can introduce our fifth principle of quantum mechanics:

The evolution of state vectors with time is unitary.

The Hamiltonian

Let $U(\epsilon)=I-i\epsilon H$ . Then $(I+i\epsilon H^\dagger)(1-i\epsilon H)=I \implies H^\dagger = H$ . This value $H$ is called the quantum Hamiltonian, a observable where the eigenvalues meausure the energy of a quantum system. Using $|\Psi(t) \rangle = U(t)|\Psi(0)\rangle$ , and taking $t=\epsilon$ , we get $|\Psi(\epsilon)\rangle = |\Psi(0)\rangle - i\epsilon H |\Psi(0)\rangle$ . Rearranging and taking $\epsilon\to 0$ gives

\dfrac{\partial |\Psi \rangle}{\partial t} = -H|\Psi\rangle,

the time-dependent Schrödinger equation. So the reason we care about the quantum Hamiltonian is that it tells us how the state of an undisturbed system evolves with time.

Planck's constant is $\hbar=1.0545\times 10^{34} kg\cdot m^2/s$ , which we need to insert to the time-dependent Schrödinger equation to make the dimensions actually make sense:

\hbar \dfrac{\partial |\Psi \rangle}{\partial t} = -i H|\Psi\rangle.

It is worth noting that Planck originally came up with the constant $h = 2\pi \hbar$ , but later physicists changed it to remove the need to write $2\pi$ in a ton of places.

Averaging

Last time, we look at averages. Here's a nice trick to compute averages:

\langle L \rangle = \langle A | L | A \rangle

We can use averaging to show that we can always scale a state-vector by whatever constant factors $e^{i\theta}$ we want and nothing will change: take $|A\rangle = \sum_i \alpha_i |\lambda_i \rangle$ . If we let $|B\rangle = e^{i\theta} |A\rangle$ , we can show that they have the same magnitude because $\langle B|B\rangle = \langle Ae^{-i\theta} | e^{i\theta} A \rangle = \langle A|A \rangle$ . Similarly, the eigenvalues will be $|\lambda_j \rangle$ with probability $\alpha_j^* e^{-i\theta}e^{i\theta} \alpha_j = \alpha_j^* \alpha_j$ for both $|A\rangle$ and $|B\rangle$ . Finally, we have $\langle L \rangle = \langle B|L|B \rangle = \langle Ae^{-i\theta} | L | e^{i\theta}A \rangle = \langle A|L|A\rangle$ .

Comparison To Classical Mechanics

The expected value changes in time because

\begin{align*} \dfrac{\,d}{\,dt}(\Psi(t)|L|\Psi(t))&=\langle \dot{\Psi}(t)|L|\Psi(t)\rangle + \langle \Psi(t)|L|\dot{\Psi}(t) \rangle \\ &= \dfrac{i}{h} \langle \Psi(t)|HL|\Psi(t) \rangle - \dfrac{i}{h} \langle \Psi(t)|LH | \Psi(t)\rangle \\ &= \dfrac{i}{h} \langle \Psi(t)| HL - LH |\Psi(t) \rangle \\ &= \dfrac{i}{h}\langle [H,L]\rangle \qquad \text{or} \qquad -\dfrac{i}{h}\langle [L,H]\rangle \end{align*}

where the second lines follows from Schrödinger. The term $[L,M]=LM-ML$ is called the commutator and is usually not $0$ . One important fact is that the commutator bracket is skew symmetric: $LM-ML=[L,M]$ . Sometimes, we rewrite this whole thing more succinctly as

\dfrac{\,dL}{\,dt}=-\dfrac{i}{h}[L,H].

We may notice that the commutator $[L,H]$ is awfully similar to the Poisson bracket $i\hbar \{L,H\}$ . If we plug this into the previous equation, we get

\dfrac{\,dL}{\,dt}=\{L,H\}.

The major difference in quantum physics is that there is a difference between $FG$ and $GF$ for two linear operators $F,G$ , whereas this is not true in classical mechanics.

What does it mean when an observable (which we call $Q$ ) is conserved? This means $[Q,H]=0 \implies [Q^n, H]=0$ . $H$ is the definition of energy in quantum mechanics, and obviously, $[H,H]=0$ , which is an example of conservation of energy.

Spin has an energy depending on it's orientation when placed in a magnetic field: let $H\sim \vec{\sigma} \cdot \vec{B}=\sigma_x B_x + \sigma_y B_y + \sigma_z B_z$ . For a simple example: take the magnetic field along the $z$ -axis. Then $H=\frac{\hbar \omega}{2} \sigma_z$ for some constant $\omega$ . The average values are

\begin{align*} \langle \sigma_x \rangle &= -\dfrac{i}{\hbar} \langle [\sigma_x, H]\rangle = -\dfrac{i\omega}{2}\langle [\sigma_x, \sigma_z]\rangle \\ \langle \sigma_y \rangle &= -\dfrac{i}{\hbar} \langle [\sigma_y, H]\rangle = -\dfrac{i\omega}{2}\langle [\sigma_y, \sigma_z]\rangle \\ \langle \sigma_x \rangle &= -\dfrac{i}{\hbar} \langle [\sigma_x, H]\rangle = -\dfrac{i\omega}{2}\langle [\sigma_z, \sigma_z]\rangle \\ \end{align*}

The Pauli matrices verify that $[\sigma_x, \sigma_y]=2i\sigma_z, [\sigma_y, \sigma_z]=2i\sigma_x$ , and $[\sigma_z, \sigma_x]=2i\sigma_y$ . Thus,

\begin{align*} \langle \sigma_x \rangle &= -\omega \langle \sigma_y \rangle \\ \langle \sigma_y \rangle &= \omega \langle \sigma_x \rangle \\ \langle \sigma_z \rangle &= 0. \end{align*}

This is exactly the same equations as the classical rotor in a magnetic field! In classical mechanics, precessing is the $x$ and $y$ components of angular momentum, whereas in quantum mechanics, it's the expected value.

Solving Schrödinger

The iconic Schrödinger is

i\hbar \dfrac{\partial \Psi(x)}{\partial t}=-\dfrac{\hbar}{2m} \dfrac{\partial^2 \Psi(x)}{\partial x^2}+U(x)\Psi(x).

Earlier, we saw the time-dependent Schrödinger equation:

\hbar \dfrac{\partial |\Psi\rangle}{\partial t}=-iH|\Psi \rangle.

There is also the time-independent Schrödinger equation:

H|E_j \rangle = E_j | E_j \rangle.

Because $H$ is energy, $E_j$ is the energy eigenvalues and $|E_j\rangle$ the energy eigenvectors. Suppose we know these. We can now solve the time-dependent analog by plugging in $|\Psi(t) \rangle = \sum_j \alpha_j(t) |E_j \rangle$ to get

\begin{align*} &\sum_j \dot{\alpha}_j(t)|E_j \rangle = -\dfrac{i}{\hbar}H\sum_j \alpha_j(t) |E_j \rangle \\ \implies & \sum_j \dot{\alpha}_j(t)|E_j \rangle = -\dfrac{i}{\hbar}\sum_j E_j \alpha_j(t) |E_j \rangle \\ \implies & \sum_j \{ \dot{\alpha}_j(t) + \dfrac{i}{\hbar} E_j \alpha_j(t) \} |E_j \rangle =0. \end{align*}

When the sum of basis vectors is $0$ , every coefficient must be $0$ , so

\dfrac{\,d \alpha_j(t)}{\,dt} =-\dfrac{i}{\hbar} E_j\alpha_j(t) \implies \alpha_j(t)=a_j(0)e^{-\frac{i}{\hbar}E_jt}.

This is our first example of the connection between energy and frequency.

At $t=0$ , we have $\alpha_j(0)=\langle E_j | \Psi(0)\rangle$ , so the time-dependent Schrödinger can be written as

\begin{align*} |\Psi(t) \rangle &= \sum_j \langle E_j | \Psi(0)\rangle e^{-\frac{i}{h}E_jt}|E_j \rangle \\ &= \sum_j |E_j \rangle \langle E_j | \Psi(0)\rangle e^{-\frac{i}{\hbar}E_j t}. \end{align*}

Using this, we can now predict probabilities: the probability for outcome $\lambda$ is $P_\lambda(t) = |\langle \lambda|\Psi(t)\rangle|^2$ , and we can calculate the ket using Schrödinger.

Wave Function Collapse

Let the state-vector be $\sum_j \alpha_j |\lambda_j\rangle$ before the measurement of $L$ . The apparatus measures $\lambda_j$ with probability $|\alpha_j|^2$ , and then leaves the system in a single eigenstate of $L$ , namely $|\lambda_j|$ . This is because we need to cnosider the apparatus as part of our single quantum system; otherwise, the state vector might reduce to a single eigenstate due to interaction with the external world. This is known as the wave function collapse, which we'll talk about later.

Lecture 5 - Uncertainty and Time Dependence

Simultaneous Eigenvectors

Consider a two-spin system. If we measure both spins, the system is in a state that is simultaneously an eigenvector of $L$ and $M$ - a simulatenous eigenvector. Assume that there is a basis of state vectors $|\lambda, \mu$ that are simultaneous eigenvectors, or

\begin{align*} L|\lambda, \mu \rangle &= \lambda | \lambda, \mu \rangle \\ M|\lambda, \mu \rangle &= \mu | \lambda, \mu \rangle. \end{align*}

Using some algebra we can get

[L, M] | \lambda, \mu \rangle =0.

So the condition for two observables to be simultaneously measurable is that they commute.

Wave Functions

Let $|a,b,c,...\rangle$ be orthonormal basis vectors with entries as commuting observables $A,B,C,...$ . We can rewrite any state vector as

|\Psi \rangle = \sum_{a,b,c,...} \psi(a,b,c,...)|a,b,c,...\rangle.

Then the wave function is

\psi(a,b,c,...)=\langle a,b,c,...|\Psi \rangle.

The probability for the commuting observables to have values $a,b,c,...$ is

P(a,b,c,...)=\psi^*(a,b,c,...)\psi(a,b,c,...)

and we know that the total probability sums to $1:$

\sum_{a,b,c,...}\psi^*(a,b,c,...)\psi(a,b,c,...)=1.

We use $\Psi$ for state-vectors and $\psi$ for wave functions.

Uncertainty

Another reason to care about the Pauli matrices is that every $2\times 2$ Hermitian operator has the Pauli matrices and the identity matrix as a basis. Can we simultaneously measure any pair of spin components? When two observables do not commute, in general, it is impossible to precisely know everything about both. For example, since $[\sigma_x, \sigma_y]=2i\sigma_z, [\sigma_y, \sigma_z]=2i\sigma_x$ , and $[\sigma_z, \sigma_x]=2i\sigma_y$ , we cannot simultaneously measure two spin components.

In general, we cannot simultaneously measure two observables with perfect precision unless they commute. We must have some uncertainty, by which we mean the standard deviation

\overline{A} = A-\langle A \rangle.

The eigenvalues of $\overline{A}$ are $\overline{a}=a-\langle A \rangle$ , and the square of the uncertainty is $(\Delta A)^2=\sum_a \overline{a}^2 P(a) = \sum_a (a-\langle A \rangle)^2 P(a) = \langle \Psi | \overline{A}^2 | \Psi$ .

To bound uncertainties, we often need inequalities, most notably the triangle inequality, $|\vec{X}| |\vec{Y}| \ge \vec{X} \cdot \vec{Y}$ and the Cauchy-Schwarz inequality, which is sometimes written in the form $|\vec{X}|^2 |\vec{Y}|^2 \ge |\vec{X} \cdot \vec{Y}|^2$ but we will use the (equivalent) form $2|X||Y| \ge |\langle X|Y \rangle + \langle Y|X \rangle|$ .

\begin{align*} &2\sqrt{\langle A^2 \rangle \langle B^2 \rangle} \ge | \langle \Psi|[A,B]|\Psi \rangle| \\ \implies & \Delta A \Delta B \ge \dfrac{1}{2} |\langle \Psi | [A,B] |\Psi \rangle | \end{align*}

Lecture 6 - Combining Systems: Entanglement

Tensor Products

Now, let there be two systems: system $A$ with space of states $S_A$ and system $B$ with space of states $S_B$ . The combined system is then $S_A\otimes S_B$ , with basis $|ab\rangle$ .

Suppose Charlie has a dime ( $\sigma = +1$ ( $\sigma = -1$ ), and gives one each to Alice and Bob. Then, Alice and Bob travel very far away from each other without looking at the coin. When Alice finally looks at the coin, she will immediately know what Bob's coin is. So $\langle \sigma_A \rangle = \langle \sigma_B \rangle =0$ and $\langle \sigma_A \sigma_B \rangle = -1$ . $\langle \sigma_A \sigma_B \rangle - \langle \sigma_A \rangle \langle \sigma_B \rangle$ is the statistical correlation, and since it is nonzero Alice and Bob's observations are correlated.

Let's take the quantum version, with spins rather than coins. We can write any state in the combined system as $|\Psi \rangle = \sum_{a,b} \psi(a,b)|ab \rangle$ . Let the components of Alice's spin be $\sigma_x, \sigma_y, \sigma_z$ with her ket vectors notated $|A\}$ , and Bob's spin be $\tau_x, \tau_y, \tau_z$ with his ket vectors notated as usual. If Alice prepares her spin in state $\alpha_u |U\} + \alpha_d |D\}$ and Bob prepares his in state $\beta_u|U\rangle + \beta_d |D\rangle$ , the combined product state is then

\{ \alpha_u |U\} + \alpha_d |D\} \}\otimes \{\beta_u|U\rangle + \beta_d |D\rangle\} = \alpha_u \beta_u |UU\rangle = \alpha_u\beta_d |UD \rangle + \alpha_d \beta_u |DU \rangle + \alpha_d \beta_d |DD\rangle.

Note that the tensor product is a vector space for studying combined systems; a product state is a state vector of the proudct space. Most state-vectors in the product space are not product states.

Tensor products work in matrices the same way we'd expect them to work.

Entanglement

The most general vector is

\psi_{UU}|UU\rangle + \psi_{UD}|UD\rangle + \psi_{DU}|DU\rangle + \psi_{DD}|DD|\rangle

with normalization condition $\psi_{UU}^*\psi_{UU}+\psi_{UD}^*\psi_{UD}+\psi_{DU}^*\psi_{DU}+\psi_{DD}^*\psi_{DD}=1$ so there are 6 real parameters. This space of states is much more complicated than just the individual ones combined; this is due to entanglement.

Two examples of maximally entangled states are the singlet state $|\text{sing}\rangle = \frac{1}{\sqrt{2}}(|UD \rangle - |DU \rangle)$ and the triplet states $\frac{1}{\sqrt{2}}(|UD\rangle + |DU \rangle), \frac{1}{\sqrt{2}}(|UU\rangle + |DD \rangle)$ and $\frac{1}{\sqrt{2}}(|UU\rangle - |DD \rangle)$ .

So inside a maximally entangled state: 1) An entangled state is a complete description of the combined system and nothing else can be known about it, and 2) In a maximally entangled state, nothing is known about the individual subsystems.

Recall the spin-polarization principle: this holds for all product, states but does not hold for $|\text{sing}\rangle$ . In fact, we can show that $\langle \sigma_x \rangle = \langle \sigma_y \rangle = \langle \sigma_z \rangle =0$ because

\begin{align*} \langle \sigma_z \rangle &= \langle \text{sing} | \sigma_z | \text{sing} \rangle \\ &= \langle \text{sing} |\sigma_z \dfrac{1}{\sqrt{2}} (|UD \rangle - |DU \rangle )\\ \end{align*}

\begin{align*} &\langle \text{sing}|\sigma_z | \text{sing} \rangle = \langle \text{sing}|\dfrac{1}{\sqrt{2}}(|UD\rangle + |DU \rangle) \\ \implies & \langle \sigma_z \rangle = \dfrac{1}{2} \left( \langle UD| - \langle DU |\right)\left(|UD \rangle + |DU\rangle\right). \end{align*}

and similarly we can show

\begin{align*} \langle \sigma_x \rangle &= \dfrac{1}{2} \left( \langle UD| - \langle DU |\right)\left(|DD \rangle + |UU\rangle\right) \\ \langle \sigma_y \rangle &= \dfrac{1}{2} \left( \langle UD| - \langle DU |\right)\left(i|DD \rangle + i|UU\rangle\right) \\ \end{align*}

So there's something hidden here.

Charlie gives two $|\text{sing}\rangle$ and Alice and Bob measure and multiply their results to obtain $\tau_z\sigma_z$ . We get

\begin{align*} \tau_z \sigma_z \dfrac{1}{\sqrt{2}}{|UD\rangle - |DU \rangle} &= \tau_z \dfrac{1}{\sqrt{2}} (|UD\rangle + |DU\rangle)\\ & = \dfrac{1}{\sqrt{2}} (-|UD\rangle + |DU \rangle) \end{align*}

so $\tau_z\sigma_z |\text{sing}\rangle = -|\text{sing}\rangle$ , or $|\text{sing}\rangle$ is an eigevector of $\tau_z \sigma_z$ with eigenvalue $-1$ . We can check that this is also true when we replace $z$ with $x$ or $y$ .

So we need an apparatus that measures in terms of $\vec{\sigma}\cdot \vec{\tau}$ , instead of measuring one component at a time. How? Sometimes the Hamiltonian of neighboring spins is proportional to $\vec{\sigma}\cdot \vec{\tau}$ , so we just need to measure the energy of the atomic pair.

Why are they called singlets and triplets? The singlet is an eigenvector with one eigenvalue, and the triplets are all eigenvectors with a different degenerate eigenvalue.

Lecture 7 - More On Entanglement

Outer Products and Density Matrices

We can also form the outer product $|\psi \rangle \langle \phi |$ , which is a linear operator that acts on a ket by $|\psi \rangle \langle \phi | |A\rangle = |\psi \rangle \langle \phi |A \rangle$ and acts on a bra by $\langle B| |\psi \rangle \langle \phi | = \langle B|\psi \rangle \langle \phi |$ . The case $|\psi \rangle \langle \psi |$ is called a projection operator, which is Hermitian and satisfies the following: $|\psi\rangle$ is an eigenvector of its projection operator with eigenvalue $1$ ; any vector orthogonal to $|\psi \rangle$ is an eigenvector with eigenvalue $0$ ; $|\psi \rangle \langle \phi |^2 = |\psi \rangle \langle \phi |$ ; the trace is $1$ ; $\sum_i |i \rangle \langle i | = I$ where $i$ is a basis; and lastly, $\langle L \rangle = \text{Tr} |\psi \rangle \langle \psi |L$ .

The reason why we care is because the density matrix defined as $\rho = \frac{1}{2} |\psi \rangle \langle \psi | + \frac{1}{2} |\phi \rangle \langle \phi |$ , which is useful because $\langle L \rangle = \text{Tr} \rho L$ . We can extend this to $n$ states easily.

Suppose Alice knows a wave function $\Psi(a,b)$ , but wishes to extract as much knowledge about $a$ without caring about $b$ . Then

\begin{align*} \langle L \rangle &= \sum_{ab, a'b'} \Psi^*(a'b')L_{a'b', ab}\Psi(ab)\\ &= \sum_{a,b,a'}\Psi^*(a'b)L_{a',a}\Psi(ab) \end{align*}

and we can write the information as a matrix $\rho_{aa'}=\sum_b \Psi^*(a'b)\Psi(ab)$ . This shows that we can remove Bob's information without removing any of Alice's information; we can take $L_{a'b', ab}=L_{a'a}\delta_{b'b}$ to "filter out" Bob's information, which gives

\begin{align*} \langle L \rangle &= \langle \Psi |L|\Psi \rangle \\ &= \sum_{a,b,a',b'}\psi^*(a',b')L_{a'b', ab}\psi(a,b) \\ &= \sum_{a',b,a}\psi^*(a',b) L_{a',a}\psi(a,b) \\ &=\sum_{a'a}\rho_{a'a}L_{a,a'}\\ &= \text{Tr}\rho L. \end{align*}

To know Alice's density matrix, we must first know the entire wave function. After that, we can disregard the rest and still compute everything about Alice using just her density matrix.

Here are some properties about density matrices: they are Hermitian; $\text{Tr}(\rho)=1$ ; the eigenvalues lie between $0$ and $1$ ; for a pure state: $\rho^2=\rho,\text{Tr}\left(\rho^2\right)=1$ ; for a mixed or entangled state, $\rho^2\neq \rho, \text{Tr}\left(\rho^2\right)<1$ .

For a quick example: for the state vector $|\Psi \rangle =\frac{1}{\sqrt{2}} \left(|UD\rangle + |DU\rangle \right)$ , the density matrix is $\begin{pmatrix} \frac{1}{2} & 0 \\ 0 & \frac{1}{2} \end{pmatrix}.$

Tests for Entanglement

The correlation test: there is no entanglement if and only if $\langle AB \rangle- \langle A \rangle \langle B \rangle =0$ .
The density matrix test: If the composite Alice-Bob system is ain a product state, then Alice's density matrix has exactly one eigenvalue equal to $1$ , and the others are $0$ . Additionally, the eigenvector with a nonzero eigenvalue is nothing but the Wave function of Alice's half of the system.

One issue with quantum mechanics is that in an experiment, the apparatus does not "know" the spin state until Alice looks at it. But once she does, the wave function collapses. If we bring in Bob to consider Alice, the apparatus, and the spin as one system, once he looks at this system, their wave function collapses. And so on...

Quantum mechanics does not violate locality, which states that it is impossible to send a signal faster than the speed of light. Let Alice's density matrix be $\rho_{aa'}=\sum_b \psi^*(a'b)\psi(ab)$ and $U_{bb'}$ be the unitary matrix for whatever happens to the entire system when Bob does his experiment. We get $\psi_{\text{final}}(ab)=\sum_{b'}U_{bb'}\psi(ab') \implies \psi_{\text{final}}^*(ab)=\sum_{b''}\psi(a'b'')U_{b'b''}^{\dagger}$ , but when we plug this in, we still get the original $\rho_{aa'}=\sum_b \psi^*(a'b)\psi(ab)$ because $U$ is unitary.

Bell's Theorem

Consider a video game that tries to fool you into thinking there is a quantum spin in a magnetic field inside the computer, and we get to experiment to test this. The computer stores two normalized complex numbers $\alpha_U$ and $\alpha_D$ . At the start of the game, the computer initializes these values, then solves the Schrödinger equation to update the $\alpha$ 's like they were the components of the spin's state-vector. We are allowed to manipulate the angle of the apparatus.

What if we do this on two computers? As long as they are connected throguh a cable and the computers can send messages instantaneously, we are good. But disconnecting the cable destroys the simulation.

This is essentially Bell's theorem: classical computers need to be connected with an instantaneous cable to simulate entanglement. This is not a problem about quantum mechanics, but rather a problem with simulating quantum mechanics inside classical computers.

Lecture 8 - Particles and Waves

Interlude: Mathematics

When we consider functions as vectors, we form a Hilbert space. But we need to modify things in three ways: 1) integrals replace sums so $\langle \Psi | \Phi \rangle = \int_{-\infty}^\infty \psi^*(x)\phi(x)\,dx$ 2) probability densities replace probabilities so $P(a,b)=\int_a^b \psi^*(x)\psi(x)\,dx$ , and 3) Dirac delta functions replace Kronecker deltas, or a function $\delta(x-x')$ with the property that for any function $F(x),

\int_{-\infty}^\infty \delta(x-x')F(x')\,dx' = F(x).

For large values of $n$ , it's approximated by $\frac{n}{\sqrt{\pi}}e^{-(nx)^2}$ .

In quantum mechanics, the limits of integration span the entire axis and our have function $\to 0$ at $\infty$ to be normalized, so we can rewrite integration by parts as:

\int_{-\infty}^\infty F\dfrac{\,dG}{\,dx} \,dx = -\int_{-\infty}^\infty \dfrac{\,dF}{\,dx}G\,dx.

Consider $\textbf{X}$ and $\textbf{D}$ , the multiply by $x$ operator and differentiation operators. We can now see that $\textbf{X}$ is Hermitian and $\textbf{D}$ is anti-Hermitian ( $D^\dagger = -D$ ). We can make $\textbf{D}$ Hermitian by $-i\hbar \textbf{D}$ , which satisfies

-i\hbar \textbf{D}\psi(x)=-i\hbar \dfrac{\,d\psi(x)}{\,dx}.

Eigenstuff for Position and Momentum

For $\textbf{X}$ , every real number $x_0$ is an eigenvalue of $\textbf{X}$ , and the corresponding eigenvectors are functions that are infinitely concentrated at $x=x_0$ . Additionally, we have $\langle x_0 | \Psi \rangle = \int_{-\infty}^\infty \delta(x-x_0)\psi(x) \implies \langle x| \Psi\rangle = \psi(x)$ , and we call $\psi(x)$ the wave function in the position representation.

Define the position operator $\textbf{P}=-i\hbar\textbf{D}$ . The eigenvectors are $\psi(x)=\frac{1}{\sqrt{2\pi}}e^{\frac{ipx}{\hbar}}$ with eigenvalue $p$ . Now, note that $\langle x|p \rangle = \langle p|x \rangle^*$ . We can now see that the wavelength of $e^{\frac{ipx}{\hbar}}$ is given by $\lambda = \frac{2\pi \hbar}{p}$ , which is one reason we call it a wave function. Let's call $\tilde{\psi}(p)=\langle \textbf{P}|\Psi \rangle$ the wave function in the momentum representation.

The relationship between the two representations is given by the Fourier transforms

\begin{align*} \tilde{\psi}(p)&=\dfrac{1}{\sqrt{2\pi}} \int \,dx e^{-\frac{ipx}{\hbar}}\psi(x) \\ \tilde{\psi}(x)&=\dfrac{1}{\sqrt{2\pi}} \int \,dp e^{\frac{ipx}{\hbar}}\tilde{\psi}(x) \end{align*}

Heisenberg's Uncertainty Principle

We have $[\textbf{X}, \textbf{P}]=i\hbar$ (which $\implies \{ x,p\}=1$ in the classical case). Specializing the general uncertainty principle from earlier to this case gives the famous Heisenberg's Uncertainty Principle:

\Delta \textbf{X} \Delta \textbf{P} \ge \frac{\hbar}{2}

Lecture 9 - Particle Dynamics

Kinematics

How do particles move in quantum mechanics? Plugging $\textbf{H}=c\textbf{P}$ gives

\dfrac{\partial \psi(x,t)}{\partial t}=-c\dfrac{\partial \psi(x,t)}{\partial x}

so any $\psi(x-ct)$ is a solution. Since we want $\int_{-\infty}^\infty \psi^*(x)\psi(x)\,dx =1$ , $\psi$ must look like a wave packet. All together, this particle can only move to the right, the energy can be either positive or negative, and the particle can only exist in a state when it moves at this particular velocity. The classical description is that momentum is conserved and the position moves with fixed velocity $c$ ; the quantum analog is the whole probability distribution and the expected value move with velocity $c$ .

If we add in the Hamiltonian, we can write $H=\frac{1}{2}mv^2 = \frac{p^2}{2m}$ , and messing around with some algebra gives us the traditional Schrödinger equation for an ordinary nonrelativistic free particle:

i\hbar \dfrac{\partial \psi}{\partial t}=-\dfrac{\hbar^2}{2m}\dfrac{\partial^2 \psi}{\partial x^2}.

To solve the time-dependent Schrödinger, we need to solve the time-independent version first. The function $\psi(x)=e^{\frac{ipx}{\hbar}}$ solves the independent version, and we can use this to solve the time-dependent version:

\psi(x,t)=\int \tilde{\psi}(p) \left( \text{exp} \dfrac{i(px-\frac{p^2t}{2m}}{\hbar} \right) \,dp.

Let's look at the quantum analog of $v=\frac{p}{m}$ . We have

v=\dfrac{\,d}{\,dt}\int \psi^*(x,t)x\psi(x,t).

From lecture 4, we have

v=\dfrac{i}{2m\hbar}\langle [\textbf{P}^2, \textbf{X}] \rangle = \dfrac{i}{2m\hbar}\langle \textbf{P}[\textbf{P},\textbf{X}]+[\textbf{P}, \textbf{X}]\textbf{P} \rangle

and using $[\textbf{P}, \textbf{X}]=-i\hbar$ gives $\langle P\rangle = mv$ .

As we saw, we can take a classical system, replace the classical phase space with a linear vector space, replace $x$ with $\textbf{X}$ and $p$ with $\textbf{P}$ , and use the Hamiltonian to solve the time-dependent equation (how the wave function changes with time) or the time-independent equation (to find the eigenvectors and eigenvalues of the Hamiltonian). This process is known as quantization.

Forces

Classically,

F(x)=m\dfrac{\,d^2 x}{\,dt^2}=-\dfrac{\partial V}{\partial x}.

Quantization tells us to make $\textbf{H}=\frac{\textbf{P}^2}{2m}+\textbf{V}(x)$ and modify Schrödinger to

\begin{align*} i\hbar \dfrac{\partial \psi}{\partial t} &= -\dfrac{\hbar^2}{2m}\dfrac{\partial^2 \psi} \\ {\partial x^2}+V(x)\psi = E\psi. \end{align*}

We can show that $\textbf{X}, \textbf{V}(x)]=0$ and

\dfrac{\,d}{\,dt} \langle \textbf{P}\rangle = \dfrac{i}{2m\hbar}\langle [\textbf{P}^2, \textbf{P}]\rangle + \dfrac{i}{\hbar} \langle [\textbf{V}, \textbf{P}]\rangle.

Using $[\textbf{V}(x), \textbf{P}]=i\hbar \frac{\,dV(x)}{\,dx}$ we get $\frac{\,d}{\,dt}\langle \textbf{P}\rangle = -\langle \frac{\,dv}{\,dx}\rangle$ .

This shows that the classical equations are only approximations that are good when we can replace the average of $\frac{\,dV}{dx}$ with the average of $x$ . We can do this when $V(x)$ varies slowly compared to the size of the wave packets.

For an example of a "bad potential", consider a bunch of large closely packed spikes of size $\delta x$ with $\delta x<\Delta x$ . The Heisenberg Uncertainty Principle tells us $\Delta x \sim \frac{\hbar}{m\Delta v}$ , which shows that large masses and smooth potentials work more classically; particles with low mass moving through an abrupt potential behave like a quantum mechanical system. When the equality of Heisenberg's Uncertainty Principle holds, these are the Gaussian wave packets, which we'll discuss later.

Path Integrals

Classically, the action is $A=\int_{t_1}^{t_2} L(x,x)\,dt = \int_{t_1}^{t_2}\left( \frac{mx^2}{2} - V(x) \right)\,dt$ . The quantum analog question is: given a particle starts at $(x_1,t_1)$ , what is the amplitude $C_{1,2}$ that it shows up at $(x_2,t_2)$ ? We have

\begin{align*} C_{1,2}&=\langle x_2 | e^{-iH(t_2-t_1)}|x_1\rangle\\ &= \int \,dx \langle x_2 |e^{-iHt/2}|x \rangle \langle x|e^{-iHt/2}|x_1 \rangle \\ &= \int_{\text{paths}}e^{\frac{iA}{\hbar}} \end{align*}

which is the extremely powerful path integral formulation by Feynman.

Lecture 10 - Harmonic Oscillator

Classical vs Quantum

Classically, $L=\frac{1}{2}my^2-\frac{1}{2}ky^2 = \frac{1}{2}x^2-\frac{1}{2}\omega^2 x^2$ where $x=\sqrt{m}y$ , $\omega = \sqrt{\frac{k}{m}}$ . The Lagrangian is $\frac{\partial L}{\partial x}=\frac{\,d}{\,dt}\frac{\partial L}{\partial \dot{x}} \implies x=A\cos(\omega t)+B\sin(\omega t)$ .

We have $p=\frac{\partial L}{\partial \dot{x}}=\dot{x}$ so $H=p\dot{x}-\mathcal{L}=\frac{1}{2}\dot{x}^2 + \frac{1}{2}\omega^2x^2 =\frac{1}{2}p^2 + \frac{1}{2}\omega^2 x^2$ . For the quantum mechanical version, we have $\mathbf{H}|\psi(x)\rangle \implies -\frac{\hbar^2}{2} \frac{\partial^2 \psi(x)}{\partial x^2}+\frac{1}{2}\omega^2 x^2 \psi(x)$ . Plugging this into the time-dependent Schrödinger gives

i\dfrac{\partial \psi}{\partial t} = -\dfrac{\hbar}{2}\dfrac{\partial^2 \psi}{\partial x^2}+\dfrac{1}{2\hbar}\omega^2 x^2 \psi.

Energy Levels

We can also calculate the energy levels by solving the time-independent Schrödinger:

-\dfrac{\hbar^2}{2} \dfrac{\partial^2 \psi_E(x)}{\partial x^2}+\dfrac{1}{2} \omega^2 x^2 \psi_E(x)=E\psi_E(x).

But there are a bunch of nonsensical solutions that make no sense physically, we need to impose conditions such as: physical solutions of the Schrödinger equation must be normalizable.

The lowest energy level is the ground state $\psi_0(x)$ . To identify this, there is a theorem (left unproved): the ground-state wave function for any potential has no zeros and it's the only energy eigenstate that has no nodes. With a huge amount of algebra we can reduce the Schrödinger to

\dfrac{\hbar}{2} \omega e^{-\frac{\omega}{2\hbar}x^2}=Ee^{-\frac{\omega}{2\hbar}x^2}.

We can write $E_0=\frac{\omega \hbar}{2}$ and $\psi_0(x)=e^{-\frac{\omega}{2\hbar}x^2}$ .

Creation and Annihilation

Take $\mathbf{H}=\frac{1}{2}(\mathbf{P}^2 + \omega^2 \mathbf{X}^2)= \frac{1}{2} (\mathbf{P}+i\omega \mathbf{X})(\mathbf{P}-i\omega \mathbf{X})+\frac{\omega \hbar}{2}$ . Consider the lowering operator $\mathbf{a}^- = \frac{i}{\sqrt{2\omega \hbar}}(\mathbf{P}-i\omega \mathbf{X})$ and the raising operator $\mathbf{a}^+ = \frac{-i}{\sqrt{2\omega \hbar}}(\mathbf{P}+i\omega \mathbf{X})$ . We can now rewrite $\mathbf{H}=\omega \hbar (\mathbf{a}^+ \mathbf{a}^- + \frac{1}{2})$ . Additionally, we have $[\mathbf{a}^-, \mathbf{a}^+]$ and defining $\mathbf{N}=\mathbf{a}^+ \mathbf{a}^-$ gives $\mathbf{H}=\omega \hbar(\mathbf{N}+\frac{1}{2})$ , as well as the relations $[\mathbf{a}^-, \mathbf{N}]=\mathbf{a}^-$ and $[\mathbf{a}^+, \mathbf{N}]=-\mathbf{a}^+$ . Since $\mathbf{a}^+|n\rangle = | n+1 \rangle$ and $\mathbf{a}^-|n\rangle = |n-1 \rangle$ , we can show that $E_n=\omega\hbar (n+\frac{1}{2})$ .

Wave Functions, Again

We have $\frac{i}{\sqrt{2\omega \hbar}}(\mathbf{P}-i\omega \mathbf{X})\psi_0(x)=0 \implies \frac{\,d\psi_0}{\,dx}=-\frac{\omega x}{\hbar}\psi_0(x)$ which has solution $e^{-\frac{\omega}{2\hbar}x^2}$ . Applying the raising operator gives $\psi_1(x)=2i\omega x \psi_0(x)$ , and we can do this as many times as we want to get $\psi_n(x)$ , and the polynomials in this sequence are called the Hermite polynomials.

These eigenfunctions also show quantum tunneling: they approach zero asymptotically but never reach $0$ , so there's a chance the particle is "outside the bowl" that defines its potential energy function.

Table of Contents