Quantum mechanics is not that hard
Last Updated:
Quantum mechanics (QM) has this reputation, both with the general public and with physicists, that it is a hard topic which no one understands. I disagree with this view and in this post, I will attempt to paint a clear picture of what exactly makes quantum mechanics so different from its classical sibling and why it arises (using experimental consideration and mathematical intuition).
Let me note now that this won’t be an exhaustive review of all things quantum. First of all because that would be more than a century of information, and this is a blog post, not a review paper. Second of all because, while I have worked extensively with quantum mechanics during my doctorate, I only explored a small corner of “frontier physics”. There are a lot of aspects of quantum mechanics of which I know very little about (for example quantum optics). Instead, this post’s ambition is simply to dispel misconceptions about the field and maybe show that its reputation for being a mysterious topic to be a bit overblown.
Are you saying quantum mechanics isn’t actually hard and mysterious?
Well, no. The thing is that QM is rather unintuitive, it directly contradicts our human experience, and that is what makes it a challenging discipline. Instead of relying on our senses and intuition, physicists need to trust the mathematics of it, which have been at this point thoroughly probed. And yes, it is mysterious, some parts of it at least. What I disagree with is the idea that “no one understands it”, as if all the discoveries and progress built on top of it was just haphazard and random. That is untrue. However, there are parts of the field which remain a mystery (in the business, we call this an “active research area”), such as the question of collapse of the wavefunction. Some other parts are also more philosophical in nature and have excited the collective imagination of the public. For example, the possible interpretations of QM (Copenhagen, many-worlds, etc.) on which physicists don’t generally agree are a large contribution to the field’s reputation. But let me clarify this immediately: these have generally no bearing on what QM tells you, but instead are a mental picture for why things happen. And physics is generally not in the business of answering why questions. So feel free to choose whichever interpretation you desire, it generally won’t matter that much. 1
Now let’s dive into the topic! Before I can really speak about the quantum aspect of quantum mechanics, I need to give you a reasonable mathematical introduction to the mechanics part, so let’s start with that.
Classical mechanics
Classical mechanics is the field of physics interested in motion. The question of “how things move” has fascinated mankind for millennia, from the celestial dance of stars in the sky to the inevitable fall of apples from their tree. The classical part of classical mechanics refers to the absence of either relativity (a topic for another post) or quantum effects. Classical systems are usually well described by equations like Newton’s laws of dynamics (where is the mass, the acceleration and the combined forces applied on the system). Building on this work, Lagrange and Hamilton introduced novel descriptions of a system’s dynamics rooted in geometry and based on the energy of the system, rather than on forces. Assume to be positional coordinates of the system ( is then the velocity and is the acceleration), then the Lagrangian is an energy of the system2 built from kinetic and potential energies . 3 What is special about this combination is that it encodes all information about the dynamics inside the Euler-Lagrange (EL) equations:
There is a reason why I’m taking you through this theoretical path. Classical physics has naturally gone through multiple versions over the centuries: those are the result of different philosophies, thought currents, and mathematical tools spread throughout various countries which all contributed to it. These contributions are what makes mechanics such a rich field: it can handle ballistics, elastic deformations of solids, flows of fluids and gases, etc. But they also make it somewhat disparate. The construction pioneered by Lagrange and later Hamilton sets a stronger mathematical foundation which is still used throughout all of physics, and the unified picture they brought will be the stepping stone towards quantum mechanics.4
But before we can get there, we need to introduce one more concept: phase space. So far, the Lagrangian is merely a function of both the positions and the velocities . However, an alternate formulation can be built by introducing the generalized momenta . If you remember from your high school physics (and if you don’t, it’s also okay), momentum is usually defined as (in our notation, you are probably more familiar with ). Conceptually, momentum is really the quantity that is affected by the forces applied to a system in Newton’s law, and the same happens in the EL equations . But since most examples usually have a constant mass, momentum and velocity are often used interchangeably in the equation. Momentum is also very often used to rewrite the kinetic energy as . Anticipating a little bit, the reason momentum is so important is that light (and thus all electromagnetic radiation) carries momentum, even though it has no mass! So naively its kinetic energy would be zero, which is clearly not the case. We will see later how this is resolved in quantum mechanics.
Armed with generalized momenta and positions, we can now define the Hamiltonian . It is basically equivalent to the Lagrangian: an energy function of coordinates which encodes all dynamics of the system. The main difference is that it depends on positions and momenta instead of positions and velocities, and the trick to go from one to the other is a Legendre transform: . 5 Unlike the Lagrangian, the Hamiltonian is physically meaningful as it generally corresponds to the total energy of the system (this is true for most systems, but not all). The equations of motion in this formalism are Hamilton’s equations:
Great, we now have a third way to describe classical mechanics, how useful! I know it sounds overly complex and redundant, but the Hamiltonian formalism gifts us more than just a handy way to compute equations of motion. It also introduces the idea of phase space, the space of all possible states of a system. Usually, when physicists say “state” of the system, it’s not always clear what they mean. Is it the parameters (like the mass of our ballistic object)? Is it a snapshot of the object at a given time? Is it the initial conditions (where was it let go? With what speed?)? Well, we can now actually make this precise. The state of the system is a point in phase space, defined by specific values of the positions and momenta . For example, in our ballistic example, a state is defined by the four numbers . Hamilton’s equations simply describe how this point moves around in phase space as time goes on. This powerful idea gives geometric meaning to motion: phase space is a special type of space called a symplectic manifold, and Hamiltonian dynamics are flows on this manifold. For more on this, I recommend this excellent essay: the general idea of a symplectic structure is to formally define a space where each point has coordinates for both position and momentum, and volumes in this space are preserved under Hamiltonian flow.
Enter quantum mechanics
I hope you still have appetite after this lengthy introduction to classical mechanics, because now we can finally enter the quantum world! For now, we know that classical systems can be described by positions and momenta, the possible values of which define the phase space of the system. The evolution of the system from one state to another is given by Hamilton’s equations and represented as a flow in phase space.
At the end of the 19th century, Lord Kelvin famously declared that “there is nothing new to be discovered in physics now”, as classical physics had finally explained all known phenomena. However, some experiments in the early 20th century quickly proved him wrong: blackbody radiation, 7 the photoelectric effect, 8 etc. I will not go over the experiments in details here (beyond the short summaries in footnotes), I just don’t think I would do them justice. Moreover, I find quantum mechanics to be less confusing when approached mathematically rather than experimentally (I know it might just be me, sorry!).
The main step between classical physics and quantum mechanics, mathematically speaking, is the promotion of phase space to a complex vector space called a Hilbert space. Instead of describing the state of a system as a point in phase space defined by positions and momenta, quantum mechanics describes the state of a system as a vector in this Hilbert space, called the wavefunction. 9
What about positions and momenta? Instead of being the fundamental degrees of freedom of the system, they are now operators (or matrices if you prefer), which act on the state and represent observable measurements (also called observables). That’s pretty much the gist of it! A lot of the strangeness of quantum mechanics comes from this fundamental change of perspective. Instead of thinking about where a particle is and how fast it’s going, you think about a vector in an abstract space, and use operators to extract information about positions and momenta.
Now let’s see how from this new perspective, we can recover the most common weird phenomena associated with quantum mechanics.
You can never know both the position and momentum of a particle exactly.
This is one of the things that tend to confuse people about quantum mechanics. If I throw a tennis ball and record it with my camera, I can show you frame by frame exactly where it is (and how fast it’s going). So how come quantum mechanics says I can’t know both position and momentum exactly?
In linear algebra, there is this concept of eigenvectors and eigenvalues. An eigenvector of an operator is a vector such that when you apply the operator to it, you just get back the same vector multiplied by a number (the eigenvalue) : . In quantum mechanics, measuring an observable associated to an operator will always yield one of its eigenvalues, and the state will collapse to the corresponding eigenvector. So a measuring a state with operator ( are the eigenvalues of ) will transform to with probability . This is the mathematical basis of measurement in quantum mechanics.
Another nice fact in linear algebra is that two operators and can either commute or not commute. Commuting means that the order in which you apply them does not matter: . Non-commuting means the opposite: . Commuting operators have the same eigenvectors, while non-commuting operators do not. Unsurprisingly, the position and momentum operator along the same dimension do not commute. This non-commutativity leads to the famous Heisenberg uncertainty principle, which states that there is a fundamental limit to how precisely you can know both position and momentum simultaneously. This is why in quantum mechanics, we are usually very interested in the commutator of two operators defined as . What you don’t know yet is that this structure is very related to our classical phase space.
I mentioned before that phase space is a symplectic manifold, which means it has a special structure defined by the Poisson bracket. The Poisson bracket of two functions and in phase space is defined as
Why is this important? The Poisson bracket is the reason why phase space points evolve via the equations of motion instead of just doing whatever they want. You can think of it as the one thing that connects the geometry of that space to actual physical behavior. But when we moved to quantum mechanics, replacing the phase space with a Hilbert space, we lost that structure. Luckily another one took its place: the commutator of operators. Whereas before, (by definition since these are canonical coordinates), now . As you can see, the commutator follows the same structure as the Poisson bracket: it is zero when position and momentum don’t match, and equal to the same constant otherwise. The new constant is just a consequence of the scale of quantum mechanics, called the reduced Planck constant . This tiny number is the reason why quantum effects are generally not noticeable at macroscopic scales: the commutator is effectively zero for large systems, and so classical mechanics is a very good approximation of reality at our scale.
So there you have it: the position and momentum operators (along the same dimension) do not commute, which means that measuring one affects the other. This is the mathematical origin of the uncertainty principle.
Particles can be in two places at once!
This one is closely related to the previous point. The issue here is that “being in two places at once” implies that position is a definite, well-defined quantity. But I just argued that it’s actually an operation which affects the state itself. There is a mathematical way to achieve this. So if you measure position, you get back one of the eigenvalues of the position operator (which correspond to definite positions in space), and the state collapses to the corresponding eigenvector (which represents the particle being at that position). But before you measure it, the state can be anything, and that includes a superposition (fancy way of saying a linear combination) of multiple eigenvectors. So when you apply the position operator, you can get any of the eigenvalues, with probability given by (where is the eigenvector corresponding to position ). This is the mathematical origin of the idea that particles can be in multiple places at once: before measurement, the state can be a superposition of multiple position eigenvectors. And the same thing holds for momentum, energy, and really any observable you can think of.
Particles can go through walls!
That’s always a cool one. If I throw a tennis ball at a wall, it will bounce back. That’s because any state in phase space where the tennis ball actually goes through the wall is inaccessible: the potential energy of such state is higher than the total energy of the system. If you think of energy as a “currency of movement”, the system can’t afford entry. But in quantum mechanics, things are, as usual, different. And that’s because the system is not at a specific point in space anymore, it’s a state which can be spread over many, many eigenvectors of position. In particular, the state may have some support (non-zero coefficient) in a region past the wall. So if I use the position operator there, I can find the system there with a small probability. Of course, we haven’t broken energy conservation here! Wherever the potential energy barrier is higher than the total energy of the system, we are still unlikely to find the particle. The reason for this is that the wavefunction feels the barrier, but instead of being “forbidden entry”, it just has an exponentially decaying probability of being there. If the barrier is thin enough, that exponential decay may not be strong enough to completely prevent access to the other side, and so there is a small but non-zero probability of finding the particle on the other side of the barrier. This is the mathematical origin of quantum tunneling.
Particles can have spin!
Spin is an interesting concept because it has no classical equivalent. The most similar thing would be angular momentum: when you spin a spinning top 10, it keeps up without falling as long as it has enough rotation left. That’s angular momentum! But do not get fooled by the name “spin”, particles can also have angular momentum if they are rotating, and that is different from spin. Spin is an intrinsic property of particles, like mass or charge.
Mathematically, spin comes from the fact that quantum states live in a Hilbert space which carries a representation of the rotation group. In simpler terms, the vectors in Hilbert space (the states) can be rotated, and these rotations correspond to physical symmetries of the system. When you rotate your experimental setup, the state vectors should also “rotate” in some sense inside the Hilbert space. These rotations are described by special operators called spin operators. The eigenvalues of these operators correspond to the possible spin values of the particle. There are two types of spin: integer spin (0, 1, 2, …) and half-integer spin (1/2, 3/2, …). Particles with integer spin are called bosons, while particles with half-integer spin are called fermions. Electrons, protons, and neutrons are all fermions with spin 1/2, while photons (particles of light) are bosons with spin 1.
Spin plays a very important role in physics: multiple bosons can occupy the same quantum state (which is what happens in lasers with photons), while fermions obey the Pauli exclusion principle, which states that no two fermions can occupy the same quantum state. This is the reason why matter is stable and has structure: electrons in atoms fill up different energy levels because they cannot all be in the lowest one. This fact alone explains a large portion of the periodic table. If electrons were bosons, all electrons would collapse to the lowest energy state, and atoms as we know them would not exist. The connection between spin and statistics (bosons vs fermions) is a deep result in quantum field theory called the spin-statistics theorem and is beyond the scope of this post.
Quantum systems are both particles and waves!
Well that’s not a misconception, it’s just true. It’s even in the name of the state vector: the wavefunction. One of the main wave-like behaviors of quantum particles is interference. You probably own a pair of noise-cancelling headphones. What they do is the classical wave equivalent to this: they record the ambient noise (a wave) and match it with an inverted wave. The two waves cancel each other, and you get peace and quiet. Well since quantum particles are also waves, they can interfere with each other, causing regions of high and low probability. Even better, they can interfere with themselves! The most famous example of this is the double-slit experiment, where particles (like electrons) are fired at a screen with two slits. Instead of just going through one slit or the other, the wavefunction goes through both slits and interferes with itself on the other side, creating a pattern of high and low probability on a detector screen. This wave-like behavior is a fundamental aspect of quantum mechanics, and it arises naturally from the mathematical structure of the wavefunction.
From the linear algebra description, this can also be seen quite naturally. The wavefunction can be expressed as a linear combination of basis vectors (eigenvectors of some operator). When you have multiple paths (like in the double-slit experiment), the wavefunction can be expressed as a sum of contributions from each path. The interference arises from the fact that when you compute probabilities, you take the modulus squared of the wavefunction, which includes cross-terms between the different paths. These cross-terms can lead to constructive or destructive interference, depending on the relative phases of the contributions.
The second wave-like part of the description is that quantum particles obey the Schrödinger equation, which is a wave equation. The time-dependent Schrödinger equation is given by
where is the Hamiltonian operator of the system. This is reminiscent of both the classical wave equation11, but also the Hamiltonian formalism we saw earlier. In fact, the Schrödinger equation can be seen as the quantum analogue of Hamilton’s equations, where the wavefunction evolves in time according to the Hamiltonian operator.
Particles can be entangled!
I want to close with this one because entanglement is, in my opinion, the very weirdest part of quantum mechanics, and it trips up even physicists sometimes. Entanglement happens when you have multiple particles, and the state of the whole system cannot be expressed as a simple product of the states of each individual particle. In a sense, I like to think about it in the same way I do in probability theory: two random variables are independent if knowing the value of one does not give you any information about the other. If they are not independent, they are correlated. In quantum mechanics, the equivalent of this would be a statement on amplitudes . But entanglement is even stronger: it is a statement on the state itself, not just on measurement outcomes. Two particles are entangled if their joint state cannot be written as a product of individual states: . For example, if you have two electrons, a non-entangled state may just be any of
However cannot be written as a product of individual electron states, so it is entangled! And from this example we see what is so special about entanglement! In such a state, observing the left-spin value immediately gives us information on the right-spin value!
When I led you to quantum mechanics, I mentioned that the state of a system is a vector in a Hilbert space. When you have multiple particles, the Hilbert space of the combined system is the tensor product of the individual Hilbert spaces. But because this combined space is also linear, it contains not only product states, but also linear combinations of product states, which are entangled states. This is the mathematical origin of entanglement: the structure of the combined Hilbert space allows for states that cannot be decomposed into individual particle states.
Conclusion
There you have it! This post was a rather quick introduction to quantum mechanics. I chose to focus on the mathematical structure of the theory, highlighting its roots in linear algebra, because I find that mathematics is often the best way to see through the fog of confusion. Quantum mechanics is one of the fields of physics we experience the least in our daily lives, yet so much of the world would be different without it. Let alone our technology and modern creature comforts (computers, smartphones, our electrical grid, nuclear energy, etc.), the very existence of our universe, our sun, our planet, and life itself often depend on quantum effects. Yet at its root, it is not that complicated. A lot of its strangeness finds root from its mathematical structure which is both so similar yet so different from that of classical mechanics (which we are most familiar with). I hope this post has helped you see through some of the mystery surrounding quantum mechanics, and maybe even piqued your interest to learn more about it!
Further reading
- Introduction to Quantum Mechanics by David J. Griffiths: a classic reference textbook for learning quantum mechanics from the ground up.
- The Feynman Lectures on Physics, Volume III by Richard P. Feynman: Feynman is known for being a wonderful educator and for being able to give intuition to complex topics. This volume focuses on quantum mechanics and provides deep insights into the subject.
- David Tong’s lecture notes on quantum mechanics: no list of physics resources would ever be complete without mentioning David Tong’s lecture notes. I have learnt so much from them as a student, they are freely available online and there is a set for almost every topic of theoretical physics you can think of! I highly recommend checking them out.
Appendix: a primer on linear algebra
If you are not familiar with linear algebra, you might have some difficulties reading this post. The reason for this is that quantum mechanics is fundamentally built on the mathematical structure of vector spaces, which is the core topic of linear algebra. Here is a quick primer on the main concepts you need to know to understand the post:
In mathematics, we often work with sets or spaces of objects. What makes each of them special tends to be the structure we can define on them, which tells us how to combine or manipulate these objects. Vector spaces (also called linear spaces) are among the most common types of spaces in mathematics. The special structure of these spaces is the existence of the linear combination: given two elements of a vector space and (also called vectors), and two scalars (plain numbers, can be real or complex) and , the linear combination is also an element of the vector space. We say the vector space is closed under linear combinations. Of course, if you apply this operation multiple times, you can keep combining more and more vectors together, and the result will still be in the vector space.
Linear algebra is the study of such vector spaces, and there are many important concepts associated with them, which I will now summarize in no particular order:
- Basis: A basis of a vector space is a set of vectors such that any vector in the space can be expressed as a unique linear combination of these basis vectors. The number of vectors in the basis is called the dimension of the vector space. For example, a basis of elements means any vector can be written as for some unique scalar coefficients .
- Inner product: An inner product is a way to define a notion of “angle” and “length” in a vector space. It is a function that takes two vectors and and returns a scalar . The inner product must satisfy certain properties, such as linearity, symmetry, and positive-definiteness. The length (or norm) of a vector is defined as .
- Orthogonality: Two vectors and are said to be orthogonal if their inner product is zero: . An orthonormal basis is a basis where all vectors are orthogonal to each other and have unit length.
- Linear transformation: A linear transformation is a function that maps vectors from one vector space to another (or to itself) while preserving the structure of linear combinations. For example, if is a linear transformation, then for any vectors and scalars .
By now, you probably noticed that most of this vocabulary sounds very similar to geometry. That’s because geometrical figures (like points, lines, planes, etc.) live inside vector spaces! So most of the early work on vector spaces was done in that context. Another similarity you might see is with matrices, row and column vectors. If you’ve worked in the machine learning field, you probably encountered these concepts regularly. Once again, this is because matrices and vectors are concrete representations of linear transformations and vectors in vector spaces. In fact, any linear transformation can be represented as a matrix, and applying the transformation to a vector corresponds to multiplying the matrix by the vector.
This leads me to the last batch of concepts which are a bit easier to understand if you think of vectors as column vectors and linear transformations as matrices:
- Eigenvalues and eigenvectors: An eigenvector of a linear transformation (or matrix) is a non-zero vector such that when you apply the transformation to it, you just get back the same vector multiplied by a scalar (the eigenvalue) : . Eigenvalues and eigenvectors are important because they reveal intrinsic properties of the transformation.
- Diagonalization: A matrix is said to be diagonalizable if, by transforming it appropriately (using a change of basis ), it can be represented as a diagonal matrix (a matrix where all non-diagonal elements are zero). Diagonalization is useful because diagonal matrices are much easier to work with, especially when computing powers or exponentials of matrices. Importantly, the eingenvalues of the matrix are the diagonal elements of the diagonalized matrix .
- Commutativity: Two matrices (or linear transformations) and are said to commute if the order in which you apply them does not matter: . Commuting matrices share the same eigenvectors (there is a basis which diagonalizes them at the same time!), while non-commuting matrices generally do not. This property is crucial in quantum mechanics, where observables are represented by operators (matrices) that may or may not commute.
- Tensor product: The tensor product is a way to combine two vector spaces and into a new vector space . The vectors in this new space are formed by taking all possible linear combinations of pairs of vectors from and . If and , then their tensor product is an element of . The dimension of the tensor product space is the product of the dimensions of the original spaces: if and , then . The tensor product is particularly important in quantum mechanics for describing systems with multiple particles, where the combined state space is the tensor product of the individual particle state spaces. Note that is also an element of , but that doesn’t mean it’s equal to a product itself (that’s the key point about entanglement!).
Finally, a Hilbert space is a special type of vector space that also has an inner product (and some other technical conditions like completeness). Hilbert spaces can be infinite-dimensional, which makes them suitable for describing quantum systems with continuous degrees of freedom (like position and momentum). The inner product in a Hilbert space allows us to define notions of orthogonality, length, and angles between vectors, which are essential for understanding the geometry of quantum states and the behavior of quantum systems.
Footnotes
-
Don’t misunderstand my point, philosophy goes hand-in-hand with physics. Some of the greatest advances made are generally the result of a different epistemology or a new way to view some concepts which is first a philosophical question (I would invite you to read what Kant wrote in Critique of Pure Reason and compare it with Einstein’s work in relativity some 150-odd years later.) ↩
-
Note that the Lagrangian is an energy by virtue of its units, but it does not mean it has as much physical meaning as the kinetic or potential energies, whose effects are more directly seen on the physics of the system. ↩
-
If you’re familiar with Newtonian dynamics, you might know that not all forces can be associated to a potential energy (for example, friction of the air ). These can be included in generalized Euler-Lagrange equations with a right-hand side . The reason I do not do so is because these functions are usually emergent and describe a transfer of energy from the system to the environment, but do not exist at the fundamental microscopic level! For example, friction is just the effective combined action of the air molecules when considered as a whole. ↩
-
A large part of modern physics is based on the idea of symmetry and invariance. The Lagrangian and Hamiltonian formalisms make these concepts more explicit, which is one of the reasons they are so widely used in theoretical physics. I suggest you read this post if you want to know more about this fascinating topic. ↩
-
If this definition of the Hamiltonian seems a bit confusing, don’t worry. It’s mostly a mathematical technique to make sure the Hamiltonian only depends on positions and momenta. You can check yourself that, this way, and so the Hamiltonian is independent of velocities! ↩
-
Chaotic systems are systems whose trajectories are very sensitive to the initial conditions. For example, a pendulum will have very similar trajectories if you let it go at 45° or 46°: it is not chaotic. However the system shown in the illustration box would have very different trajectories if you move the starting point even a little bit. ↩
-
Blackbody radiation is the phenomenon by which items emit and absorb light (electromagnetic radiation) depending on their temperature. A perfect blackbody is an ideal object which does this perfectly, with no loss or reflection. The energy emitted by such an object was predicted by classical physics to grow indefinitely as the wavelength of light decreased, eventually becoming infinite. This obviously does not happen in reality, and the discrepancy between theory and experiment was called the “ultraviolet catastrophe”. ↩
-
The photoelectric effect is the phenomenon where light shining on a metal surface can eject electrons from that surface. In classical physics, you’d expect that increasing the energy of the light would gradually eject electrons with more and more energy. However, experiments showed that below a certain frequency of light, no electrons were ejected at all, regardless of the light’s intensity. The explanation for this phenomenon required the introduction of the notion of quantization of light as photons. ↩
-
Strictly speaking, the wavefunction is the projection of the state vector onto the basis of eigenvectors of the position operator . However, physicists often use both terms interchangeably when speaking informally. ↩
-
If like me you are not a native English speaker and had never heard of a “spinning top” before, here is a Wikipedia link. It’s just a toy that spins on itself. ↩
-
The classical wave equation (in 1D) is given by , where is the wave function (like displacement in a string), and is the speed of the wave. The Schrödinger equation has a similar structure, but with a first-order time derivative and complex coefficients, reflecting the quantum nature of the system. If you don’t see the similarity right away, it’s because the derivative is hidden inside the Hamiltonian operator. usually contains a kinetic energy term proportional to momentum squared, and due to some mathematical requirements, the momentum operator is represented as a spatial derivative in the position basis. ↩