Hamiltonian optics

Last updated October 24, 2024

Hamiltonian optics^[1] and Lagrangian optics^[2] are two formulations of geometrical optics which share much of the mathematical formalism with Hamiltonian mechanics and Lagrangian mechanics.

Hamilton's principle
Lagrangian optics
Fermat's principle
The Euler-Lagrange equations
Optical momentum
Hamilton's equations
Applications
Refraction and reflection
Rays and wavefronts
Phase space
Conservation of etendue
Imaging and nonimaging optics
Generalizations
General ray parametrization
Generalized coordinates
See also
References

Hamilton's principle

In physics, Hamilton's principle states that the evolution of a system $\left(q_{1}{\left(\sigma \right)},\dots ,q_{N}{\left(\sigma \right)}\right)$ described by $N$ generalized coordinates between two specified states at two specified parameters σ_A and σ_B is a stationary point (a point where the variation is zero) of the action functional, or $\delta S=\delta \int _{\sigma _{A}}^{\sigma _{B}}L\left(q_{1},\cdots ,q_{N},{\dot {q}}_{1},\cdots ,{\dot {q}}_{N},\sigma \right)\,d\sigma =0$ where ${\dot {q}}_{k}=dq_{k}/d\sigma$ and $L$ is the Lagrangian. Condition $\delta S=0$ is valid if and only if the Euler-Lagrange equations are satisfied, i.e., ${\frac {\partial L}{\partial q_{k}}}-{\frac {d}{d\sigma }}{\frac {\partial L}{\partial {\dot {q}}_{k}}}=0$ with $k=1,\dots ,N$ .

The momentum is defined as $p_{k}={\frac {\partial L}{\partial {\dot {q}}_{k}}}$ and the Euler–Lagrange equations can then be rewritten as ${\dot {p}}_{k}={\frac {\partial L}{\partial q_{k}}}$ where ${\dot {p}}_{k}=dp_{k}/d\sigma$ .

A different approach to solving this problem consists in defining a Hamiltonian (taking a Legendre transform of the Lagrangian) as $H=\sum _{k}{{\dot {q}}_{k}}p_{k}-L$ for which a new set of differential equations can be derived by looking at how the total differential of the Lagrangian depends on parameter σ, positions $q_{i}$ and their derivatives ${\dot {q}}_{i}$ relative to σ. This derivation is the same as in Hamiltonian mechanics, only with time t now replaced by a general parameter σ. Those differential equations are the Hamilton's equations ${\frac {\partial H}{\partial q_{k}}}=-{\dot {p}}_{k}\,,\quad {\frac {\partial H}{\partial p_{k}}}={\dot {q}}_{k}\,,\quad {\frac {\partial H}{\partial \sigma }}=-{\partial L \over \partial \sigma }\,.$ with $k=1,\dots ,N$ . Hamilton's equations are first-order differential equations, while Euler-Lagrange's equations are second-order.

Lagrangian optics

The general results presented above for Hamilton's principle can be applied to optics.^[3]^[4] In 3D euclidean space the generalized coordinates are now the coordinates of euclidean space.

Fermat's principle

Fermat's principle states that the optical length of the path followed by light between two fixed points, A and B, is a stationary point. It may be a maximum, a minimum, constant or an inflection point. In general, as light travels, it moves in a medium of variable refractive index which is a scalar field of position in space, that is, $n=n\left(x_{1},x_{2},x_{3}\right)$ in 3D euclidean space. Assuming now that light travels along the x₃ axis, the path of a light ray may be parametrized as $s=\left(x_{1}\left(x_{3}\right),x_{2}\left(x_{3}\right),x_{3}\right)$ starting at a point $\mathbf {A} =\left(x_{1}\left(x_{3A}\right),x_{2}\left(x_{3A}\right),x_{3A}\right)$ and ending at a point $\mathbf {B} =\left(x_{1}\left(x_{3B}\right),x_{2}\left(x_{3B}\right),x_{3B}\right)$ . In this case, when compared to Hamilton's principle above, coordinates $x_{1}$ and $x_{2}$ take the role of the generalized coordinates $q_{k}$ while $x_{3}$ takes the role of parameter $\sigma$ , that is, parameter σ =x₃ and N=2.

In the context of calculus of variations this can be written as^[2] $\delta S=\delta \int _{\mathbf {A} }^{\mathbf {B} }n\,ds=\delta \int _{x_{3A}}^{x_{3B}}n{\frac {ds}{dx_{3}}}\,dx_{3}=\delta \int _{x_{3A}}^{x_{3B}}L\left(x_{1},x_{2},{\dot {x}}_{1},{\dot {x}}_{2},x_{3}\right)\,dx_{3}=0$ where $ds$ is an infinitesimal displacement along the ray given by ${\textstyle ds={\sqrt {dx_{1}^{2}+dx_{2}^{2}+dx_{3}^{2}}}}$ and $L=n{\frac {ds}{dx_{3}}}=n\left(x_{1},x_{2},x_{3}\right){\sqrt {1+{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}}}$ is the optical Lagrangian and ${\dot {x}}_{k}=dx_{k}/dx_{3}$ .

The optical path length (OPL) is defined as $S=\int _{\mathbf {A} }^{\mathbf {B} }n\,ds=\int _{\mathbf {A} }^{\mathbf {B} }L\,dx_{3}$ where n is the local refractive index as a function of position along the path between points A and B.

The Euler-Lagrange equations

The general results presented above for Hamilton's principle can be applied to optics using the Lagrangian defined in Fermat's principle. The Euler-Lagrange equations with parameter σ =x₃ and N=2 applied to Fermat's principle result in ${\frac {\partial L}{\partial x_{k}}}-{\frac {d}{dx_{3}}}{\frac {\partial L}{\partial {\dot {x}}_{k}}}=0$ with $k = 1, 2$ and where L is the optical Lagrangian and ${\dot {x}}_{k}=dx_{k}/dx_{3}$ .

Optical momentum

The optical momentum is defined as $p_{k}={\frac {\partial L}{\partial {\dot {x}}_{k}}}$ and from the definition of the optical Lagrangian ${\textstyle L=n{\sqrt {1+{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}}}}$ this expression can be rewritten as $p_{k}=n{\frac {{\dot {x}}_{k}}{\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}}=n{\frac {dx_{k}}{\sqrt {dx_{1}^{2}+dx_{2}^{2}+dx_{3}^{2}}}}=n{\frac {dx_{k}}{ds}}$

or in vector form $\mathbf {p} =n{\frac {\mathbf {ds} }{ds}}=\left(p_{1},p_{2},p_{3}\right)=\left(n\cos \alpha _{1},n\cos \alpha _{2},n\cos \alpha _{3}\right)=n\mathbf {\hat {e}}$ where $\mathbf {\hat {e}}$ is a unit vector and angles α₁, α₂ and α₃ are the angles p makes to axis x₁, x₂ and x₃ respectively, as shown in figure "optical momentum". Therefore, the optical momentum is a vector of norm $\|\mathbf {p} \|={\sqrt {p_{1}^{2}+p_{2}^{2}+p_{3}^{2}}}=n$ where n is the refractive index at which p is calculated. Vector p points in the direction of propagation of light. If light is propagating in a gradient index optic the path of the light ray is curved and vector p is tangent to the light ray.

The expression for the optical path length can also be written as a function of the optical momentum. Having in consideration that ${\dot {x}}_{3}=dx_{3}/dx_{3}=1$ the expression for the optical Lagrangian can be rewritten as ${\begin{aligned}L&=n{\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}={\dot {x}}_{1}{\frac {n{\dot {x}}_{1}}{\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}}+{\dot {x}}_{2}{\frac {n{\dot {x}}_{2}}{\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}}+{\frac {n{\dot {x}}_{3}}{\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}}\\[1ex]&={\dot {x}}_{1}p_{1}+{\dot {x}}_{2}p_{2}+{\dot {x}}_{3}p_{3}={\dot {x}}_{1}p_{1}+{\dot {x}}_{2}p_{2}+p_{3}\end{aligned}}$ and the expression for the optical path length is $S=\int L\,dx_{3}=\int \mathbf {p} \cdot d\mathbf {s}$

Hamilton's equations

Similarly to what happens in Hamiltonian mechanics, also in optics the Hamiltonian is defined by the expression given above for $N = 2$ corresponding to functions $x_{1}{\left(x_{3}\right)}$ and $x_{2}{\left(x_{3}\right)}$ to be determined $H={\dot {x}}_{1}p_{1}+{\dot {x}}_{2}p_{2}-L$

Comparing this expression with $L={\dot {x}}_{1}p_{1}+{\dot {x}}_{2}p_{2}+p_{3}$ for the Lagrangian results in $H=-p_{3}=-{\sqrt {n^{2}-p_{1}^{2}-p_{2}^{2}}}$

And the corresponding Hamilton's equations with parameter σ =x₃ and k=1,2 applied to optics are^[5]^[6] ${\frac {\partial H}{\partial x_{k}}}=-{\dot {p}}_{k}\,,\quad {\frac {\partial H}{\partial p_{k}}}={\dot {x}}_{k}$ with ${\dot {x}}_{k}=dx_{k}/dx_{3}$ and ${\dot {p}}_{k}=dp_{k}/dx_{3}$ .

Applications

It is assumed that light travels along the x₃ axis, in Hamilton's principle above, coordinates $x_{1}$ and $x_{2}$ take the role of the generalized coordinates $q_{k}$ while $x_{3}$ takes the role of parameter $\sigma$ , that is, parameter σ =x₃ and N=2.

Refraction and reflection

If plane x₁x₂ separates two media of refractive index n_A below and n_B above it, the refractive index is given by a step function $n(x_{3})={\begin{cases}n_{A}&{\text{if }}x_{3}<0\\n_{B}&{\text{if }}x_{3}>0\\\end{cases}}$ and from Hamilton's equations ${\frac {\partial H}{\partial x_{k}}}=-{\frac {\partial }{\partial x_{k}}}{\sqrt {n(x_{3})^{2}-p_{1}^{2}-p_{2}^{2}}}=0$ and therefore ${\dot {p}}_{k}=0$ or $p_{k}={\text{Constant}}$ for $k = 1, 2$ .

An incoming light ray has momentum p_A before refraction (below plane x₁x₂) and momentum p_B after refraction (above plane x₁x₂). The light ray makes an angle θ_A with axis x₃ (the normal to the refractive surface) before refraction and an angle θ_B with axis x₃ after refraction. Since the p₁ and p₂ components of the momentum are constant, only p₃ changes from p_3A to p_3B.

Figure "refraction" shows the geometry of this refraction from which $d=\|\mathbf {p} _{A}\|\sin \theta _{A}=\|\mathbf {p} _{B}\|\sin \theta _{B}$ . Since $\|\mathbf {p} _{A}\|=n_{A}$ and $\|\mathbf {p} _{B}\|=n_{B}$ , this last expression can be written as $n_{A}\sin \theta _{A}=n_{B}\sin \theta _{B}$ which is Snell's law of refraction.

In figure "refraction", the normal to the refractive surface points in the direction of axis x₃, and also of vector $\mathbf {v} =\mathbf {p} _{A}-\mathbf {p} _{B}$ . A unit normal $\mathbf {n} =\mathbf {v} /\|\mathbf {v} \|$ to the refractive surface can then be obtained from the momenta of the incoming and outgoing rays by $\mathbf {n} ={\frac {\mathbf {p} _{A}-\mathbf {p} _{B}}{\|\mathbf {p} _{A}-\mathbf {p} _{B}\|}}={\frac {n_{A}\mathbf {i} -n_{B}\mathbf {r} }{\|n_{A}\mathbf {i} -n_{B}\mathbf {r} \|}}$ where i and r are unit vectors in the directions of the incident and refracted rays. Also, the outgoing ray (in the direction of $\mathbf {p} _{B}$ ) is contained in the plane defined by the incoming ray (in the direction of $\mathbf {p} _{A}$ ) and the normal $\mathbf {n}$ to the surface.

A similar argument can be used for reflection in deriving the law of specular reflection, only now with n_A=n_B, resulting in θ_A=θ_B. Also, if i and r are unit vectors in the directions of the incident and refracted ray respectively, the corresponding normal to the surface is given by the same expression as for refraction, only with n_A=n_B $\mathbf {n} ={\frac {\mathbf {i} -\mathbf {r} }{\|\mathbf {i} -\mathbf {r} \|}}$

In vector form, if i is a unit vector pointing in the direction of the incident ray and n is the unit normal to the surface, the direction r of the refracted ray is given by:^[3] $\mathbf {r} ={\frac {n_{A}}{n_{B}}}\mathbf {i} +\left(-\left(\mathbf {i} \cdot \mathbf {n} \right){\frac {n_{A}}{n_{B}}}+{\sqrt {\Delta }}\right)\mathbf {n}$ with $\Delta =1-\left({\frac {n_{A}}{n_{B}}}\right)^{2}\left(1-\left(\mathbf {i} \cdot \mathbf {n} \right)^{2}\right)$

If i⋅n<0 then −n should be used in the calculations. When $\Delta <0$ , light suffers total internal reflection and the expression for the reflected ray is that of reflection: $\mathbf {r} =\mathbf {i} -2\left(\mathbf {i} \cdot \mathbf {n} \right)\mathbf {n}$

Rays and wavefronts

From the definition of optical path length ${\textstyle S=\int L\,dx_{3}}$ ${\frac {\partial S}{\partial x_{k}}}=\int {\frac {\partial L}{\partial x_{k}}}\,dx_{3}=\int {\frac {dp_{k}}{dx_{3}}}\,dx_{3}=p_{k}$

with k=1,2 where the Euler-Lagrange equations $\partial L/\partial x_{k}=dp_{k}/dx_{3}$ with k=1,2 were used. Also, from the last of Hamilton's equations $\partial H/\partial x_{3}=-\partial L/\partial x_{3}$ and from $H=-p_{3}$ above ${\frac {\partial S}{\partial x_{3}}}=\int {\frac {\partial L}{\partial x_{3}}}\,dx_{3}=\int {\frac {dp_{3}}{dx_{3}}}\,dx_{3}=p_{3}$ combining the equations for the components of momentum p results in $\mathbf {p} =\nabla S$

Since p is a vector tangent to the light rays, surfaces S=Constant must be perpendicular to those light rays. These surfaces are called wavefronts. Figure "rays and wavefronts" illustrates this relationship. Also shown is optical momentum p, tangent to a light ray and perpendicular to the wavefront.

Vector field $\mathbf {p} =\nabla S$ is conservative vector field. The gradient theorem can then be applied to the optical path length (as given above) resulting in $S=\int _{\mathbf {A} }^{\mathbf {B} }\mathbf {p} \cdot d\mathbf {s} =\int _{\mathbf {A} }^{\mathbf {B} }\nabla S\cdot d\mathbf {s} =S(\mathbf {B} )-S(\mathbf {A} )$ and the optical path length S calculated along a curve C between points A and B is a function of only its end points A and B and not the shape of the curve between them. In particular, if the curve is closed, it starts and ends at the same point, or A=B so that $S=\oint \nabla S\cdot d\mathbf {s} =0$

This result may be applied to a closed path ABCDA as in figure "optical path length" $S=\int _{\mathbf {A} }^{\mathbf {B} }\mathbf {p} \cdot d\mathbf {s} +\int _{\mathbf {B} }^{\mathbf {C} }\mathbf {p} \cdot d\mathbf {s} +\int _{\mathbf {C} }^{\mathbf {D} }\mathbf {p} \cdot d\mathbf {s} +\int _{\mathbf {D} }^{\mathbf {A} }\mathbf {p} \cdot d\mathbf {s} =0$

for curve segment AB the optical momentum p is perpendicular to a displacement ds along curve AB, or $\mathbf {p} \cdot d\mathbf {s} =0$ . The same is true for segment CD. For segment BC the optical momentum p has the same direction as displacement ds and $\mathbf {p} \cdot d\mathbf {s} =nds$ . For segment DA the optical momentum p has the opposite direction to displacement ds and $\mathbf {p} \cdot d\mathbf {s} =-n\,ds$ . However inverting the direction of the integration so that the integral is taken from A to D, ds inverts direction and $\mathbf {p} \cdot d\mathbf {s} =n\,ds$ . From these considerations $\int _{\mathbf {B} }^{\mathbf {C} }n\,ds=\int _{\mathbf {A} }^{\mathbf {D} }n\,ds$ or $S_{\mathbf {BC} }=S_{\mathbf {AD} }$ and the optical path length S_BC between points B and C along the ray connecting them is the same as the optical path length S_AD between points A and D along the ray connecting them. The optical path length is constant between wavefronts.

Phase space

Figure "2D phase space" shows at the top some light rays in a two-dimensional space. Here x₂=0 and p₂=0 so light travels on the plane x₁x₃ in directions of increasing x₃ values. In this case $p_{1}^{2}+p_{3}^{2}=n^{2}$ and the direction of a light ray is completely specified by the p₁ component of momentum $\mathbf {p} =(p_{1},p_{3})$ since p₂=0. If p₁ is given, p₃ may be calculated (given the value of the refractive index n) and therefore p₁ suffices to determine the direction of the light ray. The refractive index of the medium the ray is traveling in is determined by $\|\mathbf {p} \|=n$ .

For example, ray r_C crosses axis x₁ at coordinate x_B with an optical momentum p_C, which has its tip on a circle of radius n centered at position x_B. Coordinate x_B and the horizontal coordinate p_1C of momentum p_C completely define ray r_C as it crosses axis x₁. This ray may then be defined by a point r_C=(x_B,p_1C) in space x₁p₁ as shown at the bottom of the figure. Space x₁p₁ is called phase space and different light rays may be represented by different points in this space.

As such, ray r_D shown at the top is represented by a point r_D in phase space at the bottom. All rays crossing axis x₁ at coordinate x_B contained between rays r_C and r_D are represented by a vertical line connecting points r_C and r_D in phase space. Accordingly, all rays crossing axis x₁ at coordinate x_A contained between rays r_A and r_B are represented by a vertical line connecting points r_A and r_B in phase space. In general, all rays crossing axis x₁ between x_L and x_R are represented by a volume R in phase space. The rays at the boundary ∂R of volume R are called edge rays. For example, at position x_A of axis x₁, rays r_A and r_B are the edge rays since all other rays are contained between these two. (A ray parallel to x1 would not be between the two rays, since the momentum is not in-between the two rays)

In three-dimensional geometry the optical momentum is given by $\mathbf {p} =(p_{1},p_{2},p_{3})$ with $p_{1}^{2}+p_{2}^{2}+p_{3}^{2}=n^{2}$ . If p₁ and p₂ are given, p₃ may be calculated (given the value of the refractive index n) and therefore p₁ and p₂ suffice to determine the direction of the light ray. A ray traveling along axis x₃ is then defined by a point (x₁,x₂) in plane x₁x₂ and a direction (p₁,p₂). It may then be defined by a point in four-dimensional phase space x₁x₂p₁p₂.

Conservation of etendue

Figure "volume variation" shows a volume V bound by an area A. Over time, if the boundary A moves, the volume of V may vary. In particular, an infinitesimal area dA with outward pointing unit normal n moves with a velocity v.

This leads to a volume variation $dV=dA(\mathbf {v} \cdot \mathbf {n} )dt$ . Making use of Gauss's theorem, the variation in time of the total volume V volume moving in space is ${\frac {dV}{dt}}=\int _{A}\mathbf {v} \cdot \mathbf {n} \,dA=\int _{V}\nabla \cdot \mathbf {v} \,dV$

The rightmost term is a volume integral over the volume V and the middle term is the surface integral over the boundary A of the volume V. Also, v is the velocity with which the points in V are moving.

In optics coordinate $x_{3}$ takes the role of time. In phase space a light ray is identified by a point $(x_{1},x_{2},p_{1},p_{2})$ which moves with a "velocity" $\mathbf {v} =({\dot {x}}_{1},{\dot {x}}_{2},{\dot {p}}_{1},{\dot {p}}_{2})$ where the dot represents a derivative relative to $x_{3}$ . A set of light rays spreading over $dx_{1}$ in coordinate $x_{1}$ , $dx_{2}$ in coordinate $x_{2}$ , $dp_{1}$ in coordinate $p_{1}$ and $dp_{2}$ in coordinate $p_{2}$ occupies a volume $dV=dx_{1}dx_{2}dp_{1}dp_{2}$ in phase space. In general, a large set of rays occupies a large volume $V$ in phase space to which Gauss's theorem may be applied ${\frac {dV}{dx_{3}}}=\int _{V}\nabla \cdot \mathbf {v} \,dV$ and using Hamilton's equations $\nabla \cdot \mathbf {v} ={\frac {\partial {\dot {x}}_{1}}{\partial x_{1}}}+{\frac {\partial {\dot {x}}_{2}}{\partial x_{2}}}+{\frac {\partial {\dot {p}}_{1}}{\partial p_{1}}}+{\frac {\partial {\dot {p}}_{2}}{\partial p_{2}}}={\frac {\partial }{\partial x_{1}}}{\frac {\partial H}{\partial p_{1}}}+{\frac {\partial }{\partial x_{2}}}{\frac {\partial H}{\partial p_{2}}}-{\frac {\partial }{\partial p_{1}}}{\frac {\partial H}{\partial x_{1}}}-{\frac {\partial }{\partial p_{2}}}{\frac {\partial H}{\partial x_{2}}}=0$ or $dV/dx_{3}=0$ and $dV=dx_{1}dx_{2}dp_{1}dp_{2}={\text{Constant}}$ which means that the phase space volume is conserved as light travels along an optical system.

The volume occupied by a set of rays in phase space is called etendue, which is conserved as light rays progress in the optical system along direction x₃. This corresponds to Liouville's theorem, which also applies to Hamiltonian mechanics.

However, the meaning of Liouville’s theorem in mechanics is rather different from the theorem of conservation of étendue. Liouville’s theorem is essentially statistical in nature, and it refers to the evolution in time of an ensemble of mechanical systems of identical properties but with different initial conditions. Each system is represented by a single point in phase space, and the theorem states that the average density of points in phase space is constant in time. An example would be the molecules of a perfect classical gas in equilibrium in a container. Each point in phase space, which in this example has 2N dimensions, where N is the number of molecules, represents one of an ensemble of identical containers, an ensemble large enough to permit taking a statistical average of the density of representative points. Liouville’s theorem states that if all the containers remain in equilibrium, the average density of points remains constant.^[3]

Imaging and nonimaging optics

Figure "conservation of etendue" shows on the left a diagrammatic two-dimensional optical system in which x₂=0 and p₂=0 so light travels on the plane x₁x₃ in directions of increasing x₃ values.

Light rays crossing the input aperture of the optic at point x₁=x_I are contained between edge rays r_A and r_B represented by a vertical line between points r_A and r_B at the phase space of the input aperture (right, bottom corner of the figure). All rays crossing the input aperture are represented in phase space by a region R_I.

Also, light rays crossing the output aperture of the optic at point x₁=x_O are contained between edge rays r_A and r_B represented by a vertical line between points r_A and r_B at the phase space of the output aperture (right, top corner of the figure). All rays crossing the output aperture are represented in phase space by a region R_O.

Conservation of etendue in the optical system means that the volume (or area in this two-dimensional case) in phase space occupied by R_I at the input aperture must be the same as the volume in phase space occupied by R_O at the output aperture.

In imaging optics, all light rays crossing the input aperture at x₁=x_I are redirected by it towards the output aperture at x₁=x_O where x_I=m x_O. This ensures that an image of the input is formed at the output with a magnification m. In phase space, this means that vertical lines in the phase space at the input are transformed into vertical lines at the output. That would be the case of vertical line r_Ar_B in R_I transformed to vertical line r_Ar_B in R_O.

In nonimaging optics, the goal is not to form an image but simply to transfer all light from the input aperture to the output aperture. This is accomplished by transforming the edge rays ∂R_I of R_I to edge rays ∂R_O of R_O. This is known as the edge ray principle.

Generalizations

Above it was assumed that light travels along the x₃ axis, in Hamilton's principle above, coordinates $x_{1}$ and $x_{2}$ take the role of the generalized coordinates $q_{k}$ while $x_{3}$ takes the role of parameter $\sigma$ , that is, parameter σ =x₃ and N=2. However, different parametrizations of the light rays are possible, as well as the use of generalized coordinates.

General ray parametrization

A more general situation can be considered in which the path of a light ray is parametrized as $s=\left(x_{1}{\left(\sigma \right)},x_{2}{\left(\sigma \right)},x_{3}{\left(\sigma \right)}\right)$ in which σ is a general parameter. In this case, when compared to Hamilton's principle above, coordinates $x_{1}$ , $x_{2}$ and $x_{3}$ take the role of the generalized coordinates $q_{k}$ with N=3. Applying Hamilton's principle to optics in this case leads to ${\begin{aligned}\delta S&=\delta \int _{\mathbf {A} }^{\mathbf {B} }n\,ds=\delta \int _{\sigma _{A}}^{\sigma _{B}}n{\frac {ds}{d\sigma }}\,d\sigma \\&=\delta \int _{\sigma _{A}}^{\sigma _{B}}L\left(x_{1},x_{2},x_{3},{\dot {x}}_{1},{\dot {x}}_{2},{\dot {x}}_{3},\sigma \right)\,d\sigma =0\end{aligned}}$ where now $L=nds/d\sigma$ and ${\dot {x}}_{k}=dx_{k}/d\sigma$ and for which the Euler-Lagrange equations applied to this form of Fermat's principle result in ${\frac {\partial L}{\partial x_{k}}}-{\frac {d}{d\sigma }}{\frac {\partial L}{\partial {\dot {x}}_{k}}}=0$ with k=1,2,3 and where L is the optical Lagrangian. Also in this case the optical momentum is defined as $p_{k}={\frac {\partial L}{\partial {\dot {x}}_{k}}}$ and the Hamiltonian P is defined by the expression given above for N=3 corresponding to functions $x_{1}{\left(\sigma \right)}$ , $x_{2}{\left(\sigma \right)}$ and $x_{3}{\left(\sigma \right)}$ to be determined $P={\dot {x}}_{1}p_{1}+{\dot {x}}_{2}p_{2}+{\dot {x}}_{3}p_{3}-L$

And the corresponding Hamilton's equations with k=1,2,3 applied optics are ${\frac {\partial H}{\partial x_{k}}}=-{\dot {p}}_{k}\,,\quad {\frac {\partial H}{\partial p_{k}}}={\dot {x}}_{k}$ with ${\dot {x}}_{k}=dx_{k}/d\sigma$ and ${\dot {p}}_{k}=dp_{k}/d\sigma$ .

The optical Lagrangian is given by $L=n{\frac {ds}{d\sigma }}=n\left(x_{1},x_{2},x_{3}\right){\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}=L\left(x_{1},x_{2},x_{3},{\dot {x}}_{1},{\dot {x}}_{2},{\dot {x}}_{3}\right)$ and does not explicitly depend on parameter σ. For that reason not all solutions of the Euler-Lagrange equations will be possible light rays, since their derivation assumed an explicit dependence of L on σ which does not happen in optics.

The optical momentum components can be obtained from $p_{k}=n{\frac {{\dot {x}}_{k}}{\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}}=n{\frac {dx_{k}}{\sqrt {dx_{1}^{2}+dx_{2}^{2}+dx_{3}^{2}}}}=n{\frac {dx_{k}}{ds}}$ where ${\dot {x}}_{k}=dx_{k}/d\sigma$ . The expression for the Lagrangian can be rewritten as ${\begin{aligned}L&=n{\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}={\dot {x}}_{1}{\frac {n{\dot {x}}_{1}}{\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}}+{\dot {x}}_{2}{\frac {n{\dot {x}}_{2}}{\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}}+{\dot {x}}_{3}{\frac {n{\dot {x}}_{3}}{\sqrt {{\dot {x}}_{1}^{2}+{\dot {x}}_{2}^{2}+{\dot {x}}_{3}^{2}}}}\\&={\dot {x}}_{1}p_{1}+{\dot {x}}_{2}p_{2}+{\dot {x}}_{3}p_{3}\end{aligned}}$

Comparing this expression for L with that for the Hamiltonian P it can be concluded that $P=0$

From the expressions for the components $p_{k}$ of the optical momentum results $p_{1}^{2}+p_{2}^{2}+p_{3}^{2}-n^{2}\left(x_{1},x_{2},x_{3}\right)=0$

The optical Hamiltonian is chosen as $P=p_{1}^{2}+p_{2}^{2}+p_{3}^{2}-n^{2}\left(x_{1},x_{2},x_{3}\right)=0$

although other choices could be made.^[3]^[4] The Hamilton's equations with k = 1, 2, 3 defined above together with $P=0$ define the possible light rays.

Generalized coordinates

As in Hamiltonian mechanics, it is also possible to write the equations of Hamiltonian optics in terms of generalized coordinates $\left(q_{1}\left(\sigma \right),q_{2}\left(\sigma \right),q_{3}\left(\sigma \right)\right)$ , generalized momenta $\left(u_{1}\left(\sigma \right),u_{2}\left(\sigma \right),u_{3}\left(\sigma \right)\right)$ and Hamiltonian P as^[3]^[4]

${\begin{aligned}{\frac {dq_{1}}{d\sigma }}&={\frac {\partial P}{\partial u_{1}}}\quad \quad {\frac {du_{1}}{d\sigma }}=-{\frac {\partial P}{\partial q_{1}}}\\{\frac {dq_{2}}{d\sigma }}&={\frac {\partial P}{\partial u_{2}}}\quad \quad {\frac {du_{2}}{d\sigma }}=-{\frac {\partial P}{\partial q_{2}}}\\{\frac {dq_{3}}{d\sigma }}&={\frac {\partial P}{\partial u_{3}}}\quad \quad {\frac {du_{3}}{d\sigma }}=-{\frac {\partial P}{\partial q_{3}}}\\P&=\mathbf {p} \cdot \mathbf {p} -n^{2}=0\end{aligned}}$ where the optical momentum is given by ${\begin{aligned}\mathbf {p} &=u_{1}\nabla q_{1}+u_{2}\nabla q_{2}+u_{3}\nabla q_{3}\\&=u_{1}\|\nabla q_{1}\|{\frac {\nabla q_{1}}{\|\nabla q_{1}\|}}+u_{2}\|\nabla q_{2}\|{\frac {\nabla q_{2}}{\|\nabla q_{2}\|}}+u_{3}\|\nabla q_{3}\|{\frac {\nabla q_{3}}{\|\nabla q_{3}\|}}\\&=u_{1}a_{1}\mathbf {\hat {e}} _{1}+u_{2}a_{2}\mathbf {\hat {e}} _{2}+u_{3}a_{3}\mathbf {\hat {e}} _{3}\end{aligned}}$ and $\mathbf {\hat {e}} _{1}$ , $\mathbf {\hat {e}} _{2}$ and $\mathbf {\hat {e}} _{3}$ are unit vectors. A particular case is obtained when these vectors form an orthonormal basis, that is, they are all perpendicular to each other. In that case, $u_{k}a_{k}/n$ is the cosine of the angle the optical momentum $\mathbf {p}$ makes to unit vector $\mathbf {\hat {e}} _{k}$ .

Related Research Articles

In physics, specifically in electromagnetism, the Lorentz force law is the combination of electric and magnetic force on a point charge due to electromagnetic fields. The Lorentz force, on the other hand, is a physical effect that occurs in the vicinity of electrically neutral, current-carrying conductors causing moving electrical charges to experience a magnetic force.

In vector calculus and differential geometry the generalized Stokes theorem, also called the Stokes–Cartan theorem, is a statement about the integration of differential forms on manifolds, which both simplifies and generalizes several theorems from vector calculus. In particular, the fundamental theorem of calculus is the special case where the manifold is a line segment, Green’s theorem and Stokes' theorem are the cases of a surface in $or and the divergence theorem is the case of a volume in Hence, the theorem is sometimes referred to as the fundamental theorem of multivariate calculus .$

<span class="mw-page-title-main">Navier–Stokes equations</span> Equations describing the motion of viscous fluid substances

The Navier–Stokes equations are partial differential equations which describe the motion of viscous fluid substances. They were named after French engineer and physicist Claude-Louis Navier and the Irish physicist and mathematician George Gabriel Stokes. They were developed over several decades of progressively building the theories, from 1822 (Navier) to 1842–1850 (Stokes).

<span class="mw-page-title-main">Fokker–Planck equation</span> Partial differential equation

In statistical mechanics and information theory, the Fokker–Planck equation is a partial differential equation that describes the time evolution of the probability density function of the velocity of a particle under the influence of drag forces and random forces, as in Brownian motion. The equation can be generalized to other observables as well. The Fokker-Planck equation has multiple applications in information theory, graph theory, data science, finance, economics etc.

The calculus of variations is a field of mathematical analysis that uses variations, which are small changes in functions and functionals, to find maxima and minima of functionals: mappings from a set of functions to the real numbers. Functionals are often expressed as definite integrals involving functions and their derivatives. Functions that maximize or minimize functionals may be found using the Euler–Lagrange equation of the calculus of variations.

In continuum mechanics, the infinitesimal strain theory is a mathematical approach to the description of the deformation of a solid body in which the displacements of the material particles are assumed to be much smaller than any relevant dimension of the body; so that its geometry and the constitutive properties of the material at each point of space can be assumed to be unchanged by the deformation.

In mathematics, a Green's function is the impulse response of an inhomogeneous linear differential operator defined on a domain with specified initial conditions or boundary conditions.

A directional derivative is a concept in multivariable calculus that measures the rate at which a function changes in a particular direction at a given point.

In physics, the Hamilton–Jacobi equation, named after William Rowan Hamilton and Carl Gustav Jacob Jacobi, is an alternative formulation of classical mechanics, equivalent to other formulations such as Newton's laws of motion, Lagrangian mechanics and Hamiltonian mechanics.

In mechanics, virtual work arises in the application of the principle of least action to the study of forces and movement of a mechanical system. The work of a force acting on a particle as it moves along a displacement is different for different displacements. Among all the possible displacements that a particle may follow, called virtual displacements, one will minimize the action. This displacement is therefore the displacement followed by the particle according to the principle of least action.

The work of a force on a particle along a virtual displacement is known as the virtual work.

In geometry and linear algebra, a Cartesian tensor uses an orthonormal basis to represent a tensor in a Euclidean space in the form of components. Converting a tensor's components from one such basis to another is done through an orthogonal transformation.

In continuum mechanics, the finite strain theory—also called large strain theory, or large deformation theory—deals with deformations in which strains and/or rotations are large enough to invalidate assumptions inherent in infinitesimal strain theory. In this case, the undeformed and deformed configurations of the continuum are significantly different, requiring a clear distinction between them. This is commonly the case with elastomers, plastically deforming materials and other fluids and biological soft tissue.

In electromagnetism, charge density is the amount of electric charge per unit length, surface area, or volume. Volume charge density is the quantity of charge per unit volume, measured in the SI system in coulombs per cubic meter (C⋅m⁻³), at any point in a volume. Surface charge density (σ) is the quantity of charge per unit area, measured in coulombs per square meter (C⋅m⁻²), at any point on a surface charge distribution on a two dimensional surface. Linear charge density (λ) is the quantity of charge per unit length, measured in coulombs per meter (C⋅m⁻¹), at any point on a line charge distribution. Charge density can be either positive or negative, since electric charge can be either positive or negative.

The Cauchy momentum equation is a vector partial differential equation put forth by Cauchy that describes the non-relativistic momentum transport in any continuum.

<span class="mw-page-title-main">Interval finite element</span>

In numerical analysis, the interval finite element method is a finite element method that uses interval parameters. Interval FEM can be applied in situations where it is not possible to get reliable probabilistic characteristics of the structure. This is important in concrete structures, wood structures, geomechanics, composite structures, biomechanics and in many other areas. The goal of the Interval Finite Element is to find upper and lower bounds of different characteristics of the model and use these results in the design process. This is so called worst case design, which is closely related to the limit state design.

The Clausius–Duhem inequality is a way of expressing the second law of thermodynamics that is used in continuum mechanics. This inequality is particularly useful in determining whether the constitutive relation of a material is thermodynamically allowable.

In theoretical physics, relativistic Lagrangian mechanics is Lagrangian mechanics applied in the context of special relativity and general relativity.

<span class="mw-page-title-main">Objective stress rate</span>

In continuum mechanics, objective stress rates are time derivatives of stress that do not depend on the frame of reference. Many constitutive equations are designed in the form of a relation between a stress-rate and a strain-rate. The mechanical response of a material should not depend on the frame of reference. In other words, material constitutive equations should be frame-indifferent (objective). If the stress and strain measures are material quantities then objectivity is automatically satisfied. However, if the quantities are spatial, then the objectivity of the stress-rate is not guaranteed even if the strain-rate is objective.

Lagrangian field theory is a formalism in classical field theory. It is the field-theoretic analogue of Lagrangian mechanics. Lagrangian mechanics is used to analyze the motion of a system of discrete particles each with a finite number of degrees of freedom. Lagrangian field theory applies to continua and fields, which have an infinite number of degrees of freedom.

In physics and mathematics, the Klein–Kramers equation or sometimes referred as Kramers–Chandrasekhar equation is a partial differential equation that describes the probability density function $f$ of a Brownian particle in phase space $(r, p)$ . It is a special case of the Fokker–Planck equation.

References

↑ H. A. Buchdahl, An Introduction to Hamiltonian Optics, Dover Publications, 1993, ISBN 978-0486675978.
1 2 Vasudevan Lakshminarayanan et al., Lagrangian Optics, Springer Netherlands, 2011, ISBN 978-0792375821.
1 2 3 4 5 Chaves, Julio (2015). Introduction to Nonimaging Optics, Second Edition. CRC Press. ISBN 978-1482206739.
1 2 3 Roland Winston et al., Nonimaging Optics, Academic Press, 2004, ISBN 978-0127597515.
↑ Dietrich Marcuse, Light Transmission Optics, Van Nostrand Reinhold Company, New York, 1972, ISBN 978-0894643057.
↑ Rudolf Karl Luneburg,Mathematical Theory of Optics, University of California Press, Berkeley, CA, 1964, p. 90.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[IntroductionHO-1] H. A. Buchdahl, An Introduction to Hamiltonian Optics, Dover Publications, 1993, ISBN 978-0486675978.

[IntroductionLO-2] 1 2 Vasudevan Lakshminarayanan et al., Lagrangian Optics, Springer Netherlands, 2011, ISBN 978-0792375821.

[IntroNio2e-3] 1 2 3 4 5 Chaves, Julio (2015). Introduction to Nonimaging Optics, Second Edition. CRC Press. ISBN 978-1482206739.

[NIO-4] 1 2 3 Roland Winston et al., Nonimaging Optics, Academic Press, 2004, ISBN 978-0127597515.

[5] Dietrich Marcuse, Light Transmission Optics, Van Nostrand Reinhold Company, New York, 1972, ISBN 978-0894643057.

[6] Rudolf Karl Luneburg,Mathematical Theory of Optics, University of California Press, Berkeley, CA, 1964, p. 90.

[1]

[2]

[3]

[4]

[5]

[6]