137 lines
8.3 KiB
TeX
137 lines
8.3 KiB
TeX
\chapter{Eigenvector continuation}
|
||
\label{chap:Eigenvector_continuation}
|
||
|
||
\section{Introduction to EC}
|
||
|
||
First introduced by \cite{Frame:2017fah}, eigenvector continuation (EC) is a powerful method to tackle computationally expensive quantum mechanical problems, despite being cheap and simple to implement. Similar in spirit to Rayleigh-Schr\"odinger perturbation theory (also known simply as perturbation theory), the method aims to approximate solutions to a Hamiltonian of interest, given exact solutions to a set of neighbouring Hamiltonians differing by small deviations, although EC has been shown to greatly outperform PT for terms of convergence~\citep{Frame:2017fah,Sarkar:2020mad}. Recent work~\citep{Bonilla:2022rph,Melendez:2022kid} has shown that EC as a particular reduced-basis method (RBM) falls within a larger class of model-order reduction (MOR) techniques.
|
||
|
||
Specifically, for a Hamiltonian with parametric dependence $H(c)$, EC enables robust extrapolations to a given target point $c_*$ from ``training data'' away from that point by exploiting information contained in eigenvectors. It can be said that the essence of system is ``learned'' via the construction of a highly effective (nonorthogonal) basis, leading to a variational calculation of the states of interest, or equivalently, projection of the target Hamiltonian onto a small subspace for rapid diagonalization. The latter approach can be boiled down to constructing projected Hamiltonian and norm matrices (denoted as $H_\text{EC}$ and $N_\text{EC}$, respectively) and solving the generalized eigenvalue problem which has the form $H_\text{EC}\ket{\psi} = \lambda N_\text{EC}\ket{\psi}$.
|
||
|
||
As said above, EC works by obtaining eigenstates of a Hamiltonian $H(c)$ with a parametric dependence on a parameter $c$ for several values of that parameter. \footnote{For simplicity, we assume here that there is only one scalar parameter and note that the extension to multiple parameters is straightforward~\citep{Konig:2019adq}.}
|
||
The set of parameters $\{c_i\}$ used for this step is referred to as ``training points,'' and the corresponding set of ``training vectors'' $\{\ket{\psi(c_i)}\}$ are used to construct an effective basis within which the problem is subsequently solved for one or more target values of the parameter $c$.
|
||
For typical applications of EC, this procedure reduces the dimension of the problem from a large Hilbert space to the small subspace spanned by the training vectors, thereby leading to a vast reduction of the computational cost for each target evaluation. The projection onto this small subspace involves constructing the following Hamiltonian and norm matrices:
|
||
%
|
||
\begin{align}
|
||
\label{eq:EC-H}
|
||
\big(H_{\text{EC}}\big)_{ij} &= \braket{\psi(c_i)|H(c_*)|\psi(c_j)} \,, \\
|
||
\label{eq:EC-N}
|
||
\big(N_{\text{EC}}\big)_{ij} &= \braket{\psi(c_i) | \psi(c_j)} \,.
|
||
\end{align}
|
||
%
|
||
Next, EC involves solving the generalized eigenvalue problem,
|
||
%
|
||
\begin{equation}
|
||
H_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}}
|
||
= E_{\text{EC}} \, N_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}} \,,
|
||
\label{eq:EC-GEVP}
|
||
\end{equation}
|
||
%
|
||
where we denote the target point as $c_*$.
|
||
|
||
It can be shown that Eq.~\eqref{eq:EC-GEVP} is equivalent to the variational principle, by phrasing the problem as an optimization problem where the Rayleigh quotient,
|
||
\begin{equation}
|
||
E_{\text{EC}} = \frac{\bra{\psi(c_*)_{\text{EC}}} H_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}}}{\bra{\psi(c_*)_{\text{EC}}} N_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}}} \, ,
|
||
\end{equation}
|
||
is minimized with respect to $\ket{\psi(c_*)_{\text{EC}}}$~\citep{ghojogh2023eigenvalue}.
|
||
|
||
However, considering numerical stability, it is safer to avoid solving the GEVP by orthonormalizing the basis $\{\ket{\psi(c_i)}\}$ beforehand. This is due to the fact that for near-singular $N_{\text{EC}}$, Eq.~\eqref{eq:EC-GEVP} is an ill-posed problem, as it is the case when too many training points are sampled from too narrow a region, leading to a near-redundant basis. Orthonormalization can be cheaply carried out using an algorithm such as Gram–Schmidt while dropping any redundant basis vectors in the process. Now, with respect to this new basis, $N_{\text{EC}}=1$, thereby reducing the GEVP to the ordinary eigenvalue problem,
|
||
\begin{equation}
|
||
H_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}}
|
||
= E_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}} \, .
|
||
\end{equation}
|
||
|
||
\ny{Mention extremal eigenvalues.}
|
||
|
||
\section{A cheap error estimation for EC}
|
||
|
||
\subsection{Introduction and proof}
|
||
|
||
Here a cheap uncertainty estimation is derived as an alternative for bootstrapping as it involves re-sampling the training points and solving the generalized eigenvalue problem each time. The proposed formula is similar to the one shown in~\cite{sarkar2021selflearning}.
|
||
|
||
Let $E^\text{EC}$ and $\ket{\psi^\text{EC}}$ be an EC extrapolated eigenvalue and an eigenvector respectively for the target value of the parameter $c=c_*$.
|
||
Now consider the expansion of $\ket{\psi^\text{EC}}$ in $\{\ket{\psi_i}\}$, the exact energy eigenbasis of the target Hamiltonian $H(c_*)$, henceforth simply called $H_*$.
|
||
|
||
\begin{align}
|
||
\ket{\psi^\text{EC}} &= \sum_i {a_i\ket{\psi_i}} \\
|
||
\left(H_*-E^\text{EC}\right)\ket{\psi^\text{EC}} &= \sum_i {a_i\left(E_i-E^\text{EC}\right)\ket{\psi_i}} \\
|
||
\norm{\left(H_*-E^\text{EC}\right)\ket{\psi^\text{EC}}} &= \sqrt{\sum_i {\abssym{a_i}^2 \, \abssym{E_i-E^\text{EC}}^2}}
|
||
\label{eq:sd}
|
||
\end{align}
|
||
|
||
If the coefficients $\abssym{a_i}$ had a Gaussian-like distribution against $E_i$,
|
||
it would have a sharp peak for a good extrapolation and a spread-out distribution
|
||
otherwise. Right hand side of eq. \ref{eq:sd} gives the standard deviation for that distribution.
|
||
Therefore,
|
||
|
||
\begin{equation}
|
||
\sigma = \norm{(H_*-E^\text{EC})\ket{\psi^\text{EC}}}
|
||
\label{eq:formula}
|
||
\end{equation}
|
||
|
||
Eq. \ref{eq:formula} is a cheap calculation as it only involves one matrix-vector multiplication,
|
||
followed by a vector-vector inner product. Obviously, one would need access to the extrapolated eigenvector $\ket{\psi^\text{EC}}$ and the full Hamiltonian $H_*$. I
|
||
claim that quantity $\sigma$ is a good uncertainty quantifier for EC (or any eigenvector approximation algorithm).
|
||
|
||
\subsection{Example from FVEC paper}
|
||
|
||
Note how in fig. \ref{fig:FVEC_error}, the exact spectrum is properly
|
||
enclosed by the error bands, as opposed to the ``bootstrap'' method, which
|
||
failed to do so. However, the large widths of the bands may not give FVEC
|
||
method the credit is deserves.
|
||
|
||
\begin{figure}
|
||
\centering
|
||
\includegraphics[width=\textwidth]{Chapter-3/FVEC_error.png}
|
||
\caption{Extrapolation error for the two-body system in FVEC paper.}
|
||
\label{fig:FVEC_error}
|
||
\end{figure}
|
||
|
||
\subsection{Example of a bad EC model}
|
||
\label{sec:AC_error}
|
||
|
||
In fig. \ref{fig:AC_error}, the topmost band veers off course at smaller
|
||
volumes due to an avoided crossing as the model does not include
|
||
enough training states.
|
||
Unfortunately, the error bands fail to account for that error. However,
|
||
these error bands describe the other energy level involved in the
|
||
avoided crossing (not shown here).
|
||
|
||
\begin{figure}
|
||
\centering
|
||
\includegraphics[width=\textwidth]{Chapter-3/AC_error.png}
|
||
\caption{Extrapolation error for as two-body system with an avoided
|
||
crossing corresponding to a narrow resonance.}
|
||
\label{fig:AC_error}
|
||
\end{figure}
|
||
|
||
\subsection{Alternative interpretation: Upper bounds of error}
|
||
|
||
Let,
|
||
\begin{equation}
|
||
\braket{\psi^\text{EC} | \psi^\text{EC}}=\sum_i {\abssym{a_i}^2}=1
|
||
\end{equation} \\
|
||
|
||
Define $E'$ to be the \emph{closest} exact eigenvalue to the extrapolated value $E^\text{EC}$. That is,
|
||
|
||
\begin{equation}
|
||
\abssym{E'-E^\text{EC}} \leq \abssym{E_i-E^\text{EC}}
|
||
\end{equation}
|
||
for all $i$.
|
||
Typically this is the eigenvalue of interest, but could also be a
|
||
rogue eigenvalue that comes close and misguide the extrapolation
|
||
in bad EC models, such as the one shown in fig. \ref{fig:AC_error}.
|
||
Now, continuing from eq. \ref{eq:sd},
|
||
|
||
\begin{align}
|
||
\norm{\left(H_*-E^\text{EC}\right)\ket{\psi^\text{EC}}} &= \sqrt{\sum_i {\abssym{a_i}^2 \, \abssym{E_i-E^\text{EC}}^2}} \\
|
||
&\geq \sqrt{\sum_i {\abssym{a_i}^2 \, \abssym{E'-E^\text{EC}}^2}} \\
|
||
&= \abssym{E'-E^\text{EC}} \sqrt{\sum_i {\abssym{a_i}^2}} \\
|
||
&= \abssym{E'-E^\text{EC}}
|
||
\end{align}
|
||
|
||
That is, $\sigma$ is also an upper bound for the extrapolation error for a good EC model.
|
||
If not, we can at least guarantee that the error bars would enclose \emph{at least one}
|
||
exact eigenvalue. \\
|
||
|
||
|