thesis/Chapter-3/Chapter-3.tex

\chapter{Eigenvector continuation}
\label{chap:Eigenvector_continuation}

\section{Introduction to EC}

First introduced by \cite{Frame:2017fah}, eigenvector continuation (EC) is a powerful method to tackle computationally expensive quantum mechanical problems, despite being cheap and simple to implement. Similar in spirit to Rayleigh-Schr\"odinger perturbation theory (also known simply as perturbation theory), the method aims to approximate solutions to a Hamiltonian of interest, given exact solutions to a set of neighbouring Hamiltonians differing by small deviations, although EC has been shown to greatly outperform PT for terms of convergence~\citep{Frame:2017fah,Sarkar:2020mad}. Recent work~\citep{Bonilla:2022rph,Melendez:2022kid} has shown that EC as a particular reduced-basis method (RBM) falls within a larger class of model-order reduction (MOR) techniques.

Specifically, for a Hamiltonian with parametric dependence $H(c)$, EC enables robust extrapolations to a given target point $c_*$ from ``training data'' away from that point by exploiting information contained in eigenvectors. It can be said that the essence of system is ``learned'' via the construction of a highly effective (nonorthogonal) basis, leading to a variational calculation of the states of interest, or equivalently, projection of the target Hamiltonian onto a small subspace for rapid diagonalization. The latter approach can be boiled down to constructing projected Hamiltonian and norm matrices (denoted as $H_\text{EC}$ and $N_\text{EC}$, respectively) and solving the generalized eigenvalue problem which has the form $H_\text{EC}\ket{\psi} = \lambda N_\text{EC}\ket{\psi}$.

As said above, EC works by obtaining eigenstates of a Hamiltonian $H(c)$ with a parametric dependence on a parameter $c$ for several values of that parameter. \footnote{For simplicity, we assume here that there is only one scalar parameter and note that the extension to multiple parameters is straightforward~\citep{Konig:2019adq}.}
The set of parameters $\{c_i\}$ used for this step is referred to as ``training points,'' and the corresponding set of ``training vectors'' $\{\ket{\psi(c_i)}\}$ are used to construct an effective basis within which the problem is subsequently solved for one or more target values of the parameter $c$.
For typical applications of EC, this procedure reduces the dimension of the problem from a large Hilbert space to the small subspace spanned by the training vectors, thereby leading to a vast reduction of the computational cost for each target evaluation. The projection onto this small subspace involves constructing the following Hamiltonian and norm matrices:
%
\begin{align}
\label{eq:EC-H}
 \big(H_{\text{EC}}\big)_{ij} &= \braket{\psi(c_i)|H(c_*)|\psi(c_j)} \,, \\
\label{eq:EC-N}
 \big(N_{\text{EC}}\big)_{ij} &= \braket{\psi(c_i) | \psi(c_j)} \,.
\end{align}
%
Next, EC involves solving the generalized eigenvalue problem,
%
\begin{equation}
 H_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}}
 = E_{\text{EC}} \, N_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}} \,,
\label{eq:EC-GEVP}
\end{equation}
%
where we denote the target point as $c_*$.

It can be shown that Eq.~\eqref{eq:EC-GEVP} is equivalent to the variational principle, by phrasing the problem as an optimization problem where the Rayleigh quotient,
\begin{equation}
    E_{\text{EC}} = \frac{\bra{\psi(c_*)_{\text{EC}}} H_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}}}{\bra{\psi(c_*)_{\text{EC}}} N_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}}} \, ,
\end{equation}
is minimized with respect to $\ket{\psi(c_*)_{\text{EC}}}$~\citep{ghojogh2023eigenvalue}.

However, considering numerical stability, it is safer to avoid solving the GEVP by orthonormalizing the basis $\{\ket{\psi(c_i)}\}$ beforehand. This is due to the fact that for near-singular $N_{\text{EC}}$, Eq.~\eqref{eq:EC-GEVP} is an ill-posed problem, as it is the case when too many training points are sampled from too narrow a region, leading to a near-redundant basis. Orthonormalization can be cheaply carried out using an algorithm such as Gram–Schmidt while dropping any redundant basis vectors in the process. Now, with respect to this new basis, $N_{\text{EC}}=1$, thereby reducing the GEVP to the ordinary eigenvalue problem,
\begin{equation}
 H_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}}
 = E_{\text{EC}} \ket{\psi(c_*)_{\text{EC}}} \, .
\end{equation}

\ny{Mention extremal eigenvalues.}

\section{A cheap error estimation for EC}

\subsection{Introduction and proof}

Here a cheap uncertainty estimation is derived as an alternative for bootstrapping as it involves re-sampling the training points and solving the generalized eigenvalue problem each time. The proposed formula is similar to the one shown in~\cite{sarkar2021selflearning}.

Let $E^\text{EC}$ and $\ket{\psi^\text{EC}}$ be an EC extrapolated eigenvalue and an eigenvector respectively for the target value of the parameter $c=c_*$.
Now consider the expansion of $\ket{\psi^\text{EC}}$ in $\{\ket{\psi_i}\}$, the exact energy eigenbasis  of the target Hamiltonian $H(c_*)$, henceforth simply called $H_*$.

\begin{align}
    \ket{\psi^\text{EC}} &= \sum_i {a_i\ket{\psi_i}} \\
    \left(H_*-E^\text{EC}\right)\ket{\psi^\text{EC}} &= \sum_i {a_i\left(E_i-E^\text{EC}\right)\ket{\psi_i}} \\
    \norm{\left(H_*-E^\text{EC}\right)\ket{\psi^\text{EC}}} &= \sqrt{\sum_i {\abssym{a_i}^2 \, \abssym{E_i-E^\text{EC}}^2}}
    \label{eq:sd}
\end{align}

If the coefficients $\abssym{a_i}$ had a Gaussian-like distribution against $E_i$,
it would have a sharp peak for a good extrapolation and a spread-out distribution
otherwise. Right hand side of eq. \ref{eq:sd} gives the standard deviation for that distribution.
Therefore,

\begin{equation}
    \sigma = \norm{(H_*-E^\text{EC})\ket{\psi^\text{EC}}}
    \label{eq:formula}
\end{equation}

Eq. \ref{eq:formula} is a cheap calculation as it only involves one matrix-vector multiplication,
followed by a vector-vector inner product. Obviously, one would need access to the extrapolated eigenvector $\ket{\psi^\text{EC}}$ and the full Hamiltonian $H_*$. I
claim that quantity $\sigma$ is a good uncertainty quantifier for EC (or any eigenvector approximation algorithm).

\subsection{Example from FVEC paper}

Note how in fig. \ref{fig:FVEC_error}, the exact spectrum is properly
enclosed by the error bands, as opposed to the ``bootstrap'' method, which
failed to do so. However, the large widths of the bands may not give FVEC
method the credit is deserves.

\begin{figure}
    \centering
    \includegraphics[width=\textwidth]{Chapter-3/FVEC_error.png}
    \caption{Extrapolation error for the two-body system in FVEC paper.}
    \label{fig:FVEC_error}
\end{figure}

\subsection{Example of a bad EC model}
\label{sec:AC_error}

In fig. \ref{fig:AC_error}, the topmost band veers off course at smaller
volumes due to an avoided crossing as the model does not include
enough training states.
Unfortunately, the error bands fail to account for that error. However,
these error bands describe the other energy level involved in the
avoided crossing (not shown here).

\begin{figure}
    \centering
    \includegraphics[width=\textwidth]{Chapter-3/AC_error.png}
    \caption{Extrapolation error for as two-body system with an avoided
    crossing corresponding to a narrow resonance.}
    \label{fig:AC_error}
\end{figure}

\subsection{Alternative interpretation: Upper bounds of error}

Let,
\begin{equation}
    \braket{\psi^\text{EC} | \psi^\text{EC}}=\sum_i {\abssym{a_i}^2}=1
\end{equation} \\

Define $E'$ to be the \emph{closest} exact eigenvalue to the extrapolated value $E^\text{EC}$. That is,

\begin{equation}
    \abssym{E'-E^\text{EC}} \leq \abssym{E_i-E^\text{EC}}
\end{equation}
for all $i$.
Typically this is the eigenvalue of interest, but could also be a
rogue eigenvalue that comes close and misguide the extrapolation
in bad EC models, such as the one shown in fig. \ref{fig:AC_error}.
Now, continuing from eq. \ref{eq:sd},

\begin{align}
    \norm{\left(H_*-E^\text{EC}\right)\ket{\psi^\text{EC}}} &= \sqrt{\sum_i {\abssym{a_i}^2 \, \abssym{E_i-E^\text{EC}}^2}} \\
    &\geq \sqrt{\sum_i {\abssym{a_i}^2 \, \abssym{E'-E^\text{EC}}^2}} \\
    &= \abssym{E'-E^\text{EC}} \sqrt{\sum_i {\abssym{a_i}^2}} \\
    &= \abssym{E'-E^\text{EC}}
\end{align}

That is, $\sigma$ is also an upper bound for the extrapolation error for a good EC model.
If not, we can at least guarantee that the error bars would enclose \emph{at least one}
exact eigenvalue. \\