The trek goes on

0028

Understanding the Bernstein–Vazirani algorithm

Posted at 23:13 on 9 October 2022, and revised at 15:14 on 10 October 2022

After writing 0027, which discussed the Deutsch–Jozsa algorithm, I saw a closely related Bernstein-Vazirani algorithm in Carette et al’s QPL 2021 paper. (By the way, this paper appears to be the most recent and complete development in graphical approaches to quantum algorithms. In particular, the paper treats Grover’s algorithm in full, with a graphical representation of iterations and probabilistic reasoning — I’ll have to catch up.) The Bernstein–Vazirani problem gives us an oracle corresponding to the function $(\cdot s) : \mathbf 2^n \to \mathbf 2$, which is the dot-product function $(\cdot)$ partially applied to $s : \mathbf 2^n$: given an input $x = x_1 x_2 \cdots x_n : \mathbf 2^n$ where each $x_i : \mathbf 2$,

$$ (\cdot s)(x) \defeq x \cdot s \defeq (x_1 \mathop\wedge s_1) \oplus (x_2 \mathop\wedge s_2) \oplus \cdots \oplus (x_n \mathop\wedge s_n) $$

The secret $s$ is unknown to us, and we are required to recover $s$. Following the approach taken in 0027, I tried to lift $(\cdot s)$ to a linear map $\overline{(\cdot s)}$ and evaluate it on the X-basis, but this time it was rather difficult, and eventually I had to fall back on the more conventional understanding of the oracle construction, which is about encoding function outputs as phases. It’s still easier to talk about this encoding purely in linear algebra though.

First, let us recap how the X-basis elements are expressed relative to the Z-basis, for example,

\begin{equation} |\texttt{-+-}\rangle = |\mathtt{000}\rangle - |\mathtt{001}\rangle + |\mathtt{010}\rangle - |\mathtt{011}\rangle - |\mathtt{100}\rangle + |\mathtt{101}\rangle - |\mathtt{110}\rangle + |\mathtt{111}\rangle \label{eq:after-expansion} \end{equation}

This is expanded from

\begin{equation} |\texttt{-+-}\rangle = (|\mathtt 0\rangle - |\mathtt 1\rangle) \otimes (|\mathtt 0\rangle + |\mathtt 1\rangle) \otimes (|\mathtt 0\rangle - |\mathtt 1\rangle) \label{eq:before-expansion} \end{equation}

using the linearity of tensor product, so, for example, the term $-|\mathtt{001}\rangle$ in Equation \ref{eq:after-expansion} comes from $|\mathtt 0\rangle \otimes |\mathtt 0\rangle \otimes -|\mathtt 1\rangle$ (drawing a summand from each of the three terms in Equation \ref{eq:before-expansion}), and $|\mathtt{111}\rangle$ from $-|\mathtt 1\rangle \otimes |\mathtt 1\rangle \otimes -|\mathtt 1\rangle$. Think of $\texttt{-+-}$ as selecting the first and third bits to participate in the decision of the sign of a term; to actually decide the sign of a term $|x_1 x_2 x_3\rangle$ in $|\texttt{-+-}\rangle$, we count the number of $\mathtt 1$ among the selected bits $x_1$ and $x_3$: if it’s even, that means we multiply an even number of $-1$, and the term is positive; otherwise, the term is negative. As a formula:

$$ |\texttt{-+-}\rangle = \sum_{x_1, x_2, x_3 : \mathbf 2} (-1)^{x_1 \oplus x_3} |x_1 x_2 x_3\rangle $$

Note the correspondence between $\texttt{-+-}$ and the exponent expression $x_1 \oplus x_3$, where $x_1$ and $x_3$ are turned ‘on’ (participating in the decision of the sign) by $\texttt-$, and $x_2$ is turned ‘off’ (not appearing in the expression) by $\texttt+$. This turning on and off can be expressed more generally as a dot product

$$ x_1 \oplus x_3 = (x_1 \mathop\wedge \mathtt 1) \oplus (x_2 \mathop\wedge \mathtt 0) \oplus (x_3 \mathop\wedge \mathtt 1) = x \cdot \mathtt{101} $$

If we (somewhat informally) use $H$ to denote the conversion(s) between $\mathtt 0$/$\mathtt 1$ and $\texttt +$/$\texttt -$ and generalise, we can write

\begin{equation} |H(s)\rangle = \sum_{x : \mathbf 2^n} (-1)^{x \cdot s} |x\rangle \label{eq:X-basis-encoding} \end{equation}

where all the outputs of $(\cdot s)$ are encoded in the coefficients, which are called ‘phases’ in quantum theory. (Now we know where the function $(\cdot s)$ in the Bernstein–Vazirani problem comes from, and see that the problem is actually no less artificial —but perhaps more delicately designed— than Deutsch’s problem.)

The Bernstein–Vazirani algorithm can be understood as (i) preparing a superposition state in Equation \ref{eq:X-basis-encoding}, (ii) measuring this state with respect to the X-basis, and (iii) applying $H^{-1}$ to convert the resulting string of $\texttt +$ and $\texttt -$ back to a string of $\mathtt 0$ and $\mathtt 1$, which is $s$. Since such a state is necessarily one of the X-basis elements, which can be perfectly distinguished from the rest of the X-basis, the measurement will return that element with certainty. To a beginner, it may be surprising that we should prepare such an intricate superposition state, which seems difficult to handle, but in this case the measurement with respect to the X-basis acts as a powerful pattern-matching operation that recognises such superposition states and decodes them back to a string, so there is no difficulty.

It turns out that, in general, we can encode a function $f : \mathbf 2^n \to \mathbf 2$ in phases quite easily using $\bar f^{\mathrm T}$. The lifting of $f$ to $\bar f$ can be defined by

$$ \bar f \defeq \sum_{x : \mathbf 2^n} | f(x) \rangle \langle x | $$

whose transpose is

$$ \bar f^{\mathrm T} = \sum_{x : \mathbf 2^n} | x \rangle \langle f(x) | $$

Now, if we apply $\bar f^{\mathrm T}$ to $|\mathtt 0\rangle$, we can select all the summands where $f(x) = \mathtt 0$; we can also apply $\bar f^{\mathrm T}$ to $|\mathtt 1\rangle$ to select all the summands where $f(x) = \mathtt 1$, and put a minus sign in the front. And the sum of the two applications —which is the same as just one application to $|\texttt-\rangle$— is the state we want!

\begin{equation} \bar f^{\mathrm T}|\texttt-\rangle = \bar f^{\mathrm T}|\mathtt 0\rangle - \bar f^{\mathrm T}|\mathtt 1\rangle = \left(\sum_{x : \mathbf 2^n \mathrel\wedge f(x) = 0} |x\rangle\right) - \left(\sum_{x : \mathbf 2^n \mathrel\wedge f(x) = 1} |x\rangle\right) \\ = \sum_{x : \mathbf 2^n} (-1)^{f(x)}|x\rangle \label{eq:general-encoding} \end{equation}

So actually what we need to do in the quantum part of the algorithm is exactly the same as Deutsch–Jozsa; the rest of the development is the same as in 0027.

The Deutsch–Jozsa algorithm can also be explained —probably more concisely— using the phase encoding: if $f$ is constant, then all the terms in Equation \ref{eq:general-encoding} will concentrate in either $\bar f^{\mathrm T}|\mathtt 0\rangle$ or $\bar f^{\mathrm T}|\mathtt 1\rangle$, and the resulting state will be $|\texttt{++}\rangle$ (ignoring the sign); if $f$ is balanced, then the terms will distribute evenly in $\bar f^{\mathrm T}|\mathtt 0\rangle$ and $\bar f^{\mathrm T}|\mathtt 1\rangle$, and the resulting state will be orthogonal to $|\texttt{++}\rangle$. Well, I suppose that the phase-encoding explanation is favoured traditionally for a reason. What’s still different here is that we can talk about how and why the algorithm works purely in linear algebra, and separately from its quantum implementation.