Masterarbeit/chapter_02.tex

\section{Theoretical Background}
The following subchapters shall supply the reader with the theoretical foundation of digital signal processing to better understand the following implementation of \ac{ANR} on a low-power signal processor.\\ \\
The chapter begins with the description of signals, the problem of them interfering and the basics of digital signal processing in general, covering fundamental topics like signal representation, transfer functions and filters.\\
Filters are used in various functional designs, therefore a short explanation into the concepts of Finite Impulse Response- and Infinite Impulse Response filters is indispensable.\\
At this point an introduction into adaptive noise reduction follows, including a short overview of the most important steps in history, the general concept of \ac{ANR}, its design possibilities and its optimization possibilities in regard of error calculation.\\
With this knowledge covered, a realistic signal flow diagram of an implanted \ac{CI} system with corresponding transfer functions is designed, essential to implement \ac{ANR} on a low-power digital signal processor.\\
At the end of chapter two, high-level Python simulations shall function as a practical demonstration of the recently presented theoretical background.\\ \\
Throughout this thesis, sampled signals are denoted in lowercase with square brackets (e.g. {x[n]}) to distinguish them from time-continuous signals (e.g. {x(t)}). Vectors are notaded in lowercase bold font, whereas matrix are notaded in uppercase bold font. Scalars are notated in normal lowercase font.\\
\subsection{Signals and signal interference}
A signal is a physical parameter (e.g. pressure, voltage) changing its value over time. Whereas in nature, a signal is always analog, meaning continuous in both time and amplitude, a digital signal is represented in a discrete form, being sampled at specific time intervals and quantized to finite amplitude levels.\\ \\
The term "signal interference" describes the overlapping of unwanted signals or noise with the desired signal, degrading the overall quality and intelligibility of the processed information. A simple example of signal interference is shown in Figure \ref{fig:fig_interference} - the noisy signal (top) consists out of several signals of different frequencies, representing both the desired signal and unwanted noise. The cleaned signal (bottom) shows the desired signal after unwanted frequencies has been cut off by a filter.\\ \\
\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\linewidth]{Bilder/fig_interference.jpg}
    \caption{Noisy signal containing different frequencies and cleaned signal. \cite{source_dsp_ch1}}
    \label{fig:fig_interference}
\end{figure}
\noindent In cochlear implant systems, speech signals must be reconstructed with high spectral precision to ensure intelligibility for the user. As signal interference can cause considerable degradation to the quality of said final audio signal, the objective of this thesis shall be the improvement of implant technology in regard of adaptive noise reduction.
\subsection{Fundamentals of digital signal processing}
Digital signal processing describes the manipulation of digital signals on a \ac{DSP} through mathematical approaches. Analog signals have to be digitalized before being able to be handled by a \ac{DSP}.
\subsubsection{Signal conversion and representation}
\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\linewidth]{Bilder/fig_dsp.jpg}
    \caption{Block diagram of processing an analog input signal to an analog output signal with digital signal processing in between. \cite{source_dsp_ch1}}
    \label{fig:fig_dsp}
\end{figure}
Before digital signal processing can be applied to an analog signal like human voice, several steps are required beforehand. An analog signal, continuous in both time and amplitude, is passed through an initial filter, which limits the frequency bandwidth. An analog-digital converter then samples and quantities the signal into a digital form, now discrete in time and amplitude. This digital signal can now be processed, before (possibly) being converted to an analog signal again (refer to Figure \ref{fig:fig_dsp}). The sampling rate defines, in how many samples per second are taken from the analog signal - a higher sample rate delivers a more accurate digital representation of the signal but also uses more resources. According to the Nyquist–Shannon sampling theorem, the sample rate must be at least twice the highest frequency component present in the signal to avoid aliasing of the signal (refer to Figure \ref{fig:fig_nyquist}). Aliasing describes the phenomenon, that high frequency parts of a signal are wrongly interpreted, if the sampling rate of the analog signal is too low. The digitalized signal then contains low frequencies, which don't occur in the original signal.
\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\linewidth]{Bilder/fig_nyquist.jpg}
    \caption{Adequate (top) and inadequate
    (bottom) sampling frequency of a signal. \cite{source_dsp_ch1}}
    \label{fig:fig_nyquist}
\end{figure}
\noindent The discrete digital signal can be viewed as a sequence of finite samples with its amplitude being a discrete value, like a 16- or 32-bit integer. A signal vector of the length N, containing N samples, is therefore notated as
\begin{equation}
\label{equation1}
 x = [x[n-N+1],x[n-N+2],...,x[n-1],x[N]]
\end{equation}
where x[n] is the current sample and x[n-1] is the preceding sample.
\subsubsection{Time domain vs. frequency domain}
A signal (either analog or digital) can be displayed and analyzed in two ways: the time domain and the frequency domain. The time domain shows the amplitude of the signal over time - like the sine waves from Figure \ref{fig:fig_interference}. If a Fast Fourier Transformation (FFT) is applied to the signal in the time spectrum, we receive the same signal in the frequency spectrum, now showing the spectral power present in the signal (refer to Figure \ref{fig:fig_fft}).\\ \\
\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\linewidth]{Bilder/fig_fft.jpg}
    \caption{Sampled digital signal in the time spectrum and in the frequency spectrum. \cite{source_dsp_ch1}}
    \label{fig:fig_fft}
\end{figure}
\subsubsection{Transfer Functions and filters}
When we discuss signals in a mathematical way, we need to explain the term ``transfer function''. A transfer function is a mathematical representation of an abstract system that describes how an input signal is transformed into an output signal. This could mean a simple amplification or a phase shift applied to an input signal.
\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\linewidth]{Bilder/fig_transfer.jpg}
    \caption{Simple representation of a transfer function taking a noisy input signal and delivering a clean output signal. \cite{source_dsp_ch1}}
    \label{fig:fig_transfer}
\end{figure}
\noindent In digital signal processing, especially in the design of a noise reduction algorithm, transfer functions are essential for modeling and analyzing filters, amplifiers, and the pathway of the signal itself. By understanding a system’s transfer function, one can predict how sound signals are altered and therefore how filter parameters can be adapted to deliver the desired output signal.\\ \\
During the description of transfer functions, the term ``filter'' was used but not yet defined. A filter can be understood as a component in signal processing, designed to modify or extract specific parts of a signal by selectively allowing certain frequency ranges to pass while attenuating others. Filters can be static, meaning they always extract the same portion of a signal, or adaptive, meaning they change their filtering behavior over time according to their environment. Examples of static filter include low-pass-, high-pass-, band-pass- and band-stop filters, each tailored to isolate or remove particular frequency content (refer to Figure \ref{fig:fig_lowpass}).
\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\linewidth]{Bilder/fig_lowpass.jpg}
    \caption{Behavior of a second order Butterworth low-pass-filter. At the highlighted frequency $f_c$ of 3400 Hz, the amplitude of the incoming signal is reduced to 70\%. \cite{source_dsp_ch2}}
    \label{fig:fig_lowpass}
\end{figure}
\subsection{Filter designs}
Before we continue with the introduction to the actual topic of this thesis, adaptive noise reduction, two very essential filter designs need further explanation - the Finite Impulse Response- and Infinite Impulse Response filter.
\subsubsection{Finite Impulse Response filters}
A \ac{FIR} filter, commonly referred to as a ``Feedforward Filter'' is defined through the property, that it uses only input values and not feedback from output samples to determine its filtering behavior - therefore, if the input signal is reduced to zero, the response of a \ac{FIR} filter reaches zero after a finite number of samples.\\ \\
Equation \ref{equation_fir} specifies the input-output relationship of a \ac{FIR} filter - $x[n]$ is the input sample, $y[n]$ is output sample, and $b_0$ to $b_M$ the filter coefficients and M the length of the filter
\begin{equation}
\label{equation_fir}
 y[n] = \sum_{k=0}^{M} b_kx[n-k] = b_0x[n] + b_1x[n-1] + ... + b_Mx[n-M]
\end{equation}
Figure \ref{fig:fig_fir} visualizes a simple \ac{FIR} filter with three coefficients - the first sample is multiplied with the operator $b_0$ whereas the following samples are multiplied with the operators $b_1$ and $b_2$ before added back together. The Operator $Z^{-1}$ represents a delay operator of one sample.
As there are three operators present in the filter, three samples are needed before the filter response is complete.
\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\linewidth]{Bilder/fig_fir.jpg}
    \caption{\ac{FIR} filter example with three feedforward operators.}
    \label{fig:fig_fir}
\end{figure}
\subsubsection{Infinite Impulse Response filters}
An \ac{IIR} filter, commonly referred to as a ``Feedback Filter'' can be seen as an extension of the \ac{FIR} filter. In contrary to its counterpart, it also uses past output samples in addition to current input samples to adapt its filtering behavior - therefore the response of an \ac{IIR} filter theoretically continues indefinitely, even if the input signal is reduced to zero.\\ \\
Equation \ref{equation_iir} specifies the input-output relationship of a \ac{IIR} filter. In addition to Equation \ref{equation_fir} there is now a second term included, where $a_0$ to $a_N$ are the feedback coefficients with their own filter length N.
\begin{equation}
\label{equation_iir}
 y[n] = \sum_{k=0}^{M} b_kx[n-k] - \sum_{k=0}^{N} a_ky[n-k] = b_0x[n] + ... + b_Mx[n-M] - a_0y[n] - ... - a_Ny[n-N]
\end{equation}
Figure \ref{fig:fig_iir} visualizes a simple \ac{IIR} filter with two feedforward coefficients and two feedback coefficients. The first sample passes through the adder after it was multiplied with $b_0$. After that, it is passed back after being multiplied with $a_0$. The second sample is then processed the same way - this time multiplied with $b_1$ and $b_1$. After two samples, the response of this exemplary \ac{IIR} filter is complete.
\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\linewidth]{Bilder/fig_iir.jpg}
    \caption{\ac{IIR} filter example with two feedforward operators and two feedback operators.}
    \label{fig:fig_iir}
\end{figure}
\subsubsection{\ac{FIR}- vs. \ac{IIR}-filters}
Due to the fact, that there is no feedback, a \ac{FIR} filter offers unconditional stability, meaning that the filter response always converges, no matter how the coefficients are set. The disadvantages of the \ac{FIR} design is the relatively flat frequency response and the higher number of needed coefficients needed to achieve a sharp frequency response compared to its Infinite Impulse Response counterpart.\\ \\
The recursive nature of an \ac{IIR} filter, in contrary, allows achieving a sharp frequency response with significantly fewer coefficients than an equivalent \ac{FIR} filter, but it also opens up the possibility, that the filter response diverges, depending on the set coefficients.\\ \\
A higher number of needed coefficients implies, that the filter itself needs more time to complete its signal response, as the group delay is increased.

\subsection{Introduction to Adaptive Noise Reduction}
\subsubsection{History}
The necessity for the use of electric filters arose the first time in the beginnings of the 20th century with the development of the quite young fields of tele- and radio-communication. At his time, engineers used static filters, like low- or highpass filters, to improve transmission quality - this fundamental techniques allowed limiting the frequency spectrum, by cutting out certain frequencies like high-pitched noises or humming. From this time on, the development of new filter designs accelerated, for example with the soon-to-be developed LC-filter by Otto Zobel, an American scientist working at the telecommunication company AT\&T. Until then, the used filters were static, meaning they didn't change their behavior over time.\\ \\
In the 1930s, the first real concept of active noise cancellation was proposed by the German Physician Paul Lueg. Lueg patented the idea of two speakers emitting antiphase signals which cancel each other out. Though his patent was granted in 1936, back at the time, there was no technical possibility detect and process audio signals in a way, to make his noise cancellation actually work in a technical environment.\\ \\
20 years after Lueg's patent, Lawrence Fogel patented a practical concept of noise cancellation, intended for noise suppression in aviation - this time, the technical circumstances of the 1950s enabled the development of an aviation headset, lowering the overall noise experienced by pilots in the cockpit of a helicopter or an airplane by emitting a 180 degree phase shifted signal of the recorded background noise of the cockpit into the pilots' headset. (see Figure \ref{fig:fig_patent}).
\begin{figure}[H]
    \centering
    \includegraphics[width=0.7\linewidth]{Bilder/fig_patent.jpg}
    \caption{Reconstruction of Lawrence Fogel´s patent in 1960. \cite{source_patent}}
    \label{fig:fig_patent}
\end{figure}
\noindent In contrary to the static filters in the beginning of the century, the active noise cancellation of Lueg and Widrow was far more advanced than just reducing a signal by a specific frequency portion like with the use of static filters, yet this technique still has their limitations as it is designed only to work within to a certain environment.\\ \\
With the rapid advancement of digital signal processing technologies, noise cancellation techniques  evolved from static, hardware-based filters and physical soundwave cancellation towards more sophisticated approaches. In the then 1970s, the concept of digital adaptive filtering arose, allowing digital filters to adjust their parameters in real-time based on the characteristics of the incoming signal and noise. This marked a significant leap forward, as it enabled systems to deal with dynamic and unpredictable noise environments - the concept of adaptive noise reduction was born.
\subsubsection{The concept of adaptive filtering}
Adaptive noise reduction describes an advanced filtering method based on an error-metric and represents a significant advancement over these earlier methods by allowing the filter parameters to continuously adapt to the changing acoustic environment in real-time. This adaptability makes \ac{ANR} particularly suitable for hearing devices, where environmental noise characteristics vary constantly.\\ \\
Static filters, like low- and high-pass filters, as described in the previous chapter feature coefficients that remain constant over time. They are designed for known, predictable noise conditions (e.g., removing a steady 50 Hz hum). While these filters are efficient and easy to implement, they fail to function when noise characteristics change dynamically.\\ \\
Although active noise cancellation and adaptive noise reduction share obvious similarities, they differ fundamentally in their application and signal structure. While active noise cancellation aims to physically cancel noise in the acoustic domain — typically before, or at the time, the signal reaches the ear — \ac{ANR} operates within the signal processing chain, attempting to extract the noisy component from the digital signal. In cochlear implant systems, the latter is more practical because the acoustic waveform is converted into electrical stimulation signals; thus, signal-domain filtering is the only feasible approach.
\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\linewidth]{Bilder/fig_anr.jpg}
    \caption{The basic idea of an adaptive filter design for noise reduction.}
    \label{fig:fig_anr}
\end{figure}
\noindent Figure \ref{fig:fig_anr} shows the basic concept of an adaptive filter design, represented through a feedback filter application. The primary sensor (top) aims to receive the desired signal and outputs the corrupted signal $d[n]$, which consists out of the recorded desired signal $s[n]$ and the recorded corruption noise signal $n[n]$, whereas the secondary signal sensor aims to receive (ideally) only the noise signal and outputs the recorded reference noise signal $x[n]$, which then feeds the adaptive filter. We assume at this point, that the corruption noise signal is uncorrelated to the desired signal, and therefore separable from it. In addition, we assume, that the corruption noise signal is correlated to the reference noise signal, as it originates from the same source, but takes a different signal path. \\ \\ The adaptive filter removes a certain, noise-related, frequency part of the input signal and re-evaluates the output through its feedback design. The filter parameters are then adjusted and applied to the next sample to minimize the observed error $e[n]$, which also represents the approximated desired signal $š[n]$. In reality, a signal contamination of the two sensors has to be expected, which will be illustrated in a more realistic signal flow diagram of an implanted \ac{CI} system in chapter 2.6.
\subsubsection{Fully adaptive vs. hybrid filter design}
The basic \ac{ANR} concept illustrated in Figure \ref{fig:fig_anr} can be understood as a fully adaptive variant. A fully adaptive filter design works with a fixed number of coefficients of which everyone is updated after every sample processing. Even if this approach features the best performance in noise reduction, it also requires a relatively high amount of computing power, as every coefficient has to be re-calculated after every evaluation step.\\ \\
To reduce the required computing power, a hybrid static/adaptive filter design can be taken into consideration instead (refer to Figure \ref{fig:fig_anr_hybrid}). In this approach, the initial fully adaptive filter is split into a fixed and an adaptive part - the static filter removes a certain, known, or estimated, frequency portion of the noise signal, whereas the adaptive part only has to adapt to the remaining, unforecastable, noise parts. This approach reduces the number of coefficients required to be adapted, therefore lowering the required computing power.
\begin{figure}[H]
    \centering
    \includegraphics[width=0.8\linewidth]{Bilder/fig_anr_hybrid.jpg}
    \caption{Hybrid adaptive filter design for noise reduction with a static part and an adaptive part.}
    \label{fig:fig_anr_hybrid}
\end{figure}
\noindent Different approaches of the hybrid static/adaptive filter design will be evaluated and compared in regard of their required computing power in a later chapter of this thesis.
\subsection{Adaptive optimization strategies}
In the description of the concept of adaptive filtering above, the adaption of filter coefficients due to an error metric was mentioned but not further explained. The following subchapters shall cover the most important aspects of filter optimization in regard of adaptive noise reduction.
\subsubsection{Filter optimization and error metrics}
Adaptive filters rely on an error metric to self-reliantely evaluate their performance in real-time by adjusting their coefficients in a constant manner to minimize the received error signal $e[n]$, which is defined as:
\begin{equation}
\label{equation_error}
 e[n] = d[n] - y[n] = š[n]
\end{equation}
The error signal $e[n]$, already illustrated in Figure \ref{fig:fig_anr} and \ref{fig:fig_anr_hybrid},  is calculated as the difference between the corrupted signal $d[n]$ and the output signal of the filter $y[n]$.
As we will see in the following chapters, a real world application of an adaptive filter system poses several challenges, which have to be taken into consideration when designing the filter. These challenges include:
\begin{itemize}
\item The error signal $e[n]$ is not a perfect representation of the desired signal $s[n]$ present in the corrupted signal $d[n]$, as the adaptive filter can only approximate the noise signal based on its current coefficients, which in general, do not represent the optimal solution at that given time.
\item Although, the corruption noise signal $n[n]$ and the reference noise signal $x[n]$ are correlated, they are not identical, as they take different signal paths from the noise source to their respective sensors. This discrepancy can lead to imperfect noise reduction, as the adaptive filter has to estimate the relationship between these two signals.
\item The desired signal $s[n]$ is not directly available, as it is only available combined with the corruption noise signal $n[n]$ in the form of $d[n]$ while there is no reference available. Therefore, the error signal $e[n]$, respectively $š[n]$, of the adaptive filter serves as an approximation of the desired signal and is used as an indirect measure of the filter's performance, guiding the adaptation process by its own stepwise minimization.
\item The reference noise signal $x[n]$ fed into the adaptive filter could also be contaminated with parts of the desired signal. If this circumstance occurs is not handled properly, it could lead to the undesired removal of parts of the desired signal from the output signal $š[n]$.
\end{itemize}
The goal of the adaptive filter is therefore to minimize this error signal over time, thereby improving the quality of the output signal by reducing it by its noise-component.\\
The minimization of the error signal $e[n]$ can be achieved by applying different error metrics and algorithms used to evaluate the performance of an adaptive filter, including:
\begin{itemize}
\item \ac{MSE}: This metric calculates the averaged square of the error between the expected value and the observed value over a predefined period. It is sensitive to large errors and is commonly used in adaptive filtering applications.
\item \ac{LMS}: The \ac{LMS} is an algorithm, focused on minimizing the mean squared error by adjusting the filter coefficients iteratively based on the error signal by applying the gradient descent method. It is computationally efficient and widely used in real-time applications.
\item \ac{NLMS}: An extension of the \ac{LMS} algorithm that normalizes the step size based on the input signal, improving convergence speed.
\item \ac{RLS}: This algorithm aims to minimize the weighted sum of squared errors, providing faster convergence than the \ac{LMS} algorithm but at the cost of higher computational effort.
\end{itemize}
As computational efficiency is a key requirement for the implementation of real-time \ac{ANR} on a low-power \ac{DSP}, the Least Mean Squares algorithm is chosen for the minimization of the error signal and therefore will be further explained in the following subchapter.

\subsubsection{The Wiener filter and the concept of gradient descent}
Before the Least Mean Squares algorithm can be explained in detail, the Wiener filter and the concept of gradient descent have to be introduced. \\ \\
\begin{figure}[H]
    \centering
    \includegraphics[width=0.7\linewidth]{Bilder/fig_wien.jpg}
    \caption{Simple implementation of a Wiener filter.}
    \label{fig:fig_wien}
\end{figure}
\noindent The Wiener filter, the base of many adaptive filter designs, is a statistical filter used to minimize the Mean Squares Error between a desired signal and the output of a linear filter. The output $y[n]$ of the Wiener filter is the sum of the weighted input samples, where the weights are represented by the filter coefficients.
\begin{equation}
\label{equation_wien}
 y[n] = w_0x[n] + w_1x[n-1] + ... + w_Mx[n-M] = \sum_{k=0}^{M} w_kx[n-k]
\end{equation}
The Wiener filter aims to adjust its coefficients to generate a filter output, which resembles the corruption noise signal $n[n]$ contained in the corrupted signal $d[n]$ as close as possible. After the filter output is subtracted from the corrupted signal, we receive the error signal $e[n]$, which represents the cleaned signal $š[n]$ after the corruption noise component has been removed. For better understanding, a simple Wiener filter with one coefficient shall be illustrated in the following mathematical approach, before the generalization to an n-dimensional filter is made.
\begin{equation}
\label{equation_wien_error}
 e[n] = d[n] - y[n] = d[n] - wx[n]
\end{equation}
If we square the error signal and calculate the expected value, we receive the Mean Squared Error $J$, mentioned in the previous chapter, which is the metric the Wiener filter aims to minimize by adjusting its coefficients $w$.
\begin{equation}
\label{equation_j}
 J = E(e[n]^2) = E(d^2[n])-2wE(d[n]x[n])+w^2E(x^2[n]) = MSE
\end{equation}
The terms contained in Equation \ref{equation_j} can be further be defined as:
\begin{itemize}
\item $\sigma^2$ = $E(d^2[n])$: The expected value of the squared corrupted signal - a constant term independent of the filter coefficients $w$.
\item \textbf{P} = $E(d[n]x[n])$: The cross-correlation between the corrupted signal and the reference noise signal - a measure of how similar these two signals are.
\item \textbf{R} = $E(x^2[n])$: The auto-correlation (or serial-correlation) of the reference noise signal - a measure of the similarity of a signal with it's delayed copy and therefore of the signal's spectral power.
\end{itemize}
Equation {\ref{equation_j}} can therefore be further simplified and written as:
\begin{equation}
\label{equation_j_simple}
 J = \sigma^2 - 2w\textbf{P} + w^2\textbf{R}
\end{equation}
As every part of Equation \ref{equation_j_simple} beside $w^2$ is constant, $j$ is a quadratic function of the filter coefficients $w$, offering a calculable minimum. To find this minimum, the derivative of $J$ with respect to $w$ can be calculated and set to zero:
\begin{equation}
\label{equation_j_gradient}
 \frac{dJ}{dw} = -2\textbf{P} + 2w\textbf{R} = 0
\end{equation}
Solving Equation \ref{equation_j_gradient} for $w$ delivers the equation to calculate the optimal coefficients for the Wiener filter:
\begin{equation}
\label{equation_w_optimal}
 w_{opt} = \textbf{P}\textbf{R}^{-1}
\end{equation}
\begin{figure}[H]
    \centering
    \includegraphics[width=0.7\linewidth]{Bilder/fig_w_opt.jpg}
    \caption{Minimum of the Mean Square Error J located at the optimal coefficient w* \cite{source_dsp_ch9}}
    \label{fig:fig_mse}
\end{figure}
\noindent If the Wiener filter now consists not out of one coefficient, but out of several coefficients, Equation \ref{equation_wien} can be written in a matrix form as
\begin{equation}
\label{equation_wien_matrix}
 y[n] = \sum_{k=0}^{M} w_kx[n-k] = \textbf{w}^T\textbf{x}
\end{equation}
where \textbf{x} is the input signal vector and \textbf{w} the filter coefficient vector.
\begin{gather}
\label{equation_input_vector}
 \textbf{x} = [x[n],x[n-1],...,x[n-M]]^T \\
 \label{equation_coefficient_vector}
 \textbf{w} = [w_0,w_1,...,w_M]^T
\end{gather}
Equation \ref{equation_j} can therefore also be rewritten in matrix form to:
\begin{equation}
\label{equation_j_matrix}
 J = \sigma^2 - 2\textbf{w}^T\textbf{P} + \textbf{w}^T\textbf{R}\textbf{w}
\end{equation}
After settings the derivative of Equation \ref{equation_j_matrix} to zero and solving for $\textbf{w}$, we receive the optimal filter coefficient matrix:
\begin{equation}
\label{equation_w_optimal_matrix}
 \textbf{w}_{opt} = \textbf{P}\textbf{R}^{-1}
\end{equation}
\noindent For a large filter, the numerical solution of Equation \ref{equation_w_optimal_matrix} can be computational expensive, as it involves the inversion of potential large matrix. Therefore, to find the optimal set of coefficients $w$, the concept of gradient descent, introduced by Widrow\&Stearns in 1985, can be applied. The gradient decent algorithm aims to minimize the MSE iteratively sample by sample, by adjusting the filter coefficients $w$ in small steps towards the direction of the steepest descent to find the optimal coefficients. The update rule for the coefficients using gradient descent can be expressed as
\begin{equation}
\label{equation_gradient}
 w(n+1) = w(n) - \mu \frac{dJ}{dw}
\end{equation}
where $\mu$ is the constant step size determining the rate of convergence. Figure \ref{fig:fig_w_opt} visualizes the concept of stepwise minimization of the MSE using gradient descent. After the derivative of $J$ with respect to $\textbf{w}$ reaches zero, the optimal coefficients $\textbf{w}_{opt}$ are found and the coefficients are no longer updated.
\begin{figure}[H]
    \centering
    \includegraphics[width=0.9\linewidth]{Bilder/fig_gradient.jpg}
    \caption{Visualization of the steepest decent algorithm used on the Mean Squared Error. \cite{source_dsp_ch9}}
    \label{fig:fig_w_opt}
\end{figure}
\subsubsection{The Least Mean Squares algorithm}
The given approach of the steepest decent algorithm in the subchapter above still involves the calculation of the derivative of the MSE $\frac{dJ}{dw}$, which is also a computational expensive operation to calculate, as it requires knowledge of the statistical properties of the input signals (cross-correlation P and auto-correlation R). Therefore, in energy critical real-time applications, like the implementation of \ac{ANR} on a low-power \ac{DSP}, a sample-based approximation in form of the \ac{LMS} algorithm is used instead. The \ac{LMS} algorithm approximates the gradient of the MSE by using the instantaneous estimates of the cross-correlation and auto-correlation. To achieve this, we remove the statistical expectation out of the MSE $J$ and take the derivative to obtain a samplewise approximate of $\frac{dJ}{dw[n]}$.
\begin{gather}
\label{equation_j_lms}
 J = e[n]^2 = (d[n]-wx[n])^2 \\
 \label{equation_j_lms_final}
 \frac{dJ}{dw[n]} = 2(d[n]-w[n]x[n])\frac{d(d[n])-w[n]x[n]}{dw[n]} = -2e[n]x[n]
\end{gather}
The result of Equation \ref{equation_j_lms_final} can now be inserted into Equation \ref{equation_gradient} to receive the \ac{LMS} update rule for the filter coefficients:
\begin{equation}
\label{equation_lms}
 w[n+1] = w[n] - 2\mu e[n]x[n]
\end{equation}
The \ac{LMS} algorithm therefore updates the filter coefficients $w[n]$ after every sample by adding a correction term, which is calculated by the error signal $e[n]$ and the reference noise signal $x[n]$, scaled by the constant step size $\mu$. By iteratively applying the \ac{LMS} algorithm, the filter coefficients converge towards the optimal values that minimize the mean squared error between the desired signal and the filter output. When a predefined acceptable error level is reached, the adaptation process can be stopped to save computing power.\\ \\
 \subsection{Signal flow diagram of an implanted cochlear implant system}
 Now equipped with the necessary theoretical background about signal processing, adaptive noise reduction and the \ac{LMS} algorithm, a realistic signal flow diagram with the relevant transfer functions of an implanted cochlear implant system can be designed, which will serve as the basis for the implementation of \ac{ANR} on a low-power digital signal processor.
 \begin{figure}[H]
    \centering
    \includegraphics[width=1.1\linewidth]{Bilder/fig_anr_implant.jpg}
    \caption{Realistic implant design.}
    \label{fig:fig_anr_implant}
\end{figure}
\noindent Figure \ref{fig:fig_anr_hybrid} showed us the basic concept of an \ac{ANR} implementation, without a detailed description how the corrupted signal $d[n]$ and the reference noise signal $x[n]$ are formed. Figure \ref{fig:fig_anr_implant} now shows a more complete and realistic signal flow diagram of an implanted cochlear implant system, with two signal sensors and an adaptive noise reduction circuit afterwards. The primary sensor receives the desired- and noise signal over their respective transfer functions and outputs the corrupted signal $d[n]$, which consists out of the recorded desired signal $s[n]$ and the recorded corruption noise signal $n[n]$, whereas the noise signal sensor aims to receive (ideally) only the noise signal $v[n]$ over its transfer function and outputs the reference noise signal $x[n]$, which then feeds the adaptive filter.\\ \\
Additionally, now the relevant transfer functions of the overall system are illustrated in Figure \ref{fig:fig_anr_implant}. The transfer functions $C_n$, $D_n$, and $E_n$ describe the path from the signal sources to the cochlear implant system. As the sources, the relative location of the user to the sources and the medium bewteen them can vary, these transfer functions are time-variant and unknown. After the signals reached the implant systems, we establish the possibility, that the remaining path of the signals is mainly depented on the sensitivity curve of the respective sensors and therefore can be seen as time-invariant and known. This known transfer functions, which are titled $A$ and $B$, allow us to apply an hybrid static/adaptive filter design for the \ac{ANR} implementation, as described in chapter 2.5.2.\\ \\
\begin{equation}
\label{equation_dn}
 d[n] = s[n] + n[n] = t[n] * (C_nA) + v[n] * (D_nA)
\end{equation}
where  $t[n]$ and $v[n]$ are the target- and noise signals at their respective source, $s[n]$ is the recorded desired signal and $n[n]$ is the recorded corruption noise after passing the transfer functions.\\ \\
The recorded noise reference signal $x[n]$ can be mathematically described as:
\begin{equation}
\label{equation_xn}
x[n] = v[n] * (E_nB)
\end{equation}
where $v[n]$ is the noise signal at its source.\\ \\
Another possible signal interaction could be the leakage of the desired signal into the secondary sensor, leading to the partial removal of the desired signal from the output signal. This case is not illustrated in Figure \ref{fig:fig_anr_implant} as it won't be further evaluated in this thesis, but shall be mentioned for the sake of completeness.\\ \\
At this point, the theoretical background and the fundamentals of adaptive noise reduction have been adequately introduced and explained as necessary for the understanding of the following chapters of this thesis. The next chapter will now focus on practical high level simulations of different filter concepts and \ac{LMS} algorithm variations to evaluate their performance in regard of noise reduction quality before the actual implementation on a low-power digital signal processor is conducted.