This commit is contained in:
Patrick Hangl
2026-05-05 09:53:04 +02:00
parent 5e5331f099
commit 0a5244ec3f
9 changed files with 312 additions and 244 deletions
+18 -18
View File
@@ -56,17 +56,17 @@ The mean \ac{SNR}-Gain of the different noise signals, also shown in Figure \ref
\noindent Equation \ref{equation_computing_final} can now be utilized to calculate the needed cycles for the calculation of one sample of the filter output, using a filter length of 45 taps and an update of the filter coefficients every cycle. The needed cycles are calculated as follows:
\begin{equation}
\label{equation_computing_calculation_full_update}
C_{total} = 45 + (6*45+8)*1 + 34 = 357 \text{ cycles}
\text{C}_{\text{total}} = 45 + (6*45+8)*1 + 34 = 357 \text{ cycles}
\end{equation}
As already mentioned in the previous chapters, the sampling rate of the audio data provided to the \ac{PCM} interface amounts 20 kHz. The prefered clock frequency of the \ac{DSP} is chosen as 16 MHz, which means, that the \ac{DSP} core has cycle budget of
\begin{equation}
\label{equation_cycle_budget}
C_{budget} = \frac{16 MHz}{20 kHz} = 800 \text{ cycles}
\text{C}_{\text{budget}} = \frac{16 \text{ MHz}}{20 \text{ kHz}} = 800 \text{ cycles}
\end{equation}
\noindent for one sample. With these two values, the load of the \ac{DSP} core can be calculated as follows:
\begin{equation}
\label{equation_load_calculation_full_update}
Load_{DSP} = \frac{C_{total}}{C_{budget}} = \frac{357 \text{ cycles}}{800 \text{ cycles}} = 44.6 \%
\text{Load}_{\text{DSP}} = \frac{\text{C}_{\text{total}}}{\text{C}_{\text{budget}}} = \frac{357 \text{ cycles}}{800 \text{ cycles}} = 44.6 \%
\end{equation}
\noindent The results, calculated in Equation \ref{equation_computing_calculation_full_update} to \ref{equation_load_calculation_full_update} can also be recapped as follows:\\ \\
With the optimal filter length of 45 taps and an update rate of the filter coefficients every cycle, the \ac{ANR} algorithm is able to achieve a \ac{SNR}-Gain of about 11.54 dB, averaged over different signal/noise combinations. Under this circumstances, the computational load of the \ac{DSP} core amounts about 45\%, which means that 55\% of the time, which a new sample takes to arrive, it can be halted, and therefore, the overall power consumption can be reduced.\\ \\
@@ -85,11 +85,11 @@ As already mentioned, the reduction of the update rate is initially evaluated fo
The maximum offset bewteen the two graphs can be found at an updat rate of 0.39, meaning, that an update of the filter coefficients is only conducted in roughly 2 out of 5 samples. Updating Equation \ref{equation_computing_calculation_full_update} and \ref{equation_load_calculation_full_update} therefore delivers:
\begin{equation}
\label{equation_computing_calculation_reduced_update_1}
C_{total} = 45 + (6*45+8)*0.39 + 34 = 188 \text{ cycles}
\text{C}_{\text{total}} = 45 + (6*45+8)*0.39 + 34 = 188 \text{ cycles}
\end{equation}
\begin{equation}
\label{equation_load_calculation_reduced_update_1}
Load_{DSP} = \frac{C_{total}}{C_{budget}} = \frac{188 \text{ cycles}}{800 \text{ cycles}} = 23.5 \%
\text{Load}_{\text{DSP}} = \frac{\text{C}_{\text{total}}}{\text{C}_{\text{budget}}} = \frac{188 \text{ cycles}}{800 \text{ cycles}} = 23.5 \%
\end{equation}
The interpretation of this results leads to the conclusion, that the most cost-effective way to reduce the load of the \ac{DSP} would be to reduce the update rate of the filter coefficients to 0.39. In the case of the benchmark signal/noise combination, this action nearly halfs the processor load from 44.6\% to 23.5\%, while only reducing the \ac{SNR}-Gain by rougly 31 \% from 9.47 dB to 6.40 dB. In the next step, the same analysis will be applied on all introduced noise signal, to get an idea of the general validity of the mad eobservation.
\subsubsection{Reduced-update implementation for multiple noise signals}
@@ -109,29 +109,29 @@ Now the same evaluation as in the previous subchapter is conducted for the five
\noindent Figure \ref{fig:fig_gain_update_rate.png} shows the performance gain for the five different scenarios. The mean performance gain for all scenarious now wandered to an update rate of 0.32. Figure \ref{fig:fig_load_update_rate.png} shows the load of the \ac{DSP} core for the different update rates, which is the same for all scenarios, as it is only dependent on the update rate itself.
\begin{equation}
\label{equation_computing_calculation_reduced_update_2}
C_{total} = 45 + (6*45+8)*0.32 + 34 = 168 \text{ cycles}
\text{C}_{\text{total}} = 45 + (6*45+8)*0.32 + 34 = 168 \text{ cycles}
\end{equation}
\begin{equation}
\label{equation_load_calculation_reduced_update_2}
Load_{DSP} = \frac{C_{total}}{C_{budget}} = \frac{168 \text{ cycles}}{800 \text{ cycles}} = 20.8 \%
\text{Load}_{\text{DSP}} = \frac{\text{C}_{\text{total}}}{\text{C}_{\text{budget}}} = \frac{168 \text{ cycles}}{800 \text{ cycles}} = 20.8 \%
\end{equation}
Equation \ref{equation_computing_calculation_reduced_update_2} and \ref{equation_load_calculation_reduced_update_2} confirm, that for an update rate of 0.32, a reduction of the \ac{DSP} load to 20.8\% can be achieved, correlating with a performance gain of 24.9\%. This means, that for all viewed scenarios, an update rate of 0.32 represents the best cost-value ratio, for reducing the load while geting the best possible noise reduction. The relative performance for all scenarios result in a mean \ac{SNR}-Gain reduction of 24.5\% from 11.54 dB to 8.72 dB, while the load of the \ac{DSP} core is reduced by about 53.4\% from 44.6\% to 20.8\%.
\subsubsection{Computational load for reduced-update implementation}
The most straight forward implmementation of a reduced update rate is through the use of a counter and a modulo operation, which checks, if for the current sample the filter coefficients has to be updated or not. The code must therefore be extended by two blocks which are responsible for additional computational load:
\begin{gather}
\label{equation_update_1}
C_{increment\_counter} = 5 \text{ cycles} \\
\text{C}_{\text{increment\_counter}} = 5 \text{ cycles} \\
\label{equation_update_2}
C_{check\_counter} = 23 (24) \text{ cycles}
\text{C}_{\text{check\_counter}} = 23 (24) \text{ cycles}
\end{gather}
Incrementing the counter and checking if the counter has reached the update rate through a modulo operation adds 29 cycles to cycle count for one sample (28 when the coefficients are updated and 29 when they are not updated). Equation \ref{equation_computing_calculation_reduced_update_3} and \ref{equation_load_calculation_reduced_update_3} show the new calculation of the needed cycles and the load of the \ac{DSP} core for an update rate of 0.32:
\begin{equation}
\label{equation_computing_calculation_reduced_update_3}
C_{total} = 45 + (6*45+8)*0.32 + 63 = 197 \text{ cycles}
\text{C}_{\text{total}} = 45 + (6*45+8)*0.32 + 63 = 197 \text{ cycles}
\end{equation}
\begin{equation}
\label{equation_load_calculation_reduced_update_3}
Load_{DSP} = \frac{C_{total}}{C_{budget}} = \frac{197 \text{ cycles}}{800 \text{ cycles}} = 24.6 \%
\text{Load}_{\text{DSP}} = \frac{\text{C}_{\text{total}}}{\text{C}_{\text{budget}}} = \frac{197 \text{ cycles}}{800 \text{ cycles}} = 24.6 \%
\end{equation}
The results from the updated equations show, that the computational load for an update rate of 0.32 increase substantially from 20.8\% to 24.6\% through the use of a counter and a modulo operation, as the latter is quite computational expensive. A better alternative would be the use of a bitwise check, but this would reduce the possible update rates to powers of 2.
\subsection{Evaluation of an error driven implementation}
@@ -150,11 +150,11 @@ The chosen approach for this thessis the use a fixed error threshold. This means
\noindent Our benchmark track is evaluated for error tresholds ranging from 0 to 0.5. The results, represented in Figure \ref{fig:fig_snr_error_threshold.png}, show for small thresholds, especially smaller than 0.1, a highly beneficial behavior can be anticipated, where the \ac{SNR}-Gain is only slightly reduced, while the load of the \ac{DSP} core significantly drops. The maximum offset between the two graphs can be found at an error threshold of 0.02 - at this point, the coefficient adaption is only conducted in ~81400 of 200000 samples, which equivalents an update rate of about 41\%. Updating Equation \ref{equation_computing_calculation_full_update} and \ref{equation_load_calculation_full_update} therefore delivers:
\begin{equation}
\label{equation_computing_calculation_error threshold_1}
C_{total} = 45 + (6*45+8)*0.41 + 34 = 193 \text{ cycles}
\text{C}_{\text{total}} = 45 + (6*45+8)*0.41 + 34 = 193 \text{ cycles}
\end{equation}
\begin{equation}
\label{equation_load_calculation_error threshold_1}
Load_{DSP} = \frac{C_{total}}{C_{budget}} = \frac{193 \text{ cycles}}{800 \text{ cycles}} = 24.1 \%
\text{Load}_{\text{DSP}} = \frac{\text{C}_{\text{total}}}{\text{C}_{\text{budget}}} = \frac{193 \text{ cycles}}{800 \text{ cycles}} = 24.1 \%
\end{equation}
The performance difference to reducing the update rate is already clearly for the benchmark case: With a similar \ac{DSP} load of 24.1\% (again, nearly half the load of the full update implementation), the \ac{SNR}-Gain is reduced by only 8.9\% from 9.47dB to 8.63 dB. The same analysis will be applied on all introduced noise signal, to get an idea of the general validity of the made observation.
\subsubsection{Error threshold implementaion for multiple noise signals}
@@ -175,28 +175,28 @@ Again, the same evaluation as for the benchmark case is conducted for the five i
A mean error threshold of 0.07 results in a mean update of 38244 out of 200000 samples, which equivalents an update rate of 19.1\%. The \ac{DSP} load for all scenarios is now not the same anymore, but still quite similar - Figure \ref{fig:fig_load_error_threshold.png} shows the absolute load of the \ac{DSP} core for an error threshold of 0.07 results in only 16.6\%.
\begin{equation}
\label{equation_computing_calculation_error_threshold_2}
C_{total} = 45 + (6*45+8)*0.191 + 34 = 132 \text{ cycles}
\text{C}_{\text{total}} = 45 + (6*45+8)*0.191 + 34 = 132 \text{ cycles}
\end{equation}
\begin{equation}
\label{equation_load_calculation_error_threshold_2}
Load_{DSP} = \frac{C_{total}}{C_{budget}} = \frac{132 \text{ cycles}}{800 \text{ cycles}} = 16.6 \%
\text{Load}_{\text{DSP}} = \frac{\text{C}_{\text{total}}}{\text{C}_{\text{budget}}} = \frac{132 \text{ cycles}}{800 \text{ cycles}} = 16.6 \%
\end{equation}
Equation \ref{equation_computing_calculation_error_threshold_2} and \ref{equation_load_calculation_error_threshold_2} confirm, that for an error threshold of 0.07, a reduction of the \ac{DSP} load to 16.6\% can be achieved, correlating with a performance gain of 48.4\%. This means, that for all viewed scenarios, an error threshold of 0.07 represents the best cost-value ratio, for reducing the load while geting the best possible noise reduction. The relative performance for all scenarios result in a mean \ac{SNR}-Gain reduction of 11.7\% from 11.54 dB to 10.19 dB, while the load of the \ac{DSP} core is reduced by about 62.8\% from 44.6\% to 16.6\%.
\subsubsection{Computational load for error threshold implementation}
In contrary to the fixed update implementation, the error threshold implementation for a fixed error threshold does not require computational expensive operations: The threshold is implemented as a 32-bit integer which is simply checked for every sample by a single if-clause.
\begin{gather}
\label{equation_update_3}
C_{check\_threshold} = 10 \text{ cycles}
\text{C}_{\text{check\_threshold}} = 10 \text{ cycles}
\end{gather}
The check of the 32-bit threshold adds 10 cycles to cycle count for one sample.
Equation \ref{equation_computing_calculation_error_threshold_3} and \ref{equation_load_calculation_error_threshold_3} show the new calculation of the needed cycles and the load of the \ac{DSP} core for an error threshold of 0.07:
\begin{equation}
\label{equation_computing_calculation_error_threshold_3}
C_{total} = 45 + (6*45+8)*0.191 + 44 = 142 \text{ cycles}
\text{C}_{\text{total}} = 45 + (6*45+8)*0.191 + 44 = 142 \text{ cycles}
\end{equation}
\begin{equation}
\label{equation_load_calculation_error_threshold_3}
Load_{DSP} = \frac{C_{total}}{C_{budget}} = \frac{142 \text{ cycles}}{800 \text{ cycles}} = 17.8 \%
\text{Load}_{\text{DSP}} = \frac{\text{C}_{\text{total}}}{\text{C}_{\text{budget}}} = \frac{142 \text{ cycles}}{800 \text{ cycles}} = 17.8 \%
\end{equation}
Contrary to the fixed update implementation, the computational load for an error threshold of 0.07 only shows only a minimal increase from 16.6\% to 17.8\% through the use of a computational cheap if-clause. This is a clear advantage compared to the fixed update implementation.
\subsection{Summary of the performance evaluation}