Abkürzungen im Fließtext aktualisiert
This commit is contained in:
@@ -1,16 +1,16 @@
|
||||
\section{Hardware setup and low level simulations}
|
||||
This section aims to be the main part of this thesis. The first subchapters describes the hardware, on which the ANR algorithm is implemented. The following subchapter describes the basic implementation of the ANR algorithm on the hardware itself and shall provide the reader with a basic understanding of its efficiency, which shall serve as a baseline for the following optimiziations.\\
|
||||
During the third chapter, this initial implementation is further optimized in order to achieve an improved real-time performance on the DSP. The last subchapter picks the final optimizations of the ANR algorithm itself as a central theme, especially with respect to the capabilites of a hybrid ANR approach.
|
||||
\subsection{Description of the low-power DSP}
|
||||
The DSP used for the implementation is a 32-bit fixed-point processor primarily designed for audio signal-processing applications in low-power embedded systems. It is developed using a retargetable processor design methodology and is typically programmed in C. Its highly efficient C compiler produces optimized assembly code that is comparable in performance and quality to hand-written assembly.\\ \\
|
||||
The processor is equipped with load/store architecture, meaning that, initially all operands need to be moved from the memory to the registers, before any operation can be performed. After this task is performed, the execution units (Arithmetic Logic Units (ALUs) and multiplier) can perform their oeprations on the data and write back the results into the registers. Finally, the results need to be explicitly moved back to the memory.\\ \\
|
||||
The DSP includes a three stage pipeline consisting of fetch, decode, and execute stages, aloowing for overlapping instruction execution and improved throughput.
|
||||
The architecture is optimized for high cycle efficiency when executing computationally intensive signal-processing workloads. It features a dual Harvard load store architecture and two seperate ALUs, which enables the execution of two multiply-accumulate (MAC) operations, two memory operations (load/store) and two pointer updates in a single prcoessor cycle.\\ \\
|
||||
The DSP includes a set of registers, including
|
||||
This section aims to be the main part of this thesis. The first subchapters describes the hardware, on which the \ac{ANR} algorithm is implemented. The following subchapter describes the basic implementation of the \ac{ANR} algorithm on the hardware itself and shall provide the reader with a basic understanding of its efficiency, which shall serve as a baseline for the following optimiziations.\\
|
||||
During the third chapter, this initial implementation is further optimized in order to achieve an improved real-time performance on the \ac{DSP}. The last subchapter picks the final optimizations of the \ac{ANR} algorithm itself as a central theme, especially with respect to the capabilites of a hybrid \ac{ANR} approach.
|
||||
\subsection{Description of the low-power \ac{DSP}}
|
||||
The \ac{DSP} used for the implementation is a 32-bit fixed-point processor primarily designed for audio signal-processing applications in low-power embedded systems. It is developed using a retargetable processor design methodology and is typically programmed in C. Its highly efficient C compiler produces optimized assembly code that is comparable in performance and quality to hand-written assembly.\\ \\
|
||||
The processor is equipped with load/store architecture, meaning that, initially all operands need to be moved from the memory to the registers, before any operation can be performed. After this task is performed, the execution units (\ac{ALU} and multiplier) can perform their oeprations on the data and write back the results into the registers. Finally, the results need to be explicitly moved back to the memory.\\ \\
|
||||
The \ac{DSP} includes a three stage pipeline consisting of fetch, decode, and execute stages, aloowing for overlapping instruction execution and improved throughput.
|
||||
The architecture is optimized for high cycle efficiency when executing computationally intensive signal-processing workloads. It features a dual Harvard load store architecture and two seperate \ac{ALU}s, which enables the execution of two \ac{MAC} operations, two memory operations (load/store) and two pointer updates in a single prcoessor cycle.\\ \\
|
||||
The \ac{DSP} includes a set of registers, including
|
||||
|
||||
|
||||
Advanced addressing modes — such as cyclic and bit-reversed addressing — facilitate efficient implementation of common DSP algorithms. Additional architectural features include hardware-supported zero-overhead looping, nested loop structures, interrupt handling, power-management mechanisms, and on-chip debugging capabilities such as JTAG, breakpoints, and watchpoints. Overall, the architecture is designed to support both control-flow operations and high-throughput signal-processing tasks within low-power embedded environments.
|
||||
\subsection{Implementation of the ANR algorithm on the DSP}
|
||||
Advanced addressing modes — such as cyclic and bit-reversed addressing — facilitate efficient implementation of common \ac{DSP} algorithms. Additional architectural features include hardware-supported zero-overhead looping, nested loop structures, interrupt handling, power-management mechanisms, and on-chip debugging capabilities such as JTAG, breakpoints, and watchpoints. Overall, the architecture is designed to support both control-flow operations and high-throughput signal-processing tasks within low-power embedded environments.
|
||||
\subsection{Implementation of the \ac{ANR} algorithm on the \ac{DSP}}
|
||||
\subsection{First optimization approach: algorithm implementation}
|
||||
\subsection{Second optimization approach: hybrid ANR algorithm}
|
||||
\subsection{Second optimization approach: hybrid \ac{ANR} algorithm}
|
||||
|
||||
|
||||
Reference in New Issue
Block a user