Efficient very large-scale integration architecture design of proportionate-type least mean square adaptive filters

ABSTRACT


INTRODUCTION
In the digital world there is a need for higher level of intelligence and accuracy.Digital circuits are the basic building blocks for any smart system and signal processing plays a vital role in deciding the performance of the circuits [1], [2].In signal processing, filters are most usual circuits we can find.The need of filtering is remarkable; hence filtering is having lot of importance because of the noise presence [3].
Any noise can enter to the circuit in any means and can degrade the circuit performance.In order to make the circuit less sensitive to the noise, an efficient filter needs to be designed.The basic principle of filter is to filter any unwanted signal which should provide only the desired signal [4] which is required for circuit operation.The response of the filter with different noise is considered while evaluating a filter design.There are two types of filters, namely, finite impulse response (FIR) filters and infinite impulse response (IIR) filters.As the name suggests, the output of FIR filter is finite and becomes zero after some time period, while for an IIR filter, the output response is infinite [4], [5].
Any filter should adapt to the change in its operating environment, the filter which can adapt to the changes in operating environment is an adaptive filter.Adaptive filters are realized either with IIR and FIR filter where the coefficients of filter can be updated in order to get the desired signal.Hence, by varying the coefficients of the FIR filters according to the change in operating condition we can make the filter to adapt to the change in operating condition [6].
The key component of the adaptive filter is an algorithm, which updates the filter coefficients iteratively with respect to the changes in environment conditions.Least mean square (LMS) is one such algorithm which is used to mimic the response of the desired filter by estimating the filter coefficients that can produce the LMS of the error signal, where error signal is the difference between desired signal and input signal with noise [7].LMS algorithm suffers fixed step size () parameter which leads to gradient noise amplification problem and it has weak convergence.To overcome these problems, normalized least mean square (NLMS) algorithm is used.NLMS algorithm offers normalized step size and modification of weight update with small positive number () which makes the NLMS performance better than LMS algorithm [8].Many other algorithms such as least mean logarithmic square (LMLS) which combines the advantages of both LMS and least mean fourth (LMF) algorithms, least logarithmic absolute difference (LLAD) algorithm which offers advantages of LMS and sign LMS (SLMS) algorithm [7], [8].
Proportionate LMS (PLMS) algorithms are introduced in order to track the sparse impulse response faster.PNLMS give better performance than NLMS with faster convergence and improved mean square error (MSE) [9].Delayed μ-law proportionate normalized least mean square (DMPNLMS) is the proposed algorithm, which is an improvement over μ-law proportionate normalized least mean square (MPNLMS) algorithm.The remainder of the paper is organized as follows: section 2 describes the proposed architecture and the implementation of DMPNLMS algorithm, section 3 discusses the simulation results and section 4 is conclusion of the present work.

PROPOSED DMPNLMS ARCHITECTURE
The architecture of DMPNLMS filter is as shown in Figure 1.The input signal () is fed into tap coefficients with each having an arithmetic delay of 'X' units.To introduce this delay 'X', unit delay registers are used.The output of tap coefficient is fed into parallel prefix logarithmic adder.The output of the adder is multiplexed with desired signal, which contains some erroneous.This signal is fed into the desired function block for the DMPNLMS filter residues.The loopback path is formed for continuous '' n number of iterations due to adaption [10]- [12].Thus, the architecture of the designed DMPNLMS filter consists of tap coefficient, parallel prefix logarithmic adder, desired function and a desired block.In adaptive filtering, the tap coefficient is crucial.It represents the weights used to create the filter's output from various input values.In order to improve the effectiveness of the filter, these coefficients are modified during the learning process.

71
A crucial element for effective computation inside the filter is the parallel prefix logarithmic adder.To speed up filter processes, it executes arithmetic calculations, frequently in parallel.The filter's objective is defined by the desired function.It stands in for the desired result that the filter seeks to produce [13]- [16].Each of the subcomponents are discussed in the upcoming sub sections.The coefficient update equation of the DMPNLMS is as shown in (1) which is slightly different from NLMS with the extra step size update matrix Q as (1).
The diagonal matrix controls the step size and is evaluated using ( 2) and (3).
The control matrix elements can be expressed as (3): where, the negative infinity at the initial stage is overcome by inserting a constant 1 in the logarithm function.The denominator function (1 + ) normalizes (|^ℎ1()|) in the range [0, 1].The value of  is a small positive number, and should be chosen such that it supports the background. = 0.001 is a good choice as the echo below -60 dB is negligible.The general design methodology used in the current work is summarized as [17]- [20]: − The convergence rate and stability is done using a MATLAB code simulation.This solidifies the concept for the current and previous works.Using MATLAB simulation, the algorithm is verified for the correct functionality.

−
The field programmable gate array (FPGA) synthesis is carried out using Vivado Kintex-7 to implement this digital system.− Application specific integrated circuit (ASIC) synthesis is also carried out with area, timing and power parameter information.

Tap coefficient
The tap coefficient is the primary block of the DMPNLMS filter.It consists of an adder which is liable for adding the input values with error control block's value, so as to boost the signal.The output of this adder is fed to an AND gate that performs "AND" function of the loop backed error control block output with the output of the adder [21]- [23].The output of AND circuit is "OR" ed to introduce a delay. number of cascaded OR gates are used to produces  delay unit.The input is then "AND" ed with the delayed output of the OR gate to provide tap coefficient output [24], [25].

Ladner-Fischer logarithmic adder
In the current work, Ladner-Fischer adder is being used.It consists of black cell, gray cell and AO (AND-XOR) block.The black cell is accountable for generation and propagation.The gray cell is liable for generation alone.The black cell is the combination of two AND cells and one OR cell.It gives out two outputs, one from the AND gate which is the propagation signal and the other is from the OR gate, which is the generate signal.It is the combination of AND gate and OR gate.The output is solely the generate signal.

Desired function
The pivotal arrangement of the filter's processing sequence positions the "desired function" immediately following the parallel prefix Ladner-Fischer logarithmic adder.Its primary role revolves around conducting subtraction operations, involving the deduction of the logarithmic adder's output from the initial input signal.This subtraction process forms the bedrock of the filter's adaptation mechanism by quantifying the disparity or deviation between the expected output, as characterized by the desired function, and the present output produced by the filter.

Desired block and error control block
The desired block is used to extract error.It performs the subtraction operation of the output of desired function block and the output of the parallel prefix Ladner-Fischer logarithmic adder.This is the block where the algorithm resides.It is responsible for the formation of loop back in the system [26]- [28].

CONCLUSION
The DMPNLMS algorithm shows a improvement in MSE, convergence rate and greater stability.The synthesis results show that it is area efficient and delay efficient, hence it becomes viable for applications where higher speed of operation is required.Proportional-type adaptive algorithms offer a substantial enhancement in the convergence performance of sparse adaptive filters when compared to the traditional LMS algorithm.Nevertheless, the significant computational burden associated with these algorithms presents a formidable challenge for their implementation in VLSI.In response to this challenge, we have put forth a number of modifications aimed at simplifying the original proportionate-type normalized LMS (Pt-NLMS) algorithms.We have also introduced efficient VLSI designs tailored to these modified algorithms.Among our proposals, the DMPNLMS stands out as a robust VLSI solution.We believe that our research will serve as a catalyst for other researchers to explore more efficient hardware solutions, thus advancing the capabilities of sparse adaptive filter architectures through the use of streamlined arithmetic circuits.

Table 1
shows the comparison between the logic levels, area, fan out and wire length of different types of parallel prefix logarithmic adders.

Table 1 .
Comparison of different types of parallel prefix adders

Table 2 .
Improvement on MSE for different algorithms

Table 3 .
Delay, area, and power reports for different algorithms with 32-bit and 64-bit filter lengths