VHDL implementation of FWL RLS algorithm
The Frisch-Waugh-Lovell (FWL) Recursive Least Squares (RLS) algorithm has been recently proposed as an RLS algorithm with lower computational cost and better numerical properties. We propose a VHDL implementation that has been successfully implemented on a Xilinx Virtex-7 FPGA. The FWL RLS algorithm has a complexity of L2 + O(L) products, instead of 1.5L2 O(L) as in conventional RLS algorithms. Because it removes all matrix operations, separating an L input vector problem into L separate scalar problems, it is stable and often faster in fixed-point arithmetic than conventional RLS. An RLS filter with L inputs is composed of L stages, and the i-th stage (1 = 1, 2, ?, L) has L+ 2-i inputs and L + l-i outputs. The implementation is based on two blocks: a scalar estimation block (EB), which is instantiated once for every layer, and L + l-i identical filtering blocks (FB). For a L-input RLS model, there are L EBs and L(L + l)/2 FBs. Adding an input involves instantiating one additional EB and L + 1 FBs. Removing one input requires the removal of the first layer. The VHDL structure is modular and can be easily adjusted for different values of L. We also present estimated hardware costs over a wide range of L values.