Large scale optimization | Ricerc@Sapienza

Preconditioned nonlinear conjugate gradient methods based on a modified secant equation

This paper includes a twofold result for the Nonlinear Conjugate Gradient (NCG) method, in large scale unconstrained optimization. First we consider a theoretical analysis, where preconditioning is embedded in a strong convergence framework of an NCG method from the literature. Mild conditions to be satisfied by the preconditioners are defined, in order to preserve NCG convergence. As a second task, we also detail the use of novel matrix-free preconditioners for NCG.

Block layer decomposition schemes for training deep neural networks

Deep feedforward neural networks’ (DFNNs) weight estimation relies on the solution of a very large nonconvex optimization problem that may have many local (no global) minimizers, saddle points and large plateaus. Furthermore, the time needed to find good solutions of the training problem heavily depends on both the number of samples and the number of weights (variables). In this work, we show how block coordinate descent (BCD) methods can be fruitful applied to DFNN weight optimization problem and embedded in online frameworks possibly avoiding bad stationary points.