optimize: Hessian rescaling in Quasi-Newton methods
Created by: btracey
In the Quasi-Newton methods, especially close to the optimum, it often gets to the point that the search direction is very nearly perpendicular to the gradient. This harms our convergence by at least an order of magnitude or two in many cases. We should implement some form of discovery switch for this case and do (approximate) Hessian restart and/or some other form of conditioning.