.ml.optimize
Optimization functions
BFGS
The Broyden-Fletcher-Goldfarb-Shanno algorithm
The .ml.optimize
namespace contains functions that relate to the application of numerical optimization techniques. Such techniques are used to find local or global minima of user-provided objective functions and are central to many statistical models.
Version 1.1.0
Version 1.1.0 provides the initial versions of numerical optimization tools to the Machine Learning Toolkit.
The Broyden-Fletcher-Goldfarb-Shanno algorithm is provided initially due to its use in the generation of the SARIMA model provided with the toolkit.
This functionality will be expanded over time to include more diverse optimization algorithms
In numerical optimization, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm is a quasi-Newton iterative method for solving unconstrained non-linear optimization problems. This is a class of hill-climbing optimization techniques that seek a stationary, preferably twice-differentiable, solution to the objective function.
🌐 Rationale behind the algorithm
Optimize an objective function based on a provided initial guess using the BFGS algorithm
.ml.optimize.BFGS[func;x0;args;params]
Where
func
is a lambda/projection defining the objective function to be optimized. This function should take as input 1 or 2 arguments depending on if non-changing additional argumentsargs
are required.x0
is a numerical list denoting the initial guess for the numerical function arguments which are to be optimized.args
are any additional non-changing arguments required by the function, this can be a list or dictionary in the case that additional arguments are required, or a(::)
or()
if they are not.params
are optional parameters which can be used to modify the behavior of the algorithm. If no modifications are to be made this should be(::)
. Otherwise this should be a dictionary. The following keys outline all possible changes which can be made to the default system behavior and the currently-accepted default values
returns a dictionary containing:
key | content |
---|---|
xVals | optimized input arguments for the functions based on initial guess x0 |
funcRet | return from the objective function at position xVals |
numIter | number of iterations to reach the optimal values |
Optional entries for params
:
key | type | default | explanation |
---|---|---|---|
display |
boolean | 0b | are the results at each optimization iteration to be printed to standard output |
optimIter |
integer | 0W | maximum number of iterations before optimization procedure is terminated |
zoomIter |
integer | 10 | maximum number of iterations when finding optimal zoom position |
wolfeIter |
integer | 10 | maximum number of iterations when attempting to calculate strong Wolfe conditions |
norm |
integer | 0W | order of function used to calculate the gradient norm. This can be 0W = maximum value , -0W = minimum value otherwise calculated via sum[abs[vec]xexp norm]xexp 1%norm |
gtol |
float | 1e-5 | gradient norm must be less than this value before successful termination |
geps |
float | 1.5e-8 | the absolute step size used for numerical approximation of the jacobian via forward differencing |
stepSize |
float | 0W | the maximum allowable ‘alpha’ step size between calculations during the Wolfe condition search |
c1 |
float | 1e-4 | Armijo rule condition used in calculation of strong Wolfe conditions |
c2 |
float | 0.9 | curvature rule condition used in calculation of the strong Wolfe condition search |
The following examples outline the optimization algorithm in use across a number of functions and with various starting points in 1-D and 2-D space.
⚠️ Consistency across platformsWhen applying optimization algorithms in kdb+, subtracting small values from large to generate deltas to find the optimization direction may result in inconsistent results across operating systems.
This is due to potential floating-point precision differences at the machine level and issues with subtractions of floating point numbers more generally. These issues may be seen in the application of
.ml.ts.SARIMA
and.ml.optimize.BFGS
.
Example Test minimization on a function with a single global minimum
-
Define a quadratic equation to be minimized:
$$f(x) = x^{2} -4x$$ -
Test at starting point
x0 = 4
-
Test at starting point
x0 = -2
-
Plot results of both showing starting points and point of convergence
// function definition
q)func:{xexp[x[0];2]-4*x[0]}
// define initial starting condition as x0
q)x0:enlist 4f
// apply the BFGS optimization algorithm
q).ml.optimize.BFGS[func;x0;();::]
xVals | ,2f
funcRet| -4f
numIter| 2
// define a different initial starting condition as x1
q)x1:enlist -2f
q).ml.optimize.BFGS[func;x1;();::]
xVals | ,2f
funcRet| -4f
numIter| 3
Example Test minimization on a function which has multiple minima, demonstrate functionality to print optimization information on a per iteration basis.
-
Define a sinusoidal equation to be minimized:
$$f(x) = sin(x)$$ -
Test at starting point
x0 = 7
and displaying per iteration output -
Test at starting point
x1 = 8.5
-
Plot results of both showing starting points and point of convergence
// function definition
q)func:{sin x 0}
// define initial starting condition as x0
q)x0:enlist 7f
// apply the BFGS optimization algorithm
q).ml.optimize.BFGS[func;x0;();``display!(::;1b)]
xk | ,3.984391
fk | -0.7465079
prev_fk| 0.6569866
gk | ,-0.6653765
prev_xk| ,7f
hess | ,,2.124748
gnorm | 0.6653765
I | ,,1f
idx | 1
xk | ,4.691269
fk | -0.999777
prev_fk| -0.7465079
gk | ,-0.02111795
prev_xk| ,3.984391
hess | ,,1.097197
gnorm | 0.02111795
I | ,,1f
idx | 2
xk | ,4.71444
fk | -0.9999979
prev_fk| -0.999777
gk | ,0.002051036
prev_xk| ,4.691269
hess | ,,1.000068
gnorm | 0.002051036
I | ,,1f
idx | 3
xk | ,4.712389
fk | -1f
prev_fk| -0.9999979
gk | ,-1.415721e-07
prev_xk| ,4.71444
hess | ,,0.9999985
gnorm | 1.415721e-07
I | ,,1f
idx | 4
xVals | ,4.712389
funcRet| -1f
numIter| 4
// define a different initial starting condition as x1
q)x1:enlist 8.5f
q).ml.optimize.BFGS[func;x1;();::]
xVals | ,10.99556
funcRet| -1f
numIter| 3
Example Test minimization of a multi-dimensional function, showing the effect of modifying the number of allowed iterations in the application of optimization
-
Define a function which optimizes on two arguments
$$f(x) = (x[0]-1)^2 + (x[1]-2.5)^2$$ -
Test at starting point
x[0] = 10, x[1] = 20
with an unconstrained number of iterations -
Test at starting point
x[0] = 10, x[1] = 20
with an a maximum of 3 iterations allowed
// define the function to be minimized
q)func:{xexp[x[0]-1;2]+xexp[x[1]-2.5;2]}
// define the initial starting point
q)x0:10 20f
// optimize the function with an unconstrained number of iterations
q).ml.optimize.BFGS[func;x0;();::]
xVals | 0.9999915 2.500004
funcRet| 8.848111e-11
numIter| 4
// optimize the function stopping after 3 iterations
q).ml.optimize.BFGS[func;x0;();``optimIter!(::;3)]
xVals | 2.473882 5.365867
funcRet| 10.38552
numIter| 3