Skip to content

Latest commit

 

History

History
215 lines (163 loc) · 7.69 KB

optimize.md

File metadata and controls

215 lines (163 loc) · 7.69 KB

Numerical optimization

.ml.optimize Optimization functions

BFGS The Broyden-Fletcher-Goldfarb-Shanno algorithm

The .ml.optimize namespace contains functions that relate to the application of numerical optimization techniques. Such techniques are used to find local or global minima of user-provided objective functions and are central to many statistical models.

Version 1.1.0

Version 1.1.0 provides the initial versions of numerical optimization tools to the Machine Learning Toolkit.

The Broyden-Fletcher-Goldfarb-Shanno algorithm is provided initially due to its use in the generation of the SARIMA model provided with the toolkit.

This functionality will be expanded over time to include more diverse optimization algorithms

Broyden-Fletcher-Goldfarb-Shanno algorithm

In numerical optimization, the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm is a quasi-Newton iterative method for solving unconstrained non-linear optimization problems. This is a class of hill-climbing optimization techniques that seek a stationary, preferably twice-differentiable, solution to the objective function.

🌐 Rationale behind the algorithm


.ml.optimize.BFGS

Optimize an objective function based on a provided initial guess using the BFGS algorithm

.ml.optimize.BFGS[func;x0;args;params]

Where

  • func is a lambda/projection defining the objective function to be optimized. This function should take as input 1 or 2 arguments depending on if non-changing additional arguments args are required.
  • x0 is a numerical list denoting the initial guess for the numerical function arguments which are to be optimized.
  • args are any additional non-changing arguments required by the function, this can be a list or dictionary in the case that additional arguments are required, or a (::) or () if they are not.
  • params are optional parameters which can be used to modify the behavior of the algorithm. If no modifications are to be made this should be (::). Otherwise this should be a dictionary. The following keys outline all possible changes which can be made to the default system behavior and the currently-accepted default values

returns a dictionary containing:

key content
xVals optimized input arguments for the functions based on initial guess x0
funcRet return from the objective function at position xVals
numIter number of iterations to reach the optimal values

Optional entries for params:

key type default explanation
display boolean 0b are the results at each optimization iteration to be printed to standard output
optimIter integer 0W maximum number of iterations before optimization procedure is terminated
zoomIter integer 10 maximum number of iterations when finding optimal zoom position
wolfeIter integer 10 maximum number of iterations when attempting to calculate strong Wolfe conditions
norm integer 0W order of function used to calculate the gradient norm. This can be 0W = maximum value, -0W = minimum value otherwise calculated via sum[abs[vec]xexp norm]xexp 1%norm
gtol float 1e-5 gradient norm must be less than this value before successful termination
geps float 1.5e-8 the absolute step size used for numerical approximation of the jacobian via forward differencing
stepSize float 0W the maximum allowable ‘alpha’ step size between calculations during the Wolfe condition search
c1 float 1e-4 Armijo rule condition used in calculation of strong Wolfe conditions
c2 float 0.9 curvature rule condition used in calculation of the strong Wolfe condition search

The following examples outline the optimization algorithm in use across a number of functions and with various starting points in 1-D and 2-D space.

⚠️ Consistency across platforms

When applying optimization algorithms in kdb+, subtracting small values from large to generate deltas to find the optimization direction may result in inconsistent results across operating systems.

This is due to potential floating-point precision differences at the machine level and issues with subtractions of floating point numbers more generally. These issues may be seen in the application of .ml.ts.SARIMA and .ml.optimize.BFGS.

👉 Precision

Example Test minimization on a function with a single global minimum

  1. Define a quadratic equation to be minimized:

    $$f(x) = x^{2} -4x$$

  2. Test at starting point x0 = 4

  3. Test at starting point x0 = -2

  4. Plot results of both showing starting points and point of convergence

// function definition
q)func:{xexp[x[0];2]-4*x[0]}
// define initial starting condition as x0
q)x0:enlist 4f
// apply the BFGS optimization algorithm
q).ml.optimize.BFGS[func;x0;();::]
xVals  | ,2f
funcRet| -4f
numIter| 2

// define a different initial starting condition as x1
q)x1:enlist -2f
q).ml.optimize.BFGS[func;x1;();::]
xVals  | ,2f
funcRet| -4f
numIter| 3

example 1

Example Test minimization on a function which has multiple minima, demonstrate functionality to print optimization information on a per iteration basis.

  1. Define a sinusoidal equation to be minimized:

    $$f(x) = sin(x)$$

  2. Test at starting point x0 = 7 and displaying per iteration output

  3. Test at starting point x1 = 8.5

  4. Plot results of both showing starting points and point of convergence

// function definition
q)func:{sin x 0}
// define initial starting condition as x0
q)x0:enlist 7f
// apply the BFGS optimization algorithm
q).ml.optimize.BFGS[func;x0;();``display!(::;1b)]
xk     | ,3.984391
fk     | -0.7465079
prev_fk| 0.6569866
gk     | ,-0.6653765
prev_xk| ,7f
hess   | ,,2.124748
gnorm  | 0.6653765
I      | ,,1f
idx    | 1

xk     | ,4.691269
fk     | -0.999777
prev_fk| -0.7465079
gk     | ,-0.02111795
prev_xk| ,3.984391
hess   | ,,1.097197
gnorm  | 0.02111795
I      | ,,1f
idx    | 2

xk     | ,4.71444
fk     | -0.9999979
prev_fk| -0.999777
gk     | ,0.002051036
prev_xk| ,4.691269
hess   | ,,1.000068
gnorm  | 0.002051036
I      | ,,1f
idx    | 3

xk     | ,4.712389
fk     | -1f
prev_fk| -0.9999979
gk     | ,-1.415721e-07
prev_xk| ,4.71444
hess   | ,,0.9999985
gnorm  | 1.415721e-07
I      | ,,1f
idx    | 4

xVals  | ,4.712389
funcRet| -1f
numIter| 4

// define a different initial starting condition as x1
q)x1:enlist 8.5f
q).ml.optimize.BFGS[func;x1;();::]
xVals  | ,10.99556
funcRet| -1f
numIter| 3

example2

Example Test minimization of a multi-dimensional function, showing the effect of modifying the number of allowed iterations in the application of optimization

  1. Define a function which optimizes on two arguments

    $$f(x) = (x[0]-1)^2 + (x[1]-2.5)^2$$

  2. Test at starting point x[0] = 10, x[1] = 20 with an unconstrained number of iterations

  3. Test at starting point x[0] = 10, x[1] = 20 with an a maximum of 3 iterations allowed

// define the function to be minimized
q)func:{xexp[x[0]-1;2]+xexp[x[1]-2.5;2]}

// define the initial starting point
q)x0:10 20f

// optimize the function with an unconstrained number of iterations
q).ml.optimize.BFGS[func;x0;();::]
xVals  | 0.9999915 2.500004
funcRet| 8.848111e-11
numIter| 4

// optimize the function stopping after 3 iterations
q).ml.optimize.BFGS[func;x0;();``optimIter!(::;3)]
xVals  | 2.473882 5.365867
funcRet| 10.38552
numIter| 3