first commit

This commit is contained in:
Carla Floricel
2022-08-02 09:52:52 -04:00
parent 417ea8660b
commit 05e52aa52b
10444 changed files with 2300232 additions and 0 deletions

View File

@@ -0,0 +1,434 @@
"""
=====================================================
Optimization and root finding (:mod:`scipy.optimize`)
=====================================================
.. currentmodule:: scipy.optimize
SciPy ``optimize`` provides functions for minimizing (or maximizing)
objective functions, possibly subject to constraints. It includes
solvers for nonlinear problems (with support for both local and global
optimization algorithms), linear programing, constrained
and nonlinear least-squares, root finding, and curve fitting.
Common functions and objects, shared across different solvers, are:
.. autosummary::
:toctree: generated/
show_options - Show specific options optimization solvers.
OptimizeResult - The optimization result returned by some optimizers.
OptimizeWarning - The optimization encountered problems.
Optimization
============
Scalar functions optimization
-----------------------------
.. autosummary::
:toctree: generated/
minimize_scalar - Interface for minimizers of univariate functions
The `minimize_scalar` function supports the following methods:
.. toctree::
optimize.minimize_scalar-brent
optimize.minimize_scalar-bounded
optimize.minimize_scalar-golden
Local (multivariate) optimization
---------------------------------
.. autosummary::
:toctree: generated/
minimize - Interface for minimizers of multivariate functions.
The `minimize` function supports the following methods:
.. toctree::
optimize.minimize-neldermead
optimize.minimize-powell
optimize.minimize-cg
optimize.minimize-bfgs
optimize.minimize-newtoncg
optimize.minimize-lbfgsb
optimize.minimize-tnc
optimize.minimize-cobyla
optimize.minimize-slsqp
optimize.minimize-trustconstr
optimize.minimize-dogleg
optimize.minimize-trustncg
optimize.minimize-trustkrylov
optimize.minimize-trustexact
Constraints are passed to `minimize` function as a single object or
as a list of objects from the following classes:
.. autosummary::
:toctree: generated/
NonlinearConstraint - Class defining general nonlinear constraints.
LinearConstraint - Class defining general linear constraints.
Simple bound constraints are handled separately and there is a special class
for them:
.. autosummary::
:toctree: generated/
Bounds - Bound constraints.
Quasi-Newton strategies implementing `HessianUpdateStrategy`
interface can be used to approximate the Hessian in `minimize`
function (available only for the 'trust-constr' method). Available
quasi-Newton methods implementing this interface are:
.. autosummary::
:toctree: generated/
BFGS - Broyden-Fletcher-Goldfarb-Shanno (BFGS) Hessian update strategy.
SR1 - Symmetric-rank-1 Hessian update strategy.
Global optimization
-------------------
.. autosummary::
:toctree: generated/
basinhopping - Basinhopping stochastic optimizer.
brute - Brute force searching optimizer.
differential_evolution - stochastic minimization using differential evolution.
shgo - simplicial homology global optimisation
dual_annealing - Dual annealing stochastic optimizer.
Least-squares and curve fitting
===============================
Nonlinear least-squares
-----------------------
.. autosummary::
:toctree: generated/
least_squares - Solve a nonlinear least-squares problem with bounds on the variables.
Linear least-squares
--------------------
.. autosummary::
:toctree: generated/
nnls - Linear least-squares problem with non-negativity constraint.
lsq_linear - Linear least-squares problem with bound constraints.
Curve fitting
-------------
.. autosummary::
:toctree: generated/
curve_fit -- Fit curve to a set of points.
Root finding
============
Scalar functions
----------------
.. autosummary::
:toctree: generated/
root_scalar - Unified interface for nonlinear solvers of scalar functions.
brentq - quadratic interpolation Brent method.
brenth - Brent method, modified by Harris with hyperbolic extrapolation.
ridder - Ridder's method.
bisect - Bisection method.
newton - Newton's method (also Secant and Halley's methods).
toms748 - Alefeld, Potra & Shi Algorithm 748.
RootResults - The root finding result returned by some root finders.
The `root_scalar` function supports the following methods:
.. toctree::
optimize.root_scalar-brentq
optimize.root_scalar-brenth
optimize.root_scalar-bisect
optimize.root_scalar-ridder
optimize.root_scalar-newton
optimize.root_scalar-toms748
optimize.root_scalar-secant
optimize.root_scalar-halley
The table below lists situations and appropriate methods, along with
*asymptotic* convergence rates per iteration (and per function evaluation)
for successful convergence to a simple root(*).
Bisection is the slowest of them all, adding one bit of accuracy for each
function evaluation, but is guaranteed to converge.
The other bracketing methods all (eventually) increase the number of accurate
bits by about 50% for every function evaluation.
The derivative-based methods, all built on `newton`, can converge quite quickly
if the initial value is close to the root. They can also be applied to
functions defined on (a subset of) the complex plane.
+-------------+----------+----------+-----------+-------------+-------------+----------------+
| Domain of f | Bracket? | Derivatives? | Solvers | Convergence |
+ + +----------+-----------+ +-------------+----------------+
| | | `fprime` | `fprime2` | | Guaranteed? | Rate(s)(*) |
+=============+==========+==========+===========+=============+=============+================+
| `R` | Yes | N/A | N/A | - bisection | - Yes | - 1 "Linear" |
| | | | | - brentq | - Yes | - >=1, <= 1.62 |
| | | | | - brenth | - Yes | - >=1, <= 1.62 |
| | | | | - ridder | - Yes | - 2.0 (1.41) |
| | | | | - toms748 | - Yes | - 2.7 (1.65) |
+-------------+----------+----------+-----------+-------------+-------------+----------------+
| `R` or `C` | No | No | No | secant | No | 1.62 (1.62) |
+-------------+----------+----------+-----------+-------------+-------------+----------------+
| `R` or `C` | No | Yes | No | newton | No | 2.00 (1.41) |
+-------------+----------+----------+-----------+-------------+-------------+----------------+
| `R` or `C` | No | Yes | Yes | halley | No | 3.00 (1.44) |
+-------------+----------+----------+-----------+-------------+-------------+----------------+
.. seealso::
`scipy.optimize.cython_optimize` -- Typed Cython versions of zeros functions
Fixed point finding:
.. autosummary::
:toctree: generated/
fixed_point - Single-variable fixed-point solver.
Multidimensional
----------------
.. autosummary::
:toctree: generated/
root - Unified interface for nonlinear solvers of multivariate functions.
The `root` function supports the following methods:
.. toctree::
optimize.root-hybr
optimize.root-lm
optimize.root-broyden1
optimize.root-broyden2
optimize.root-anderson
optimize.root-linearmixing
optimize.root-diagbroyden
optimize.root-excitingmixing
optimize.root-krylov
optimize.root-dfsane
Linear programming
==================
.. autosummary::
:toctree: generated/
linprog -- Unified interface for minimizers of linear programming problems.
The `linprog` function supports the following methods:
.. toctree::
optimize.linprog-simplex
optimize.linprog-interior-point
optimize.linprog-revised_simplex
optimize.linprog-highs-ipm
optimize.linprog-highs-ds
optimize.linprog-highs
The simplex, interior-point, and revised simplex methods support callback
functions, such as:
.. autosummary::
:toctree: generated/
linprog_verbose_callback -- Sample callback function for linprog (simplex).
Assignment problems
===================
.. autosummary::
:toctree: generated/
linear_sum_assignment -- Solves the linear-sum assignment problem.
quadratic_assignment -- Solves the quadratic assignment problem.
The `quadratic_assignment` function supports the following methods:
.. toctree::
optimize.qap-faq
optimize.qap-2opt
Utilities
=========
Finite-difference approximation
-------------------------------
.. autosummary::
:toctree: generated/
approx_fprime - Approximate the gradient of a scalar function.
check_grad - Check the supplied derivative using finite differences.
Line search
-----------
.. autosummary::
:toctree: generated/
bracket - Bracket a minimum, given two starting points.
line_search - Return a step that satisfies the strong Wolfe conditions.
Hessian approximation
---------------------
.. autosummary::
:toctree: generated/
LbfgsInvHessProduct - Linear operator for L-BFGS approximate inverse Hessian.
HessianUpdateStrategy - Interface for implementing Hessian update strategies
Benchmark problems
------------------
.. autosummary::
:toctree: generated/
rosen - The Rosenbrock function.
rosen_der - The derivative of the Rosenbrock function.
rosen_hess - The Hessian matrix of the Rosenbrock function.
rosen_hess_prod - Product of the Rosenbrock Hessian with a vector.
Legacy functions
================
The functions below are not recommended for use in new scripts;
all of these methods are accessible via a newer, more consistent
interfaces, provided by the interfaces above.
Optimization
------------
General-purpose multivariate methods:
.. autosummary::
:toctree: generated/
fmin - Nelder-Mead Simplex algorithm.
fmin_powell - Powell's (modified) level set method.
fmin_cg - Non-linear (Polak-Ribiere) conjugate gradient algorithm.
fmin_bfgs - Quasi-Newton method (Broydon-Fletcher-Goldfarb-Shanno).
fmin_ncg - Line-search Newton Conjugate Gradient.
Constrained multivariate methods:
.. autosummary::
:toctree: generated/
fmin_l_bfgs_b - Zhu, Byrd, and Nocedal's constrained optimizer.
fmin_tnc - Truncated Newton code.
fmin_cobyla - Constrained optimization by linear approximation.
fmin_slsqp - Minimization using sequential least-squares programming.
Univariate (scalar) minimization methods:
.. autosummary::
:toctree: generated/
fminbound - Bounded minimization of a scalar function.
brent - 1-D function minimization using Brent method.
golden - 1-D function minimization using Golden Section method.
Least-squares
-------------
.. autosummary::
:toctree: generated/
leastsq - Minimize the sum of squares of M equations in N unknowns.
Root finding
------------
General nonlinear solvers:
.. autosummary::
:toctree: generated/
fsolve - Non-linear multivariable equation solver.
broyden1 - Broyden's first method.
broyden2 - Broyden's second method.
Large-scale nonlinear solvers:
.. autosummary::
:toctree: generated/
newton_krylov
anderson
Simple iteration solvers:
.. autosummary::
:toctree: generated/
excitingmixing
linearmixing
diagbroyden
"""
from ._optimize import *
from ._minimize import *
from ._root import *
from ._root_scalar import *
from ._minpack_py import *
from ._zeros_py import *
from ._lbfgsb_py import fmin_l_bfgs_b, LbfgsInvHessProduct
from ._tnc import fmin_tnc
from ._cobyla_py import fmin_cobyla
from ._nonlin import *
from ._slsqp_py import fmin_slsqp
from ._nnls import nnls
from ._basinhopping import basinhopping
from ._linprog import linprog, linprog_verbose_callback
from ._lsap import linear_sum_assignment
from ._differentialevolution import differential_evolution
from ._lsq import least_squares, lsq_linear
from ._constraints import (NonlinearConstraint,
LinearConstraint,
Bounds)
from ._hessian_update_strategy import HessianUpdateStrategy, BFGS, SR1
from ._shgo import shgo
from ._dual_annealing import dual_annealing
from ._qap import quadratic_assignment
# Deprecated namespaces, to be removed in v2.0.0
from . import (
cobyla, lbfgsb, linesearch, minpack, minpack2, moduleTNC, nonlin, optimize,
slsqp, tnc, zeros
)
__all__ = [s for s in dir() if not s.startswith('_')]
from scipy._lib._testutils import PytestTester
test = PytestTester(__name__)
del PytestTester

View File

@@ -0,0 +1,21 @@
from __future__ import annotations
from typing import TYPE_CHECKING, Tuple
import numpy as np
if TYPE_CHECKING:
import numpy.typing as npt
def nnls(
a: npt.ArrayLike,
mda: int,
m: int,
n: int,
b: npt.ArrayLike,
x: npt.ArrayLike,
rnorm: float,
w: float,
zz: float,
index_bn: int,
mode: int,
maxiter: int
) -> Tuple[npt.ArrayLike, float, int]: ...

View File

@@ -0,0 +1,796 @@
"""
basinhopping: The basinhopping global optimization algorithm
"""
import numpy as np
import math
from numpy import cos, sin
import scipy.optimize
from scipy._lib._util import check_random_state
__all__ = ['basinhopping']
class Storage:
"""
Class used to store the lowest energy structure
"""
def __init__(self, minres):
self._add(minres)
def _add(self, minres):
self.minres = minres
self.minres.x = np.copy(minres.x)
def update(self, minres):
if minres.fun < self.minres.fun:
self._add(minres)
return True
else:
return False
def get_lowest(self):
return self.minres
class BasinHoppingRunner:
"""This class implements the core of the basinhopping algorithm.
x0 : ndarray
The starting coordinates.
minimizer : callable
The local minimizer, with signature ``result = minimizer(x)``.
The return value is an `optimize.OptimizeResult` object.
step_taking : callable
This function displaces the coordinates randomly. Signature should
be ``x_new = step_taking(x)``. Note that `x` may be modified in-place.
accept_tests : list of callables
Each test is passed the kwargs `f_new`, `x_new`, `f_old` and
`x_old`. These tests will be used to judge whether or not to accept
the step. The acceptable return values are True, False, or ``"force
accept"``. If any of the tests return False then the step is rejected.
If ``"force accept"``, then this will override any other tests in
order to accept the step. This can be used, for example, to forcefully
escape from a local minimum that ``basinhopping`` is trapped in.
disp : bool, optional
Display status messages.
"""
def __init__(self, x0, minimizer, step_taking, accept_tests, disp=False):
self.x = np.copy(x0)
self.minimizer = minimizer
self.step_taking = step_taking
self.accept_tests = accept_tests
self.disp = disp
self.nstep = 0
# initialize return object
self.res = scipy.optimize.OptimizeResult()
self.res.minimization_failures = 0
# do initial minimization
minres = minimizer(self.x)
if not minres.success:
self.res.minimization_failures += 1
if self.disp:
print("warning: basinhopping: local minimization failure")
self.x = np.copy(minres.x)
self.energy = minres.fun
if self.disp:
print("basinhopping step %d: f %g" % (self.nstep, self.energy))
# initialize storage class
self.storage = Storage(minres)
if hasattr(minres, "nfev"):
self.res.nfev = minres.nfev
if hasattr(minres, "njev"):
self.res.njev = minres.njev
if hasattr(minres, "nhev"):
self.res.nhev = minres.nhev
def _monte_carlo_step(self):
"""Do one Monte Carlo iteration
Randomly displace the coordinates, minimize, and decide whether
or not to accept the new coordinates.
"""
# Take a random step. Make a copy of x because the step_taking
# algorithm might change x in place
x_after_step = np.copy(self.x)
x_after_step = self.step_taking(x_after_step)
# do a local minimization
minres = self.minimizer(x_after_step)
x_after_quench = minres.x
energy_after_quench = minres.fun
if not minres.success:
self.res.minimization_failures += 1
if self.disp:
print("warning: basinhopping: local minimization failure")
if hasattr(minres, "nfev"):
self.res.nfev += minres.nfev
if hasattr(minres, "njev"):
self.res.njev += minres.njev
if hasattr(minres, "nhev"):
self.res.nhev += minres.nhev
# accept the move based on self.accept_tests. If any test is False,
# then reject the step. If any test returns the special string
# 'force accept', then accept the step regardless. This can be used
# to forcefully escape from a local minimum if normal basin hopping
# steps are not sufficient.
accept = True
for test in self.accept_tests:
testres = test(f_new=energy_after_quench, x_new=x_after_quench,
f_old=self.energy, x_old=self.x)
if testres == 'force accept':
accept = True
break
elif testres is None:
raise ValueError("accept_tests must return True, False, or "
"'force accept'")
elif not testres:
accept = False
# Report the result of the acceptance test to the take step class.
# This is for adaptive step taking
if hasattr(self.step_taking, "report"):
self.step_taking.report(accept, f_new=energy_after_quench,
x_new=x_after_quench, f_old=self.energy,
x_old=self.x)
return accept, minres
def one_cycle(self):
"""Do one cycle of the basinhopping algorithm
"""
self.nstep += 1
new_global_min = False
accept, minres = self._monte_carlo_step()
if accept:
self.energy = minres.fun
self.x = np.copy(minres.x)
new_global_min = self.storage.update(minres)
# print some information
if self.disp:
self.print_report(minres.fun, accept)
if new_global_min:
print("found new global minimum on step %d with function"
" value %g" % (self.nstep, self.energy))
# save some variables as BasinHoppingRunner attributes
self.xtrial = minres.x
self.energy_trial = minres.fun
self.accept = accept
return new_global_min
def print_report(self, energy_trial, accept):
"""print a status update"""
minres = self.storage.get_lowest()
print("basinhopping step %d: f %g trial_f %g accepted %d "
" lowest_f %g" % (self.nstep, self.energy, energy_trial,
accept, minres.fun))
class AdaptiveStepsize:
"""
Class to implement adaptive stepsize.
This class wraps the step taking class and modifies the stepsize to
ensure the true acceptance rate is as close as possible to the target.
Parameters
----------
takestep : callable
The step taking routine. Must contain modifiable attribute
takestep.stepsize
accept_rate : float, optional
The target step acceptance rate
interval : int, optional
Interval for how often to update the stepsize
factor : float, optional
The step size is multiplied or divided by this factor upon each
update.
verbose : bool, optional
Print information about each update
"""
def __init__(self, takestep, accept_rate=0.5, interval=50, factor=0.9,
verbose=True):
self.takestep = takestep
self.target_accept_rate = accept_rate
self.interval = interval
self.factor = factor
self.verbose = verbose
self.nstep = 0
self.nstep_tot = 0
self.naccept = 0
def __call__(self, x):
return self.take_step(x)
def _adjust_step_size(self):
old_stepsize = self.takestep.stepsize
accept_rate = float(self.naccept) / self.nstep
if accept_rate > self.target_accept_rate:
# We're accepting too many steps. This generally means we're
# trapped in a basin. Take bigger steps.
self.takestep.stepsize /= self.factor
else:
# We're not accepting enough steps. Take smaller steps.
self.takestep.stepsize *= self.factor
if self.verbose:
print("adaptive stepsize: acceptance rate %f target %f new "
"stepsize %g old stepsize %g" % (accept_rate,
self.target_accept_rate, self.takestep.stepsize,
old_stepsize))
def take_step(self, x):
self.nstep += 1
self.nstep_tot += 1
if self.nstep % self.interval == 0:
self._adjust_step_size()
return self.takestep(x)
def report(self, accept, **kwargs):
"called by basinhopping to report the result of the step"
if accept:
self.naccept += 1
class RandomDisplacement:
"""Add a random displacement of maximum size `stepsize` to each coordinate.
Calling this updates `x` in-place.
Parameters
----------
stepsize : float, optional
Maximum stepsize in any dimension
random_gen : {None, int, `numpy.random.Generator`,
`numpy.random.RandomState`}, optional
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used,
seeded with `seed`.
If `seed` is already a ``Generator`` or ``RandomState`` instance then
that instance is used.
"""
def __init__(self, stepsize=0.5, random_gen=None):
self.stepsize = stepsize
self.random_gen = check_random_state(random_gen)
def __call__(self, x):
x += self.random_gen.uniform(-self.stepsize, self.stepsize,
np.shape(x))
return x
class MinimizerWrapper:
"""
wrap a minimizer function as a minimizer class
"""
def __init__(self, minimizer, func=None, **kwargs):
self.minimizer = minimizer
self.func = func
self.kwargs = kwargs
def __call__(self, x0):
if self.func is None:
return self.minimizer(x0, **self.kwargs)
else:
return self.minimizer(self.func, x0, **self.kwargs)
class Metropolis:
"""Metropolis acceptance criterion.
Parameters
----------
T : float
The "temperature" parameter for the accept or reject criterion.
random_gen : {None, int, `numpy.random.Generator`,
`numpy.random.RandomState`}, optional
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used,
seeded with `seed`.
If `seed` is already a ``Generator`` or ``RandomState`` instance then
that instance is used.
Random number generator used for acceptance test.
"""
def __init__(self, T, random_gen=None):
# Avoid ZeroDivisionError since "MBH can be regarded as a special case
# of the BH framework with the Metropolis criterion, where temperature
# T = 0." (Reject all steps that increase energy.)
self.beta = 1.0 / T if T != 0 else float('inf')
self.random_gen = check_random_state(random_gen)
def accept_reject(self, energy_new, energy_old):
"""
If new energy is lower than old, it will always be accepted.
If new is higher than old, there is a chance it will be accepted,
less likely for larger differences.
"""
with np.errstate(invalid='ignore'):
# The energy values being fed to Metropolis are 1-length arrays, and if
# they are equal, their difference is 0, which gets multiplied by beta,
# which is inf, and array([0]) * float('inf') causes
#
# RuntimeWarning: invalid value encountered in multiply
#
# Ignore this warning so so when the algorithm is on a flat plane, it always
# accepts the step, to try to move off the plane.
prod = -(energy_new - energy_old) * self.beta
w = math.exp(min(0, prod))
rand = self.random_gen.uniform()
return w >= rand
def __call__(self, **kwargs):
"""
f_new and f_old are mandatory in kwargs
"""
return bool(self.accept_reject(kwargs["f_new"],
kwargs["f_old"]))
def basinhopping(func, x0, niter=100, T=1.0, stepsize=0.5,
minimizer_kwargs=None, take_step=None, accept_test=None,
callback=None, interval=50, disp=False, niter_success=None,
seed=None, *, target_accept_rate=0.5, stepwise_factor=0.9):
"""Find the global minimum of a function using the basin-hopping algorithm.
Basin-hopping is a two-phase method that combines a global stepping
algorithm with local minimization at each step. Designed to mimic
the natural process of energy minimization of clusters of atoms, it works
well for similar problems with "funnel-like, but rugged" energy landscapes
[5]_.
As the step-taking, step acceptance, and minimization methods are all
customizable, this function can also be used to implement other two-phase
methods.
Parameters
----------
func : callable ``f(x, *args)``
Function to be optimized. ``args`` can be passed as an optional item
in the dict ``minimizer_kwargs``
x0 : array_like
Initial guess.
niter : integer, optional
The number of basin-hopping iterations. There will be a total of
``niter + 1`` runs of the local minimizer.
T : float, optional
The "temperature" parameter for the accept or reject criterion. Higher
"temperatures" mean that larger jumps in function value will be
accepted. For best results ``T`` should be comparable to the
separation (in function value) between local minima.
stepsize : float, optional
Maximum step size for use in the random displacement.
minimizer_kwargs : dict, optional
Extra keyword arguments to be passed to the local minimizer
``scipy.optimize.minimize()`` Some important options could be:
method : str
The minimization method (e.g. ``"L-BFGS-B"``)
args : tuple
Extra arguments passed to the objective function (``func``) and
its derivatives (Jacobian, Hessian).
take_step : callable ``take_step(x)``, optional
Replace the default step-taking routine with this routine. The default
step-taking routine is a random displacement of the coordinates, but
other step-taking algorithms may be better for some systems.
``take_step`` can optionally have the attribute ``take_step.stepsize``.
If this attribute exists, then ``basinhopping`` will adjust
``take_step.stepsize`` in order to try to optimize the global minimum
search.
accept_test : callable, ``accept_test(f_new=f_new, x_new=x_new, f_old=fold, x_old=x_old)``, optional
Define a test which will be used to judge whether or not to accept the
step. This will be used in addition to the Metropolis test based on
"temperature" ``T``. The acceptable return values are True,
False, or ``"force accept"``. If any of the tests return False
then the step is rejected. If the latter, then this will override any
other tests in order to accept the step. This can be used, for example,
to forcefully escape from a local minimum that ``basinhopping`` is
trapped in.
callback : callable, ``callback(x, f, accept)``, optional
A callback function which will be called for all minima found. ``x``
and ``f`` are the coordinates and function value of the trial minimum,
and ``accept`` is whether or not that minimum was accepted. This can
be used, for example, to save the lowest N minima found. Also,
``callback`` can be used to specify a user defined stop criterion by
optionally returning True to stop the ``basinhopping`` routine.
interval : integer, optional
interval for how often to update the ``stepsize``
disp : bool, optional
Set to True to print status messages
niter_success : integer, optional
Stop the run if the global minimum candidate remains the same for this
number of iterations.
seed : {None, int, `numpy.random.Generator`,
`numpy.random.RandomState`}, optional
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used,
seeded with `seed`.
If `seed` is already a ``Generator`` or ``RandomState`` instance then
that instance is used.
Specify `seed` for repeatable minimizations. The random numbers
generated with this seed only affect the default Metropolis
`accept_test` and the default `take_step`. If you supply your own
`take_step` and `accept_test`, and these functions use random
number generation, then those functions are responsible for the state
of their random number generator.
target_accept_rate : float, optional
The target acceptance rate that is used to adjust the `stepsize`.
If the current acceptance rate is greater than the target,
then the `stepsize` is increased. Otherwise, it is decreased.
Range is (0, 1). Default is 0.5.
.. versionadded:: 1.8.0
stepwise_factor : float, optional
The `stepsize` is multiplied or divided by this stepwise factor upon
each update. Range is (0, 1). Default is 0.9.
.. versionadded:: 1.8.0
Returns
-------
res : OptimizeResult
The optimization result represented as a ``OptimizeResult`` object.
Important attributes are: ``x`` the solution array, ``fun`` the value
of the function at the solution, and ``message`` which describes the
cause of the termination. The ``OptimizeResult`` object returned by the
selected minimizer at the lowest minimum is also contained within this
object and can be accessed through the ``lowest_optimization_result``
attribute. See `OptimizeResult` for a description of other attributes.
See Also
--------
minimize :
The local minimization function called once for each basinhopping step.
``minimizer_kwargs`` is passed to this routine.
Notes
-----
Basin-hopping is a stochastic algorithm which attempts to find the global
minimum of a smooth scalar function of one or more variables [1]_ [2]_ [3]_
[4]_. The algorithm in its current form was described by David Wales and
Jonathan Doye [2]_ http://www-wales.ch.cam.ac.uk/.
The algorithm is iterative with each cycle composed of the following
features
1) random perturbation of the coordinates
2) local minimization
3) accept or reject the new coordinates based on the minimized function
value
The acceptance test used here is the Metropolis criterion of standard Monte
Carlo algorithms, although there are many other possibilities [3]_.
This global minimization method has been shown to be extremely efficient
for a wide variety of problems in physics and chemistry. It is
particularly useful when the function has many minima separated by large
barriers. See the Cambridge Cluster Database
http://www-wales.ch.cam.ac.uk/CCD.html for databases of molecular systems
that have been optimized primarily using basin-hopping. This database
includes minimization problems exceeding 300 degrees of freedom.
See the free software program GMIN (http://www-wales.ch.cam.ac.uk/GMIN) for
a Fortran implementation of basin-hopping. This implementation has many
different variations of the procedure described above, including more
advanced step taking algorithms and alternate acceptance criterion.
For stochastic global optimization there is no way to determine if the true
global minimum has actually been found. Instead, as a consistency check,
the algorithm can be run from a number of different random starting points
to ensure the lowest minimum found in each example has converged to the
global minimum. For this reason, ``basinhopping`` will by default simply
run for the number of iterations ``niter`` and return the lowest minimum
found. It is left to the user to ensure that this is in fact the global
minimum.
Choosing ``stepsize``: This is a crucial parameter in ``basinhopping`` and
depends on the problem being solved. The step is chosen uniformly in the
region from x0-stepsize to x0+stepsize, in each dimension. Ideally, it
should be comparable to the typical separation (in argument values) between
local minima of the function being optimized. ``basinhopping`` will, by
default, adjust ``stepsize`` to find an optimal value, but this may take
many iterations. You will get quicker results if you set a sensible
initial value for ``stepsize``.
Choosing ``T``: The parameter ``T`` is the "temperature" used in the
Metropolis criterion. Basinhopping steps are always accepted if
``func(xnew) < func(xold)``. Otherwise, they are accepted with
probability::
exp( -(func(xnew) - func(xold)) / T )
So, for best results, ``T`` should to be comparable to the typical
difference (in function values) between local minima. (The height of
"walls" between local minima is irrelevant.)
If ``T`` is 0, the algorithm becomes Monotonic Basin-Hopping, in which all
steps that increase energy are rejected.
.. versionadded:: 0.12.0
References
----------
.. [1] Wales, David J. 2003, Energy Landscapes, Cambridge University Press,
Cambridge, UK.
.. [2] Wales, D J, and Doye J P K, Global Optimization by Basin-Hopping and
the Lowest Energy Structures of Lennard-Jones Clusters Containing up to
110 Atoms. Journal of Physical Chemistry A, 1997, 101, 5111.
.. [3] Li, Z. and Scheraga, H. A., Monte Carlo-minimization approach to the
multiple-minima problem in protein folding, Proc. Natl. Acad. Sci. USA,
1987, 84, 6611.
.. [4] Wales, D. J. and Scheraga, H. A., Global optimization of clusters,
crystals, and biomolecules, Science, 1999, 285, 1368.
.. [5] Olson, B., Hashmi, I., Molloy, K., and Shehu1, A., Basin Hopping as
a General and Versatile Optimization Framework for the Characterization
of Biological Macromolecules, Advances in Artificial Intelligence,
Volume 2012 (2012), Article ID 674832, :doi:`10.1155/2012/674832`
Examples
--------
The following example is a 1-D minimization problem, with many
local minima superimposed on a parabola.
>>> from scipy.optimize import basinhopping
>>> func = lambda x: np.cos(14.5 * x - 0.3) + (x + 0.2) * x
>>> x0=[1.]
Basinhopping, internally, uses a local minimization algorithm. We will use
the parameter ``minimizer_kwargs`` to tell basinhopping which algorithm to
use and how to set up that minimizer. This parameter will be passed to
``scipy.optimize.minimize()``.
>>> minimizer_kwargs = {"method": "BFGS"}
>>> ret = basinhopping(func, x0, minimizer_kwargs=minimizer_kwargs,
... niter=200)
>>> print("global minimum: x = %.4f, f(x0) = %.4f" % (ret.x, ret.fun))
global minimum: x = -0.1951, f(x0) = -1.0009
Next consider a 2-D minimization problem. Also, this time, we
will use gradient information to significantly speed up the search.
>>> def func2d(x):
... f = np.cos(14.5 * x[0] - 0.3) + (x[1] + 0.2) * x[1] + (x[0] +
... 0.2) * x[0]
... df = np.zeros(2)
... df[0] = -14.5 * np.sin(14.5 * x[0] - 0.3) + 2. * x[0] + 0.2
... df[1] = 2. * x[1] + 0.2
... return f, df
We'll also use a different local minimization algorithm. Also, we must tell
the minimizer that our function returns both energy and gradient (Jacobian).
>>> minimizer_kwargs = {"method":"L-BFGS-B", "jac":True}
>>> x0 = [1.0, 1.0]
>>> ret = basinhopping(func2d, x0, minimizer_kwargs=minimizer_kwargs,
... niter=200)
>>> print("global minimum: x = [%.4f, %.4f], f(x0) = %.4f" % (ret.x[0],
... ret.x[1],
... ret.fun))
global minimum: x = [-0.1951, -0.1000], f(x0) = -1.0109
Here is an example using a custom step-taking routine. Imagine you want
the first coordinate to take larger steps than the rest of the coordinates.
This can be implemented like so:
>>> class MyTakeStep:
... def __init__(self, stepsize=0.5):
... self.stepsize = stepsize
... self.rng = np.random.default_rng()
... def __call__(self, x):
... s = self.stepsize
... x[0] += self.rng.uniform(-2.*s, 2.*s)
... x[1:] += self.rng.uniform(-s, s, x[1:].shape)
... return x
Since ``MyTakeStep.stepsize`` exists basinhopping will adjust the magnitude
of ``stepsize`` to optimize the search. We'll use the same 2-D function as
before
>>> mytakestep = MyTakeStep()
>>> ret = basinhopping(func2d, x0, minimizer_kwargs=minimizer_kwargs,
... niter=200, take_step=mytakestep)
>>> print("global minimum: x = [%.4f, %.4f], f(x0) = %.4f" % (ret.x[0],
... ret.x[1],
... ret.fun))
global minimum: x = [-0.1951, -0.1000], f(x0) = -1.0109
Now, let's do an example using a custom callback function which prints the
value of every minimum found
>>> def print_fun(x, f, accepted):
... print("at minimum %.4f accepted %d" % (f, int(accepted)))
We'll run it for only 10 basinhopping steps this time.
>>> rng = np.random.default_rng()
>>> ret = basinhopping(func2d, x0, minimizer_kwargs=minimizer_kwargs,
... niter=10, callback=print_fun, seed=rng)
at minimum 0.4159 accepted 1
at minimum -0.4317 accepted 1
at minimum -1.0109 accepted 1
at minimum -0.9073 accepted 1
at minimum -0.4317 accepted 0
at minimum -0.1021 accepted 1
at minimum -0.7425 accepted 1
at minimum -0.9073 accepted 1
at minimum -0.4317 accepted 0
at minimum -0.7425 accepted 1
at minimum -0.9073 accepted 1
The minimum at -1.0109 is actually the global minimum, found already on the
8th iteration.
Now let's implement bounds on the problem using a custom ``accept_test``:
>>> class MyBounds:
... def __init__(self, xmax=[1.1,1.1], xmin=[-1.1,-1.1] ):
... self.xmax = np.array(xmax)
... self.xmin = np.array(xmin)
... def __call__(self, **kwargs):
... x = kwargs["x_new"]
... tmax = bool(np.all(x <= self.xmax))
... tmin = bool(np.all(x >= self.xmin))
... return tmax and tmin
>>> mybounds = MyBounds()
>>> ret = basinhopping(func2d, x0, minimizer_kwargs=minimizer_kwargs,
... niter=10, accept_test=mybounds)
"""
if target_accept_rate <= 0. or target_accept_rate >= 1.:
raise ValueError('target_accept_rate has to be in range (0, 1)')
if stepwise_factor <= 0. or stepwise_factor >= 1.:
raise ValueError('stepwise_factor has to be in range (0, 1)')
x0 = np.array(x0)
# set up the np.random generator
rng = check_random_state(seed)
# set up minimizer
if minimizer_kwargs is None:
minimizer_kwargs = dict()
wrapped_minimizer = MinimizerWrapper(scipy.optimize.minimize, func,
**minimizer_kwargs)
# set up step-taking algorithm
if take_step is not None:
if not callable(take_step):
raise TypeError("take_step must be callable")
# if take_step.stepsize exists then use AdaptiveStepsize to control
# take_step.stepsize
if hasattr(take_step, "stepsize"):
take_step_wrapped = AdaptiveStepsize(
take_step, interval=interval,
accept_rate=target_accept_rate,
factor=stepwise_factor,
verbose=disp)
else:
take_step_wrapped = take_step
else:
# use default
displace = RandomDisplacement(stepsize=stepsize, random_gen=rng)
take_step_wrapped = AdaptiveStepsize(displace, interval=interval,
accept_rate=target_accept_rate,
factor=stepwise_factor,
verbose=disp)
# set up accept tests
accept_tests = []
if accept_test is not None:
if not callable(accept_test):
raise TypeError("accept_test must be callable")
accept_tests = [accept_test]
# use default
metropolis = Metropolis(T, random_gen=rng)
accept_tests.append(metropolis)
if niter_success is None:
niter_success = niter + 2
bh = BasinHoppingRunner(x0, wrapped_minimizer, take_step_wrapped,
accept_tests, disp=disp)
# The wrapped minimizer is called once during construction of
# BasinHoppingRunner, so run the callback
if callable(callback):
callback(bh.storage.minres.x, bh.storage.minres.fun, True)
# start main iteration loop
count, i = 0, 0
message = ["requested number of basinhopping iterations completed"
" successfully"]
for i in range(niter):
new_global_min = bh.one_cycle()
if callable(callback):
# should we pass a copy of x?
val = callback(bh.xtrial, bh.energy_trial, bh.accept)
if val is not None:
if val:
message = ["callback function requested stop early by"
"returning True"]
break
count += 1
if new_global_min:
count = 0
elif count > niter_success:
message = ["success condition satisfied"]
break
# prepare return object
res = bh.res
res.lowest_optimization_result = bh.storage.get_lowest()
res.x = np.copy(res.lowest_optimization_result.x)
res.fun = res.lowest_optimization_result.fun
res.message = message
res.nit = i + 1
res.success = res.lowest_optimization_result.success
return res
def _test_func2d_nograd(x):
f = (cos(14.5 * x[0] - 0.3) + (x[1] + 0.2) * x[1] + (x[0] + 0.2) * x[0]
+ 1.010876184442655)
return f
def _test_func2d(x):
f = (cos(14.5 * x[0] - 0.3) + (x[0] + 0.2) * x[0] + cos(14.5 * x[1] -
0.3) + (x[1] + 0.2) * x[1] + x[0] * x[1] + 1.963879482144252)
df = np.zeros(2)
df[0] = -14.5 * sin(14.5 * x[0] - 0.3) + 2. * x[0] + 0.2 + x[1]
df[1] = -14.5 * sin(14.5 * x[1] - 0.3) + 2. * x[1] + 0.2 + x[0]
return f, df
if __name__ == "__main__":
print("\n\nminimize a 2-D function without gradient")
# minimum expected at ~[-0.195, -0.1]
kwargs = {"method": "L-BFGS-B"}
x0 = np.array([1.0, 1.])
scipy.optimize.minimize(_test_func2d_nograd, x0, **kwargs)
ret = basinhopping(_test_func2d_nograd, x0, minimizer_kwargs=kwargs,
niter=200, disp=False)
print("minimum expected at func([-0.195, -0.1]) = 0.0")
print(ret)
print("\n\ntry a harder 2-D problem")
kwargs = {"method": "L-BFGS-B", "jac": True}
x0 = np.array([1.0, 1.0])
ret = basinhopping(_test_func2d, x0, minimizer_kwargs=kwargs, niter=200,
disp=False)
print("minimum expected at ~, func([-0.19415263, -0.19415263]) = 0")
print(ret)

View File

@@ -0,0 +1,296 @@
"""
Interface to Constrained Optimization By Linear Approximation
Functions
---------
.. autosummary::
:toctree: generated/
fmin_cobyla
"""
import functools
from threading import RLock
import numpy as np
from scipy.optimize import _cobyla as cobyla
from ._optimize import OptimizeResult, _check_unknown_options
try:
from itertools import izip
except ImportError:
izip = zip
__all__ = ['fmin_cobyla']
# Workarund as _cobyla.minimize is not threadsafe
# due to an unknown f2py bug and can segfault,
# see gh-9658.
_module_lock = RLock()
def synchronized(func):
@functools.wraps(func)
def wrapper(*args, **kwargs):
with _module_lock:
return func(*args, **kwargs)
return wrapper
@synchronized
def fmin_cobyla(func, x0, cons, args=(), consargs=None, rhobeg=1.0,
rhoend=1e-4, maxfun=1000, disp=None, catol=2e-4,
*, callback=None):
"""
Minimize a function using the Constrained Optimization By Linear
Approximation (COBYLA) method. This method wraps a FORTRAN
implementation of the algorithm.
Parameters
----------
func : callable
Function to minimize. In the form func(x, \\*args).
x0 : ndarray
Initial guess.
cons : sequence
Constraint functions; must all be ``>=0`` (a single function
if only 1 constraint). Each function takes the parameters `x`
as its first argument, and it can return either a single number or
an array or list of numbers.
args : tuple, optional
Extra arguments to pass to function.
consargs : tuple, optional
Extra arguments to pass to constraint functions (default of None means
use same extra arguments as those passed to func).
Use ``()`` for no extra arguments.
rhobeg : float, optional
Reasonable initial changes to the variables.
rhoend : float, optional
Final accuracy in the optimization (not precisely guaranteed). This
is a lower bound on the size of the trust region.
disp : {0, 1, 2, 3}, optional
Controls the frequency of output; 0 implies no output.
maxfun : int, optional
Maximum number of function evaluations.
catol : float, optional
Absolute tolerance for constraint violations.
callback : callable, optional
Called after each iteration, as ``callback(x)``, where ``x`` is the
current parameter vector.
Returns
-------
x : ndarray
The argument that minimises `f`.
See also
--------
minimize: Interface to minimization algorithms for multivariate
functions. See the 'COBYLA' `method` in particular.
Notes
-----
This algorithm is based on linear approximations to the objective
function and each constraint. We briefly describe the algorithm.
Suppose the function is being minimized over k variables. At the
jth iteration the algorithm has k+1 points v_1, ..., v_(k+1),
an approximate solution x_j, and a radius RHO_j.
(i.e., linear plus a constant) approximations to the objective
function and constraint functions such that their function values
agree with the linear approximation on the k+1 points v_1,.., v_(k+1).
This gives a linear program to solve (where the linear approximations
of the constraint functions are constrained to be non-negative).
However, the linear approximations are likely only good
approximations near the current simplex, so the linear program is
given the further requirement that the solution, which
will become x_(j+1), must be within RHO_j from x_j. RHO_j only
decreases, never increases. The initial RHO_j is rhobeg and the
final RHO_j is rhoend. In this way COBYLA's iterations behave
like a trust region algorithm.
Additionally, the linear program may be inconsistent, or the
approximation may give poor improvement. For details about
how these issues are resolved, as well as how the points v_i are
updated, refer to the source code or the references below.
References
----------
Powell M.J.D. (1994), "A direct search optimization method that models
the objective and constraint functions by linear interpolation.", in
Advances in Optimization and Numerical Analysis, eds. S. Gomez and
J-P Hennart, Kluwer Academic (Dordrecht), pp. 51-67
Powell M.J.D. (1998), "Direct search algorithms for optimization
calculations", Acta Numerica 7, 287-336
Powell M.J.D. (2007), "A view of algorithms for optimization without
derivatives", Cambridge University Technical Report DAMTP 2007/NA03
Examples
--------
Minimize the objective function f(x,y) = x*y subject
to the constraints x**2 + y**2 < 1 and y > 0::
>>> def objective(x):
... return x[0]*x[1]
...
>>> def constr1(x):
... return 1 - (x[0]**2 + x[1]**2)
...
>>> def constr2(x):
... return x[1]
...
>>> from scipy.optimize import fmin_cobyla
>>> fmin_cobyla(objective, [0.0, 0.1], [constr1, constr2], rhoend=1e-7)
array([-0.70710685, 0.70710671])
The exact solution is (-sqrt(2)/2, sqrt(2)/2).
"""
err = "cons must be a sequence of callable functions or a single"\
" callable function."
try:
len(cons)
except TypeError as e:
if callable(cons):
cons = [cons]
else:
raise TypeError(err) from e
else:
for thisfunc in cons:
if not callable(thisfunc):
raise TypeError(err)
if consargs is None:
consargs = args
# build constraints
con = tuple({'type': 'ineq', 'fun': c, 'args': consargs} for c in cons)
# options
opts = {'rhobeg': rhobeg,
'tol': rhoend,
'disp': disp,
'maxiter': maxfun,
'catol': catol,
'callback': callback}
sol = _minimize_cobyla(func, x0, args, constraints=con,
**opts)
if disp and not sol['success']:
print("COBYLA failed to find a solution: %s" % (sol.message,))
return sol['x']
@synchronized
def _minimize_cobyla(fun, x0, args=(), constraints=(),
rhobeg=1.0, tol=1e-4, maxiter=1000,
disp=False, catol=2e-4, callback=None,
**unknown_options):
"""
Minimize a scalar function of one or more variables using the
Constrained Optimization BY Linear Approximation (COBYLA) algorithm.
Options
-------
rhobeg : float
Reasonable initial changes to the variables.
tol : float
Final accuracy in the optimization (not precisely guaranteed).
This is a lower bound on the size of the trust region.
disp : bool
Set to True to print convergence messages. If False,
`verbosity` is ignored as set to 0.
maxiter : int
Maximum number of function evaluations.
catol : float
Tolerance (absolute) for constraint violations
callback : callable, optional
Called after each iteration, as ``callback(x)``, where ``x`` is the
current parameter vector.
"""
_check_unknown_options(unknown_options)
maxfun = maxiter
rhoend = tol
iprint = int(bool(disp))
# check constraints
if isinstance(constraints, dict):
constraints = (constraints, )
for ic, con in enumerate(constraints):
# check type
try:
ctype = con['type'].lower()
except KeyError as e:
raise KeyError('Constraint %d has no type defined.' % ic) from e
except TypeError as e:
raise TypeError('Constraints must be defined using a '
'dictionary.') from e
except AttributeError as e:
raise TypeError("Constraint's type must be a string.") from e
else:
if ctype != 'ineq':
raise ValueError("Constraints of type '%s' not handled by "
"COBYLA." % con['type'])
# check function
if 'fun' not in con:
raise KeyError('Constraint %d has no function defined.' % ic)
# check extra arguments
if 'args' not in con:
con['args'] = ()
# m is the total number of constraint values
# it takes into account that some constraints may be vector-valued
cons_lengths = []
for c in constraints:
f = c['fun'](x0, *c['args'])
try:
cons_length = len(f)
except TypeError:
cons_length = 1
cons_lengths.append(cons_length)
m = sum(cons_lengths)
def calcfc(x, con):
f = fun(np.copy(x), *args)
i = 0
for size, c in izip(cons_lengths, constraints):
con[i: i + size] = c['fun'](x, *c['args'])
i += size
return f
def wrapped_callback(x):
if callback is not None:
callback(np.copy(x))
info = np.zeros(4, np.float64)
xopt, info = cobyla.minimize(calcfc, m=m, x=np.copy(x0), rhobeg=rhobeg,
rhoend=rhoend, iprint=iprint, maxfun=maxfun,
dinfo=info, callback=wrapped_callback)
if info[3] > catol:
# Check constraint violation
info[0] = 4
return OptimizeResult(x=xopt,
status=int(info[0]),
success=info[0] == 1,
message={1: 'Optimization terminated successfully.',
2: 'Maximum number of function evaluations '
'has been exceeded.',
3: 'Rounding errors are becoming damaging '
'in COBYLA subroutine.',
4: 'Did not converge to a solution '
'satisfying the constraints. See '
'`maxcv` for magnitude of violation.',
5: 'NaN result encountered.'
}.get(info[0], 'Unknown exit status.'),
nfev=int(info[1]),
fun=info[2],
maxcv=info[3])

View File

@@ -0,0 +1,479 @@
"""Constraints definition for minimize."""
import numpy as np
from ._hessian_update_strategy import BFGS
from ._differentiable_functions import (
VectorFunction, LinearVectorFunction, IdentityVectorFunction)
from ._optimize import OptimizeWarning
from warnings import warn
from numpy.testing import suppress_warnings
from scipy.sparse import issparse
def _arr_to_scalar(x):
# If x is a numpy array, return x.item(). This will
# fail if the array has more than one element.
return x.item() if isinstance(x, np.ndarray) else x
class NonlinearConstraint:
"""Nonlinear constraint on the variables.
The constraint has the general inequality form::
lb <= fun(x) <= ub
Here the vector of independent variables x is passed as ndarray of shape
(n,) and ``fun`` returns a vector with m components.
It is possible to use equal bounds to represent an equality constraint or
infinite bounds to represent a one-sided constraint.
Parameters
----------
fun : callable
The function defining the constraint.
The signature is ``fun(x) -> array_like, shape (m,)``.
lb, ub : array_like
Lower and upper bounds on the constraint. Each array must have the
shape (m,) or be a scalar, in the latter case a bound will be the same
for all components of the constraint. Use ``np.inf`` with an
appropriate sign to specify a one-sided constraint.
Set components of `lb` and `ub` equal to represent an equality
constraint. Note that you can mix constraints of different types:
interval, one-sided or equality, by setting different components of
`lb` and `ub` as necessary.
jac : {callable, '2-point', '3-point', 'cs'}, optional
Method of computing the Jacobian matrix (an m-by-n matrix,
where element (i, j) is the partial derivative of f[i] with
respect to x[j]). The keywords {'2-point', '3-point',
'cs'} select a finite difference scheme for the numerical estimation.
A callable must have the following signature:
``jac(x) -> {ndarray, sparse matrix}, shape (m, n)``.
Default is '2-point'.
hess : {callable, '2-point', '3-point', 'cs', HessianUpdateStrategy, None}, optional
Method for computing the Hessian matrix. The keywords
{'2-point', '3-point', 'cs'} select a finite difference scheme for
numerical estimation. Alternatively, objects implementing
`HessianUpdateStrategy` interface can be used to approximate the
Hessian. Currently available implementations are:
- `BFGS` (default option)
- `SR1`
A callable must return the Hessian matrix of ``dot(fun, v)`` and
must have the following signature:
``hess(x, v) -> {LinearOperator, sparse matrix, array_like}, shape (n, n)``.
Here ``v`` is ndarray with shape (m,) containing Lagrange multipliers.
keep_feasible : array_like of bool, optional
Whether to keep the constraint components feasible throughout
iterations. A single value set this property for all components.
Default is False. Has no effect for equality constraints.
finite_diff_rel_step: None or array_like, optional
Relative step size for the finite difference approximation. Default is
None, which will select a reasonable value automatically depending
on a finite difference scheme.
finite_diff_jac_sparsity: {None, array_like, sparse matrix}, optional
Defines the sparsity structure of the Jacobian matrix for finite
difference estimation, its shape must be (m, n). If the Jacobian has
only few non-zero elements in *each* row, providing the sparsity
structure will greatly speed up the computations. A zero entry means
that a corresponding element in the Jacobian is identically zero.
If provided, forces the use of 'lsmr' trust-region solver.
If None (default) then dense differencing will be used.
Notes
-----
Finite difference schemes {'2-point', '3-point', 'cs'} may be used for
approximating either the Jacobian or the Hessian. We, however, do not allow
its use for approximating both simultaneously. Hence whenever the Jacobian
is estimated via finite-differences, we require the Hessian to be estimated
using one of the quasi-Newton strategies.
The scheme 'cs' is potentially the most accurate, but requires the function
to correctly handles complex inputs and be analytically continuable to the
complex plane. The scheme '3-point' is more accurate than '2-point' but
requires twice as many operations.
Examples
--------
Constrain ``x[0] < sin(x[1]) + 1.9``
>>> from scipy.optimize import NonlinearConstraint
>>> con = lambda x: x[0] - np.sin(x[1])
>>> nlc = NonlinearConstraint(con, -np.inf, 1.9)
"""
def __init__(self, fun, lb, ub, jac='2-point', hess=BFGS(),
keep_feasible=False, finite_diff_rel_step=None,
finite_diff_jac_sparsity=None):
self.fun = fun
self.lb = lb
self.ub = ub
self.finite_diff_rel_step = finite_diff_rel_step
self.finite_diff_jac_sparsity = finite_diff_jac_sparsity
self.jac = jac
self.hess = hess
self.keep_feasible = keep_feasible
class LinearConstraint:
"""Linear constraint on the variables.
The constraint has the general inequality form::
lb <= A.dot(x) <= ub
Here the vector of independent variables x is passed as ndarray of shape
(n,) and the matrix A has shape (m, n).
It is possible to use equal bounds to represent an equality constraint or
infinite bounds to represent a one-sided constraint.
Parameters
----------
A : {array_like, sparse matrix}, shape (m, n)
Matrix defining the constraint.
lb, ub : array_like
Lower and upper bounds on the constraint. Each array must have the
shape (m,) or be a scalar, in the latter case a bound will be the same
for all components of the constraint. Use ``np.inf`` with an
appropriate sign to specify a one-sided constraint.
Set components of `lb` and `ub` equal to represent an equality
constraint. Note that you can mix constraints of different types:
interval, one-sided or equality, by setting different components of
`lb` and `ub` as necessary.
keep_feasible : array_like of bool, optional
Whether to keep the constraint components feasible throughout
iterations. A single value set this property for all components.
Default is False. Has no effect for equality constraints.
"""
def __init__(self, A, lb, ub, keep_feasible=False):
self.A = A
self.lb = lb
self.ub = ub
self.keep_feasible = keep_feasible
class Bounds:
"""Bounds constraint on the variables.
The constraint has the general inequality form::
lb <= x <= ub
It is possible to use equal bounds to represent an equality constraint or
infinite bounds to represent a one-sided constraint.
Parameters
----------
lb, ub : array_like
Lower and upper bounds on independent variables. Each array must
have the same size as x or be a scalar, in which case a bound will be
the same for all the variables. Set components of `lb` and `ub` equal
to fix a variable. Use ``np.inf`` with an appropriate sign to disable
bounds on all or some variables. Note that you can mix constraints of
different types: interval, one-sided or equality, by setting different
components of `lb` and `ub` as necessary.
keep_feasible : array_like of bool, optional
Whether to keep the constraint components feasible throughout
iterations. A single value set this property for all components.
Default is False. Has no effect for equality constraints.
"""
def __init__(self, lb, ub, keep_feasible=False):
self.lb = np.asarray(lb)
self.ub = np.asarray(ub)
self.keep_feasible = keep_feasible
def __repr__(self):
start = f"{type(self).__name__}({self.lb!r}, {self.ub!r}"
if np.any(self.keep_feasible):
end = f", keep_feasible={self.keep_feasible!r})"
else:
end = ")"
return start + end
class PreparedConstraint:
"""Constraint prepared from a user defined constraint.
On creation it will check whether a constraint definition is valid and
the initial point is feasible. If created successfully, it will contain
the attributes listed below.
Parameters
----------
constraint : {NonlinearConstraint, LinearConstraint`, Bounds}
Constraint to check and prepare.
x0 : array_like
Initial vector of independent variables.
sparse_jacobian : bool or None, optional
If bool, then the Jacobian of the constraint will be converted
to the corresponded format if necessary. If None (default), such
conversion is not made.
finite_diff_bounds : 2-tuple, optional
Lower and upper bounds on the independent variables for the finite
difference approximation, if applicable. Defaults to no bounds.
Attributes
----------
fun : {VectorFunction, LinearVectorFunction, IdentityVectorFunction}
Function defining the constraint wrapped by one of the convenience
classes.
bounds : 2-tuple
Contains lower and upper bounds for the constraints --- lb and ub.
These are converted to ndarray and have a size equal to the number of
the constraints.
keep_feasible : ndarray
Array indicating which components must be kept feasible with a size
equal to the number of the constraints.
"""
def __init__(self, constraint, x0, sparse_jacobian=None,
finite_diff_bounds=(-np.inf, np.inf)):
if isinstance(constraint, NonlinearConstraint):
fun = VectorFunction(constraint.fun, x0,
constraint.jac, constraint.hess,
constraint.finite_diff_rel_step,
constraint.finite_diff_jac_sparsity,
finite_diff_bounds, sparse_jacobian)
elif isinstance(constraint, LinearConstraint):
fun = LinearVectorFunction(constraint.A, x0, sparse_jacobian)
elif isinstance(constraint, Bounds):
fun = IdentityVectorFunction(x0, sparse_jacobian)
else:
raise ValueError("`constraint` of an unknown type is passed.")
m = fun.m
lb = np.asarray(constraint.lb, dtype=float)
ub = np.asarray(constraint.ub, dtype=float)
if lb.ndim == 0:
lb = np.resize(lb, m)
if ub.ndim == 0:
ub = np.resize(ub, m)
keep_feasible = np.asarray(constraint.keep_feasible, dtype=bool)
if keep_feasible.ndim == 0:
keep_feasible = np.resize(keep_feasible, m)
if keep_feasible.shape != (m,):
raise ValueError("`keep_feasible` has a wrong shape.")
mask = keep_feasible & (lb != ub)
f0 = fun.f
if np.any(f0[mask] < lb[mask]) or np.any(f0[mask] > ub[mask]):
raise ValueError("`x0` is infeasible with respect to some "
"inequality constraint with `keep_feasible` "
"set to True.")
self.fun = fun
self.bounds = (lb, ub)
self.keep_feasible = keep_feasible
def violation(self, x):
"""How much the constraint is exceeded by.
Parameters
----------
x : array-like
Vector of independent variables
Returns
-------
excess : array-like
How much the constraint is exceeded by, for each of the
constraints specified by `PreparedConstraint.fun`.
"""
with suppress_warnings() as sup:
sup.filter(UserWarning)
ev = self.fun.fun(np.asarray(x))
excess_lb = np.maximum(self.bounds[0] - ev, 0)
excess_ub = np.maximum(ev - self.bounds[1], 0)
return excess_lb + excess_ub
def new_bounds_to_old(lb, ub, n):
"""Convert the new bounds representation to the old one.
The new representation is a tuple (lb, ub) and the old one is a list
containing n tuples, ith containing lower and upper bound on a ith
variable.
If any of the entries in lb/ub are -np.inf/np.inf they are replaced by
None.
"""
lb = np.asarray(lb)
ub = np.asarray(ub)
if lb.ndim == 0:
lb = np.resize(lb, n)
if ub.ndim == 0:
ub = np.resize(ub, n)
lb = [float(x) if x > -np.inf else None for x in lb]
ub = [float(x) if x < np.inf else None for x in ub]
return list(zip(lb, ub))
def old_bound_to_new(bounds):
"""Convert the old bounds representation to the new one.
The new representation is a tuple (lb, ub) and the old one is a list
containing n tuples, ith containing lower and upper bound on a ith
variable.
If any of the entries in lb/ub are None they are replaced by
-np.inf/np.inf.
"""
lb, ub = zip(*bounds)
# Convert occurrences of None to -inf or inf, and replace occurrences of
# any numpy array x with x.item(). Then wrap the results in numpy arrays.
lb = np.array([float(_arr_to_scalar(x)) if x is not None else -np.inf
for x in lb])
ub = np.array([float(_arr_to_scalar(x)) if x is not None else np.inf
for x in ub])
return lb, ub
def strict_bounds(lb, ub, keep_feasible, n_vars):
"""Remove bounds which are not asked to be kept feasible."""
strict_lb = np.resize(lb, n_vars).astype(float)
strict_ub = np.resize(ub, n_vars).astype(float)
keep_feasible = np.resize(keep_feasible, n_vars)
strict_lb[~keep_feasible] = -np.inf
strict_ub[~keep_feasible] = np.inf
return strict_lb, strict_ub
def new_constraint_to_old(con, x0):
"""
Converts new-style constraint objects to old-style constraint dictionaries.
"""
if isinstance(con, NonlinearConstraint):
if (con.finite_diff_jac_sparsity is not None or
con.finite_diff_rel_step is not None or
not isinstance(con.hess, BFGS) or # misses user specified BFGS
con.keep_feasible):
warn("Constraint options `finite_diff_jac_sparsity`, "
"`finite_diff_rel_step`, `keep_feasible`, and `hess`"
"are ignored by this method.", OptimizeWarning)
fun = con.fun
if callable(con.jac):
jac = con.jac
else:
jac = None
else: # LinearConstraint
if con.keep_feasible:
warn("Constraint option `keep_feasible` is ignored by this "
"method.", OptimizeWarning)
A = con.A
if issparse(A):
A = A.toarray()
fun = lambda x: np.dot(A, x)
jac = lambda x: A
# FIXME: when bugs in VectorFunction/LinearVectorFunction are worked out,
# use pcon.fun.fun and pcon.fun.jac. Until then, get fun/jac above.
pcon = PreparedConstraint(con, x0)
lb, ub = pcon.bounds
i_eq = lb == ub
i_bound_below = np.logical_xor(lb != -np.inf, i_eq)
i_bound_above = np.logical_xor(ub != np.inf, i_eq)
i_unbounded = np.logical_and(lb == -np.inf, ub == np.inf)
if np.any(i_unbounded):
warn("At least one constraint is unbounded above and below. Such "
"constraints are ignored.", OptimizeWarning)
ceq = []
if np.any(i_eq):
def f_eq(x):
y = np.array(fun(x)).flatten()
return y[i_eq] - lb[i_eq]
ceq = [{"type": "eq", "fun": f_eq}]
if jac is not None:
def j_eq(x):
dy = jac(x)
if issparse(dy):
dy = dy.toarray()
dy = np.atleast_2d(dy)
return dy[i_eq, :]
ceq[0]["jac"] = j_eq
cineq = []
n_bound_below = np.sum(i_bound_below)
n_bound_above = np.sum(i_bound_above)
if n_bound_below + n_bound_above:
def f_ineq(x):
y = np.zeros(n_bound_below + n_bound_above)
y_all = np.array(fun(x)).flatten()
y[:n_bound_below] = y_all[i_bound_below] - lb[i_bound_below]
y[n_bound_below:] = -(y_all[i_bound_above] - ub[i_bound_above])
return y
cineq = [{"type": "ineq", "fun": f_ineq}]
if jac is not None:
def j_ineq(x):
dy = np.zeros((n_bound_below + n_bound_above, len(x0)))
dy_all = jac(x)
if issparse(dy_all):
dy_all = dy_all.toarray()
dy_all = np.atleast_2d(dy_all)
dy[:n_bound_below, :] = dy_all[i_bound_below]
dy[n_bound_below:, :] = -dy_all[i_bound_above]
return dy
cineq[0]["jac"] = j_ineq
old_constraints = ceq + cineq
if len(old_constraints) > 1:
warn("Equality and inequality constraints are specified in the same "
"element of the constraint list. For efficient use with this "
"method, equality and inequality constraints should be specified "
"in separate elements of the constraint list. ", OptimizeWarning)
return old_constraints
def old_constraint_to_new(ic, con):
"""
Converts old-style constraint dictionaries to new-style constraint objects.
"""
# check type
try:
ctype = con['type'].lower()
except KeyError as e:
raise KeyError('Constraint %d has no type defined.' % ic) from e
except TypeError as e:
raise TypeError(
'Constraints must be a sequence of dictionaries.'
) from e
except AttributeError as e:
raise TypeError("Constraint's type must be a string.") from e
else:
if ctype not in ['eq', 'ineq']:
raise ValueError("Unknown constraint type '%s'." % con['type'])
if 'fun' not in con:
raise ValueError('Constraint %d has no function defined.' % ic)
lb = 0
if ctype == 'eq':
ub = 0
else:
ub = np.inf
jac = '2-point'
if 'args' in con:
args = con['args']
fun = lambda x: con['fun'](x, *args)
if 'jac' in con:
jac = lambda x: con['jac'](x, *args)
else:
fun = con['fun']
if 'jac' in con:
jac = con['jac']
return NonlinearConstraint(fun, lb, ub, jac)

View File

@@ -0,0 +1,616 @@
import numpy as np
import scipy.sparse as sps
from ._numdiff import approx_derivative, group_columns
from ._hessian_update_strategy import HessianUpdateStrategy
from scipy.sparse.linalg import LinearOperator
FD_METHODS = ('2-point', '3-point', 'cs')
class ScalarFunction:
"""Scalar function and its derivatives.
This class defines a scalar function F: R^n->R and methods for
computing or approximating its first and second derivatives.
Parameters
----------
fun : callable
evaluates the scalar function. Must be of the form ``fun(x, *args)``,
where ``x`` is the argument in the form of a 1-D array and ``args`` is
a tuple of any additional fixed parameters needed to completely specify
the function. Should return a scalar.
x0 : array-like
Provides an initial set of variables for evaluating fun. Array of real
elements of size (n,), where 'n' is the number of independent
variables.
args : tuple, optional
Any additional fixed parameters needed to completely specify the scalar
function.
grad : {callable, '2-point', '3-point', 'cs'}
Method for computing the gradient vector.
If it is a callable, it should be a function that returns the gradient
vector:
``grad(x, *args) -> array_like, shape (n,)``
where ``x`` is an array with shape (n,) and ``args`` is a tuple with
the fixed parameters.
Alternatively, the keywords {'2-point', '3-point', 'cs'} can be used
to select a finite difference scheme for numerical estimation of the
gradient with a relative step size. These finite difference schemes
obey any specified `bounds`.
hess : {callable, '2-point', '3-point', 'cs', HessianUpdateStrategy}
Method for computing the Hessian matrix. If it is callable, it should
return the Hessian matrix:
``hess(x, *args) -> {LinearOperator, spmatrix, array}, (n, n)``
where x is a (n,) ndarray and `args` is a tuple with the fixed
parameters. Alternatively, the keywords {'2-point', '3-point', 'cs'}
select a finite difference scheme for numerical estimation. Or, objects
implementing `HessianUpdateStrategy` interface can be used to
approximate the Hessian.
Whenever the gradient is estimated via finite-differences, the Hessian
cannot be estimated with options {'2-point', '3-point', 'cs'} and needs
to be estimated using one of the quasi-Newton strategies.
finite_diff_rel_step : None or array_like
Relative step size to use. The absolute step size is computed as
``h = finite_diff_rel_step * sign(x0) * max(1, abs(x0))``, possibly
adjusted to fit into the bounds. For ``method='3-point'`` the sign
of `h` is ignored. If None then finite_diff_rel_step is selected
automatically,
finite_diff_bounds : tuple of array_like
Lower and upper bounds on independent variables. Defaults to no bounds,
(-np.inf, np.inf). Each bound must match the size of `x0` or be a
scalar, in the latter case the bound will be the same for all
variables. Use it to limit the range of function evaluation.
epsilon : None or array_like, optional
Absolute step size to use, possibly adjusted to fit into the bounds.
For ``method='3-point'`` the sign of `epsilon` is ignored. By default
relative steps are used, only if ``epsilon is not None`` are absolute
steps used.
Notes
-----
This class implements a memoization logic. There are methods `fun`,
`grad`, hess` and corresponding attributes `f`, `g` and `H`. The following
things should be considered:
1. Use only public methods `fun`, `grad` and `hess`.
2. After one of the methods is called, the corresponding attribute
will be set. However, a subsequent call with a different argument
of *any* of the methods may overwrite the attribute.
"""
def __init__(self, fun, x0, args, grad, hess, finite_diff_rel_step,
finite_diff_bounds, epsilon=None):
if not callable(grad) and grad not in FD_METHODS:
raise ValueError(
f"`grad` must be either callable or one of {FD_METHODS}."
)
if not (callable(hess) or hess in FD_METHODS
or isinstance(hess, HessianUpdateStrategy)):
raise ValueError(
f"`hess` must be either callable, HessianUpdateStrategy"
f" or one of {FD_METHODS}."
)
if grad in FD_METHODS and hess in FD_METHODS:
raise ValueError("Whenever the gradient is estimated via "
"finite-differences, we require the Hessian "
"to be estimated using one of the "
"quasi-Newton strategies.")
# the astype call ensures that self.x is a copy of x0
self.x = np.atleast_1d(x0).astype(float)
self.n = self.x.size
self.nfev = 0
self.ngev = 0
self.nhev = 0
self.f_updated = False
self.g_updated = False
self.H_updated = False
self._lowest_x = None
self._lowest_f = np.inf
finite_diff_options = {}
if grad in FD_METHODS:
finite_diff_options["method"] = grad
finite_diff_options["rel_step"] = finite_diff_rel_step
finite_diff_options["abs_step"] = epsilon
finite_diff_options["bounds"] = finite_diff_bounds
if hess in FD_METHODS:
finite_diff_options["method"] = hess
finite_diff_options["rel_step"] = finite_diff_rel_step
finite_diff_options["abs_step"] = epsilon
finite_diff_options["as_linear_operator"] = True
# Function evaluation
def fun_wrapped(x):
self.nfev += 1
# Send a copy because the user may overwrite it.
# Overwriting results in undefined behaviour because
# fun(self.x) will change self.x, with the two no longer linked.
fx = fun(np.copy(x), *args)
# Make sure the function returns a true scalar
if not np.isscalar(fx):
try:
fx = np.asarray(fx).item()
except (TypeError, ValueError) as e:
raise ValueError(
"The user-provided objective function "
"must return a scalar value."
) from e
if fx < self._lowest_f:
self._lowest_x = x
self._lowest_f = fx
return fx
def update_fun():
self.f = fun_wrapped(self.x)
self._update_fun_impl = update_fun
self._update_fun()
# Gradient evaluation
if callable(grad):
def grad_wrapped(x):
self.ngev += 1
return np.atleast_1d(grad(np.copy(x), *args))
def update_grad():
self.g = grad_wrapped(self.x)
elif grad in FD_METHODS:
def update_grad():
self._update_fun()
self.ngev += 1
self.g = approx_derivative(fun_wrapped, self.x, f0=self.f,
**finite_diff_options)
self._update_grad_impl = update_grad
self._update_grad()
# Hessian Evaluation
if callable(hess):
self.H = hess(np.copy(x0), *args)
self.H_updated = True
self.nhev += 1
if sps.issparse(self.H):
def hess_wrapped(x):
self.nhev += 1
return sps.csr_matrix(hess(np.copy(x), *args))
self.H = sps.csr_matrix(self.H)
elif isinstance(self.H, LinearOperator):
def hess_wrapped(x):
self.nhev += 1
return hess(np.copy(x), *args)
else:
def hess_wrapped(x):
self.nhev += 1
return np.atleast_2d(np.asarray(hess(np.copy(x), *args)))
self.H = np.atleast_2d(np.asarray(self.H))
def update_hess():
self.H = hess_wrapped(self.x)
elif hess in FD_METHODS:
def update_hess():
self._update_grad()
self.H = approx_derivative(grad_wrapped, self.x, f0=self.g,
**finite_diff_options)
return self.H
update_hess()
self.H_updated = True
elif isinstance(hess, HessianUpdateStrategy):
self.H = hess
self.H.initialize(self.n, 'hess')
self.H_updated = True
self.x_prev = None
self.g_prev = None
def update_hess():
self._update_grad()
self.H.update(self.x - self.x_prev, self.g - self.g_prev)
self._update_hess_impl = update_hess
if isinstance(hess, HessianUpdateStrategy):
def update_x(x):
self._update_grad()
self.x_prev = self.x
self.g_prev = self.g
# ensure that self.x is a copy of x. Don't store a reference
# otherwise the memoization doesn't work properly.
self.x = np.atleast_1d(x).astype(float)
self.f_updated = False
self.g_updated = False
self.H_updated = False
self._update_hess()
else:
def update_x(x):
# ensure that self.x is a copy of x. Don't store a reference
# otherwise the memoization doesn't work properly.
self.x = np.atleast_1d(x).astype(float)
self.f_updated = False
self.g_updated = False
self.H_updated = False
self._update_x_impl = update_x
def _update_fun(self):
if not self.f_updated:
self._update_fun_impl()
self.f_updated = True
def _update_grad(self):
if not self.g_updated:
self._update_grad_impl()
self.g_updated = True
def _update_hess(self):
if not self.H_updated:
self._update_hess_impl()
self.H_updated = True
def fun(self, x):
if not np.array_equal(x, self.x):
self._update_x_impl(x)
self._update_fun()
return self.f
def grad(self, x):
if not np.array_equal(x, self.x):
self._update_x_impl(x)
self._update_grad()
return self.g
def hess(self, x):
if not np.array_equal(x, self.x):
self._update_x_impl(x)
self._update_hess()
return self.H
def fun_and_grad(self, x):
if not np.array_equal(x, self.x):
self._update_x_impl(x)
self._update_fun()
self._update_grad()
return self.f, self.g
class VectorFunction:
"""Vector function and its derivatives.
This class defines a vector function F: R^n->R^m and methods for
computing or approximating its first and second derivatives.
Notes
-----
This class implements a memoization logic. There are methods `fun`,
`jac`, hess` and corresponding attributes `f`, `J` and `H`. The following
things should be considered:
1. Use only public methods `fun`, `jac` and `hess`.
2. After one of the methods is called, the corresponding attribute
will be set. However, a subsequent call with a different argument
of *any* of the methods may overwrite the attribute.
"""
def __init__(self, fun, x0, jac, hess,
finite_diff_rel_step, finite_diff_jac_sparsity,
finite_diff_bounds, sparse_jacobian):
if not callable(jac) and jac not in FD_METHODS:
raise ValueError("`jac` must be either callable or one of {}."
.format(FD_METHODS))
if not (callable(hess) or hess in FD_METHODS
or isinstance(hess, HessianUpdateStrategy)):
raise ValueError("`hess` must be either callable,"
"HessianUpdateStrategy or one of {}."
.format(FD_METHODS))
if jac in FD_METHODS and hess in FD_METHODS:
raise ValueError("Whenever the Jacobian is estimated via "
"finite-differences, we require the Hessian to "
"be estimated using one of the quasi-Newton "
"strategies.")
self.x = np.atleast_1d(x0).astype(float)
self.n = self.x.size
self.nfev = 0
self.njev = 0
self.nhev = 0
self.f_updated = False
self.J_updated = False
self.H_updated = False
finite_diff_options = {}
if jac in FD_METHODS:
finite_diff_options["method"] = jac
finite_diff_options["rel_step"] = finite_diff_rel_step
if finite_diff_jac_sparsity is not None:
sparsity_groups = group_columns(finite_diff_jac_sparsity)
finite_diff_options["sparsity"] = (finite_diff_jac_sparsity,
sparsity_groups)
finite_diff_options["bounds"] = finite_diff_bounds
self.x_diff = np.copy(self.x)
if hess in FD_METHODS:
finite_diff_options["method"] = hess
finite_diff_options["rel_step"] = finite_diff_rel_step
finite_diff_options["as_linear_operator"] = True
self.x_diff = np.copy(self.x)
if jac in FD_METHODS and hess in FD_METHODS:
raise ValueError("Whenever the Jacobian is estimated via "
"finite-differences, we require the Hessian to "
"be estimated using one of the quasi-Newton "
"strategies.")
# Function evaluation
def fun_wrapped(x):
self.nfev += 1
return np.atleast_1d(fun(x))
def update_fun():
self.f = fun_wrapped(self.x)
self._update_fun_impl = update_fun
update_fun()
self.v = np.zeros_like(self.f)
self.m = self.v.size
# Jacobian Evaluation
if callable(jac):
self.J = jac(self.x)
self.J_updated = True
self.njev += 1
if (sparse_jacobian or
sparse_jacobian is None and sps.issparse(self.J)):
def jac_wrapped(x):
self.njev += 1
return sps.csr_matrix(jac(x))
self.J = sps.csr_matrix(self.J)
self.sparse_jacobian = True
elif sps.issparse(self.J):
def jac_wrapped(x):
self.njev += 1
return jac(x).toarray()
self.J = self.J.toarray()
self.sparse_jacobian = False
else:
def jac_wrapped(x):
self.njev += 1
return np.atleast_2d(jac(x))
self.J = np.atleast_2d(self.J)
self.sparse_jacobian = False
def update_jac():
self.J = jac_wrapped(self.x)
elif jac in FD_METHODS:
self.J = approx_derivative(fun_wrapped, self.x, f0=self.f,
**finite_diff_options)
self.J_updated = True
if (sparse_jacobian or
sparse_jacobian is None and sps.issparse(self.J)):
def update_jac():
self._update_fun()
self.J = sps.csr_matrix(
approx_derivative(fun_wrapped, self.x, f0=self.f,
**finite_diff_options))
self.J = sps.csr_matrix(self.J)
self.sparse_jacobian = True
elif sps.issparse(self.J):
def update_jac():
self._update_fun()
self.J = approx_derivative(fun_wrapped, self.x, f0=self.f,
**finite_diff_options).toarray()
self.J = self.J.toarray()
self.sparse_jacobian = False
else:
def update_jac():
self._update_fun()
self.J = np.atleast_2d(
approx_derivative(fun_wrapped, self.x, f0=self.f,
**finite_diff_options))
self.J = np.atleast_2d(self.J)
self.sparse_jacobian = False
self._update_jac_impl = update_jac
# Define Hessian
if callable(hess):
self.H = hess(self.x, self.v)
self.H_updated = True
self.nhev += 1
if sps.issparse(self.H):
def hess_wrapped(x, v):
self.nhev += 1
return sps.csr_matrix(hess(x, v))
self.H = sps.csr_matrix(self.H)
elif isinstance(self.H, LinearOperator):
def hess_wrapped(x, v):
self.nhev += 1
return hess(x, v)
else:
def hess_wrapped(x, v):
self.nhev += 1
return np.atleast_2d(np.asarray(hess(x, v)))
self.H = np.atleast_2d(np.asarray(self.H))
def update_hess():
self.H = hess_wrapped(self.x, self.v)
elif hess in FD_METHODS:
def jac_dot_v(x, v):
return jac_wrapped(x).T.dot(v)
def update_hess():
self._update_jac()
self.H = approx_derivative(jac_dot_v, self.x,
f0=self.J.T.dot(self.v),
args=(self.v,),
**finite_diff_options)
update_hess()
self.H_updated = True
elif isinstance(hess, HessianUpdateStrategy):
self.H = hess
self.H.initialize(self.n, 'hess')
self.H_updated = True
self.x_prev = None
self.J_prev = None
def update_hess():
self._update_jac()
# When v is updated before x was updated, then x_prev and
# J_prev are None and we need this check.
if self.x_prev is not None and self.J_prev is not None:
delta_x = self.x - self.x_prev
delta_g = self.J.T.dot(self.v) - self.J_prev.T.dot(self.v)
self.H.update(delta_x, delta_g)
self._update_hess_impl = update_hess
if isinstance(hess, HessianUpdateStrategy):
def update_x(x):
self._update_jac()
self.x_prev = self.x
self.J_prev = self.J
self.x = np.atleast_1d(x).astype(float)
self.f_updated = False
self.J_updated = False
self.H_updated = False
self._update_hess()
else:
def update_x(x):
self.x = np.atleast_1d(x).astype(float)
self.f_updated = False
self.J_updated = False
self.H_updated = False
self._update_x_impl = update_x
def _update_v(self, v):
if not np.array_equal(v, self.v):
self.v = v
self.H_updated = False
def _update_x(self, x):
if not np.array_equal(x, self.x):
self._update_x_impl(x)
def _update_fun(self):
if not self.f_updated:
self._update_fun_impl()
self.f_updated = True
def _update_jac(self):
if not self.J_updated:
self._update_jac_impl()
self.J_updated = True
def _update_hess(self):
if not self.H_updated:
self._update_hess_impl()
self.H_updated = True
def fun(self, x):
self._update_x(x)
self._update_fun()
return self.f
def jac(self, x):
self._update_x(x)
self._update_jac()
return self.J
def hess(self, x, v):
# v should be updated before x.
self._update_v(v)
self._update_x(x)
self._update_hess()
return self.H
class LinearVectorFunction:
"""Linear vector function and its derivatives.
Defines a linear function F = A x, where x is N-D vector and
A is m-by-n matrix. The Jacobian is constant and equals to A. The Hessian
is identically zero and it is returned as a csr matrix.
"""
def __init__(self, A, x0, sparse_jacobian):
if sparse_jacobian or sparse_jacobian is None and sps.issparse(A):
self.J = sps.csr_matrix(A)
self.sparse_jacobian = True
elif sps.issparse(A):
self.J = A.toarray()
self.sparse_jacobian = False
else:
# np.asarray makes sure A is ndarray and not matrix
self.J = np.atleast_2d(np.asarray(A))
self.sparse_jacobian = False
self.m, self.n = self.J.shape
self.x = np.atleast_1d(x0).astype(float)
self.f = self.J.dot(self.x)
self.f_updated = True
self.v = np.zeros(self.m, dtype=float)
self.H = sps.csr_matrix((self.n, self.n))
def _update_x(self, x):
if not np.array_equal(x, self.x):
self.x = np.atleast_1d(x).astype(float)
self.f_updated = False
def fun(self, x):
self._update_x(x)
if not self.f_updated:
self.f = self.J.dot(x)
self.f_updated = True
return self.f
def jac(self, x):
self._update_x(x)
return self.J
def hess(self, x, v):
self._update_x(x)
self.v = v
return self.H
class IdentityVectorFunction(LinearVectorFunction):
"""Identity vector function and its derivatives.
The Jacobian is the identity matrix, returned as a dense array when
`sparse_jacobian=False` and as a csr matrix otherwise. The Hessian is
identically zero and it is returned as a csr matrix.
"""
def __init__(self, x0, sparse_jacobian):
n = len(x0)
if sparse_jacobian or sparse_jacobian is None:
A = sps.eye(n, format='csr')
sparse_jacobian = True
else:
A = np.eye(n)
sparse_jacobian = False
super().__init__(A, x0, sparse_jacobian)

View File

@@ -0,0 +1,717 @@
# Dual Annealing implementation.
# Copyright (c) 2018 Sylvain Gubian <sylvain.gubian@pmi.com>,
# Yang Xiang <yang.xiang@pmi.com>
# Author: Sylvain Gubian, Yang Xiang, PMP S.A.
"""
A Dual Annealing global optimization algorithm
"""
import warnings
import numpy as np
from scipy.optimize import OptimizeResult
from scipy.optimize import minimize
from scipy.special import gammaln
from scipy._lib._util import check_random_state
__all__ = ['dual_annealing']
class VisitingDistribution:
"""
Class used to generate new coordinates based on the distorted
Cauchy-Lorentz distribution. Depending on the steps within the strategy
chain, the class implements the strategy for generating new location
changes.
Parameters
----------
lb : array_like
A 1-D NumPy ndarray containing lower bounds of the generated
components. Neither NaN or inf are allowed.
ub : array_like
A 1-D NumPy ndarray containing upper bounds for the generated
components. Neither NaN or inf are allowed.
visiting_param : float
Parameter for visiting distribution. Default value is 2.62.
Higher values give the visiting distribution a heavier tail, this
makes the algorithm jump to a more distant region.
The value range is (1, 3]. It's value is fixed for the life of the
object.
rand_gen : {`~numpy.random.RandomState`, `~numpy.random.Generator`}
A `~numpy.random.RandomState`, `~numpy.random.Generator` object
for using the current state of the created random generator container.
"""
TAIL_LIMIT = 1.e8
MIN_VISIT_BOUND = 1.e-10
def __init__(self, lb, ub, visiting_param, rand_gen):
# if you wish to make _visiting_param adjustable during the life of
# the object then _factor2, _factor3, _factor5, _d1, _factor6 will
# have to be dynamically calculated in `visit_fn`. They're factored
# out here so they don't need to be recalculated all the time.
self._visiting_param = visiting_param
self.rand_gen = rand_gen
self.lower = lb
self.upper = ub
self.bound_range = ub - lb
# these are invariant numbers unless visiting_param changes
self._factor2 = np.exp((4.0 - self._visiting_param) * np.log(
self._visiting_param - 1.0))
self._factor3 = np.exp((2.0 - self._visiting_param) * np.log(2.0)
/ (self._visiting_param - 1.0))
self._factor4_p = np.sqrt(np.pi) * self._factor2 / (self._factor3 * (
3.0 - self._visiting_param))
self._factor5 = 1.0 / (self._visiting_param - 1.0) - 0.5
self._d1 = 2.0 - self._factor5
self._factor6 = np.pi * (1.0 - self._factor5) / np.sin(
np.pi * (1.0 - self._factor5)) / np.exp(gammaln(self._d1))
def visiting(self, x, step, temperature):
""" Based on the step in the strategy chain, new coordinated are
generated by changing all components is the same time or only
one of them, the new values are computed with visit_fn method
"""
dim = x.size
if step < dim:
# Changing all coordinates with a new visiting value
visits = self.visit_fn(temperature, dim)
upper_sample, lower_sample = self.rand_gen.uniform(size=2)
visits[visits > self.TAIL_LIMIT] = self.TAIL_LIMIT * upper_sample
visits[visits < -self.TAIL_LIMIT] = -self.TAIL_LIMIT * lower_sample
x_visit = visits + x
a = x_visit - self.lower
b = np.fmod(a, self.bound_range) + self.bound_range
x_visit = np.fmod(b, self.bound_range) + self.lower
x_visit[np.fabs(
x_visit - self.lower) < self.MIN_VISIT_BOUND] += 1.e-10
else:
# Changing only one coordinate at a time based on strategy
# chain step
x_visit = np.copy(x)
visit = self.visit_fn(temperature, 1)[0]
if visit > self.TAIL_LIMIT:
visit = self.TAIL_LIMIT * self.rand_gen.uniform()
elif visit < -self.TAIL_LIMIT:
visit = -self.TAIL_LIMIT * self.rand_gen.uniform()
index = step - dim
x_visit[index] = visit + x[index]
a = x_visit[index] - self.lower[index]
b = np.fmod(a, self.bound_range[index]) + self.bound_range[index]
x_visit[index] = np.fmod(b, self.bound_range[
index]) + self.lower[index]
if np.fabs(x_visit[index] - self.lower[
index]) < self.MIN_VISIT_BOUND:
x_visit[index] += self.MIN_VISIT_BOUND
return x_visit
def visit_fn(self, temperature, dim):
""" Formula Visita from p. 405 of reference [2] """
x, y = self.rand_gen.normal(size=(dim, 2)).T
factor1 = np.exp(np.log(temperature) / (self._visiting_param - 1.0))
factor4 = self._factor4_p * factor1
# sigmax
x *= np.exp(-(self._visiting_param - 1.0) * np.log(
self._factor6 / factor4) / (3.0 - self._visiting_param))
den = np.exp((self._visiting_param - 1.0) * np.log(np.fabs(y)) /
(3.0 - self._visiting_param))
return x / den
class EnergyState:
"""
Class used to record the energy state. At any time, it knows what is the
currently used coordinates and the most recent best location.
Parameters
----------
lower : array_like
A 1-D NumPy ndarray containing lower bounds for generating an initial
random components in the `reset` method.
upper : array_like
A 1-D NumPy ndarray containing upper bounds for generating an initial
random components in the `reset` method
components. Neither NaN or inf are allowed.
callback : callable, ``callback(x, f, context)``, optional
A callback function which will be called for all minima found.
``x`` and ``f`` are the coordinates and function value of the
latest minimum found, and `context` has value in [0, 1, 2]
"""
# Maximimum number of trials for generating a valid starting point
MAX_REINIT_COUNT = 1000
def __init__(self, lower, upper, callback=None):
self.ebest = None
self.current_energy = None
self.current_location = None
self.xbest = None
self.lower = lower
self.upper = upper
self.callback = callback
def reset(self, func_wrapper, rand_gen, x0=None):
"""
Initialize current location is the search domain. If `x0` is not
provided, a random location within the bounds is generated.
"""
if x0 is None:
self.current_location = rand_gen.uniform(self.lower, self.upper,
size=len(self.lower))
else:
self.current_location = np.copy(x0)
init_error = True
reinit_counter = 0
while init_error:
self.current_energy = func_wrapper.fun(self.current_location)
if self.current_energy is None:
raise ValueError('Objective function is returning None')
if (not np.isfinite(self.current_energy) or np.isnan(
self.current_energy)):
if reinit_counter >= EnergyState.MAX_REINIT_COUNT:
init_error = False
message = (
'Stopping algorithm because function '
'create NaN or (+/-) infinity values even with '
'trying new random parameters'
)
raise ValueError(message)
self.current_location = rand_gen.uniform(self.lower,
self.upper,
size=self.lower.size)
reinit_counter += 1
else:
init_error = False
# If first time reset, initialize ebest and xbest
if self.ebest is None and self.xbest is None:
self.ebest = self.current_energy
self.xbest = np.copy(self.current_location)
# Otherwise, we keep them in case of reannealing reset
def update_best(self, e, x, context):
self.ebest = e
self.xbest = np.copy(x)
if self.callback is not None:
val = self.callback(x, e, context)
if val is not None:
if val:
return('Callback function requested to stop early by '
'returning True')
def update_current(self, e, x):
self.current_energy = e
self.current_location = np.copy(x)
class StrategyChain:
"""
Class that implements within a Markov chain the strategy for location
acceptance and local search decision making.
Parameters
----------
acceptance_param : float
Parameter for acceptance distribution. It is used to control the
probability of acceptance. The lower the acceptance parameter, the
smaller the probability of acceptance. Default value is -5.0 with
a range (-1e4, -5].
visit_dist : VisitingDistribution
Instance of `VisitingDistribution` class.
func_wrapper : ObjectiveFunWrapper
Instance of `ObjectiveFunWrapper` class.
minimizer_wrapper: LocalSearchWrapper
Instance of `LocalSearchWrapper` class.
rand_gen : {None, int, `numpy.random.Generator`,
`numpy.random.RandomState`}, optional
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used,
seeded with `seed`.
If `seed` is already a ``Generator`` or ``RandomState`` instance then
that instance is used.
energy_state: EnergyState
Instance of `EnergyState` class.
"""
def __init__(self, acceptance_param, visit_dist, func_wrapper,
minimizer_wrapper, rand_gen, energy_state):
# Local strategy chain minimum energy and location
self.emin = energy_state.current_energy
self.xmin = np.array(energy_state.current_location)
# Global optimizer state
self.energy_state = energy_state
# Acceptance parameter
self.acceptance_param = acceptance_param
# Visiting distribution instance
self.visit_dist = visit_dist
# Wrapper to objective function
self.func_wrapper = func_wrapper
# Wrapper to the local minimizer
self.minimizer_wrapper = minimizer_wrapper
self.not_improved_idx = 0
self.not_improved_max_idx = 1000
self._rand_gen = rand_gen
self.temperature_step = 0
self.K = 100 * len(energy_state.current_location)
def accept_reject(self, j, e, x_visit):
r = self._rand_gen.uniform()
pqv_temp = 1.0 - ((1.0 - self.acceptance_param) *
(e - self.energy_state.current_energy) / self.temperature_step)
if pqv_temp <= 0.:
pqv = 0.
else:
pqv = np.exp(np.log(pqv_temp) / (
1. - self.acceptance_param))
if r <= pqv:
# We accept the new location and update state
self.energy_state.update_current(e, x_visit)
self.xmin = np.copy(self.energy_state.current_location)
# No improvement for a long time
if self.not_improved_idx >= self.not_improved_max_idx:
if j == 0 or self.energy_state.current_energy < self.emin:
self.emin = self.energy_state.current_energy
self.xmin = np.copy(self.energy_state.current_location)
def run(self, step, temperature):
self.temperature_step = temperature / float(step + 1)
self.not_improved_idx += 1
for j in range(self.energy_state.current_location.size * 2):
if j == 0:
if step == 0:
self.energy_state_improved = True
else:
self.energy_state_improved = False
x_visit = self.visit_dist.visiting(
self.energy_state.current_location, j, temperature)
# Calling the objective function
e = self.func_wrapper.fun(x_visit)
if e < self.energy_state.current_energy:
# We have got a better energy value
self.energy_state.update_current(e, x_visit)
if e < self.energy_state.ebest:
val = self.energy_state.update_best(e, x_visit, 0)
if val is not None:
if val:
return val
self.energy_state_improved = True
self.not_improved_idx = 0
else:
# We have not improved but do we accept the new location?
self.accept_reject(j, e, x_visit)
if self.func_wrapper.nfev >= self.func_wrapper.maxfun:
return ('Maximum number of function call reached '
'during annealing')
# End of StrategyChain loop
def local_search(self):
# Decision making for performing a local search
# based on strategy chain results
# If energy has been improved or no improvement since too long,
# performing a local search with the best strategy chain location
if self.energy_state_improved:
# Global energy has improved, let's see if LS improves further
e, x = self.minimizer_wrapper.local_search(self.energy_state.xbest,
self.energy_state.ebest)
if e < self.energy_state.ebest:
self.not_improved_idx = 0
val = self.energy_state.update_best(e, x, 1)
if val is not None:
if val:
return val
self.energy_state.update_current(e, x)
if self.func_wrapper.nfev >= self.func_wrapper.maxfun:
return ('Maximum number of function call reached '
'during local search')
# Check probability of a need to perform a LS even if no improvement
do_ls = False
if self.K < 90 * len(self.energy_state.current_location):
pls = np.exp(self.K * (
self.energy_state.ebest - self.energy_state.current_energy) /
self.temperature_step)
if pls >= self._rand_gen.uniform():
do_ls = True
# Global energy not improved, let's see what LS gives
# on the best strategy chain location
if self.not_improved_idx >= self.not_improved_max_idx:
do_ls = True
if do_ls:
e, x = self.minimizer_wrapper.local_search(self.xmin, self.emin)
self.xmin = np.copy(x)
self.emin = e
self.not_improved_idx = 0
self.not_improved_max_idx = self.energy_state.current_location.size
if e < self.energy_state.ebest:
val = self.energy_state.update_best(
self.emin, self.xmin, 2)
if val is not None:
if val:
return val
self.energy_state.update_current(e, x)
if self.func_wrapper.nfev >= self.func_wrapper.maxfun:
return ('Maximum number of function call reached '
'during dual annealing')
class ObjectiveFunWrapper:
def __init__(self, func, maxfun=1e7, *args):
self.func = func
self.args = args
# Number of objective function evaluations
self.nfev = 0
# Number of gradient function evaluation if used
self.ngev = 0
# Number of hessian of the objective function if used
self.nhev = 0
self.maxfun = maxfun
def fun(self, x):
self.nfev += 1
return self.func(x, *self.args)
class LocalSearchWrapper:
"""
Class used to wrap around the minimizer used for local search
Default local minimizer is SciPy minimizer L-BFGS-B
"""
LS_MAXITER_RATIO = 6
LS_MAXITER_MIN = 100
LS_MAXITER_MAX = 1000
def __init__(self, search_bounds, func_wrapper, **kwargs):
self.func_wrapper = func_wrapper
self.kwargs = kwargs
self.minimizer = minimize
bounds_list = list(zip(*search_bounds))
self.lower = np.array(bounds_list[0])
self.upper = np.array(bounds_list[1])
# If no minimizer specified, use SciPy minimize with 'L-BFGS-B' method
if not self.kwargs:
n = len(self.lower)
ls_max_iter = min(max(n * self.LS_MAXITER_RATIO,
self.LS_MAXITER_MIN),
self.LS_MAXITER_MAX)
self.kwargs['method'] = 'L-BFGS-B'
self.kwargs['options'] = {
'maxiter': ls_max_iter,
}
self.kwargs['bounds'] = list(zip(self.lower, self.upper))
def local_search(self, x, e):
# Run local search from the given x location where energy value is e
x_tmp = np.copy(x)
mres = self.minimizer(self.func_wrapper.fun, x, **self.kwargs)
if 'njev' in mres:
self.func_wrapper.ngev += mres.njev
if 'nhev' in mres:
self.func_wrapper.nhev += mres.nhev
# Check if is valid value
is_finite = np.all(np.isfinite(mres.x)) and np.isfinite(mres.fun)
in_bounds = np.all(mres.x >= self.lower) and np.all(
mres.x <= self.upper)
is_valid = is_finite and in_bounds
# Use the new point only if it is valid and return a better results
if is_valid and mres.fun < e:
return mres.fun, mres.x
else:
return e, x_tmp
def dual_annealing(func, bounds, args=(), maxiter=1000,
minimizer_kwargs=None, initial_temp=5230.,
restart_temp_ratio=2.e-5, visit=2.62, accept=-5.0,
maxfun=1e7, seed=None, no_local_search=False,
callback=None, x0=None, local_search_options=None):
"""
Find the global minimum of a function using Dual Annealing.
Parameters
----------
func : callable
The objective function to be minimized. Must be in the form
``f(x, *args)``, where ``x`` is the argument in the form of a 1-D array
and ``args`` is a tuple of any additional fixed parameters needed to
completely specify the function.
bounds : sequence, shape (n, 2)
Bounds for variables. ``(min, max)`` pairs for each element in ``x``,
defining bounds for the objective function parameter.
args : tuple, optional
Any additional fixed parameters needed to completely specify the
objective function.
maxiter : int, optional
The maximum number of global search iterations. Default value is 1000.
minimizer_kwargs : dict, optional
Extra keyword arguments to be passed to the local minimizer
(`minimize`). Some important options could be:
``method`` for the minimizer method to use and ``args`` for
objective function additional arguments.
initial_temp : float, optional
The initial temperature, use higher values to facilitates a wider
search of the energy landscape, allowing dual_annealing to escape
local minima that it is trapped in. Default value is 5230. Range is
(0.01, 5.e4].
restart_temp_ratio : float, optional
During the annealing process, temperature is decreasing, when it
reaches ``initial_temp * restart_temp_ratio``, the reannealing process
is triggered. Default value of the ratio is 2e-5. Range is (0, 1).
visit : float, optional
Parameter for visiting distribution. Default value is 2.62. Higher
values give the visiting distribution a heavier tail, this makes
the algorithm jump to a more distant region. The value range is (1, 3].
accept : float, optional
Parameter for acceptance distribution. It is used to control the
probability of acceptance. The lower the acceptance parameter, the
smaller the probability of acceptance. Default value is -5.0 with
a range (-1e4, -5].
maxfun : int, optional
Soft limit for the number of objective function calls. If the
algorithm is in the middle of a local search, this number will be
exceeded, the algorithm will stop just after the local search is
done. Default value is 1e7.
seed : {None, int, `numpy.random.Generator`,
`numpy.random.RandomState`}, optional
If `seed` is None (or `np.random`), the `numpy.random.RandomState`
singleton is used.
If `seed` is an int, a new ``RandomState`` instance is used,
seeded with `seed`.
If `seed` is already a ``Generator`` or ``RandomState`` instance then
that instance is used.
Specify `seed` for repeatable minimizations. The random numbers
generated with this seed only affect the visiting distribution function
and new coordinates generation.
no_local_search : bool, optional
If `no_local_search` is set to True, a traditional Generalized
Simulated Annealing will be performed with no local search
strategy applied.
callback : callable, optional
A callback function with signature ``callback(x, f, context)``,
which will be called for all minima found.
``x`` and ``f`` are the coordinates and function value of the
latest minimum found, and ``context`` has value in [0, 1, 2], with the
following meaning:
- 0: minimum detected in the annealing process.
- 1: detection occurred in the local search process.
- 2: detection done in the dual annealing process.
If the callback implementation returns True, the algorithm will stop.
x0 : ndarray, shape(n,), optional
Coordinates of a single N-D starting point.
local_search_options : dict, optional
Backwards compatible flag for `minimizer_kwargs`, only one of these
should be supplied.
Returns
-------
res : OptimizeResult
The optimization result represented as a `OptimizeResult` object.
Important attributes are: ``x`` the solution array, ``fun`` the value
of the function at the solution, and ``message`` which describes the
cause of the termination.
See `OptimizeResult` for a description of other attributes.
Notes
-----
This function implements the Dual Annealing optimization. This stochastic
approach derived from [3]_ combines the generalization of CSA (Classical
Simulated Annealing) and FSA (Fast Simulated Annealing) [1]_ [2]_ coupled
to a strategy for applying a local search on accepted locations [4]_.
An alternative implementation of this same algorithm is described in [5]_
and benchmarks are presented in [6]_. This approach introduces an advanced
method to refine the solution found by the generalized annealing
process. This algorithm uses a distorted Cauchy-Lorentz visiting
distribution, with its shape controlled by the parameter :math:`q_{v}`
.. math::
g_{q_{v}}(\\Delta x(t)) \\propto \\frac{ \\
\\left[T_{q_{v}}(t) \\right]^{-\\frac{D}{3-q_{v}}}}{ \\
\\left[{1+(q_{v}-1)\\frac{(\\Delta x(t))^{2}} { \\
\\left[T_{q_{v}}(t)\\right]^{\\frac{2}{3-q_{v}}}}}\\right]^{ \\
\\frac{1}{q_{v}-1}+\\frac{D-1}{2}}}
Where :math:`t` is the artificial time. This visiting distribution is used
to generate a trial jump distance :math:`\\Delta x(t)` of variable
:math:`x(t)` under artificial temperature :math:`T_{q_{v}}(t)`.
From the starting point, after calling the visiting distribution
function, the acceptance probability is computed as follows:
.. math::
p_{q_{a}} = \\min{\\{1,\\left[1-(1-q_{a}) \\beta \\Delta E \\right]^{ \\
\\frac{1}{1-q_{a}}}\\}}
Where :math:`q_{a}` is a acceptance parameter. For :math:`q_{a}<1`, zero
acceptance probability is assigned to the cases where
.. math::
[1-(1-q_{a}) \\beta \\Delta E] < 0
The artificial temperature :math:`T_{q_{v}}(t)` is decreased according to
.. math::
T_{q_{v}}(t) = T_{q_{v}}(1) \\frac{2^{q_{v}-1}-1}{\\left( \\
1 + t\\right)^{q_{v}-1}-1}
Where :math:`q_{v}` is the visiting parameter.
.. versionadded:: 1.2.0
References
----------
.. [1] Tsallis C. Possible generalization of Boltzmann-Gibbs
statistics. Journal of Statistical Physics, 52, 479-487 (1998).
.. [2] Tsallis C, Stariolo DA. Generalized Simulated Annealing.
Physica A, 233, 395-406 (1996).
.. [3] Xiang Y, Sun DY, Fan W, Gong XG. Generalized Simulated
Annealing Algorithm and Its Application to the Thomson Model.
Physics Letters A, 233, 216-220 (1997).
.. [4] Xiang Y, Gong XG. Efficiency of Generalized Simulated
Annealing. Physical Review E, 62, 4473 (2000).
.. [5] Xiang Y, Gubian S, Suomela B, Hoeng J. Generalized
Simulated Annealing for Efficient Global Optimization: the GenSA
Package for R. The R Journal, Volume 5/1 (2013).
.. [6] Mullen, K. Continuous Global Optimization in R. Journal of
Statistical Software, 60(6), 1 - 45, (2014).
:doi:`10.18637/jss.v060.i06`
Examples
--------
The following example is a 10-D problem, with many local minima.
The function involved is called Rastrigin
(https://en.wikipedia.org/wiki/Rastrigin_function)
>>> from scipy.optimize import dual_annealing
>>> func = lambda x: np.sum(x*x - 10*np.cos(2*np.pi*x)) + 10*np.size(x)
>>> lw = [-5.12] * 10
>>> up = [5.12] * 10
>>> ret = dual_annealing(func, bounds=list(zip(lw, up)))
>>> ret.x
array([-4.26437714e-09, -3.91699361e-09, -1.86149218e-09, -3.97165720e-09,
-6.29151648e-09, -6.53145322e-09, -3.93616815e-09, -6.55623025e-09,
-6.05775280e-09, -5.00668935e-09]) # random
>>> ret.fun
0.000000
""" # noqa: E501
if x0 is not None and not len(x0) == len(bounds):
raise ValueError('Bounds size does not match x0')
lu = list(zip(*bounds))
lower = np.array(lu[0])
upper = np.array(lu[1])
# Check that restart temperature ratio is correct
if restart_temp_ratio <= 0. or restart_temp_ratio >= 1.:
raise ValueError('Restart temperature ratio has to be in range (0, 1)')
# Checking bounds are valid
if (np.any(np.isinf(lower)) or np.any(np.isinf(upper)) or np.any(
np.isnan(lower)) or np.any(np.isnan(upper))):
raise ValueError('Some bounds values are inf values or nan values')
# Checking that bounds are consistent
if not np.all(lower < upper):
raise ValueError('Bounds are not consistent min < max')
# Checking that bounds are the same length
if not len(lower) == len(upper):
raise ValueError('Bounds do not have the same dimensions')
# Wrapper for the objective function
func_wrapper = ObjectiveFunWrapper(func, maxfun, *args)
# Wrapper for the minimizer
if local_search_options and minimizer_kwargs:
raise ValueError("dual_annealing only allows either 'minimizer_kwargs' (preferred) or "
"'local_search_options' (deprecated); not both!")
if local_search_options is not None:
warnings.warn("dual_annealing argument 'local_search_options' is "
"deprecated in favor of 'minimizer_kwargs'",
category=DeprecationWarning, stacklevel=2)
minimizer_kwargs = local_search_options
# minimizer_kwargs has to be a dict, not None
minimizer_kwargs = minimizer_kwargs or {}
minimizer_wrapper = LocalSearchWrapper(
bounds, func_wrapper, **minimizer_kwargs)
# Initialization of random Generator for reproducible runs if seed provided
rand_state = check_random_state(seed)
# Initialization of the energy state
energy_state = EnergyState(lower, upper, callback)
energy_state.reset(func_wrapper, rand_state, x0)
# Minimum value of annealing temperature reached to perform
# re-annealing
temperature_restart = initial_temp * restart_temp_ratio
# VisitingDistribution instance
visit_dist = VisitingDistribution(lower, upper, visit, rand_state)
# Strategy chain instance
strategy_chain = StrategyChain(accept, visit_dist, func_wrapper,
minimizer_wrapper, rand_state, energy_state)
need_to_stop = False
iteration = 0
message = []
# OptimizeResult object to be returned
optimize_res = OptimizeResult()
optimize_res.success = True
optimize_res.status = 0
t1 = np.exp((visit - 1) * np.log(2.0)) - 1.0
# Run the search loop
while(not need_to_stop):
for i in range(maxiter):
# Compute temperature for this step
s = float(i) + 2.0
t2 = np.exp((visit - 1) * np.log(s)) - 1.0
temperature = initial_temp * t1 / t2
if iteration >= maxiter:
message.append("Maximum number of iteration reached")
need_to_stop = True
break
# Need a re-annealing process?
if temperature < temperature_restart:
energy_state.reset(func_wrapper, rand_state)
break
# starting strategy chain
val = strategy_chain.run(i, temperature)
if val is not None:
message.append(val)
need_to_stop = True
optimize_res.success = False
break
# Possible local search at the end of the strategy chain
if not no_local_search:
val = strategy_chain.local_search()
if val is not None:
message.append(val)
need_to_stop = True
optimize_res.success = False
break
iteration += 1
# Setting the OptimizeResult values
optimize_res.x = energy_state.xbest
optimize_res.fun = energy_state.ebest
optimize_res.nit = iteration
optimize_res.nfev = func_wrapper.nfev
optimize_res.njev = func_wrapper.ngev
optimize_res.nhev = func_wrapper.nhev
optimize_res.message = message
return optimize_res

View File

@@ -0,0 +1,97 @@
"""
Pythran implementation of columns grouping for finite difference Jacobian
estimation. Used by ._numdiff.group_columns and based on the Cython version.
"""
import numpy as np
#pythran export group_dense(int, int, intc[:,:])
#pythran export group_dense(int, int, int[:,:])
def group_dense(m, n, A):
B = A.T # Transposed view for convenience.
# FIXME: use np.full once pythran supports it
groups = -np.ones(n, dtype=np.intp)
current_group = 0
union = np.empty(m, dtype=np.intp)
# Loop through all the columns.
for i in range(n):
if groups[i] >= 0: # A group was already assigned.
continue
groups[i] = current_group
all_grouped = True
union[:] = B[i] # Here we store the union of grouped columns.
for j in range(groups.shape[0]):
if groups[j] < 0:
all_grouped = False
else:
continue
# Determine if j-th column intersects with the union.
intersect = False
for k in range(m):
if union[k] > 0 and B[j, k] > 0:
intersect = True
break
# If not, add it to the union and assign the group to it.
if not intersect:
union += B[j]
groups[j] = current_group
if all_grouped:
break
current_group += 1
return groups
#pythran export group_sparse(int, int, intc[], intc[])
#pythran export group_sparse(int, int, int[], int[])
#pythran export group_sparse(int, int, intc[::], intc[::])
#pythran export group_sparse(int, int, int[::], int[::])
def group_sparse(m, n, indices, indptr):
groups = -np.ones(n, dtype=np.intp)
current_group = 0
union = np.empty(m, dtype=np.intp)
for i in range(n):
if groups[i] >= 0:
continue
groups[i] = current_group
all_grouped = True
union.fill(0)
for k in range(indptr[i], indptr[i + 1]):
union[indices[k]] = 1
for j in range(groups.shape[0]):
if groups[j] < 0:
all_grouped = False
else:
continue
intersect = False
for k in range(indptr[j], indptr[j + 1]):
if union[indices[k]] == 1:
intersect = True
break
if not intersect:
for k in range(indptr[j], indptr[j + 1]):
union[indices[k]] = 1
groups[j] = current_group
if all_grouped:
break
current_group += 1
return groups

View File

@@ -0,0 +1,429 @@
"""Hessian update strategies for quasi-Newton optimization methods."""
import numpy as np
from numpy.linalg import norm
from scipy.linalg import get_blas_funcs
from warnings import warn
__all__ = ['HessianUpdateStrategy', 'BFGS', 'SR1']
class HessianUpdateStrategy:
"""Interface for implementing Hessian update strategies.
Many optimization methods make use of Hessian (or inverse Hessian)
approximations, such as the quasi-Newton methods BFGS, SR1, L-BFGS.
Some of these approximations, however, do not actually need to store
the entire matrix or can compute the internal matrix product with a
given vector in a very efficiently manner. This class serves as an
abstract interface between the optimization algorithm and the
quasi-Newton update strategies, giving freedom of implementation
to store and update the internal matrix as efficiently as possible.
Different choices of initialization and update procedure will result
in different quasi-Newton strategies.
Four methods should be implemented in derived classes: ``initialize``,
``update``, ``dot`` and ``get_matrix``.
Notes
-----
Any instance of a class that implements this interface,
can be accepted by the method ``minimize`` and used by
the compatible solvers to approximate the Hessian (or
inverse Hessian) used by the optimization algorithms.
"""
def initialize(self, n, approx_type):
"""Initialize internal matrix.
Allocate internal memory for storing and updating
the Hessian or its inverse.
Parameters
----------
n : int
Problem dimension.
approx_type : {'hess', 'inv_hess'}
Selects either the Hessian or the inverse Hessian.
When set to 'hess' the Hessian will be stored and updated.
When set to 'inv_hess' its inverse will be used instead.
"""
raise NotImplementedError("The method ``initialize(n, approx_type)``"
" is not implemented.")
def update(self, delta_x, delta_grad):
"""Update internal matrix.
Update Hessian matrix or its inverse (depending on how 'approx_type'
is defined) using information about the last evaluated points.
Parameters
----------
delta_x : ndarray
The difference between two points the gradient
function have been evaluated at: ``delta_x = x2 - x1``.
delta_grad : ndarray
The difference between the gradients:
``delta_grad = grad(x2) - grad(x1)``.
"""
raise NotImplementedError("The method ``update(delta_x, delta_grad)``"
" is not implemented.")
def dot(self, p):
"""Compute the product of the internal matrix with the given vector.
Parameters
----------
p : array_like
1-D array representing a vector.
Returns
-------
Hp : array
1-D represents the result of multiplying the approximation matrix
by vector p.
"""
raise NotImplementedError("The method ``dot(p)``"
" is not implemented.")
def get_matrix(self):
"""Return current internal matrix.
Returns
-------
H : ndarray, shape (n, n)
Dense matrix containing either the Hessian
or its inverse (depending on how 'approx_type'
is defined).
"""
raise NotImplementedError("The method ``get_matrix(p)``"
" is not implemented.")
class FullHessianUpdateStrategy(HessianUpdateStrategy):
"""Hessian update strategy with full dimensional internal representation.
"""
_syr = get_blas_funcs('syr', dtype='d') # Symmetric rank 1 update
_syr2 = get_blas_funcs('syr2', dtype='d') # Symmetric rank 2 update
# Symmetric matrix-vector product
_symv = get_blas_funcs('symv', dtype='d')
def __init__(self, init_scale='auto'):
self.init_scale = init_scale
# Until initialize is called we can't really use the class,
# so it makes sense to set everything to None.
self.first_iteration = None
self.approx_type = None
self.B = None
self.H = None
def initialize(self, n, approx_type):
"""Initialize internal matrix.
Allocate internal memory for storing and updating
the Hessian or its inverse.
Parameters
----------
n : int
Problem dimension.
approx_type : {'hess', 'inv_hess'}
Selects either the Hessian or the inverse Hessian.
When set to 'hess' the Hessian will be stored and updated.
When set to 'inv_hess' its inverse will be used instead.
"""
self.first_iteration = True
self.n = n
self.approx_type = approx_type
if approx_type not in ('hess', 'inv_hess'):
raise ValueError("`approx_type` must be 'hess' or 'inv_hess'.")
# Create matrix
if self.approx_type == 'hess':
self.B = np.eye(n, dtype=float)
else:
self.H = np.eye(n, dtype=float)
def _auto_scale(self, delta_x, delta_grad):
# Heuristic to scale matrix at first iteration.
# Described in Nocedal and Wright "Numerical Optimization"
# p.143 formula (6.20).
s_norm2 = np.dot(delta_x, delta_x)
y_norm2 = np.dot(delta_grad, delta_grad)
ys = np.abs(np.dot(delta_grad, delta_x))
if ys == 0.0 or y_norm2 == 0 or s_norm2 == 0:
return 1
if self.approx_type == 'hess':
return y_norm2 / ys
else:
return ys / y_norm2
def _update_implementation(self, delta_x, delta_grad):
raise NotImplementedError("The method ``_update_implementation``"
" is not implemented.")
def update(self, delta_x, delta_grad):
"""Update internal matrix.
Update Hessian matrix or its inverse (depending on how 'approx_type'
is defined) using information about the last evaluated points.
Parameters
----------
delta_x : ndarray
The difference between two points the gradient
function have been evaluated at: ``delta_x = x2 - x1``.
delta_grad : ndarray
The difference between the gradients:
``delta_grad = grad(x2) - grad(x1)``.
"""
if np.all(delta_x == 0.0):
return
if np.all(delta_grad == 0.0):
warn('delta_grad == 0.0. Check if the approximated '
'function is linear. If the function is linear '
'better results can be obtained by defining the '
'Hessian as zero instead of using quasi-Newton '
'approximations.', UserWarning)
return
if self.first_iteration:
# Get user specific scale
if self.init_scale == "auto":
scale = self._auto_scale(delta_x, delta_grad)
else:
scale = float(self.init_scale)
# Scale initial matrix with ``scale * np.eye(n)``
if self.approx_type == 'hess':
self.B *= scale
else:
self.H *= scale
self.first_iteration = False
self._update_implementation(delta_x, delta_grad)
def dot(self, p):
"""Compute the product of the internal matrix with the given vector.
Parameters
----------
p : array_like
1-D array representing a vector.
Returns
-------
Hp : array
1-D represents the result of multiplying the approximation matrix
by vector p.
"""
if self.approx_type == 'hess':
return self._symv(1, self.B, p)
else:
return self._symv(1, self.H, p)
def get_matrix(self):
"""Return the current internal matrix.
Returns
-------
M : ndarray, shape (n, n)
Dense matrix containing either the Hessian or its inverse
(depending on how `approx_type` was defined).
"""
if self.approx_type == 'hess':
M = np.copy(self.B)
else:
M = np.copy(self.H)
li = np.tril_indices_from(M, k=-1)
M[li] = M.T[li]
return M
class BFGS(FullHessianUpdateStrategy):
"""Broyden-Fletcher-Goldfarb-Shanno (BFGS) Hessian update strategy.
Parameters
----------
exception_strategy : {'skip_update', 'damp_update'}, optional
Define how to proceed when the curvature condition is violated.
Set it to 'skip_update' to just skip the update. Or, alternatively,
set it to 'damp_update' to interpolate between the actual BFGS
result and the unmodified matrix. Both exceptions strategies
are explained in [1]_, p.536-537.
min_curvature : float
This number, scaled by a normalization factor, defines the
minimum curvature ``dot(delta_grad, delta_x)`` allowed to go
unaffected by the exception strategy. By default is equal to
1e-8 when ``exception_strategy = 'skip_update'`` and equal
to 0.2 when ``exception_strategy = 'damp_update'``.
init_scale : {float, 'auto'}
Matrix scale at first iteration. At the first
iteration the Hessian matrix or its inverse will be initialized
with ``init_scale*np.eye(n)``, where ``n`` is the problem dimension.
Set it to 'auto' in order to use an automatic heuristic for choosing
the initial scale. The heuristic is described in [1]_, p.143.
By default uses 'auto'.
Notes
-----
The update is based on the description in [1]_, p.140.
References
----------
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
Second Edition (2006).
"""
def __init__(self, exception_strategy='skip_update', min_curvature=None,
init_scale='auto'):
if exception_strategy == 'skip_update':
if min_curvature is not None:
self.min_curvature = min_curvature
else:
self.min_curvature = 1e-8
elif exception_strategy == 'damp_update':
if min_curvature is not None:
self.min_curvature = min_curvature
else:
self.min_curvature = 0.2
else:
raise ValueError("`exception_strategy` must be 'skip_update' "
"or 'damp_update'.")
super().__init__(init_scale)
self.exception_strategy = exception_strategy
def _update_inverse_hessian(self, ys, Hy, yHy, s):
"""Update the inverse Hessian matrix.
BFGS update using the formula:
``H <- H + ((H*y).T*y + s.T*y)/(s.T*y)^2 * (s*s.T)
- 1/(s.T*y) * ((H*y)*s.T + s*(H*y).T)``
where ``s = delta_x`` and ``y = delta_grad``. This formula is
equivalent to (6.17) in [1]_ written in a more efficient way
for implementation.
References
----------
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
Second Edition (2006).
"""
self.H = self._syr2(-1.0 / ys, s, Hy, a=self.H)
self.H = self._syr((ys+yHy)/ys**2, s, a=self.H)
def _update_hessian(self, ys, Bs, sBs, y):
"""Update the Hessian matrix.
BFGS update using the formula:
``B <- B - (B*s)*(B*s).T/s.T*(B*s) + y*y^T/s.T*y``
where ``s`` is short for ``delta_x`` and ``y`` is short
for ``delta_grad``. Formula (6.19) in [1]_.
References
----------
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
Second Edition (2006).
"""
self.B = self._syr(1.0 / ys, y, a=self.B)
self.B = self._syr(-1.0 / sBs, Bs, a=self.B)
def _update_implementation(self, delta_x, delta_grad):
# Auxiliary variables w and z
if self.approx_type == 'hess':
w = delta_x
z = delta_grad
else:
w = delta_grad
z = delta_x
# Do some common operations
wz = np.dot(w, z)
Mw = self.dot(w)
wMw = Mw.dot(w)
# Guarantee that wMw > 0 by reinitializing matrix.
# While this is always true in exact arithmetics,
# indefinite matrix may appear due to roundoff errors.
if wMw <= 0.0:
scale = self._auto_scale(delta_x, delta_grad)
# Reinitialize matrix
if self.approx_type == 'hess':
self.B = scale * np.eye(self.n, dtype=float)
else:
self.H = scale * np.eye(self.n, dtype=float)
# Do common operations for new matrix
Mw = self.dot(w)
wMw = Mw.dot(w)
# Check if curvature condition is violated
if wz <= self.min_curvature * wMw:
# If the option 'skip_update' is set
# we just skip the update when the condion
# is violated.
if self.exception_strategy == 'skip_update':
return
# If the option 'damp_update' is set we
# interpolate between the actual BFGS
# result and the unmodified matrix.
elif self.exception_strategy == 'damp_update':
update_factor = (1-self.min_curvature) / (1 - wz/wMw)
z = update_factor*z + (1-update_factor)*Mw
wz = np.dot(w, z)
# Update matrix
if self.approx_type == 'hess':
self._update_hessian(wz, Mw, wMw, z)
else:
self._update_inverse_hessian(wz, Mw, wMw, z)
class SR1(FullHessianUpdateStrategy):
"""Symmetric-rank-1 Hessian update strategy.
Parameters
----------
min_denominator : float
This number, scaled by a normalization factor,
defines the minimum denominator magnitude allowed
in the update. When the condition is violated we skip
the update. By default uses ``1e-8``.
init_scale : {float, 'auto'}, optional
Matrix scale at first iteration. At the first
iteration the Hessian matrix or its inverse will be initialized
with ``init_scale*np.eye(n)``, where ``n`` is the problem dimension.
Set it to 'auto' in order to use an automatic heuristic for choosing
the initial scale. The heuristic is described in [1]_, p.143.
By default uses 'auto'.
Notes
-----
The update is based on the description in [1]_, p.144-146.
References
----------
.. [1] Nocedal, Jorge, and Stephen J. Wright. "Numerical optimization"
Second Edition (2006).
"""
def __init__(self, min_denominator=1e-8, init_scale='auto'):
self.min_denominator = min_denominator
super().__init__(init_scale)
def _update_implementation(self, delta_x, delta_grad):
# Auxiliary variables w and z
if self.approx_type == 'hess':
w = delta_x
z = delta_grad
else:
w = delta_grad
z = delta_x
# Do some common operations
Mw = self.dot(w)
z_minus_Mw = z - Mw
denominator = np.dot(w, z_minus_Mw)
# If the denominator is too small
# we just skip the update.
if np.abs(denominator) <= self.min_denominator*norm(w)*norm(z_minus_Mw):
return
# Update matrix
if self.approx_type == 'hess':
self.B = self._syr(1/denominator, z_minus_Mw, a=self.B)
else:
self.H = self._syr(1/denominator, z_minus_Mw, a=self.H)

View File

@@ -0,0 +1,64 @@
# distutils: language=c++
# cython: language_level=3
from libcpp cimport bool
from libcpp.string cimport string
cdef extern from "HConst.h" nogil:
const int HIGHS_CONST_I_INF
const double HIGHS_CONST_INF
const double HIGHS_CONST_TINY
const double HIGHS_CONST_ZERO
const int HIGHS_THREAD_LIMIT
const bool allow_infinite_costs
const string FILENAME_DEFAULT
ctypedef enum HighsModelStatus:
HighsModelStatusNOTSET "HighsModelStatus::NOTSET" = 0
HighsModelStatusHIGHS_MODEL_STATUS_MIN "HighsModelStatus::HIGHS_MODEL_STATUS_MIN" = HighsModelStatusNOTSET
HighsModelStatusLOAD_ERROR "HighsModelStatus::LOAD_ERROR"
HighsModelStatusMODEL_ERROR "HighsModelStatus::MODEL_ERROR"
HighsModelStatusPRESOLVE_ERROR "HighsModelStatus::PRESOLVE_ERROR"
HighsModelStatusSOLVE_ERROR "HighsModelStatus::SOLVE_ERROR"
HighsModelStatusPOSTSOLVE_ERROR "HighsModelStatus::POSTSOLVE_ERROR"
HighsModelStatusMODEL_EMPTY "HighsModelStatus::MODEL_EMPTY"
HighsModelStatusPRIMAL_INFEASIBLE "HighsModelStatus::PRIMAL_INFEASIBLE"
HighsModelStatusPRIMAL_UNBOUNDED "HighsModelStatus::PRIMAL_UNBOUNDED"
HighsModelStatusOPTIMAL "HighsModelStatus::OPTIMAL"
HighsModelStatusREACHED_DUAL_OBJECTIVE_VALUE_UPPER_BOUND "HighsModelStatus::REACHED_DUAL_OBJECTIVE_VALUE_UPPER_BOUND"
HighsModelStatusREACHED_TIME_LIMIT "HighsModelStatus::REACHED_TIME_LIMIT"
HighsModelStatusREACHED_ITERATION_LIMIT "HighsModelStatus::REACHED_ITERATION_LIMIT"
HighsModelStatusPRIMAL_DUAL_INFEASIBLE "HighsModelStatus::PRIMAL_DUAL_INFEASIBLE"
HighsModelStatusDUAL_INFEASIBLE "HighsModelStatus::DUAL_INFEASIBLE"
HighsModelStatusHIGHS_MODEL_STATUS_MAX "HighsModelStatus::HIGHS_MODEL_STATUS_MAX" = HighsModelStatusDUAL_INFEASIBLE
cdef enum HighsBasisStatus:
HighsBasisStatusLOWER "HighsBasisStatus::LOWER" = 0, # (slack) variable is at its lower bound [including fixed variables]
HighsBasisStatusBASIC "HighsBasisStatus::BASIC" # (slack) variable is basic
HighsBasisStatusUPPER "HighsBasisStatus::UPPER" # (slack) variable is at its upper bound
HighsBasisStatusZERO "HighsBasisStatus::ZERO" # free variable is non-basic and set to zero
HighsBasisStatusNONBASIC "HighsBasisStatus::NONBASIC" # nonbasic with no specific bound information - useful for users and postsolve
HighsBasisStatusSUPER "HighsBasisStatus::SUPER" # Super-basic variable: non-basic and either free and
# nonzero or not at a bound. No SCIP equivalent
cdef enum SolverOption:
SOLVER_OPTION_SIMPLEX "SolverOption::SOLVER_OPTION_SIMPLEX" = -1
SOLVER_OPTION_CHOOSE "SolverOption::SOLVER_OPTION_CHOOSE"
SOLVER_OPTION_IPM "SolverOption::SOLVER_OPTION_IPM"
cdef enum PrimalDualStatus:
PrimalDualStatusSTATUS_NOT_SET "PrimalDualStatus::STATUS_NOT_SET" = -1
PrimalDualStatusSTATUS_MIN "PrimalDualStatus::STATUS_MIN" = PrimalDualStatusSTATUS_NOT_SET
PrimalDualStatusSTATUS_NO_SOLUTION "PrimalDualStatus::STATUS_NO_SOLUTION"
PrimalDualStatusSTATUS_UNKNOWN "PrimalDualStatus::STATUS_UNKNOWN"
PrimalDualStatusSTATUS_INFEASIBLE_POINT "PrimalDualStatus::STATUS_INFEASIBLE_POINT"
PrimalDualStatusSTATUS_FEASIBLE_POINT "PrimalDualStatus::STATUS_FEASIBLE_POINT"
PrimalDualStatusSTATUS_MAX "PrimalDualStatus::STATUS_MAX" = PrimalDualStatusSTATUS_FEASIBLE_POINT
cdef enum HighsOptionType:
HighsOptionTypeBOOL "HighsOptionType::BOOL" = 0
HighsOptionTypeINT "HighsOptionType::INT"
HighsOptionTypeDOUBLE "HighsOptionType::DOUBLE"
HighsOptionTypeSTRING "HighsOptionType::STRING"

View File

@@ -0,0 +1,54 @@
# distutils: language=c++
# cython: language_level=3
from libc.stdio cimport FILE
from libcpp cimport bool
from libcpp.string cimport string
from .HighsStatus cimport HighsStatus
from .HighsOptions cimport HighsOptions
from .HighsInfo cimport HighsInfo
from .HighsLp cimport (
HighsLp,
HighsSolution,
HighsBasis,
ObjSense,
)
from .HConst cimport HighsModelStatus
cdef extern from "Highs.h":
# From HiGHS/src/Highs.h
cdef cppclass Highs:
HighsStatus passHighsOptions(const HighsOptions& options)
HighsStatus passModel(const HighsLp& lp)
HighsStatus run()
HighsStatus setHighsLogfile(FILE* logfile)
HighsStatus setHighsOutput(FILE* output)
HighsStatus writeHighsOptions(const string filename, const bool report_only_non_default_values = true)
# split up for cython below
#const HighsModelStatus& getModelStatus(const bool scaled_model = False) const
const HighsModelStatus & getModelStatus() const
const HighsModelStatus & getModelStatus(const bool scaled_model) const
const HighsInfo& getHighsInfo() const
string highsModelStatusToString(const HighsModelStatus model_status) const
#HighsStatus getHighsInfoValue(const string& info, int& value)
HighsStatus getHighsInfoValue(const string& info, double& value) const
const HighsOptions& getHighsOptions() const
HighsStatus writeSolution(const string filename, const bool pretty) const
HighsStatus setBasis()
const HighsSolution& getSolution() const
const HighsBasis& getBasis() const
bool changeObjectiveSense(const ObjSense sense)
HighsStatus setHighsOptionValueBool "setHighsOptionValue" (const string & option, const bool value)
HighsStatus setHighsOptionValueInt "setHighsOptionValue" (const string & option, const int value)
HighsStatus setHighsOptionValueStr "setHighsOptionValue" (const string & option, const string & value)
HighsStatus setHighsOptionValueDbl "setHighsOptionValue" (const string & option, const double value)
string primalDualStatusToString(const int primal_dual_status)

View File

@@ -0,0 +1,16 @@
# distutils: language=c++
# cython: language_level=3
from libc.stdio cimport FILE
cdef extern from "HighsIO.h" nogil:
void HighsPrintMessage(FILE* pass_output, const int level, const char* format, ...)
cdef enum HighsPrintMessageLevel:
ML_MIN = 0
ML_NONE = ML_MIN
ML_VERBOSE = 1
ML_DETAILED = 2
ML_MINIMAL = 4
ML_ALWAYS = ML_VERBOSE | ML_DETAILED | ML_MINIMAL
ML_MAX = ML_ALWAYS

View File

@@ -0,0 +1,19 @@
# distutils: language=c++
# cython: language_level=3
cdef extern from "HighsInfo.h" nogil:
# From HiGHS/src/lp_data/HighsInfo.h
cdef cppclass HighsInfo:
# Inherited from HighsInfoStruct:
int simplex_iteration_count
int ipm_iteration_count
int crossover_iteration_count
int primal_status
int dual_status
double objective_function_value
int num_primal_infeasibilities
double max_primal_infeasibility
double sum_primal_infeasibilities
int num_dual_infeasibilities
double max_dual_infeasibility
double sum_dual_infeasibilities

View File

@@ -0,0 +1,49 @@
# distutils: language=c++
# cython: language_level=3
from libcpp cimport bool
from libcpp.string cimport string
from libcpp.vector cimport vector
from .HConst cimport HighsBasisStatus
cdef extern from "HighsLp.h" nogil:
# From HiGHS/src/lp_data/HighsLp.h
cdef cppclass HighsLp:
int numCol_
int numRow_
vector[int] Astart_
vector[int] Aindex_
vector[double] Avalue_
vector[double] colCost_
vector[double] colLower_
vector[double] colUpper_
vector[double] rowLower_
vector[double] rowUpper_
ObjSense sense_
double offset_
string model_name_
string lp_name_
vector[string] row_names_
vector[string] col_names_
vector[int] integrality_
ctypedef enum ObjSense:
ObjSenseMINIMIZE "ObjSense::MINIMIZE" = 1
ObjSenseMAXIMIZE "ObjSense::MAXIMIZE" = -1
cdef cppclass HighsSolution:
vector[double] col_value
vector[double] col_dual
vector[double] row_value
vector[double] row_dual
cdef cppclass HighsBasis:
bool valid_
vector[HighsBasisStatus] col_status
vector[HighsBasisStatus] row_status

View File

@@ -0,0 +1,10 @@
# distutils: language=c++
# cython: language_level=3
from .HighsStatus cimport HighsStatus
from .HighsLp cimport HighsLp
from .HighsOptions cimport HighsOptions
cdef extern from "HighsLpUtils.h" nogil:
# From HiGHS/src/lp_data/HighsLpUtils.h
HighsStatus assessLp(HighsLp& lp, const HighsOptions& options)

View File

@@ -0,0 +1,11 @@
# distutils: language=c++
# cython: language_level=3
from libcpp.string cimport string
from .HConst cimport HighsModelStatus
cdef extern from "HighsModelUtils.h" nogil:
# From HiGHS/src/lp_data/HighsModelUtils.h
string utilHighsModelStatusToString(const HighsModelStatus model_status)
string utilPrimalDualStatusToString(const int primal_dual_status)

View File

@@ -0,0 +1,111 @@
# distutils: language=c++
# cython: language_level=3
from libc.stdio cimport FILE
from libcpp cimport bool
from libcpp.string cimport string
from libcpp.vector cimport vector
from .HConst cimport HighsOptionType
cdef extern from "HighsOptions.h" nogil:
cdef cppclass OptionRecord:
HighsOptionType type
string name
string description
bool advanced
cdef cppclass OptionRecordBool(OptionRecord):
bool* value
bool default_value
cdef cppclass OptionRecordInt(OptionRecord):
int* value
int lower_bound
int default_value
int upper_bound
cdef cppclass OptionRecordDouble(OptionRecord):
double* value
double lower_bound
double default_value
double upper_bound
cdef cppclass OptionRecordString(OptionRecord):
string* value
string default_value
cdef cppclass HighsOptions:
# From HighsOptionsStruct:
# Options read from the command line
string model_file
string presolve
string solver
string parallel
double time_limit
string options_file
# Options read from the file
double infinite_cost
double infinite_bound
double small_matrix_value
double large_matrix_value
double primal_feasibility_tolerance
double dual_feasibility_tolerance
double ipm_optimality_tolerance
double dual_objective_value_upper_bound
int highs_debug_level
int simplex_strategy
int simplex_scale_strategy
int simplex_crash_strategy
int simplex_dual_edge_weight_strategy
int simplex_primal_edge_weight_strategy
int simplex_iteration_limit
int simplex_update_limit
int ipm_iteration_limit
int highs_min_threads
int highs_max_threads
int message_level
string solution_file
bool write_solution_to_file
bool write_solution_pretty
# Advanced options
bool run_crossover
bool mps_parser_type_free
int keep_n_rows
int allowed_simplex_matrix_scale_factor
int allowed_simplex_cost_scale_factor
int simplex_dualise_strategy
int simplex_permute_strategy
int dual_simplex_cleanup_strategy
int simplex_price_strategy
int dual_chuzc_sort_strategy
bool simplex_initial_condition_check
double simplex_initial_condition_tolerance
double dual_steepest_edge_weight_log_error_threshhold
double dual_simplex_cost_perturbation_multiplier
double start_crossover_tolerance
bool less_infeasible_DSE_check
bool less_infeasible_DSE_choose_row
bool use_original_HFactor_logic
# Options for MIP solver
int mip_max_nodes
int mip_report_level
# Switch for MIP solver
bool mip
# Options for HighsPrintMessage and HighsLogMessage
FILE* logfile
FILE* output
int message_level
string solution_file
bool write_solution_to_file
bool write_solution_pretty
vector[OptionRecord*] records

View File

@@ -0,0 +1,10 @@
# distutils: language=c++
# cython: language_level=3
from libcpp cimport bool
from .HighsOptions cimport HighsOptions
cdef extern from "HighsRuntimeOptions.h" nogil:
# From HiGHS/src/lp_data/HighsRuntimeOptions.h
bool loadOptions(int argc, char** argv, HighsOptions& options)

View File

@@ -0,0 +1,12 @@
# distutils: language=c++
# cython: language_level=3
from libcpp.string cimport string
cdef extern from "HighsStatus.h" nogil:
ctypedef enum HighsStatus:
HighsStatusOK "HighsStatus::OK"
HighsStatusWarning "HighsStatus::Warning"
HighsStatusError "HighsStatus::Error"
string HighsStatusToString(HighsStatus status)

View File

@@ -0,0 +1,119 @@
# distutils: language=c++
# cython: language_level=3
from libcpp cimport bool
cdef extern from "SimplexConst.h" nogil:
cdef enum SimplexAlgorithm:
PRIMAL "SimplexAlgorithm::PRIMAL" = 0
DUAL "SimplexAlgorithm::DUAL"
cdef enum SimplexStrategy:
SIMPLEX_STRATEGY_MIN = 0
SIMPLEX_STRATEGY_CHOOSE = SIMPLEX_STRATEGY_MIN
SIMPLEX_STRATEGY_DUAL
SIMPLEX_STRATEGY_DUAL_PLAIN = SIMPLEX_STRATEGY_DUAL
SIMPLEX_STRATEGY_DUAL_TASKS
SIMPLEX_STRATEGY_DUAL_MULTI
SIMPLEX_STRATEGY_PRIMAL
SIMPLEX_STRATEGY_MAX = SIMPLEX_STRATEGY_PRIMAL
SIMPLEX_STRATEGY_NUM
cdef enum DualSimplexCleanupStrategy:
DUAL_SIMPLEX_CLEANUP_STRATEGY_MIN = 0
DUAL_SIMPLEX_CLEANUP_STRATEGY_NONE = DUAL_SIMPLEX_CLEANUP_STRATEGY_MIN
DUAL_SIMPLEX_CLEANUP_STRATEGY_HPRIMAL
DUAL_SIMPLEX_CLEANUP_STRATEGY_HQPRIMAL
DUAL_SIMPLEX_CLEANUP_STRATEGY_MAX = DUAL_SIMPLEX_CLEANUP_STRATEGY_HQPRIMAL
cdef enum SimplexScaleStrategy:
SIMPLEX_SCALE_STRATEGY_MIN = 0
SIMPLEX_SCALE_STRATEGY_OFF = SIMPLEX_SCALE_STRATEGY_MIN
SIMPLEX_SCALE_STRATEGY_HIGHS
SIMPLEX_SCALE_STRATEGY_HIGHS_FORCED
SIMPLEX_SCALE_STRATEGY_015
SIMPLEX_SCALE_STRATEGY_0157
SIMPLEX_SCALE_STRATEGY_MAX = SIMPLEX_SCALE_STRATEGY_0157
cdef enum SimplexCrashStrategy:
SIMPLEX_CRASH_STRATEGY_MIN = 0
SIMPLEX_CRASH_STRATEGY_OFF = SIMPLEX_CRASH_STRATEGY_MIN
SIMPLEX_CRASH_STRATEGY_LTSSF_K
SIMPLEX_CRASH_STRATEGY_LTSSF = SIMPLEX_CRASH_STRATEGY_LTSSF_K
SIMPLEX_CRASH_STRATEGY_BIXBY
SIMPLEX_CRASH_STRATEGY_LTSSF_PRI
SIMPLEX_CRASH_STRATEGY_LTSF_K
SIMPLEX_CRASH_STRATEGY_LTSF_PRI
SIMPLEX_CRASH_STRATEGY_LTSF
SIMPLEX_CRASH_STRATEGY_BIXBY_NO_NONZERO_COL_COSTS
SIMPLEX_CRASH_STRATEGY_BASIC
SIMPLEX_CRASH_STRATEGY_TEST_SING
SIMPLEX_CRASH_STRATEGY_MAX = SIMPLEX_CRASH_STRATEGY_TEST_SING
cdef enum SimplexDualEdgeWeightStrategy:
SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_MIN = -1
SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_CHOOSE = SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_MIN
SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_DANTZIG
SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_DEVEX
SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE
SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE_UNIT_INITIAL
SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_MAX = SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE_UNIT_INITIAL
cdef enum SimplexPrimalEdgeWeightStrategy:
SIMPLEX_PRIMAL_EDGE_WEIGHT_STRATEGY_MIN = -1
SIMPLEX_PRIMAL_EDGE_WEIGHT_STRATEGY_CHOOSE = SIMPLEX_PRIMAL_EDGE_WEIGHT_STRATEGY_MIN
SIMPLEX_PRIMAL_EDGE_WEIGHT_STRATEGY_DANTZIG
SIMPLEX_PRIMAL_EDGE_WEIGHT_STRATEGY_DEVEX
SIMPLEX_PRIMAL_EDGE_WEIGHT_STRATEGY_MAX = SIMPLEX_PRIMAL_EDGE_WEIGHT_STRATEGY_DEVEX
cdef enum SimplexPriceStrategy:
SIMPLEX_PRICE_STRATEGY_MIN = 0
SIMPLEX_PRICE_STRATEGY_COL = SIMPLEX_PRICE_STRATEGY_MIN
SIMPLEX_PRICE_STRATEGY_ROW
SIMPLEX_PRICE_STRATEGY_ROW_SWITCH
SIMPLEX_PRICE_STRATEGY_ROW_SWITCH_COL_SWITCH
SIMPLEX_PRICE_STRATEGY_MAX = SIMPLEX_PRICE_STRATEGY_ROW_SWITCH_COL_SWITCH
cdef enum SimplexDualChuzcStrategy:
SIMPLEX_DUAL_CHUZC_STRATEGY_MIN = 0
SIMPLEX_DUAL_CHUZC_STRATEGY_CHOOSE = SIMPLEX_DUAL_CHUZC_STRATEGY_MIN
SIMPLEX_DUAL_CHUZC_STRATEGY_QUAD
SIMPLEX_DUAL_CHUZC_STRATEGY_HEAP
SIMPLEX_DUAL_CHUZC_STRATEGY_BOTH
SIMPLEX_DUAL_CHUZC_STRATEGY_MAX = SIMPLEX_DUAL_CHUZC_STRATEGY_BOTH
cdef enum InvertHint:
INVERT_HINT_NO = 0
INVERT_HINT_UPDATE_LIMIT_REACHED
INVERT_HINT_SYNTHETIC_CLOCK_SAYS_INVERT
INVERT_HINT_POSSIBLY_OPTIMAL
INVERT_HINT_POSSIBLY_PRIMAL_UNBOUNDED
INVERT_HINT_POSSIBLY_DUAL_UNBOUNDED
INVERT_HINT_POSSIBLY_SINGULAR_BASIS
INVERT_HINT_PRIMAL_INFEASIBLE_IN_PRIMAL_SIMPLEX
INVERT_HINT_CHOOSE_COLUMN_FAIL
INVERT_HINT_Count
cdef enum DualEdgeWeightMode:
DANTZIG "DualEdgeWeightMode::DANTZIG" = 0
DEVEX "DualEdgeWeightMode::DEVEX"
STEEPEST_EDGE "DualEdgeWeightMode::STEEPEST_EDGE"
Count "DualEdgeWeightMode::Count"
cdef enum PriceMode:
ROW "PriceMode::ROW" = 0
COL "PriceMode::COL"
const int PARALLEL_THREADS_DEFAULT
const int DUAL_TASKS_MIN_THREADS
const int DUAL_MULTI_MIN_THREADS
const bool invert_if_row_out_negative
const int NONBASIC_FLAG_TRUE
const int NONBASIC_FLAG_FALSE
const int NONBASIC_MOVE_UP
const int NONBASIC_MOVE_DN
const int NONBASIC_MOVE_ZE

View File

@@ -0,0 +1,8 @@
# distutils: language=c++
# cython: language_level=3
cdef extern from "highs_c_api.h" nogil:
int Highs_passLp(void* highs, int numcol, int numrow, int numnz,
double* colcost, double* collower, double* colupper,
double* rowlower, double* rowupper,
int* astart, int* aindex, double* avalue)

View File

@@ -0,0 +1,159 @@
'''
setup.py for HiGHS scipy interface
Some CMake files are used to create source lists for compilation
'''
import pathlib
from datetime import datetime
import os
from os.path import join
def pre_build_hook(build_ext, ext):
from scipy._build_utils.compiler_helper import get_cxx_std_flag
std_flag = get_cxx_std_flag(build_ext._cxx_compiler)
if std_flag is not None:
ext.extra_compile_args.append(std_flag)
def basiclu_pre_build_hook(build_clib, build_info):
from scipy._build_utils.compiler_helper import get_c_std_flag
c_flag = get_c_std_flag(build_clib.compiler)
if c_flag is not None:
if 'extra_compiler_args' not in build_info:
build_info['extra_compiler_args'] = []
build_info['extra_compiler_args'].append(c_flag)
def _get_sources(CMakeLists, start_token, end_token):
# Read in sources from CMakeLists.txt
CMakeLists = pathlib.Path(__file__).parent / CMakeLists
with open(CMakeLists, 'r', encoding='utf-8') as f:
s = f.read()
# Find block where sources are listed
start_idx = s.find(start_token) + len(start_token)
end_idx = s[start_idx:].find(end_token) + len(s[:start_idx])
sources = s[start_idx:end_idx].split('\n')
sources = [s.strip() for s in sources if s[0] != '#']
# Make relative to setup.py
sources = [str(pathlib.Path('src/' + s)) for s in sources]
return sources
# Grab some more info about HiGHS from root CMakeLists
def _get_version(CMakeLists, start_token, end_token=')'):
CMakeLists = pathlib.Path(__file__).parent / CMakeLists
with open(CMakeLists, 'r', encoding='utf-8') as f:
s = f.read()
start_idx = s.find(start_token) + len(start_token) + 1
end_idx = s[start_idx:].find(end_token) + len(s[:start_idx])
return s[start_idx:end_idx].strip()
def configuration(parent_package='', top_path=None):
from numpy.distutils.misc_util import Configuration
config = Configuration('_highs', parent_package, top_path)
# HiGHS info
_major_dot_minor = _get_version(
'CMakeLists.txt', 'project(HIGHS VERSION', 'LANGUAGES CXX C')
HIGHS_VERSION_MAJOR, HIGHS_VERSION_MINOR = _major_dot_minor.split('.')
HIGHS_VERSION_PATCH = _get_version(
'CMakeLists.txt', 'HIGHS_VERSION_PATCH')
GITHASH = 'n/a'
HIGHS_DIR = str(pathlib.Path(__file__).parent.resolve())
# Here are the pound defines that HConfig.h would usually provide;
# We provide an empty HConfig.h file and do the defs and undefs
# here:
TODAY_DATE = datetime.today().strftime('%Y-%m-%d')
DEFINE_MACROS = [
('CMAKE_BUILD_TYPE', '"Release"'),
('HiGHSRELEASE', None),
('IPX_ON', 'ON'),
('HIGHS_GITHASH', '"%s"' % GITHASH),
('HIGHS_COMPILATION_DATE', '"' + TODAY_DATE + '"'),
('HIGHS_VERSION_MAJOR', HIGHS_VERSION_MAJOR),
('HIGHS_VERSION_MINOR', HIGHS_VERSION_MINOR),
('HIGHS_VERSION_PATCH', HIGHS_VERSION_PATCH),
('HIGHS_DIR', '"' + HIGHS_DIR + '"'),
# ('NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION', None),
]
UNDEF_MACROS = [
'OPENMP', # unconditionally disable openmp
'EXT_PRESOLVE',
'SCIP_DEV',
'HiGHSDEV',
'OSI_FOUND',
]
# Compile BASICLU as a static library to appease clang:
# (won't allow -std=c++11/14 option for C sources)
basiclu_sources = _get_sources('src/CMakeLists.txt',
'set(basiclu_sources\n', ')')
config.add_library(
'basiclu',
sources=basiclu_sources,
include_dirs=[
'src',
join('src', 'ipm', 'basiclu', 'include'),
],
language='c',
macros=DEFINE_MACROS,
_pre_build_hook=basiclu_pre_build_hook,
)
# highs_wrapper:
ipx_sources = _get_sources('src/CMakeLists.txt', 'set(ipx_sources\n', ')')
highs_sources = _get_sources('src/CMakeLists.txt', 'set(sources\n', ')')
# filter out MIP sources until MIP is officially supported
highs_sources = [s for s in highs_sources
if pathlib.Path(s).parent.name != 'mip']
ext = config.add_extension(
'_highs_wrapper',
sources=([join('cython', 'src', '_highs_wrapper.cxx')] +
highs_sources + ipx_sources),
include_dirs=[
# highs_wrapper
'src',
join('cython', 'src'),
join('src', 'lp_data'),
# highs
join('src', 'io'),
join('src', 'ipm', 'ipx', 'include'),
# IPX
join('src', 'ipm', 'ipx', 'include'),
join('src', 'ipm', 'basiclu', 'include'),
],
language='c++',
libraries=['basiclu'],
define_macros=DEFINE_MACROS,
undef_macros=UNDEF_MACROS,
)
# Add c++11/14 support:
ext._pre_build_hook = pre_build_hook
# Export constants and enums from HiGHS:
ext = config.add_extension(
'_highs_constants',
sources=[join('cython', 'src', '_highs_constants.cxx')],
include_dirs=[
'src',
join('cython', 'src'),
join('src', 'io'),
join('src', 'lp_data'),
join('src', 'simplex'),
],
language='c++',
)
ext._pre_build_hook = pre_build_hook
config.add_data_files(os.path.join('cython', 'src', '*.pxd'))
return config
if __name__ == '__main__':
from numpy.distutils.core import setup
setup(**configuration(top_path='').todict())

View File

@@ -0,0 +1,497 @@
"""
Functions
---------
.. autosummary::
:toctree: generated/
fmin_l_bfgs_b
"""
## License for the Python wrapper
## ==============================
## Copyright (c) 2004 David M. Cooke <cookedm@physics.mcmaster.ca>
## Permission is hereby granted, free of charge, to any person obtaining a
## copy of this software and associated documentation files (the "Software"),
## to deal in the Software without restriction, including without limitation
## the rights to use, copy, modify, merge, publish, distribute, sublicense,
## and/or sell copies of the Software, and to permit persons to whom the
## Software is furnished to do so, subject to the following conditions:
## The above copyright notice and this permission notice shall be included in
## all copies or substantial portions of the Software.
## THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
## IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
## FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
## AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
## LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
## FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
## DEALINGS IN THE SOFTWARE.
## Modifications by Travis Oliphant and Enthought, Inc. for inclusion in SciPy
import numpy as np
from numpy import array, asarray, float64, zeros
from . import _lbfgsb
from ._optimize import (MemoizeJac, OptimizeResult,
_check_unknown_options, _prepare_scalar_function)
from ._constraints import old_bound_to_new
from scipy.sparse.linalg import LinearOperator
__all__ = ['fmin_l_bfgs_b', 'LbfgsInvHessProduct']
def fmin_l_bfgs_b(func, x0, fprime=None, args=(),
approx_grad=0,
bounds=None, m=10, factr=1e7, pgtol=1e-5,
epsilon=1e-8,
iprint=-1, maxfun=15000, maxiter=15000, disp=None,
callback=None, maxls=20):
"""
Minimize a function func using the L-BFGS-B algorithm.
Parameters
----------
func : callable f(x,*args)
Function to minimize.
x0 : ndarray
Initial guess.
fprime : callable fprime(x,*args), optional
The gradient of `func`. If None, then `func` returns the function
value and the gradient (``f, g = func(x, *args)``), unless
`approx_grad` is True in which case `func` returns only ``f``.
args : sequence, optional
Arguments to pass to `func` and `fprime`.
approx_grad : bool, optional
Whether to approximate the gradient numerically (in which case
`func` returns only the function value).
bounds : list, optional
``(min, max)`` pairs for each element in ``x``, defining
the bounds on that parameter. Use None or +-inf for one of ``min`` or
``max`` when there is no bound in that direction.
m : int, optional
The maximum number of variable metric corrections
used to define the limited memory matrix. (The limited memory BFGS
method does not store the full hessian but uses this many terms in an
approximation to it.)
factr : float, optional
The iteration stops when
``(f^k - f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= factr * eps``,
where ``eps`` is the machine precision, which is automatically
generated by the code. Typical values for `factr` are: 1e12 for
low accuracy; 1e7 for moderate accuracy; 10.0 for extremely
high accuracy. See Notes for relationship to `ftol`, which is exposed
(instead of `factr`) by the `scipy.optimize.minimize` interface to
L-BFGS-B.
pgtol : float, optional
The iteration will stop when
``max{|proj g_i | i = 1, ..., n} <= pgtol``
where ``pg_i`` is the i-th component of the projected gradient.
epsilon : float, optional
Step size used when `approx_grad` is True, for numerically
calculating the gradient
iprint : int, optional
Controls the frequency of output. ``iprint < 0`` means no output;
``iprint = 0`` print only one line at the last iteration;
``0 < iprint < 99`` print also f and ``|proj g|`` every iprint iterations;
``iprint = 99`` print details of every iteration except n-vectors;
``iprint = 100`` print also the changes of active set and final x;
``iprint > 100`` print details of every iteration including x and g.
disp : int, optional
If zero, then no output. If a positive number, then this over-rides
`iprint` (i.e., `iprint` gets the value of `disp`).
maxfun : int, optional
Maximum number of function evaluations. Note that this function
may violate the limit because of evaluating gradients by numerical
differentiation.
maxiter : int, optional
Maximum number of iterations.
callback : callable, optional
Called after each iteration, as ``callback(xk)``, where ``xk`` is the
current parameter vector.
maxls : int, optional
Maximum number of line search steps (per iteration). Default is 20.
Returns
-------
x : array_like
Estimated position of the minimum.
f : float
Value of `func` at the minimum.
d : dict
Information dictionary.
* d['warnflag'] is
- 0 if converged,
- 1 if too many function evaluations or too many iterations,
- 2 if stopped for another reason, given in d['task']
* d['grad'] is the gradient at the minimum (should be 0 ish)
* d['funcalls'] is the number of function calls made.
* d['nit'] is the number of iterations.
See also
--------
minimize: Interface to minimization algorithms for multivariate
functions. See the 'L-BFGS-B' `method` in particular. Note that the
`ftol` option is made available via that interface, while `factr` is
provided via this interface, where `factr` is the factor multiplying
the default machine floating-point precision to arrive at `ftol`:
``ftol = factr * numpy.finfo(float).eps``.
Notes
-----
License of L-BFGS-B (FORTRAN code):
The version included here (in fortran code) is 3.0
(released April 25, 2011). It was written by Ciyou Zhu, Richard Byrd,
and Jorge Nocedal <nocedal@ece.nwu.edu>. It carries the following
condition for use:
This software is freely available, but we expect that all publications
describing work using this software, or all commercial products using it,
quote at least one of the references given below. This software is released
under the BSD License.
References
----------
* R. H. Byrd, P. Lu and J. Nocedal. A Limited Memory Algorithm for Bound
Constrained Optimization, (1995), SIAM Journal on Scientific and
Statistical Computing, 16, 5, pp. 1190-1208.
* C. Zhu, R. H. Byrd and J. Nocedal. L-BFGS-B: Algorithm 778: L-BFGS-B,
FORTRAN routines for large scale bound constrained optimization (1997),
ACM Transactions on Mathematical Software, 23, 4, pp. 550 - 560.
* J.L. Morales and J. Nocedal. L-BFGS-B: Remark on Algorithm 778: L-BFGS-B,
FORTRAN routines for large scale bound constrained optimization (2011),
ACM Transactions on Mathematical Software, 38, 1.
"""
# handle fprime/approx_grad
if approx_grad:
fun = func
jac = None
elif fprime is None:
fun = MemoizeJac(func)
jac = fun.derivative
else:
fun = func
jac = fprime
# build options
if disp is None:
disp = iprint
opts = {'disp': disp,
'iprint': iprint,
'maxcor': m,
'ftol': factr * np.finfo(float).eps,
'gtol': pgtol,
'eps': epsilon,
'maxfun': maxfun,
'maxiter': maxiter,
'callback': callback,
'maxls': maxls}
res = _minimize_lbfgsb(fun, x0, args=args, jac=jac, bounds=bounds,
**opts)
d = {'grad': res['jac'],
'task': res['message'],
'funcalls': res['nfev'],
'nit': res['nit'],
'warnflag': res['status']}
f = res['fun']
x = res['x']
return x, f, d
def _minimize_lbfgsb(fun, x0, args=(), jac=None, bounds=None,
disp=None, maxcor=10, ftol=2.2204460492503131e-09,
gtol=1e-5, eps=1e-8, maxfun=15000, maxiter=15000,
iprint=-1, callback=None, maxls=20,
finite_diff_rel_step=None, **unknown_options):
"""
Minimize a scalar function of one or more variables using the L-BFGS-B
algorithm.
Options
-------
disp : None or int
If `disp is None` (the default), then the supplied version of `iprint`
is used. If `disp is not None`, then it overrides the supplied version
of `iprint` with the behaviour you outlined.
maxcor : int
The maximum number of variable metric corrections used to
define the limited memory matrix. (The limited memory BFGS
method does not store the full hessian but uses this many terms
in an approximation to it.)
ftol : float
The iteration stops when ``(f^k -
f^{k+1})/max{|f^k|,|f^{k+1}|,1} <= ftol``.
gtol : float
The iteration will stop when ``max{|proj g_i | i = 1, ..., n}
<= gtol`` where ``pg_i`` is the i-th component of the
projected gradient.
eps : float or ndarray
If `jac is None` the absolute step size used for numerical
approximation of the jacobian via forward differences.
maxfun : int
Maximum number of function evaluations.
maxiter : int
Maximum number of iterations.
iprint : int, optional
Controls the frequency of output. ``iprint < 0`` means no output;
``iprint = 0`` print only one line at the last iteration;
``0 < iprint < 99`` print also f and ``|proj g|`` every iprint iterations;
``iprint = 99`` print details of every iteration except n-vectors;
``iprint = 100`` print also the changes of active set and final x;
``iprint > 100`` print details of every iteration including x and g.
callback : callable, optional
Called after each iteration, as ``callback(xk)``, where ``xk`` is the
current parameter vector.
maxls : int, optional
Maximum number of line search steps (per iteration). Default is 20.
finite_diff_rel_step : None or array_like, optional
If `jac in ['2-point', '3-point', 'cs']` the relative step size to
use for numerical approximation of the jacobian. The absolute step
size is computed as ``h = rel_step * sign(x0) * max(1, abs(x0))``,
possibly adjusted to fit into the bounds. For ``method='3-point'``
the sign of `h` is ignored. If None (default) then step is selected
automatically.
Notes
-----
The option `ftol` is exposed via the `scipy.optimize.minimize` interface,
but calling `scipy.optimize.fmin_l_bfgs_b` directly exposes `factr`. The
relationship between the two is ``ftol = factr * numpy.finfo(float).eps``.
I.e., `factr` multiplies the default machine floating-point precision to
arrive at `ftol`.
"""
_check_unknown_options(unknown_options)
m = maxcor
pgtol = gtol
factr = ftol / np.finfo(float).eps
x0 = asarray(x0).ravel()
n, = x0.shape
if bounds is None:
bounds = [(None, None)] * n
if len(bounds) != n:
raise ValueError('length of x0 != length of bounds')
# unbounded variables must use None, not +-inf, for optimizer to work properly
bounds = [(None if l == -np.inf else l, None if u == np.inf else u) for l, u in bounds]
# LBFGSB is sent 'old-style' bounds, 'new-style' bounds are required by
# approx_derivative and ScalarFunction
new_bounds = old_bound_to_new(bounds)
# check bounds
if (new_bounds[0] > new_bounds[1]).any():
raise ValueError("LBFGSB - one of the lower bounds is greater than an upper bound.")
# initial vector must lie within the bounds. Otherwise ScalarFunction and
# approx_derivative will cause problems
x0 = np.clip(x0, new_bounds[0], new_bounds[1])
if disp is not None:
if disp == 0:
iprint = -1
else:
iprint = disp
sf = _prepare_scalar_function(fun, x0, jac=jac, args=args, epsilon=eps,
bounds=new_bounds,
finite_diff_rel_step=finite_diff_rel_step)
func_and_grad = sf.fun_and_grad
fortran_int = _lbfgsb.types.intvar.dtype
nbd = zeros(n, fortran_int)
low_bnd = zeros(n, float64)
upper_bnd = zeros(n, float64)
bounds_map = {(None, None): 0,
(1, None): 1,
(1, 1): 2,
(None, 1): 3}
for i in range(0, n):
l, u = bounds[i]
if l is not None:
low_bnd[i] = l
l = 1
if u is not None:
upper_bnd[i] = u
u = 1
nbd[i] = bounds_map[l, u]
if not maxls > 0:
raise ValueError('maxls must be positive.')
x = array(x0, float64)
f = array(0.0, float64)
g = zeros((n,), float64)
wa = zeros(2*m*n + 5*n + 11*m*m + 8*m, float64)
iwa = zeros(3*n, fortran_int)
task = zeros(1, 'S60')
csave = zeros(1, 'S60')
lsave = zeros(4, fortran_int)
isave = zeros(44, fortran_int)
dsave = zeros(29, float64)
task[:] = 'START'
n_iterations = 0
while 1:
# x, f, g, wa, iwa, task, csave, lsave, isave, dsave = \
_lbfgsb.setulb(m, x, low_bnd, upper_bnd, nbd, f, g, factr,
pgtol, wa, iwa, task, iprint, csave, lsave,
isave, dsave, maxls)
task_str = task.tobytes()
if task_str.startswith(b'FG'):
# The minimization routine wants f and g at the current x.
# Note that interruptions due to maxfun are postponed
# until the completion of the current minimization iteration.
# Overwrite f and g:
f, g = func_and_grad(x)
elif task_str.startswith(b'NEW_X'):
# new iteration
n_iterations += 1
if callback is not None:
callback(np.copy(x))
if n_iterations >= maxiter:
task[:] = 'STOP: TOTAL NO. of ITERATIONS REACHED LIMIT'
elif sf.nfev > maxfun:
task[:] = ('STOP: TOTAL NO. of f AND g EVALUATIONS '
'EXCEEDS LIMIT')
else:
break
task_str = task.tobytes().strip(b'\x00').strip()
if task_str.startswith(b'CONV'):
warnflag = 0
elif sf.nfev > maxfun or n_iterations >= maxiter:
warnflag = 1
else:
warnflag = 2
# These two portions of the workspace are described in the mainlb
# subroutine in lbfgsb.f. See line 363.
s = wa[0: m*n].reshape(m, n)
y = wa[m*n: 2*m*n].reshape(m, n)
# See lbfgsb.f line 160 for this portion of the workspace.
# isave(31) = the total number of BFGS updates prior the current iteration;
n_bfgs_updates = isave[30]
n_corrs = min(n_bfgs_updates, maxcor)
hess_inv = LbfgsInvHessProduct(s[:n_corrs], y[:n_corrs])
task_str = task_str.decode()
return OptimizeResult(fun=f, jac=g, nfev=sf.nfev,
njev=sf.ngev,
nit=n_iterations, status=warnflag, message=task_str,
x=x, success=(warnflag == 0), hess_inv=hess_inv)
class LbfgsInvHessProduct(LinearOperator):
"""Linear operator for the L-BFGS approximate inverse Hessian.
This operator computes the product of a vector with the approximate inverse
of the Hessian of the objective function, using the L-BFGS limited
memory approximation to the inverse Hessian, accumulated during the
optimization.
Objects of this class implement the ``scipy.sparse.linalg.LinearOperator``
interface.
Parameters
----------
sk : array_like, shape=(n_corr, n)
Array of `n_corr` most recent updates to the solution vector.
(See [1]).
yk : array_like, shape=(n_corr, n)
Array of `n_corr` most recent updates to the gradient. (See [1]).
References
----------
.. [1] Nocedal, Jorge. "Updating quasi-Newton matrices with limited
storage." Mathematics of computation 35.151 (1980): 773-782.
"""
def __init__(self, sk, yk):
"""Construct the operator."""
if sk.shape != yk.shape or sk.ndim != 2:
raise ValueError('sk and yk must have matching shape, (n_corrs, n)')
n_corrs, n = sk.shape
super().__init__(dtype=np.float64, shape=(n, n))
self.sk = sk
self.yk = yk
self.n_corrs = n_corrs
self.rho = 1 / np.einsum('ij,ij->i', sk, yk)
def _matvec(self, x):
"""Efficient matrix-vector multiply with the BFGS matrices.
This calculation is described in Section (4) of [1].
Parameters
----------
x : ndarray
An array with shape (n,) or (n,1).
Returns
-------
y : ndarray
The matrix-vector product
"""
s, y, n_corrs, rho = self.sk, self.yk, self.n_corrs, self.rho
q = np.array(x, dtype=self.dtype, copy=True)
if q.ndim == 2 and q.shape[1] == 1:
q = q.reshape(-1)
alpha = np.empty(n_corrs)
for i in range(n_corrs-1, -1, -1):
alpha[i] = rho[i] * np.dot(s[i], q)
q = q - alpha[i]*y[i]
r = q
for i in range(n_corrs):
beta = rho[i] * np.dot(y[i], r)
r = r + s[i] * (alpha[i] - beta)
return r
def todense(self):
"""Return a dense array representation of this operator.
Returns
-------
arr : ndarray, shape=(n, n)
An array with the same shape and containing
the same data represented by this `LinearOperator`.
"""
s, y, n_corrs, rho = self.sk, self.yk, self.n_corrs, self.rho
I = np.eye(*self.shape, dtype=self.dtype)
Hk = I
for i in range(n_corrs):
A1 = I - s[i][:, np.newaxis] * y[i][np.newaxis, :] * rho[i]
A2 = I - y[i][:, np.newaxis] * s[i][np.newaxis, :] * rho[i]
Hk = np.dot(A1, np.dot(Hk, A2)) + (rho[i] * s[i][:, np.newaxis] *
s[i][np.newaxis, :])
return Hk

View File

@@ -0,0 +1,880 @@
"""
Functions
---------
.. autosummary::
:toctree: generated/
line_search_armijo
line_search_wolfe1
line_search_wolfe2
scalar_search_wolfe1
scalar_search_wolfe2
"""
from warnings import warn
from scipy.optimize import _minpack2 as minpack2
import numpy as np
__all__ = ['LineSearchWarning', 'line_search_wolfe1', 'line_search_wolfe2',
'scalar_search_wolfe1', 'scalar_search_wolfe2',
'line_search_armijo']
class LineSearchWarning(RuntimeWarning):
pass
#------------------------------------------------------------------------------
# Minpack's Wolfe line and scalar searches
#------------------------------------------------------------------------------
def line_search_wolfe1(f, fprime, xk, pk, gfk=None,
old_fval=None, old_old_fval=None,
args=(), c1=1e-4, c2=0.9, amax=50, amin=1e-8,
xtol=1e-14):
"""
As `scalar_search_wolfe1` but do a line search to direction `pk`
Parameters
----------
f : callable
Function `f(x)`
fprime : callable
Gradient of `f`
xk : array_like
Current point
pk : array_like
Search direction
gfk : array_like, optional
Gradient of `f` at point `xk`
old_fval : float, optional
Value of `f` at point `xk`
old_old_fval : float, optional
Value of `f` at point preceding `xk`
The rest of the parameters are the same as for `scalar_search_wolfe1`.
Returns
-------
stp, f_count, g_count, fval, old_fval
As in `line_search_wolfe1`
gval : array
Gradient of `f` at the final point
"""
if gfk is None:
gfk = fprime(xk, *args)
gval = [gfk]
gc = [0]
fc = [0]
def phi(s):
fc[0] += 1
return f(xk + s*pk, *args)
def derphi(s):
gval[0] = fprime(xk + s*pk, *args)
gc[0] += 1
return np.dot(gval[0], pk)
derphi0 = np.dot(gfk, pk)
stp, fval, old_fval = scalar_search_wolfe1(
phi, derphi, old_fval, old_old_fval, derphi0,
c1=c1, c2=c2, amax=amax, amin=amin, xtol=xtol)
return stp, fc[0], gc[0], fval, old_fval, gval[0]
def scalar_search_wolfe1(phi, derphi, phi0=None, old_phi0=None, derphi0=None,
c1=1e-4, c2=0.9,
amax=50, amin=1e-8, xtol=1e-14):
"""
Scalar function search for alpha that satisfies strong Wolfe conditions
alpha > 0 is assumed to be a descent direction.
Parameters
----------
phi : callable phi(alpha)
Function at point `alpha`
derphi : callable phi'(alpha)
Objective function derivative. Returns a scalar.
phi0 : float, optional
Value of phi at 0
old_phi0 : float, optional
Value of phi at previous point
derphi0 : float, optional
Value derphi at 0
c1 : float, optional
Parameter for Armijo condition rule.
c2 : float, optional
Parameter for curvature condition rule.
amax, amin : float, optional
Maximum and minimum step size
xtol : float, optional
Relative tolerance for an acceptable step.
Returns
-------
alpha : float
Step size, or None if no suitable step was found
phi : float
Value of `phi` at the new point `alpha`
phi0 : float
Value of `phi` at `alpha=0`
Notes
-----
Uses routine DCSRCH from MINPACK.
"""
if phi0 is None:
phi0 = phi(0.)
if derphi0 is None:
derphi0 = derphi(0.)
if old_phi0 is not None and derphi0 != 0:
alpha1 = min(1.0, 1.01*2*(phi0 - old_phi0)/derphi0)
if alpha1 < 0:
alpha1 = 1.0
else:
alpha1 = 1.0
phi1 = phi0
derphi1 = derphi0
isave = np.zeros((2,), np.intc)
dsave = np.zeros((13,), float)
task = b'START'
maxiter = 100
for i in range(maxiter):
stp, phi1, derphi1, task = minpack2.dcsrch(alpha1, phi1, derphi1,
c1, c2, xtol, task,
amin, amax, isave, dsave)
if task[:2] == b'FG':
alpha1 = stp
phi1 = phi(stp)
derphi1 = derphi(stp)
else:
break
else:
# maxiter reached, the line search did not converge
stp = None
if task[:5] == b'ERROR' or task[:4] == b'WARN':
stp = None # failed
return stp, phi1, phi0
line_search = line_search_wolfe1
#------------------------------------------------------------------------------
# Pure-Python Wolfe line and scalar searches
#------------------------------------------------------------------------------
def line_search_wolfe2(f, myfprime, xk, pk, gfk=None, old_fval=None,
old_old_fval=None, args=(), c1=1e-4, c2=0.9, amax=None,
extra_condition=None, maxiter=10):
"""Find alpha that satisfies strong Wolfe conditions.
Parameters
----------
f : callable f(x,*args)
Objective function.
myfprime : callable f'(x,*args)
Objective function gradient.
xk : ndarray
Starting point.
pk : ndarray
Search direction.
gfk : ndarray, optional
Gradient value for x=xk (xk being the current parameter
estimate). Will be recomputed if omitted.
old_fval : float, optional
Function value for x=xk. Will be recomputed if omitted.
old_old_fval : float, optional
Function value for the point preceding x=xk.
args : tuple, optional
Additional arguments passed to objective function.
c1 : float, optional
Parameter for Armijo condition rule.
c2 : float, optional
Parameter for curvature condition rule.
amax : float, optional
Maximum step size
extra_condition : callable, optional
A callable of the form ``extra_condition(alpha, x, f, g)``
returning a boolean. Arguments are the proposed step ``alpha``
and the corresponding ``x``, ``f`` and ``g`` values. The line search
accepts the value of ``alpha`` only if this
callable returns ``True``. If the callable returns ``False``
for the step length, the algorithm will continue with
new iterates. The callable is only called for iterates
satisfying the strong Wolfe conditions.
maxiter : int, optional
Maximum number of iterations to perform.
Returns
-------
alpha : float or None
Alpha for which ``x_new = x0 + alpha * pk``,
or None if the line search algorithm did not converge.
fc : int
Number of function evaluations made.
gc : int
Number of gradient evaluations made.
new_fval : float or None
New function value ``f(x_new)=f(x0+alpha*pk)``,
or None if the line search algorithm did not converge.
old_fval : float
Old function value ``f(x0)``.
new_slope : float or None
The local slope along the search direction at the
new value ``<myfprime(x_new), pk>``,
or None if the line search algorithm did not converge.
Notes
-----
Uses the line search algorithm to enforce strong Wolfe
conditions. See Wright and Nocedal, 'Numerical Optimization',
1999, pp. 59-61.
Examples
--------
>>> from scipy.optimize import line_search
A objective function and its gradient are defined.
>>> def obj_func(x):
... return (x[0])**2+(x[1])**2
>>> def obj_grad(x):
... return [2*x[0], 2*x[1]]
We can find alpha that satisfies strong Wolfe conditions.
>>> start_point = np.array([1.8, 1.7])
>>> search_gradient = np.array([-1.0, -1.0])
>>> line_search(obj_func, obj_grad, start_point, search_gradient)
(1.0, 2, 1, 1.1300000000000001, 6.13, [1.6, 1.4])
"""
fc = [0]
gc = [0]
gval = [None]
gval_alpha = [None]
def phi(alpha):
fc[0] += 1
return f(xk + alpha * pk, *args)
fprime = myfprime
def derphi(alpha):
gc[0] += 1
gval[0] = fprime(xk + alpha * pk, *args) # store for later use
gval_alpha[0] = alpha
return np.dot(gval[0], pk)
if gfk is None:
gfk = fprime(xk, *args)
derphi0 = np.dot(gfk, pk)
if extra_condition is not None:
# Add the current gradient as argument, to avoid needless
# re-evaluation
def extra_condition2(alpha, phi):
if gval_alpha[0] != alpha:
derphi(alpha)
x = xk + alpha * pk
return extra_condition(alpha, x, phi, gval[0])
else:
extra_condition2 = None
alpha_star, phi_star, old_fval, derphi_star = scalar_search_wolfe2(
phi, derphi, old_fval, old_old_fval, derphi0, c1, c2, amax,
extra_condition2, maxiter=maxiter)
if derphi_star is None:
warn('The line search algorithm did not converge', LineSearchWarning)
else:
# derphi_star is a number (derphi) -- so use the most recently
# calculated gradient used in computing it derphi = gfk*pk
# this is the gradient at the next step no need to compute it
# again in the outer loop.
derphi_star = gval[0]
return alpha_star, fc[0], gc[0], phi_star, old_fval, derphi_star
def scalar_search_wolfe2(phi, derphi, phi0=None,
old_phi0=None, derphi0=None,
c1=1e-4, c2=0.9, amax=None,
extra_condition=None, maxiter=10):
"""Find alpha that satisfies strong Wolfe conditions.
alpha > 0 is assumed to be a descent direction.
Parameters
----------
phi : callable phi(alpha)
Objective scalar function.
derphi : callable phi'(alpha)
Objective function derivative. Returns a scalar.
phi0 : float, optional
Value of phi at 0.
old_phi0 : float, optional
Value of phi at previous point.
derphi0 : float, optional
Value of derphi at 0
c1 : float, optional
Parameter for Armijo condition rule.
c2 : float, optional
Parameter for curvature condition rule.
amax : float, optional
Maximum step size.
extra_condition : callable, optional
A callable of the form ``extra_condition(alpha, phi_value)``
returning a boolean. The line search accepts the value
of ``alpha`` only if this callable returns ``True``.
If the callable returns ``False`` for the step length,
the algorithm will continue with new iterates.
The callable is only called for iterates satisfying
the strong Wolfe conditions.
maxiter : int, optional
Maximum number of iterations to perform.
Returns
-------
alpha_star : float or None
Best alpha, or None if the line search algorithm did not converge.
phi_star : float
phi at alpha_star.
phi0 : float
phi at 0.
derphi_star : float or None
derphi at alpha_star, or None if the line search algorithm
did not converge.
Notes
-----
Uses the line search algorithm to enforce strong Wolfe
conditions. See Wright and Nocedal, 'Numerical Optimization',
1999, pp. 59-61.
"""
if phi0 is None:
phi0 = phi(0.)
if derphi0 is None:
derphi0 = derphi(0.)
alpha0 = 0
if old_phi0 is not None and derphi0 != 0:
alpha1 = min(1.0, 1.01*2*(phi0 - old_phi0)/derphi0)
else:
alpha1 = 1.0
if alpha1 < 0:
alpha1 = 1.0
if amax is not None:
alpha1 = min(alpha1, amax)
phi_a1 = phi(alpha1)
#derphi_a1 = derphi(alpha1) evaluated below
phi_a0 = phi0
derphi_a0 = derphi0
if extra_condition is None:
extra_condition = lambda alpha, phi: True
for i in range(maxiter):
if alpha1 == 0 or (amax is not None and alpha0 == amax):
# alpha1 == 0: This shouldn't happen. Perhaps the increment has
# slipped below machine precision?
alpha_star = None
phi_star = phi0
phi0 = old_phi0
derphi_star = None
if alpha1 == 0:
msg = 'Rounding errors prevent the line search from converging'
else:
msg = "The line search algorithm could not find a solution " + \
"less than or equal to amax: %s" % amax
warn(msg, LineSearchWarning)
break
not_first_iteration = i > 0
if (phi_a1 > phi0 + c1 * alpha1 * derphi0) or \
((phi_a1 >= phi_a0) and not_first_iteration):
alpha_star, phi_star, derphi_star = \
_zoom(alpha0, alpha1, phi_a0,
phi_a1, derphi_a0, phi, derphi,
phi0, derphi0, c1, c2, extra_condition)
break
derphi_a1 = derphi(alpha1)
if (abs(derphi_a1) <= -c2*derphi0):
if extra_condition(alpha1, phi_a1):
alpha_star = alpha1
phi_star = phi_a1
derphi_star = derphi_a1
break
if (derphi_a1 >= 0):
alpha_star, phi_star, derphi_star = \
_zoom(alpha1, alpha0, phi_a1,
phi_a0, derphi_a1, phi, derphi,
phi0, derphi0, c1, c2, extra_condition)
break
alpha2 = 2 * alpha1 # increase by factor of two on each iteration
if amax is not None:
alpha2 = min(alpha2, amax)
alpha0 = alpha1
alpha1 = alpha2
phi_a0 = phi_a1
phi_a1 = phi(alpha1)
derphi_a0 = derphi_a1
else:
# stopping test maxiter reached
alpha_star = alpha1
phi_star = phi_a1
derphi_star = None
warn('The line search algorithm did not converge', LineSearchWarning)
return alpha_star, phi_star, phi0, derphi_star
def _cubicmin(a, fa, fpa, b, fb, c, fc):
"""
Finds the minimizer for a cubic polynomial that goes through the
points (a,fa), (b,fb), and (c,fc) with derivative at a of fpa.
If no minimizer can be found, return None.
"""
# f(x) = A *(x-a)^3 + B*(x-a)^2 + C*(x-a) + D
with np.errstate(divide='raise', over='raise', invalid='raise'):
try:
C = fpa
db = b - a
dc = c - a
denom = (db * dc) ** 2 * (db - dc)
d1 = np.empty((2, 2))
d1[0, 0] = dc ** 2
d1[0, 1] = -db ** 2
d1[1, 0] = -dc ** 3
d1[1, 1] = db ** 3
[A, B] = np.dot(d1, np.asarray([fb - fa - C * db,
fc - fa - C * dc]).flatten())
A /= denom
B /= denom
radical = B * B - 3 * A * C
xmin = a + (-B + np.sqrt(radical)) / (3 * A)
except ArithmeticError:
return None
if not np.isfinite(xmin):
return None
return xmin
def _quadmin(a, fa, fpa, b, fb):
"""
Finds the minimizer for a quadratic polynomial that goes through
the points (a,fa), (b,fb) with derivative at a of fpa.
"""
# f(x) = B*(x-a)^2 + C*(x-a) + D
with np.errstate(divide='raise', over='raise', invalid='raise'):
try:
D = fa
C = fpa
db = b - a * 1.0
B = (fb - D - C * db) / (db * db)
xmin = a - C / (2.0 * B)
except ArithmeticError:
return None
if not np.isfinite(xmin):
return None
return xmin
def _zoom(a_lo, a_hi, phi_lo, phi_hi, derphi_lo,
phi, derphi, phi0, derphi0, c1, c2, extra_condition):
"""Zoom stage of approximate linesearch satisfying strong Wolfe conditions.
Part of the optimization algorithm in `scalar_search_wolfe2`.
Notes
-----
Implements Algorithm 3.6 (zoom) in Wright and Nocedal,
'Numerical Optimization', 1999, pp. 61.
"""
maxiter = 10
i = 0
delta1 = 0.2 # cubic interpolant check
delta2 = 0.1 # quadratic interpolant check
phi_rec = phi0
a_rec = 0
while True:
# interpolate to find a trial step length between a_lo and
# a_hi Need to choose interpolation here. Use cubic
# interpolation and then if the result is within delta *
# dalpha or outside of the interval bounded by a_lo or a_hi
# then use quadratic interpolation, if the result is still too
# close, then use bisection
dalpha = a_hi - a_lo
if dalpha < 0:
a, b = a_hi, a_lo
else:
a, b = a_lo, a_hi
# minimizer of cubic interpolant
# (uses phi_lo, derphi_lo, phi_hi, and the most recent value of phi)
#
# if the result is too close to the end points (or out of the
# interval), then use quadratic interpolation with phi_lo,
# derphi_lo and phi_hi if the result is still too close to the
# end points (or out of the interval) then use bisection
if (i > 0):
cchk = delta1 * dalpha
a_j = _cubicmin(a_lo, phi_lo, derphi_lo, a_hi, phi_hi,
a_rec, phi_rec)
if (i == 0) or (a_j is None) or (a_j > b - cchk) or (a_j < a + cchk):
qchk = delta2 * dalpha
a_j = _quadmin(a_lo, phi_lo, derphi_lo, a_hi, phi_hi)
if (a_j is None) or (a_j > b-qchk) or (a_j < a+qchk):
a_j = a_lo + 0.5*dalpha
# Check new value of a_j
phi_aj = phi(a_j)
if (phi_aj > phi0 + c1*a_j*derphi0) or (phi_aj >= phi_lo):
phi_rec = phi_hi
a_rec = a_hi
a_hi = a_j
phi_hi = phi_aj
else:
derphi_aj = derphi(a_j)
if abs(derphi_aj) <= -c2*derphi0 and extra_condition(a_j, phi_aj):
a_star = a_j
val_star = phi_aj
valprime_star = derphi_aj
break
if derphi_aj*(a_hi - a_lo) >= 0:
phi_rec = phi_hi
a_rec = a_hi
a_hi = a_lo
phi_hi = phi_lo
else:
phi_rec = phi_lo
a_rec = a_lo
a_lo = a_j
phi_lo = phi_aj
derphi_lo = derphi_aj
i += 1
if (i > maxiter):
# Failed to find a conforming step size
a_star = None
val_star = None
valprime_star = None
break
return a_star, val_star, valprime_star
#------------------------------------------------------------------------------
# Armijo line and scalar searches
#------------------------------------------------------------------------------
def line_search_armijo(f, xk, pk, gfk, old_fval, args=(), c1=1e-4, alpha0=1):
"""Minimize over alpha, the function ``f(xk+alpha pk)``.
Parameters
----------
f : callable
Function to be minimized.
xk : array_like
Current point.
pk : array_like
Search direction.
gfk : array_like
Gradient of `f` at point `xk`.
old_fval : float
Value of `f` at point `xk`.
args : tuple, optional
Optional arguments.
c1 : float, optional
Value to control stopping criterion.
alpha0 : scalar, optional
Value of `alpha` at start of the optimization.
Returns
-------
alpha
f_count
f_val_at_alpha
Notes
-----
Uses the interpolation algorithm (Armijo backtracking) as suggested by
Wright and Nocedal in 'Numerical Optimization', 1999, pp. 56-57
"""
xk = np.atleast_1d(xk)
fc = [0]
def phi(alpha1):
fc[0] += 1
return f(xk + alpha1*pk, *args)
if old_fval is None:
phi0 = phi(0.)
else:
phi0 = old_fval # compute f(xk) -- done in past loop
derphi0 = np.dot(gfk, pk)
alpha, phi1 = scalar_search_armijo(phi, phi0, derphi0, c1=c1,
alpha0=alpha0)
return alpha, fc[0], phi1
def line_search_BFGS(f, xk, pk, gfk, old_fval, args=(), c1=1e-4, alpha0=1):
"""
Compatibility wrapper for `line_search_armijo`
"""
r = line_search_armijo(f, xk, pk, gfk, old_fval, args=args, c1=c1,
alpha0=alpha0)
return r[0], r[1], 0, r[2]
def scalar_search_armijo(phi, phi0, derphi0, c1=1e-4, alpha0=1, amin=0):
"""Minimize over alpha, the function ``phi(alpha)``.
Uses the interpolation algorithm (Armijo backtracking) as suggested by
Wright and Nocedal in 'Numerical Optimization', 1999, pp. 56-57
alpha > 0 is assumed to be a descent direction.
Returns
-------
alpha
phi1
"""
phi_a0 = phi(alpha0)
if phi_a0 <= phi0 + c1*alpha0*derphi0:
return alpha0, phi_a0
# Otherwise, compute the minimizer of a quadratic interpolant:
alpha1 = -(derphi0) * alpha0**2 / 2.0 / (phi_a0 - phi0 - derphi0 * alpha0)
phi_a1 = phi(alpha1)
if (phi_a1 <= phi0 + c1*alpha1*derphi0):
return alpha1, phi_a1
# Otherwise, loop with cubic interpolation until we find an alpha which
# satisfies the first Wolfe condition (since we are backtracking, we will
# assume that the value of alpha is not too small and satisfies the second
# condition.
while alpha1 > amin: # we are assuming alpha>0 is a descent direction
factor = alpha0**2 * alpha1**2 * (alpha1-alpha0)
a = alpha0**2 * (phi_a1 - phi0 - derphi0*alpha1) - \
alpha1**2 * (phi_a0 - phi0 - derphi0*alpha0)
a = a / factor
b = -alpha0**3 * (phi_a1 - phi0 - derphi0*alpha1) + \
alpha1**3 * (phi_a0 - phi0 - derphi0*alpha0)
b = b / factor
alpha2 = (-b + np.sqrt(abs(b**2 - 3 * a * derphi0))) / (3.0*a)
phi_a2 = phi(alpha2)
if (phi_a2 <= phi0 + c1*alpha2*derphi0):
return alpha2, phi_a2
if (alpha1 - alpha2) > alpha1 / 2.0 or (1 - alpha2/alpha1) < 0.96:
alpha2 = alpha1 / 2.0
alpha0 = alpha1
alpha1 = alpha2
phi_a0 = phi_a1
phi_a1 = phi_a2
# Failed to find a suitable step length
return None, phi_a1
#------------------------------------------------------------------------------
# Non-monotone line search for DF-SANE
#------------------------------------------------------------------------------
def _nonmonotone_line_search_cruz(f, x_k, d, prev_fs, eta,
gamma=1e-4, tau_min=0.1, tau_max=0.5):
"""
Nonmonotone backtracking line search as described in [1]_
Parameters
----------
f : callable
Function returning a tuple ``(f, F)`` where ``f`` is the value
of a merit function and ``F`` the residual.
x_k : ndarray
Initial position.
d : ndarray
Search direction.
prev_fs : float
List of previous merit function values. Should have ``len(prev_fs) <= M``
where ``M`` is the nonmonotonicity window parameter.
eta : float
Allowed merit function increase, see [1]_
gamma, tau_min, tau_max : float, optional
Search parameters, see [1]_
Returns
-------
alpha : float
Step length
xp : ndarray
Next position
fp : float
Merit function value at next position
Fp : ndarray
Residual at next position
References
----------
[1] "Spectral residual method without gradient information for solving
large-scale nonlinear systems of equations." W. La Cruz,
J.M. Martinez, M. Raydan. Math. Comp. **75**, 1429 (2006).
"""
f_k = prev_fs[-1]
f_bar = max(prev_fs)
alpha_p = 1
alpha_m = 1
alpha = 1
while True:
xp = x_k + alpha_p * d
fp, Fp = f(xp)
if fp <= f_bar + eta - gamma * alpha_p**2 * f_k:
alpha = alpha_p
break
alpha_tp = alpha_p**2 * f_k / (fp + (2*alpha_p - 1)*f_k)
xp = x_k - alpha_m * d
fp, Fp = f(xp)
if fp <= f_bar + eta - gamma * alpha_m**2 * f_k:
alpha = -alpha_m
break
alpha_tm = alpha_m**2 * f_k / (fp + (2*alpha_m - 1)*f_k)
alpha_p = np.clip(alpha_tp, tau_min * alpha_p, tau_max * alpha_p)
alpha_m = np.clip(alpha_tm, tau_min * alpha_m, tau_max * alpha_m)
return alpha, xp, fp, Fp
def _nonmonotone_line_search_cheng(f, x_k, d, f_k, C, Q, eta,
gamma=1e-4, tau_min=0.1, tau_max=0.5,
nu=0.85):
"""
Nonmonotone line search from [1]
Parameters
----------
f : callable
Function returning a tuple ``(f, F)`` where ``f`` is the value
of a merit function and ``F`` the residual.
x_k : ndarray
Initial position.
d : ndarray
Search direction.
f_k : float
Initial merit function value.
C, Q : float
Control parameters. On the first iteration, give values
Q=1.0, C=f_k
eta : float
Allowed merit function increase, see [1]_
nu, gamma, tau_min, tau_max : float, optional
Search parameters, see [1]_
Returns
-------
alpha : float
Step length
xp : ndarray
Next position
fp : float
Merit function value at next position
Fp : ndarray
Residual at next position
C : float
New value for the control parameter C
Q : float
New value for the control parameter Q
References
----------
.. [1] W. Cheng & D.-H. Li, ''A derivative-free nonmonotone line
search and its application to the spectral residual
method'', IMA J. Numer. Anal. 29, 814 (2009).
"""
alpha_p = 1
alpha_m = 1
alpha = 1
while True:
xp = x_k + alpha_p * d
fp, Fp = f(xp)
if fp <= C + eta - gamma * alpha_p**2 * f_k:
alpha = alpha_p
break
alpha_tp = alpha_p**2 * f_k / (fp + (2*alpha_p - 1)*f_k)
xp = x_k - alpha_m * d
fp, Fp = f(xp)
if fp <= C + eta - gamma * alpha_m**2 * f_k:
alpha = -alpha_m
break
alpha_tm = alpha_m**2 * f_k / (fp + (2*alpha_m - 1)*f_k)
alpha_p = np.clip(alpha_tp, tau_min * alpha_p, tau_max * alpha_p)
alpha_m = np.clip(alpha_tm, tau_min * alpha_m, tau_max * alpha_m)
# Update C and Q
Q_next = nu * Q + 1
C = (nu * Q * (C + eta) + fp) / Q_next
Q = Q_next
return alpha, xp, fp, Fp, C, Q

View File

@@ -0,0 +1,685 @@
"""
A top-level linear programming interface. Currently this interface solves
linear programming problems via the Simplex and Interior-Point methods.
.. versionadded:: 0.15.0
Functions
---------
.. autosummary::
:toctree: generated/
linprog
linprog_verbose_callback
linprog_terse_callback
"""
import numpy as np
from ._optimize import OptimizeResult, OptimizeWarning
from warnings import warn
from ._linprog_highs import _linprog_highs
from ._linprog_ip import _linprog_ip
from ._linprog_simplex import _linprog_simplex
from ._linprog_rs import _linprog_rs
from ._linprog_doc import (_linprog_highs_doc, _linprog_ip_doc,
_linprog_rs_doc, _linprog_simplex_doc,
_linprog_highs_ipm_doc, _linprog_highs_ds_doc)
from ._linprog_util import (
_parse_linprog, _presolve, _get_Abc, _LPProblem, _autoscale,
_postsolve, _check_result, _display_summary)
from copy import deepcopy
__all__ = ['linprog', 'linprog_verbose_callback', 'linprog_terse_callback']
__docformat__ = "restructuredtext en"
LINPROG_METHODS = ['simplex', 'revised simplex', 'interior-point', 'highs', 'highs-ds', 'highs-ipm']
def linprog_verbose_callback(res):
"""
A sample callback function demonstrating the linprog callback interface.
This callback produces detailed output to sys.stdout before each iteration
and after the final iteration of the simplex algorithm.
Parameters
----------
res : A `scipy.optimize.OptimizeResult` consisting of the following fields:
x : 1-D array
The independent variable vector which optimizes the linear
programming problem.
fun : float
Value of the objective function.
success : bool
True if the algorithm succeeded in finding an optimal solution.
slack : 1-D array
The values of the slack variables. Each slack variable corresponds
to an inequality constraint. If the slack is zero, then the
corresponding constraint is active.
con : 1-D array
The (nominally zero) residuals of the equality constraints, that is,
``b - A_eq @ x``
phase : int
The phase of the optimization being executed. In phase 1 a basic
feasible solution is sought and the T has an additional row
representing an alternate objective function.
status : int
An integer representing the exit status of the optimization::
0 : Optimization terminated successfully
1 : Iteration limit reached
2 : Problem appears to be infeasible
3 : Problem appears to be unbounded
4 : Serious numerical difficulties encountered
nit : int
The number of iterations performed.
message : str
A string descriptor of the exit status of the optimization.
"""
x = res['x']
fun = res['fun']
phase = res['phase']
status = res['status']
nit = res['nit']
message = res['message']
complete = res['complete']
saved_printoptions = np.get_printoptions()
np.set_printoptions(linewidth=500,
formatter={'float': lambda x: "{0: 12.4f}".format(x)})
if status:
print('--------- Simplex Early Exit -------\n')
print('The simplex method exited early with status {0:d}'.format(status))
print(message)
elif complete:
print('--------- Simplex Complete --------\n')
print('Iterations required: {}'.format(nit))
else:
print('--------- Iteration {0:d} ---------\n'.format(nit))
if nit > 0:
if phase == 1:
print('Current Pseudo-Objective Value:')
else:
print('Current Objective Value:')
print('f = ', fun)
print()
print('Current Solution Vector:')
print('x = ', x)
print()
np.set_printoptions(**saved_printoptions)
def linprog_terse_callback(res):
"""
A sample callback function demonstrating the linprog callback interface.
This callback produces brief output to sys.stdout before each iteration
and after the final iteration of the simplex algorithm.
Parameters
----------
res : A `scipy.optimize.OptimizeResult` consisting of the following fields:
x : 1-D array
The independent variable vector which optimizes the linear
programming problem.
fun : float
Value of the objective function.
success : bool
True if the algorithm succeeded in finding an optimal solution.
slack : 1-D array
The values of the slack variables. Each slack variable corresponds
to an inequality constraint. If the slack is zero, then the
corresponding constraint is active.
con : 1-D array
The (nominally zero) residuals of the equality constraints, that is,
``b - A_eq @ x``.
phase : int
The phase of the optimization being executed. In phase 1 a basic
feasible solution is sought and the T has an additional row
representing an alternate objective function.
status : int
An integer representing the exit status of the optimization::
0 : Optimization terminated successfully
1 : Iteration limit reached
2 : Problem appears to be infeasible
3 : Problem appears to be unbounded
4 : Serious numerical difficulties encountered
nit : int
The number of iterations performed.
message : str
A string descriptor of the exit status of the optimization.
"""
nit = res['nit']
x = res['x']
if nit == 0:
print("Iter: X:")
print("{0: <5d} ".format(nit), end="")
print(x)
def linprog(c, A_ub=None, b_ub=None, A_eq=None, b_eq=None,
bounds=None, method='interior-point', callback=None,
options=None, x0=None):
r"""
Linear programming: minimize a linear objective function subject to linear
equality and inequality constraints.
Linear programming solves problems of the following form:
.. math::
\min_x \ & c^T x \\
\mbox{such that} \ & A_{ub} x \leq b_{ub},\\
& A_{eq} x = b_{eq},\\
& l \leq x \leq u ,
where :math:`x` is a vector of decision variables; :math:`c`,
:math:`b_{ub}`, :math:`b_{eq}`, :math:`l`, and :math:`u` are vectors; and
:math:`A_{ub}` and :math:`A_{eq}` are matrices.
Alternatively, that's:
minimize::
c @ x
such that::
A_ub @ x <= b_ub
A_eq @ x == b_eq
lb <= x <= ub
Note that by default ``lb = 0`` and ``ub = None`` unless specified with
``bounds``.
Parameters
----------
c : 1-D array
The coefficients of the linear objective function to be minimized.
A_ub : 2-D array, optional
The inequality constraint matrix. Each row of ``A_ub`` specifies the
coefficients of a linear inequality constraint on ``x``.
b_ub : 1-D array, optional
The inequality constraint vector. Each element represents an
upper bound on the corresponding value of ``A_ub @ x``.
A_eq : 2-D array, optional
The equality constraint matrix. Each row of ``A_eq`` specifies the
coefficients of a linear equality constraint on ``x``.
b_eq : 1-D array, optional
The equality constraint vector. Each element of ``A_eq @ x`` must equal
the corresponding element of ``b_eq``.
bounds : sequence, optional
A sequence of ``(min, max)`` pairs for each element in ``x``, defining
the minimum and maximum values of that decision variable. Use ``None``
to indicate that there is no bound. By default, bounds are
``(0, None)`` (all decision variables are non-negative).
If a single tuple ``(min, max)`` is provided, then ``min`` and
``max`` will serve as bounds for all decision variables.
method : str, optional
The algorithm used to solve the standard form problem.
:ref:`'highs-ds' <optimize.linprog-highs-ds>`,
:ref:`'highs-ipm' <optimize.linprog-highs-ipm>`,
:ref:`'highs' <optimize.linprog-highs>`,
:ref:`'interior-point' <optimize.linprog-interior-point>` (default),
:ref:`'revised simplex' <optimize.linprog-revised_simplex>`, and
:ref:`'simplex' <optimize.linprog-simplex>` (legacy)
are supported.
callback : callable, optional
If a callback function is provided, it will be called at least once per
iteration of the algorithm. The callback function must accept a single
`scipy.optimize.OptimizeResult` consisting of the following fields:
x : 1-D array
The current solution vector.
fun : float
The current value of the objective function ``c @ x``.
success : bool
``True`` when the algorithm has completed successfully.
slack : 1-D array
The (nominally positive) values of the slack,
``b_ub - A_ub @ x``.
con : 1-D array
The (nominally zero) residuals of the equality constraints,
``b_eq - A_eq @ x``.
phase : int
The phase of the algorithm being executed.
status : int
An integer representing the status of the algorithm.
``0`` : Optimization proceeding nominally.
``1`` : Iteration limit reached.
``2`` : Problem appears to be infeasible.
``3`` : Problem appears to be unbounded.
``4`` : Numerical difficulties encountered.
nit : int
The current iteration number.
message : str
A string descriptor of the algorithm status.
Callback functions are not currently supported by the HiGHS methods.
options : dict, optional
A dictionary of solver options. All methods accept the following
options:
maxiter : int
Maximum number of iterations to perform.
Default: see method-specific documentation.
disp : bool
Set to ``True`` to print convergence messages.
Default: ``False``.
presolve : bool
Set to ``False`` to disable automatic presolve.
Default: ``True``.
All methods except the HiGHS solvers also accept:
tol : float
A tolerance which determines when a residual is "close enough" to
zero to be considered exactly zero.
autoscale : bool
Set to ``True`` to automatically perform equilibration.
Consider using this option if the numerical values in the
constraints are separated by several orders of magnitude.
Default: ``False``.
rr : bool
Set to ``False`` to disable automatic redundancy removal.
Default: ``True``.
rr_method : string
Method used to identify and remove redundant rows from the
equality constraint matrix after presolve. For problems with
dense input, the available methods for redundancy removal are:
"SVD":
Repeatedly performs singular value decomposition on
the matrix, detecting redundant rows based on nonzeros
in the left singular vectors that correspond with
zero singular values. May be fast when the matrix is
nearly full rank.
"pivot":
Uses the algorithm presented in [5]_ to identify
redundant rows.
"ID":
Uses a randomized interpolative decomposition.
Identifies columns of the matrix transpose not used in
a full-rank interpolative decomposition of the matrix.
None:
Uses "svd" if the matrix is nearly full rank, that is,
the difference between the matrix rank and the number
of rows is less than five. If not, uses "pivot". The
behavior of this default is subject to change without
prior notice.
Default: None.
For problems with sparse input, this option is ignored, and the
pivot-based algorithm presented in [5]_ is used.
For method-specific options, see
:func:`show_options('linprog') <show_options>`.
x0 : 1-D array, optional
Guess values of the decision variables, which will be refined by
the optimization algorithm. This argument is currently used only by the
'revised simplex' method, and can only be used if `x0` represents a
basic feasible solution.
Returns
-------
res : OptimizeResult
A :class:`scipy.optimize.OptimizeResult` consisting of the fields:
x : 1-D array
The values of the decision variables that minimizes the
objective function while satisfying the constraints.
fun : float
The optimal value of the objective function ``c @ x``.
slack : 1-D array
The (nominally positive) values of the slack variables,
``b_ub - A_ub @ x``.
con : 1-D array
The (nominally zero) residuals of the equality constraints,
``b_eq - A_eq @ x``.
success : bool
``True`` when the algorithm succeeds in finding an optimal
solution.
status : int
An integer representing the exit status of the algorithm.
``0`` : Optimization terminated successfully.
``1`` : Iteration limit reached.
``2`` : Problem appears to be infeasible.
``3`` : Problem appears to be unbounded.
``4`` : Numerical difficulties encountered.
nit : int
The total number of iterations performed in all phases.
message : str
A string descriptor of the exit status of the algorithm.
See Also
--------
show_options : Additional options accepted by the solvers.
Notes
-----
This section describes the available solvers that can be selected by the
'method' parameter.
`'highs-ds'` and
`'highs-ipm'` are interfaces to the
HiGHS simplex and interior-point method solvers [13]_, respectively.
`'highs'` chooses between
the two automatically. These are the fastest linear
programming solvers in SciPy, especially for large, sparse problems;
which of these two is faster is problem-dependent.
`'interior-point'` is the default
as it was the fastest and most robust method before the recent
addition of the HiGHS solvers.
`'revised simplex'` is more
accurate than interior-point for the problems it solves.
`'simplex'` is the legacy method and is
included for backwards compatibility and educational purposes.
Method *highs-ds* is a wrapper of the C++ high performance dual
revised simplex implementation (HSOL) [13]_, [14]_. Method *highs-ipm*
is a wrapper of a C++ implementation of an **i**\ nterior-\ **p**\ oint
**m**\ ethod [13]_; it features a crossover routine, so it is as accurate
as a simplex solver. Method *highs* chooses between the two automatically.
For new code involving `linprog`, we recommend explicitly choosing one of
these three method values.
.. versionadded:: 1.6.0
Method *interior-point* uses the primal-dual path following algorithm
as outlined in [4]_. This algorithm supports sparse constraint matrices and
is typically faster than the simplex methods, especially for large, sparse
problems. Note, however, that the solution returned may be slightly less
accurate than those of the simplex methods and will not, in general,
correspond with a vertex of the polytope defined by the constraints.
.. versionadded:: 1.0.0
Method *revised simplex* uses the revised simplex method as described in
[9]_, except that a factorization [11]_ of the basis matrix, rather than
its inverse, is efficiently maintained and used to solve the linear systems
at each iteration of the algorithm.
.. versionadded:: 1.3.0
Method *simplex* uses a traditional, full-tableau implementation of
Dantzig's simplex algorithm [1]_, [2]_ (*not* the
Nelder-Mead simplex). This algorithm is included for backwards
compatibility and educational purposes.
.. versionadded:: 0.15.0
Before applying *interior-point*, *revised simplex*, or *simplex*,
a presolve procedure based on [8]_ attempts
to identify trivial infeasibilities, trivial unboundedness, and potential
problem simplifications. Specifically, it checks for:
- rows of zeros in ``A_eq`` or ``A_ub``, representing trivial constraints;
- columns of zeros in ``A_eq`` `and` ``A_ub``, representing unconstrained
variables;
- column singletons in ``A_eq``, representing fixed variables; and
- column singletons in ``A_ub``, representing simple bounds.
If presolve reveals that the problem is unbounded (e.g. an unconstrained
and unbounded variable has negative cost) or infeasible (e.g., a row of
zeros in ``A_eq`` corresponds with a nonzero in ``b_eq``), the solver
terminates with the appropriate status code. Note that presolve terminates
as soon as any sign of unboundedness is detected; consequently, a problem
may be reported as unbounded when in reality the problem is infeasible
(but infeasibility has not been detected yet). Therefore, if it is
important to know whether the problem is actually infeasible, solve the
problem again with option ``presolve=False``.
If neither infeasibility nor unboundedness are detected in a single pass
of the presolve, bounds are tightened where possible and fixed
variables are removed from the problem. Then, linearly dependent rows
of the ``A_eq`` matrix are removed, (unless they represent an
infeasibility) to avoid numerical difficulties in the primary solve
routine. Note that rows that are nearly linearly dependent (within a
prescribed tolerance) may also be removed, which can change the optimal
solution in rare cases. If this is a concern, eliminate redundancy from
your problem formulation and run with option ``rr=False`` or
``presolve=False``.
Several potential improvements can be made here: additional presolve
checks outlined in [8]_ should be implemented, the presolve routine should
be run multiple times (until no further simplifications can be made), and
more of the efficiency improvements from [5]_ should be implemented in the
redundancy removal routines.
After presolve, the problem is transformed to standard form by converting
the (tightened) simple bounds to upper bound constraints, introducing
non-negative slack variables for inequality constraints, and expressing
unbounded variables as the difference between two non-negative variables.
Optionally, the problem is automatically scaled via equilibration [12]_.
The selected algorithm solves the standard form problem, and a
postprocessing routine converts the result to a solution to the original
problem.
References
----------
.. [1] Dantzig, George B., Linear programming and extensions. Rand
Corporation Research Study Princeton Univ. Press, Princeton, NJ,
1963
.. [2] Hillier, S.H. and Lieberman, G.J. (1995), "Introduction to
Mathematical Programming", McGraw-Hill, Chapter 4.
.. [3] Bland, Robert G. New finite pivoting rules for the simplex method.
Mathematics of Operations Research (2), 1977: pp. 103-107.
.. [4] Andersen, Erling D., and Knud D. Andersen. "The MOSEK interior point
optimizer for linear programming: an implementation of the
homogeneous algorithm." High performance optimization. Springer US,
2000. 197-232.
.. [5] Andersen, Erling D. "Finding all linearly dependent rows in
large-scale linear programming." Optimization Methods and Software
6.3 (1995): 219-227.
.. [6] Freund, Robert M. "Primal-Dual Interior-Point Methods for Linear
Programming based on Newton's Method." Unpublished Course Notes,
March 2004. Available 2/25/2017 at
https://ocw.mit.edu/courses/sloan-school-of-management/15-084j-nonlinear-programming-spring-2004/lecture-notes/lec14_int_pt_mthd.pdf
.. [7] Fourer, Robert. "Solving Linear Programs by Interior-Point Methods."
Unpublished Course Notes, August 26, 2005. Available 2/25/2017 at
http://www.4er.org/CourseNotes/Book%20B/B-III.pdf
.. [8] Andersen, Erling D., and Knud D. Andersen. "Presolving in linear
programming." Mathematical Programming 71.2 (1995): 221-245.
.. [9] Bertsimas, Dimitris, and J. Tsitsiklis. "Introduction to linear
programming." Athena Scientific 1 (1997): 997.
.. [10] Andersen, Erling D., et al. Implementation of interior point
methods for large scale linear programming. HEC/Universite de
Geneve, 1996.
.. [11] Bartels, Richard H. "A stabilization of the simplex method."
Journal in Numerische Mathematik 16.5 (1971): 414-434.
.. [12] Tomlin, J. A. "On scaling linear programming problems."
Mathematical Programming Study 4 (1975): 146-166.
.. [13] Huangfu, Q., Galabova, I., Feldmeier, M., and Hall, J. A. J.
"HiGHS - high performance software for linear optimization."
Accessed 4/16/2020 at https://www.maths.ed.ac.uk/hall/HiGHS/#guide
.. [14] Huangfu, Q. and Hall, J. A. J. "Parallelizing the dual revised
simplex method." Mathematical Programming Computation, 10 (1),
119-142, 2018. DOI: 10.1007/s12532-017-0130-5
Examples
--------
Consider the following problem:
.. math::
\min_{x_0, x_1} \ -x_0 + 4x_1 & \\
\mbox{such that} \ -3x_0 + x_1 & \leq 6,\\
-x_0 - 2x_1 & \geq -4,\\
x_1 & \geq -3.
The problem is not presented in the form accepted by `linprog`. This is
easily remedied by converting the "greater than" inequality
constraint to a "less than" inequality constraint by
multiplying both sides by a factor of :math:`-1`. Note also that the last
constraint is really the simple bound :math:`-3 \leq x_1 \leq \infty`.
Finally, since there are no bounds on :math:`x_0`, we must explicitly
specify the bounds :math:`-\infty \leq x_0 \leq \infty`, as the
default is for variables to be non-negative. After collecting coeffecients
into arrays and tuples, the input for this problem is:
>>> c = [-1, 4]
>>> A = [[-3, 1], [1, 2]]
>>> b = [6, 4]
>>> x0_bounds = (None, None)
>>> x1_bounds = (-3, None)
>>> from scipy.optimize import linprog
>>> res = linprog(c, A_ub=A, b_ub=b, bounds=[x0_bounds, x1_bounds])
Note that the default method for `linprog` is 'interior-point', which is
approximate by nature.
>>> print(res)
con: array([], dtype=float64)
fun: -21.99999984082494 # may vary
message: 'Optimization terminated successfully.'
nit: 6 # may vary
slack: array([3.89999997e+01, 8.46872439e-08] # may vary
status: 0
success: True
x: array([ 9.99999989, -2.99999999]) # may vary
If you need greater accuracy, try 'revised simplex'.
>>> res = linprog(c, A_ub=A, b_ub=b, bounds=[x0_bounds, x1_bounds], method='revised simplex')
>>> print(res)
con: array([], dtype=float64)
fun: -22.0 # may vary
message: 'Optimization terminated successfully.'
nit: 1 # may vary
slack: array([39., 0.]) # may vary
status: 0
success: True
x: array([10., -3.]) # may vary
You can use the ``options`` parameter, e.g.,
to restrict the maximum number of iterations.
>>> res = linprog(c, A_ub=A, b_ub=b, bounds=[x0_bounds, x1_bounds],
... options={'maxiter': 4})
>>> print(res)
con: array([], dtype=float64)
fun: -21.35207150630407 # may vary
message: 'The iteration limit was reached before the algorithm converged.'
nit: 4
slack: array([37.19406046, 0.5727398 ])
status: 1
success: False
x: array([ 9.4021973 , -2.98746855])
"""
meth = method.lower()
methods = {"simplex", "revised simplex", "interior-point",
"highs", "highs-ds", "highs-ipm"}
if meth not in methods:
raise ValueError(f"Unknown solver '{method}'")
if x0 is not None and meth != "revised simplex":
warning_message = "x0 is used only when method is 'revised simplex'. "
warn(warning_message, OptimizeWarning)
lp = _LPProblem(c, A_ub, b_ub, A_eq, b_eq, bounds, x0)
lp, solver_options = _parse_linprog(lp, options, meth)
tol = solver_options.get('tol', 1e-9)
# Give unmodified problem to HiGHS
if meth.startswith('highs'):
if callback is not None:
raise NotImplementedError("HiGHS solvers do not support the "
"callback interface.")
highs_solvers = {'highs-ipm': 'ipm', 'highs-ds': 'simplex',
'highs': None}
sol = _linprog_highs(lp, solver=highs_solvers[meth],
**solver_options)
sol['status'], sol['message'] = (
_check_result(sol['x'], sol['fun'], sol['status'], sol['slack'],
sol['con'], lp.bounds, tol, sol['message']))
sol['success'] = sol['status'] == 0
return OptimizeResult(sol)
iteration = 0
complete = False # will become True if solved in presolve
undo = []
# Keep the original arrays to calculate slack/residuals for original
# problem.
lp_o = deepcopy(lp)
# Solve trivial problem, eliminate variables, tighten bounds, etc.
rr_method = solver_options.pop('rr_method', None) # need to pop these;
rr = solver_options.pop('rr', True) # they're not passed to methods
c0 = 0 # we might get a constant term in the objective
if solver_options.pop('presolve', True):
(lp, c0, x, undo, complete, status, message) = _presolve(lp, rr,
rr_method,
tol)
C, b_scale = 1, 1 # for trivial unscaling if autoscale is not used
postsolve_args = (lp_o._replace(bounds=lp.bounds), undo, C, b_scale)
if not complete:
A, b, c, c0, x0 = _get_Abc(lp, c0)
if solver_options.pop('autoscale', False):
A, b, c, x0, C, b_scale = _autoscale(A, b, c, x0)
postsolve_args = postsolve_args[:-2] + (C, b_scale)
if meth == 'simplex':
x, status, message, iteration = _linprog_simplex(
c, c0=c0, A=A, b=b, callback=callback,
postsolve_args=postsolve_args, **solver_options)
elif meth == 'interior-point':
x, status, message, iteration = _linprog_ip(
c, c0=c0, A=A, b=b, callback=callback,
postsolve_args=postsolve_args, **solver_options)
elif meth == 'revised simplex':
x, status, message, iteration = _linprog_rs(
c, c0=c0, A=A, b=b, x0=x0, callback=callback,
postsolve_args=postsolve_args, **solver_options)
# Eliminate artificial variables, re-introduce presolved variables, etc.
disp = solver_options.get('disp', False)
x, fun, slack, con = _postsolve(x, postsolve_args, complete)
status, message = _check_result(x, fun, status, slack, con, lp_o.bounds, tol, message)
if disp:
_display_summary(message, status, fun, iteration)
sol = {
'x': x,
'fun': fun,
'slack': slack,
'con': con,
'status': status,
'message': message,
'nit': iteration,
'success': status == 0}
return OptimizeResult(sol)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,442 @@
"""HiGHS Linear Optimization Methods
Interface to HiGHS linear optimization software.
https://www.maths.ed.ac.uk/hall/HiGHS/
.. versionadded:: 1.5.0
References
----------
.. [1] Q. Huangfu and J.A.J. Hall. "Parallelizing the dual revised simplex
method." Mathematical Programming Computation, 10 (1), 119-142,
2018. DOI: 10.1007/s12532-017-0130-5
"""
import inspect
import numpy as np
from ._optimize import _check_unknown_options, OptimizeWarning, OptimizeResult
from warnings import warn
from ._highs._highs_wrapper import _highs_wrapper
from ._highs._highs_constants import (
CONST_I_INF,
CONST_INF,
MESSAGE_LEVEL_MINIMAL,
MODEL_STATUS_NOTSET,
MODEL_STATUS_LOAD_ERROR,
MODEL_STATUS_MODEL_ERROR,
MODEL_STATUS_PRESOLVE_ERROR,
MODEL_STATUS_SOLVE_ERROR,
MODEL_STATUS_POSTSOLVE_ERROR,
MODEL_STATUS_MODEL_EMPTY,
MODEL_STATUS_PRIMAL_INFEASIBLE,
MODEL_STATUS_PRIMAL_UNBOUNDED,
MODEL_STATUS_OPTIMAL,
MODEL_STATUS_REACHED_DUAL_OBJECTIVE_VALUE_UPPER_BOUND
as MODEL_STATUS_RDOVUB,
MODEL_STATUS_REACHED_TIME_LIMIT,
MODEL_STATUS_REACHED_ITERATION_LIMIT,
MODEL_STATUS_PRIMAL_DUAL_INFEASIBLE,
MODEL_STATUS_DUAL_INFEASIBLE,
HIGHS_SIMPLEX_STRATEGY_CHOOSE,
HIGHS_SIMPLEX_STRATEGY_DUAL,
HIGHS_SIMPLEX_CRASH_STRATEGY_OFF,
HIGHS_SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_CHOOSE,
HIGHS_SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_DANTZIG,
HIGHS_SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_DEVEX,
HIGHS_SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE,
)
from scipy.sparse import csc_matrix, vstack, issparse
def _replace_inf(x):
# Replace `np.inf` with CONST_INF
infs = np.isinf(x)
x[infs] = np.sign(x[infs])*CONST_INF
return x
def _convert_to_highs_enum(option, option_str, choices):
# If option is in the choices we can look it up, if not use
# the default value taken from function signature and warn:
try:
return choices[option.lower()]
except AttributeError:
return choices[option]
except KeyError:
sig = inspect.signature(_linprog_highs)
default_str = sig.parameters[option_str].default
warn(f"Option {option_str} is {option}, but only values in "
f"{set(choices.keys())} are allowed. Using default: "
f"{default_str}.",
OptimizeWarning, stacklevel=3)
return choices[default_str]
def _linprog_highs(lp, solver, time_limit=None, presolve=True,
disp=False, maxiter=None,
dual_feasibility_tolerance=None,
primal_feasibility_tolerance=None,
ipm_optimality_tolerance=None,
simplex_dual_edge_weight_strategy=None,
**unknown_options):
r"""
Solve the following linear programming problem using one of the HiGHS
solvers:
User-facing documentation is in _linprog_doc.py.
Parameters
----------
lp : _LPProblem
A ``scipy.optimize._linprog_util._LPProblem`` ``namedtuple``.
solver : "ipm" or "simplex" or None
Which HiGHS solver to use. If ``None``, "simplex" will be used.
Options
-------
maxiter : int
The maximum number of iterations to perform in either phase. For
``solver='ipm'``, this does not include the number of crossover
iterations. Default is the largest possible value for an ``int``
on the platform.
disp : bool
Set to ``True`` if indicators of optimization status are to be printed
to the console each iteration; default ``False``.
time_limit : float
The maximum time in seconds allotted to solve the problem; default is
the largest possible value for a ``double`` on the platform.
presolve : bool
Presolve attempts to identify trivial infeasibilities,
identify trivial unboundedness, and simplify the problem before
sending it to the main solver. It is generally recommended
to keep the default setting ``True``; set to ``False`` if presolve is
to be disabled.
dual_feasibility_tolerance : double
Dual feasibility tolerance. Default is 1e-07.
The minimum of this and ``primal_feasibility_tolerance``
is used for the feasibility tolerance when ``solver='ipm'``.
primal_feasibility_tolerance : double
Primal feasibility tolerance. Default is 1e-07.
The minimum of this and ``dual_feasibility_tolerance``
is used for the feasibility tolerance when ``solver='ipm'``.
ipm_optimality_tolerance : double
Optimality tolerance for ``solver='ipm'``. Default is 1e-08.
Minimum possible value is 1e-12 and must be smaller than the largest
possible value for a ``double`` on the platform.
simplex_dual_edge_weight_strategy : str (default: None)
Strategy for simplex dual edge weights. The default, ``None``,
automatically selects one of the following.
``'dantzig'`` uses Dantzig's original strategy of choosing the most
negative reduced cost.
``'devex'`` uses the strategy described in [15]_.
``steepest`` uses the exact steepest edge strategy as described in
[16]_.
``'steepest-devex'`` begins with the exact steepest edge strategy
until the computation is too costly or inexact and then switches to
the devex method.
Curently, using ``None`` always selects ``'steepest-devex'``, but this
may change as new options become available.
unknown_options : dict
Optional arguments not used by this particular solver. If
``unknown_options`` is non-empty, a warning is issued listing all
unused options.
Returns
-------
sol : dict
A dictionary consisting of the fields:
x : 1D array
The values of the decision variables that minimizes the
objective function while satisfying the constraints.
fun : float
The optimal value of the objective function ``c @ x``.
slack : 1D array
The (nominally positive) values of the slack,
``b_ub - A_ub @ x``.
con : 1D array
The (nominally zero) residuals of the equality constraints,
``b_eq - A_eq @ x``.
success : bool
``True`` when the algorithm succeeds in finding an optimal
solution.
status : int
An integer representing the exit status of the algorithm.
``0`` : Optimization terminated successfully.
``1`` : Iteration or time limit reached.
``2`` : Problem appears to be infeasible.
``3`` : Problem appears to be unbounded.
``4`` : The HiGHS solver ran into a problem.
message : str
A string descriptor of the exit status of the algorithm.
nit : int
The total number of iterations performed.
For ``solver='simplex'``, this includes iterations in all
phases. For ``solver='ipm'``, this does not include
crossover iterations.
crossover_nit : int
The number of primal/dual pushes performed during the
crossover routine for ``solver='ipm'``. This is ``0``
for ``solver='simplex'``.
ineqlin : OptimizeResult
Solution and sensitivity information corresponding to the
inequality constraints, `b_ub`. A dictionary consisting of the
fields:
residual : np.ndnarray
The (nominally positive) values of the slack variables,
``b_ub - A_ub @ x``. This quantity is also commonly
referred to as "slack".
marginals : np.ndarray
The sensitivity (partial derivative) of the objective
function with respect to the right-hand side of the
inequality constraints, `b_ub`.
eqlin : OptimizeResult
Solution and sensitivity information corresponding to the
equality constraints, `b_eq`. A dictionary consisting of the
fields:
residual : np.ndarray
The (nominally zero) residuals of the equality constraints,
``b_eq - A_eq @ x``.
marginals : np.ndarray
The sensitivity (partial derivative) of the objective
function with respect to the right-hand side of the
equality constraints, `b_eq`.
lower, upper : OptimizeResult
Solution and sensitivity information corresponding to the
lower and upper bounds on decision variables, `bounds`.
residual : np.ndarray
The (nominally positive) values of the quantity
``x - lb`` (lower) or ``ub - x`` (upper).
marginals : np.ndarray
The sensitivity (partial derivative) of the objective
function with respect to the lower and upper
`bounds`.
Notes
-----
The result fields `ineqlin`, `eqlin`, `lower`, and `upper` all contain
`marginals`, or partial derivatives of the objective function with respect
to the right-hand side of each constraint. These partial derivatives are
also referred to as "Lagrange multipliers", "dual values", and
"shadow prices". The sign convention of `marginals` is opposite that
of Lagrange multipliers produced by many nonlinear solvers.
References
----------
.. [15] Harris, Paula MJ. "Pivot selection methods of the Devex LP code."
Mathematical programming 5.1 (1973): 1-28.
.. [16] Goldfarb, Donald, and John Ker Reid. "A practicable steepest-edge
simplex algorithm." Mathematical Programming 12.1 (1977): 361-371.
"""
_check_unknown_options(unknown_options)
# Map options to HiGHS enum values
simplex_dual_edge_weight_strategy_enum = _convert_to_highs_enum(
simplex_dual_edge_weight_strategy,
'simplex_dual_edge_weight_strategy',
choices={'dantzig': HIGHS_SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_DANTZIG,
'devex': HIGHS_SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_DEVEX,
'steepest-devex': HIGHS_SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_CHOOSE,
'steepest':
HIGHS_SIMPLEX_DUAL_EDGE_WEIGHT_STRATEGY_STEEPEST_EDGE,
None: None})
statuses = {
MODEL_STATUS_NOTSET: (
4,
'HiGHS Status Code 0: HighsModelStatusNOTSET',
),
MODEL_STATUS_LOAD_ERROR: (
4,
'HiGHS Status Code 1: HighsModelStatusLOAD_ERROR',
),
MODEL_STATUS_MODEL_ERROR: (
2,
'HiGHS Status Code 2: HighsModelStatusMODEL_ERROR',
),
MODEL_STATUS_PRESOLVE_ERROR: (
4,
'HiGHS Status Code 4: HighsModelStatusPRESOLVE_ERROR',
),
MODEL_STATUS_SOLVE_ERROR: (
4,
'HiGHS Status Code 5: HighsModelStatusSOLVE_ERROR',
),
MODEL_STATUS_POSTSOLVE_ERROR: (
4,
'HiGHS Status Code 6: HighsModelStatusPOSTSOLVE_ERROR',
),
MODEL_STATUS_MODEL_EMPTY: (
4,
'HiGHS Status Code 3: HighsModelStatusMODEL_EMPTY',
),
MODEL_STATUS_RDOVUB: (
4,
'HiGHS Status Code 10: '
'HighsModelStatusREACHED_DUAL_OBJECTIVE_VALUE_UPPER_BOUND',
),
MODEL_STATUS_PRIMAL_INFEASIBLE: (
2,
"The problem is infeasible.",
),
MODEL_STATUS_PRIMAL_UNBOUNDED: (
3,
"The problem is unbounded.",
),
MODEL_STATUS_OPTIMAL: (
0,
"Optimization terminated successfully.",
),
MODEL_STATUS_REACHED_TIME_LIMIT: (
1,
"Time limit reached.",
),
MODEL_STATUS_REACHED_ITERATION_LIMIT: (
1,
"Iteration limit reached.",
),
MODEL_STATUS_PRIMAL_DUAL_INFEASIBLE: (
2,
"The problem is primal/dual infeasible.",
),
MODEL_STATUS_DUAL_INFEASIBLE: (
2,
"The problem is dual infeasible.",
),
}
c, A_ub, b_ub, A_eq, b_eq, bounds, x0 = lp
lb, ub = bounds.T.copy() # separate bounds, copy->C-cntgs
# highs_wrapper solves LHS <= A*x <= RHS, not equality constraints
lhs_ub = -np.ones_like(b_ub)*np.inf # LHS of UB constraints is -inf
rhs_ub = b_ub # RHS of UB constraints is b_ub
lhs_eq = b_eq # Equality constaint is inequality
rhs_eq = b_eq # constraint with LHS=RHS
lhs = np.concatenate((lhs_ub, lhs_eq))
rhs = np.concatenate((rhs_ub, rhs_eq))
if issparse(A_ub) or issparse(A_eq):
A = vstack((A_ub, A_eq))
else:
A = np.vstack((A_ub, A_eq))
A = csc_matrix(A)
options = {
'presolve': presolve,
'sense': 1, # minimization
'solver': solver,
'time_limit': time_limit,
'message_level': MESSAGE_LEVEL_MINIMAL * disp,
'dual_feasibility_tolerance': dual_feasibility_tolerance,
'ipm_optimality_tolerance': ipm_optimality_tolerance,
'primal_feasibility_tolerance': primal_feasibility_tolerance,
'simplex_dual_edge_weight_strategy':
simplex_dual_edge_weight_strategy_enum,
'simplex_strategy': HIGHS_SIMPLEX_STRATEGY_DUAL,
'simplex_crash_strategy': HIGHS_SIMPLEX_CRASH_STRATEGY_OFF,
'ipm_iteration_limit': maxiter,
'simplex_iteration_limit': maxiter,
}
# np.inf doesn't work; use very large constant
rhs = _replace_inf(rhs)
lhs = _replace_inf(lhs)
lb = _replace_inf(lb)
ub = _replace_inf(ub)
res = _highs_wrapper(c, A.indptr, A.indices, A.data, lhs, rhs,
lb, ub, options)
# HiGHS represents constraints as lhs/rhs, so
# Ax + s = b => Ax = b - s
# and we need to split up s by A_ub and A_eq
if 'slack' in res:
slack = res['slack']
con = np.array(slack[len(b_ub):])
slack = np.array(slack[:len(b_ub)])
else:
slack, con = None, None
# lagrange multipliers for equalities/inequalities and upper/lower bounds
if 'lambda' in res:
lamda = res['lambda']
marg_ineqlin = np.array(lamda[:len(b_ub)])
marg_eqlin = np.array(lamda[len(b_ub):])
marg_upper = np.array(res['marg_bnds'][1, :])
marg_lower = np.array(res['marg_bnds'][0, :])
else:
marg_ineqlin, marg_eqlin = None, None
marg_upper, marg_lower = None, None
# this needs to be updated if we start choosing the solver intelligently
solvers = {"ipm": "highs-ipm", "simplex": "highs-ds", None: "highs-ds"}
# HiGHS will report OPTIMAL if the scaled model is solved to optimality
# even if the unscaled original model is infeasible;
# Catch that case here and provide a more useful message
if ((res['status'] == MODEL_STATUS_OPTIMAL) and
(res['unscaled_status'] != res['status'])):
_unscaled_status, unscaled_message = statuses[res["unscaled_status"]]
status, message = 4, ('An optimal solution to the scaled model was '
f'found but was {unscaled_message} in the '
'unscaled model. For more information run with '
'the option `disp: True`.')
else:
status, message = statuses[res['status']]
x = np.array(res['x']) if 'x' in res else None
sol = {'x': x,
'slack': slack,
'con': con,
'ineqlin': OptimizeResult({
'residual': slack,
'marginals': marg_ineqlin,
}),
'eqlin': OptimizeResult({
'residual': con,
'marginals': marg_eqlin,
}),
'lower': OptimizeResult({
'residual': None if x is None else x - lb,
'marginals': marg_lower,
}),
'upper': OptimizeResult({
'residual': None if x is None else ub - x,
'marginals': marg_upper
}),
'fun': res.get('fun'),
'status': status,
'success': res['status'] == MODEL_STATUS_OPTIMAL,
'message': message,
'nit': res.get('simplex_nit', 0) or res.get('ipm_nit', 0),
'crossover_nit': res.get('crossover_nit'),
}
return sol

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,572 @@
"""Revised simplex method for linear programming
The *revised simplex* method uses the method described in [1]_, except
that a factorization [2]_ of the basis matrix, rather than its inverse,
is efficiently maintained and used to solve the linear systems at each
iteration of the algorithm.
.. versionadded:: 1.3.0
References
----------
.. [1] Bertsimas, Dimitris, and J. Tsitsiklis. "Introduction to linear
programming." Athena Scientific 1 (1997): 997.
.. [2] Bartels, Richard H. "A stabilization of the simplex method."
Journal in Numerische Mathematik 16.5 (1971): 414-434.
"""
# Author: Matt Haberland
import numpy as np
from numpy.linalg import LinAlgError
from scipy.linalg import solve
from ._optimize import _check_unknown_options
from ._bglu_dense import LU
from ._bglu_dense import BGLU as BGLU
from ._linprog_util import _postsolve
from ._optimize import OptimizeResult
def _phase_one(A, b, x0, callback, postsolve_args, maxiter, tol, disp,
maxupdate, mast, pivot):
"""
The purpose of phase one is to find an initial basic feasible solution
(BFS) to the original problem.
Generates an auxiliary problem with a trivial BFS and an objective that
minimizes infeasibility of the original problem. Solves the auxiliary
problem using the main simplex routine (phase two). This either yields
a BFS to the original problem or determines that the original problem is
infeasible. If feasible, phase one detects redundant rows in the original
constraint matrix and removes them, then chooses additional indices as
necessary to complete a basis/BFS for the original problem.
"""
m, n = A.shape
status = 0
# generate auxiliary problem to get initial BFS
A, b, c, basis, x, status = _generate_auxiliary_problem(A, b, x0, tol)
if status == 6:
residual = c.dot(x)
iter_k = 0
return x, basis, A, b, residual, status, iter_k
# solve auxiliary problem
phase_one_n = n
iter_k = 0
x, basis, status, iter_k = _phase_two(c, A, x, basis, callback,
postsolve_args,
maxiter, tol, disp,
maxupdate, mast, pivot,
iter_k, phase_one_n)
# check for infeasibility
residual = c.dot(x)
if status == 0 and residual > tol:
status = 2
# drive artificial variables out of basis
# TODO: test redundant row removal better
# TODO: make solve more efficient with BGLU? This could take a while.
keep_rows = np.ones(m, dtype=bool)
for basis_column in basis[basis >= n]:
B = A[:, basis]
try:
basis_finder = np.abs(solve(B, A)) # inefficient
pertinent_row = np.argmax(basis_finder[:, basis_column])
eligible_columns = np.ones(n, dtype=bool)
eligible_columns[basis[basis < n]] = 0
eligible_column_indices = np.where(eligible_columns)[0]
index = np.argmax(basis_finder[:, :n]
[pertinent_row, eligible_columns])
new_basis_column = eligible_column_indices[index]
if basis_finder[pertinent_row, new_basis_column] < tol:
keep_rows[pertinent_row] = False
else:
basis[basis == basis_column] = new_basis_column
except LinAlgError:
status = 4
# form solution to original problem
A = A[keep_rows, :n]
basis = basis[keep_rows]
x = x[:n]
m = A.shape[0]
return x, basis, A, b, residual, status, iter_k
def _get_more_basis_columns(A, basis):
"""
Called when the auxiliary problem terminates with artificial columns in
the basis, which must be removed and replaced with non-artificial
columns. Finds additional columns that do not make the matrix singular.
"""
m, n = A.shape
# options for inclusion are those that aren't already in the basis
a = np.arange(m+n)
bl = np.zeros(len(a), dtype=bool)
bl[basis] = 1
options = a[~bl]
options = options[options < n] # and they have to be non-artificial
# form basis matrix
B = np.zeros((m, m))
B[:, 0:len(basis)] = A[:, basis]
if (basis.size > 0 and
np.linalg.matrix_rank(B[:, :len(basis)]) < len(basis)):
raise Exception("Basis has dependent columns")
rank = 0 # just enter the loop
for i in range(n): # somewhat arbitrary, but we need another way out
# permute the options, and take as many as needed
new_basis = np.random.permutation(options)[:m-len(basis)]
B[:, len(basis):] = A[:, new_basis] # update the basis matrix
rank = np.linalg.matrix_rank(B) # check the rank
if rank == m:
break
return np.concatenate((basis, new_basis))
def _generate_auxiliary_problem(A, b, x0, tol):
"""
Modifies original problem to create an auxiliary problem with a trivial
initial basic feasible solution and an objective that minimizes
infeasibility in the original problem.
Conceptually, this is done by stacking an identity matrix on the right of
the original constraint matrix, adding artificial variables to correspond
with each of these new columns, and generating a cost vector that is all
zeros except for ones corresponding with each of the new variables.
A initial basic feasible solution is trivial: all variables are zero
except for the artificial variables, which are set equal to the
corresponding element of the right hand side `b`.
Runnning the simplex method on this auxiliary problem drives all of the
artificial variables - and thus the cost - to zero if the original problem
is feasible. The original problem is declared infeasible otherwise.
Much of the complexity below is to improve efficiency by using singleton
columns in the original problem where possible, thus generating artificial
variables only as necessary, and using an initial 'guess' basic feasible
solution.
"""
status = 0
m, n = A.shape
if x0 is not None:
x = x0
else:
x = np.zeros(n)
r = b - A@x # residual; this must be all zeros for feasibility
A[r < 0] = -A[r < 0] # express problem with RHS positive for trivial BFS
b[r < 0] = -b[r < 0] # to the auxiliary problem
r[r < 0] *= -1
# Rows which we will need to find a trivial way to zero.
# This should just be the rows where there is a nonzero residual.
# But then we would not necessarily have a column singleton in every row.
# This makes it difficult to find an initial basis.
if x0 is None:
nonzero_constraints = np.arange(m)
else:
nonzero_constraints = np.where(r > tol)[0]
# these are (at least some of) the initial basis columns
basis = np.where(np.abs(x) > tol)[0]
if len(nonzero_constraints) == 0 and len(basis) <= m: # already a BFS
c = np.zeros(n)
basis = _get_more_basis_columns(A, basis)
return A, b, c, basis, x, status
elif (len(nonzero_constraints) > m - len(basis) or
np.any(x < 0)): # can't get trivial BFS
c = np.zeros(n)
status = 6
return A, b, c, basis, x, status
# chooses existing columns appropriate for inclusion in initial basis
cols, rows = _select_singleton_columns(A, r)
# find the rows we need to zero that we _can_ zero with column singletons
i_tofix = np.isin(rows, nonzero_constraints)
# these columns can't already be in the basis, though
# we are going to add them to the basis and change the corresponding x val
i_notinbasis = np.logical_not(np.isin(cols, basis))
i_fix_without_aux = np.logical_and(i_tofix, i_notinbasis)
rows = rows[i_fix_without_aux]
cols = cols[i_fix_without_aux]
# indices of the rows we can only zero with auxiliary variable
# these rows will get a one in each auxiliary column
arows = nonzero_constraints[np.logical_not(
np.isin(nonzero_constraints, rows))]
n_aux = len(arows)
acols = n + np.arange(n_aux) # indices of auxiliary columns
basis_ng = np.concatenate((cols, acols)) # basis columns not from guess
basis_ng_rows = np.concatenate((rows, arows)) # rows we need to zero
# add auxiliary singleton columns
A = np.hstack((A, np.zeros((m, n_aux))))
A[arows, acols] = 1
# generate initial BFS
x = np.concatenate((x, np.zeros(n_aux)))
x[basis_ng] = r[basis_ng_rows]/A[basis_ng_rows, basis_ng]
# generate costs to minimize infeasibility
c = np.zeros(n_aux + n)
c[acols] = 1
# basis columns correspond with nonzeros in guess, those with column
# singletons we used to zero remaining constraints, and any additional
# columns to get a full set (m columns)
basis = np.concatenate((basis, basis_ng))
basis = _get_more_basis_columns(A, basis) # add columns as needed
return A, b, c, basis, x, status
def _select_singleton_columns(A, b):
"""
Finds singleton columns for which the singleton entry is of the same sign
as the right-hand side; these columns are eligible for inclusion in an
initial basis. Determines the rows in which the singleton entries are
located. For each of these rows, returns the indices of the one singleton
column and its corresponding row.
"""
# find indices of all singleton columns and corresponding row indicies
column_indices = np.nonzero(np.sum(np.abs(A) != 0, axis=0) == 1)[0]
columns = A[:, column_indices] # array of singleton columns
row_indices = np.zeros(len(column_indices), dtype=int)
nonzero_rows, nonzero_columns = np.nonzero(columns)
row_indices[nonzero_columns] = nonzero_rows # corresponding row indicies
# keep only singletons with entries that have same sign as RHS
# this is necessary because all elements of BFS must be non-negative
same_sign = A[row_indices, column_indices]*b[row_indices] >= 0
column_indices = column_indices[same_sign][::-1]
row_indices = row_indices[same_sign][::-1]
# Reversing the order so that steps below select rightmost columns
# for initial basis, which will tend to be slack variables. (If the
# guess corresponds with a basic feasible solution but a constraint
# is not satisfied with the corresponding slack variable zero, the slack
# variable must be basic.)
# for each row, keep rightmost singleton column with an entry in that row
unique_row_indices, first_columns = np.unique(row_indices,
return_index=True)
return column_indices[first_columns], unique_row_indices
def _find_nonzero_rows(A, tol):
"""
Returns logical array indicating the locations of rows with at least
one nonzero element.
"""
return np.any(np.abs(A) > tol, axis=1)
def _select_enter_pivot(c_hat, bl, a, rule="bland", tol=1e-12):
"""
Selects a pivot to enter the basis. Currently Bland's rule - the smallest
index that has a negative reduced cost - is the default.
"""
if rule.lower() == "mrc": # index with minimum reduced cost
return a[~bl][np.argmin(c_hat)]
else: # smallest index w/ negative reduced cost
return a[~bl][c_hat < -tol][0]
def _display_iter(phase, iteration, slack, con, fun):
"""
Print indicators of optimization status to the console.
"""
header = True if not iteration % 20 else False
if header:
print("Phase",
"Iteration",
"Minimum Slack ",
"Constraint Residual",
"Objective ")
# :<X.Y left aligns Y digits in X digit spaces
fmt = '{0:<6}{1:<10}{2:<20.13}{3:<20.13}{4:<20.13}'
try:
slack = np.min(slack)
except ValueError:
slack = "NA"
print(fmt.format(phase, iteration, slack, np.linalg.norm(con), fun))
def _display_and_callback(phase_one_n, x, postsolve_args, status,
iteration, disp, callback):
if phase_one_n is not None:
phase = 1
x_postsolve = x[:phase_one_n]
else:
phase = 2
x_postsolve = x
x_o, fun, slack, con = _postsolve(x_postsolve,
postsolve_args)
if callback is not None:
res = OptimizeResult({'x': x_o, 'fun': fun, 'slack': slack,
'con': con, 'nit': iteration,
'phase': phase, 'complete': False,
'status': status, 'message': "",
'success': False})
callback(res)
if disp:
_display_iter(phase, iteration, slack, con, fun)
def _phase_two(c, A, x, b, callback, postsolve_args, maxiter, tol, disp,
maxupdate, mast, pivot, iteration=0, phase_one_n=None):
"""
The heart of the simplex method. Beginning with a basic feasible solution,
moves to adjacent basic feasible solutions successively lower reduced cost.
Terminates when there are no basic feasible solutions with lower reduced
cost or if the problem is determined to be unbounded.
This implementation follows the revised simplex method based on LU
decomposition. Rather than maintaining a tableau or an inverse of the
basis matrix, we keep a factorization of the basis matrix that allows
efficient solution of linear systems while avoiding stability issues
associated with inverted matrices.
"""
m, n = A.shape
status = 0
a = np.arange(n) # indices of columns of A
ab = np.arange(m) # indices of columns of B
if maxupdate:
# basis matrix factorization object; similar to B = A[:, b]
B = BGLU(A, b, maxupdate, mast)
else:
B = LU(A, b)
for iteration in range(iteration, maxiter):
if disp or callback is not None:
_display_and_callback(phase_one_n, x, postsolve_args, status,
iteration, disp, callback)
bl = np.zeros(len(a), dtype=bool)
bl[b] = 1
xb = x[b] # basic variables
cb = c[b] # basic costs
try:
v = B.solve(cb, transposed=True) # similar to v = solve(B.T, cb)
except LinAlgError:
status = 4
break
# TODO: cythonize?
c_hat = c - v.dot(A) # reduced cost
c_hat = c_hat[~bl]
# Above is much faster than:
# N = A[:, ~bl] # slow!
# c_hat = c[~bl] - v.T.dot(N)
# Can we perform the multiplication only on the nonbasic columns?
if np.all(c_hat >= -tol): # all reduced costs positive -> terminate
break
j = _select_enter_pivot(c_hat, bl, a, rule=pivot, tol=tol)
u = B.solve(A[:, j]) # similar to u = solve(B, A[:, j])
i = u > tol # if none of the u are positive, unbounded
if not np.any(i):
status = 3
break
th = xb[i]/u[i]
l = np.argmin(th) # implicitly selects smallest subscript
th_star = th[l] # step size
x[b] = x[b] - th_star*u # take step
x[j] = th_star
B.update(ab[i][l], j) # modify basis
b = B.b # similar to b[ab[i][l]] =
else:
# If the end of the for loop is reached (without a break statement),
# then another step has been taken, so the iteration counter should
# increment, info should be displayed, and callback should be called.
iteration += 1
status = 1
if disp or callback is not None:
_display_and_callback(phase_one_n, x, postsolve_args, status,
iteration, disp, callback)
return x, b, status, iteration
def _linprog_rs(c, c0, A, b, x0, callback, postsolve_args,
maxiter=5000, tol=1e-12, disp=False,
maxupdate=10, mast=False, pivot="mrc",
**unknown_options):
"""
Solve the following linear programming problem via a two-phase
revised simplex algorithm.::
minimize: c @ x
subject to: A @ x == b
0 <= x < oo
User-facing documentation is in _linprog_doc.py.
Parameters
----------
c : 1-D array
Coefficients of the linear objective function to be minimized.
c0 : float
Constant term in objective function due to fixed (and eliminated)
variables. (Currently unused.)
A : 2-D array
2-D array which, when matrix-multiplied by ``x``, gives the values of
the equality constraints at ``x``.
b : 1-D array
1-D array of values representing the RHS of each equality constraint
(row) in ``A_eq``.
x0 : 1-D array, optional
Starting values of the independent variables, which will be refined by
the optimization algorithm. For the revised simplex method, these must
correspond with a basic feasible solution.
callback : callable, optional
If a callback function is provided, it will be called within each
iteration of the algorithm. The callback function must accept a single
`scipy.optimize.OptimizeResult` consisting of the following fields:
x : 1-D array
Current solution vector.
fun : float
Current value of the objective function ``c @ x``.
success : bool
True only when an algorithm has completed successfully,
so this is always False as the callback function is called
only while the algorithm is still iterating.
slack : 1-D array
The values of the slack variables. Each slack variable
corresponds to an inequality constraint. If the slack is zero,
the corresponding constraint is active.
con : 1-D array
The (nominally zero) residuals of the equality constraints,
that is, ``b - A_eq @ x``.
phase : int
The phase of the algorithm being executed.
status : int
For revised simplex, this is always 0 because if a different
status is detected, the algorithm terminates.
nit : int
The number of iterations performed.
message : str
A string descriptor of the exit status of the optimization.
postsolve_args : tuple
Data needed by _postsolve to convert the solution to the standard-form
problem into the solution to the original problem.
Options
-------
maxiter : int
The maximum number of iterations to perform in either phase.
tol : float
The tolerance which determines when a solution is "close enough" to
zero in Phase 1 to be considered a basic feasible solution or close
enough to positive to serve as an optimal solution.
disp : bool
Set to ``True`` if indicators of optimization status are to be printed
to the console each iteration.
maxupdate : int
The maximum number of updates performed on the LU factorization.
After this many updates is reached, the basis matrix is factorized
from scratch.
mast : bool
Minimize Amortized Solve Time. If enabled, the average time to solve
a linear system using the basis factorization is measured. Typically,
the average solve time will decrease with each successive solve after
initial factorization, as factorization takes much more time than the
solve operation (and updates). Eventually, however, the updated
factorization becomes sufficiently complex that the average solve time
begins to increase. When this is detected, the basis is refactorized
from scratch. Enable this option to maximize speed at the risk of
nondeterministic behavior. Ignored if ``maxupdate`` is 0.
pivot : "mrc" or "bland"
Pivot rule: Minimum Reduced Cost (default) or Bland's rule. Choose
Bland's rule if iteration limit is reached and cycling is suspected.
unknown_options : dict
Optional arguments not used by this particular solver. If
`unknown_options` is non-empty a warning is issued listing all
unused options.
Returns
-------
x : 1-D array
Solution vector.
status : int
An integer representing the exit status of the optimization::
0 : Optimization terminated successfully
1 : Iteration limit reached
2 : Problem appears to be infeasible
3 : Problem appears to be unbounded
4 : Numerical difficulties encountered
5 : No constraints; turn presolve on
6 : Guess x0 cannot be converted to a basic feasible solution
message : str
A string descriptor of the exit status of the optimization.
iteration : int
The number of iterations taken to solve the problem.
"""
_check_unknown_options(unknown_options)
messages = ["Optimization terminated successfully.",
"Iteration limit reached.",
"The problem appears infeasible, as the phase one auxiliary "
"problem terminated successfully with a residual of {0:.1e}, "
"greater than the tolerance {1} required for the solution to "
"be considered feasible. Consider increasing the tolerance to "
"be greater than {0:.1e}. If this tolerance is unnaceptably "
"large, the problem is likely infeasible.",
"The problem is unbounded, as the simplex algorithm found "
"a basic feasible solution from which there is a direction "
"with negative reduced cost in which all decision variables "
"increase.",
"Numerical difficulties encountered; consider trying "
"method='interior-point'.",
"Problems with no constraints are trivially solved; please "
"turn presolve on.",
"The guess x0 cannot be converted to a basic feasible "
"solution. "
]
if A.size == 0: # address test_unbounded_below_no_presolve_corrected
return np.zeros(c.shape), 5, messages[5], 0
x, basis, A, b, residual, status, iteration = (
_phase_one(A, b, x0, callback, postsolve_args,
maxiter, tol, disp, maxupdate, mast, pivot))
if status == 0:
x, basis, status, iteration = _phase_two(c, A, x, basis, callback,
postsolve_args,
maxiter, tol, disp,
maxupdate, mast, pivot,
iteration)
return x, status, messages[status].format(residual, tol), iteration

View File

@@ -0,0 +1,661 @@
"""Simplex method for linear programming
The *simplex* method uses a traditional, full-tableau implementation of
Dantzig's simplex algorithm [1]_, [2]_ (*not* the Nelder-Mead simplex).
This algorithm is included for backwards compatibility and educational
purposes.
.. versionadded:: 0.15.0
Warnings
--------
The simplex method may encounter numerical difficulties when pivot
values are close to the specified tolerance. If encountered try
remove any redundant constraints, change the pivot strategy to Bland's
rule or increase the tolerance value.
Alternatively, more robust methods maybe be used. See
:ref:`'interior-point' <optimize.linprog-interior-point>` and
:ref:`'revised simplex' <optimize.linprog-revised_simplex>`.
References
----------
.. [1] Dantzig, George B., Linear programming and extensions. Rand
Corporation Research Study Princeton Univ. Press, Princeton, NJ,
1963
.. [2] Hillier, S.H. and Lieberman, G.J. (1995), "Introduction to
Mathematical Programming", McGraw-Hill, Chapter 4.
"""
import numpy as np
from warnings import warn
from ._optimize import OptimizeResult, OptimizeWarning, _check_unknown_options
from ._linprog_util import _postsolve
def _pivot_col(T, tol=1e-9, bland=False):
"""
Given a linear programming simplex tableau, determine the column
of the variable to enter the basis.
Parameters
----------
T : 2-D array
A 2-D array representing the simplex tableau, T, corresponding to the
linear programming problem. It should have the form:
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
.
.
.
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
[c[0], c[1], ..., c[n_total], 0]]
for a Phase 2 problem, or the form:
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
.
.
.
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
[c[0], c[1], ..., c[n_total], 0],
[c'[0], c'[1], ..., c'[n_total], 0]]
for a Phase 1 problem (a problem in which a basic feasible solution is
sought prior to maximizing the actual objective. ``T`` is modified in
place by ``_solve_simplex``.
tol : float
Elements in the objective row larger than -tol will not be considered
for pivoting. Nominally this value is zero, but numerical issues
cause a tolerance about zero to be necessary.
bland : bool
If True, use Bland's rule for selection of the column (select the
first column with a negative coefficient in the objective row,
regardless of magnitude).
Returns
-------
status: bool
True if a suitable pivot column was found, otherwise False.
A return of False indicates that the linear programming simplex
algorithm is complete.
col: int
The index of the column of the pivot element.
If status is False, col will be returned as nan.
"""
ma = np.ma.masked_where(T[-1, :-1] >= -tol, T[-1, :-1], copy=False)
if ma.count() == 0:
return False, np.nan
if bland:
# ma.mask is sometimes 0d
return True, np.nonzero(np.logical_not(np.atleast_1d(ma.mask)))[0][0]
return True, np.ma.nonzero(ma == ma.min())[0][0]
def _pivot_row(T, basis, pivcol, phase, tol=1e-9, bland=False):
"""
Given a linear programming simplex tableau, determine the row for the
pivot operation.
Parameters
----------
T : 2-D array
A 2-D array representing the simplex tableau, T, corresponding to the
linear programming problem. It should have the form:
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
.
.
.
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
[c[0], c[1], ..., c[n_total], 0]]
for a Phase 2 problem, or the form:
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
.
.
.
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
[c[0], c[1], ..., c[n_total], 0],
[c'[0], c'[1], ..., c'[n_total], 0]]
for a Phase 1 problem (a Problem in which a basic feasible solution is
sought prior to maximizing the actual objective. ``T`` is modified in
place by ``_solve_simplex``.
basis : array
A list of the current basic variables.
pivcol : int
The index of the pivot column.
phase : int
The phase of the simplex algorithm (1 or 2).
tol : float
Elements in the pivot column smaller than tol will not be considered
for pivoting. Nominally this value is zero, but numerical issues
cause a tolerance about zero to be necessary.
bland : bool
If True, use Bland's rule for selection of the row (if more than one
row can be used, choose the one with the lowest variable index).
Returns
-------
status: bool
True if a suitable pivot row was found, otherwise False. A return
of False indicates that the linear programming problem is unbounded.
row: int
The index of the row of the pivot element. If status is False, row
will be returned as nan.
"""
if phase == 1:
k = 2
else:
k = 1
ma = np.ma.masked_where(T[:-k, pivcol] <= tol, T[:-k, pivcol], copy=False)
if ma.count() == 0:
return False, np.nan
mb = np.ma.masked_where(T[:-k, pivcol] <= tol, T[:-k, -1], copy=False)
q = mb / ma
min_rows = np.ma.nonzero(q == q.min())[0]
if bland:
return True, min_rows[np.argmin(np.take(basis, min_rows))]
return True, min_rows[0]
def _apply_pivot(T, basis, pivrow, pivcol, tol=1e-9):
"""
Pivot the simplex tableau inplace on the element given by (pivrow, pivol).
The entering variable corresponds to the column given by pivcol forcing
the variable basis[pivrow] to leave the basis.
Parameters
----------
T : 2-D array
A 2-D array representing the simplex tableau, T, corresponding to the
linear programming problem. It should have the form:
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
.
.
.
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
[c[0], c[1], ..., c[n_total], 0]]
for a Phase 2 problem, or the form:
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
.
.
.
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
[c[0], c[1], ..., c[n_total], 0],
[c'[0], c'[1], ..., c'[n_total], 0]]
for a Phase 1 problem (a problem in which a basic feasible solution is
sought prior to maximizing the actual objective. ``T`` is modified in
place by ``_solve_simplex``.
basis : 1-D array
An array of the indices of the basic variables, such that basis[i]
contains the column corresponding to the basic variable for row i.
Basis is modified in place by _apply_pivot.
pivrow : int
Row index of the pivot.
pivcol : int
Column index of the pivot.
"""
basis[pivrow] = pivcol
pivval = T[pivrow, pivcol]
T[pivrow] = T[pivrow] / pivval
for irow in range(T.shape[0]):
if irow != pivrow:
T[irow] = T[irow] - T[pivrow] * T[irow, pivcol]
# The selected pivot should never lead to a pivot value less than the tol.
if np.isclose(pivval, tol, atol=0, rtol=1e4):
message = (
"The pivot operation produces a pivot value of:{0: .1e}, "
"which is only slightly greater than the specified "
"tolerance{1: .1e}. This may lead to issues regarding the "
"numerical stability of the simplex method. "
"Removing redundant constraints, changing the pivot strategy "
"via Bland's rule or increasing the tolerance may "
"help reduce the issue.".format(pivval, tol))
warn(message, OptimizeWarning, stacklevel=5)
def _solve_simplex(T, n, basis, callback, postsolve_args,
maxiter=1000, tol=1e-9, phase=2, bland=False, nit0=0,
):
"""
Solve a linear programming problem in "standard form" using the Simplex
Method. Linear Programming is intended to solve the following problem form:
Minimize::
c @ x
Subject to::
A @ x == b
x >= 0
Parameters
----------
T : 2-D array
A 2-D array representing the simplex tableau, T, corresponding to the
linear programming problem. It should have the form:
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
.
.
.
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
[c[0], c[1], ..., c[n_total], 0]]
for a Phase 2 problem, or the form:
[[A[0, 0], A[0, 1], ..., A[0, n_total], b[0]],
[A[1, 0], A[1, 1], ..., A[1, n_total], b[1]],
.
.
.
[A[m, 0], A[m, 1], ..., A[m, n_total], b[m]],
[c[0], c[1], ..., c[n_total], 0],
[c'[0], c'[1], ..., c'[n_total], 0]]
for a Phase 1 problem (a problem in which a basic feasible solution is
sought prior to maximizing the actual objective. ``T`` is modified in
place by ``_solve_simplex``.
n : int
The number of true variables in the problem.
basis : 1-D array
An array of the indices of the basic variables, such that basis[i]
contains the column corresponding to the basic variable for row i.
Basis is modified in place by _solve_simplex
callback : callable, optional
If a callback function is provided, it will be called within each
iteration of the algorithm. The callback must accept a
`scipy.optimize.OptimizeResult` consisting of the following fields:
x : 1-D array
Current solution vector
fun : float
Current value of the objective function
success : bool
True only when a phase has completed successfully. This
will be False for most iterations.
slack : 1-D array
The values of the slack variables. Each slack variable
corresponds to an inequality constraint. If the slack is zero,
the corresponding constraint is active.
con : 1-D array
The (nominally zero) residuals of the equality constraints,
that is, ``b - A_eq @ x``
phase : int
The phase of the optimization being executed. In phase 1 a basic
feasible solution is sought and the T has an additional row
representing an alternate objective function.
status : int
An integer representing the exit status of the optimization::
0 : Optimization terminated successfully
1 : Iteration limit reached
2 : Problem appears to be infeasible
3 : Problem appears to be unbounded
4 : Serious numerical difficulties encountered
nit : int
The number of iterations performed.
message : str
A string descriptor of the exit status of the optimization.
postsolve_args : tuple
Data needed by _postsolve to convert the solution to the standard-form
problem into the solution to the original problem.
maxiter : int
The maximum number of iterations to perform before aborting the
optimization.
tol : float
The tolerance which determines when a solution is "close enough" to
zero in Phase 1 to be considered a basic feasible solution or close
enough to positive to serve as an optimal solution.
phase : int
The phase of the optimization being executed. In phase 1 a basic
feasible solution is sought and the T has an additional row
representing an alternate objective function.
bland : bool
If True, choose pivots using Bland's rule [3]_. In problems which
fail to converge due to cycling, using Bland's rule can provide
convergence at the expense of a less optimal path about the simplex.
nit0 : int
The initial iteration number used to keep an accurate iteration total
in a two-phase problem.
Returns
-------
nit : int
The number of iterations. Used to keep an accurate iteration total
in the two-phase problem.
status : int
An integer representing the exit status of the optimization::
0 : Optimization terminated successfully
1 : Iteration limit reached
2 : Problem appears to be infeasible
3 : Problem appears to be unbounded
4 : Serious numerical difficulties encountered
"""
nit = nit0
status = 0
message = ''
complete = False
if phase == 1:
m = T.shape[1]-2
elif phase == 2:
m = T.shape[1]-1
else:
raise ValueError("Argument 'phase' to _solve_simplex must be 1 or 2")
if phase == 2:
# Check if any artificial variables are still in the basis.
# If yes, check if any coefficients from this row and a column
# corresponding to one of the non-artificial variable is non-zero.
# If found, pivot at this term. If not, start phase 2.
# Do this for all artificial variables in the basis.
# Ref: "An Introduction to Linear Programming and Game Theory"
# by Paul R. Thie, Gerard E. Keough, 3rd Ed,
# Chapter 3.7 Redundant Systems (pag 102)
for pivrow in [row for row in range(basis.size)
if basis[row] > T.shape[1] - 2]:
non_zero_row = [col for col in range(T.shape[1] - 1)
if abs(T[pivrow, col]) > tol]
if len(non_zero_row) > 0:
pivcol = non_zero_row[0]
_apply_pivot(T, basis, pivrow, pivcol, tol)
nit += 1
if len(basis[:m]) == 0:
solution = np.empty(T.shape[1] - 1, dtype=np.float64)
else:
solution = np.empty(max(T.shape[1] - 1, max(basis[:m]) + 1),
dtype=np.float64)
while not complete:
# Find the pivot column
pivcol_found, pivcol = _pivot_col(T, tol, bland)
if not pivcol_found:
pivcol = np.nan
pivrow = np.nan
status = 0
complete = True
else:
# Find the pivot row
pivrow_found, pivrow = _pivot_row(T, basis, pivcol, phase, tol, bland)
if not pivrow_found:
status = 3
complete = True
if callback is not None:
solution[:] = 0
solution[basis[:n]] = T[:n, -1]
x = solution[:m]
x, fun, slack, con = _postsolve(
x, postsolve_args
)
res = OptimizeResult({
'x': x,
'fun': fun,
'slack': slack,
'con': con,
'status': status,
'message': message,
'nit': nit,
'success': status == 0 and complete,
'phase': phase,
'complete': complete,
})
callback(res)
if not complete:
if nit >= maxiter:
# Iteration limit exceeded
status = 1
complete = True
else:
_apply_pivot(T, basis, pivrow, pivcol, tol)
nit += 1
return nit, status
def _linprog_simplex(c, c0, A, b, callback, postsolve_args,
maxiter=1000, tol=1e-9, disp=False, bland=False,
**unknown_options):
"""
Minimize a linear objective function subject to linear equality and
non-negativity constraints using the two phase simplex method.
Linear programming is intended to solve problems of the following form:
Minimize::
c @ x
Subject to::
A @ x == b
x >= 0
User-facing documentation is in _linprog_doc.py.
Parameters
----------
c : 1-D array
Coefficients of the linear objective function to be minimized.
c0 : float
Constant term in objective function due to fixed (and eliminated)
variables. (Purely for display.)
A : 2-D array
2-D array such that ``A @ x``, gives the values of the equality
constraints at ``x``.
b : 1-D array
1-D array of values representing the right hand side of each equality
constraint (row) in ``A``.
callback : callable, optional
If a callback function is provided, it will be called within each
iteration of the algorithm. The callback function must accept a single
`scipy.optimize.OptimizeResult` consisting of the following fields:
x : 1-D array
Current solution vector
fun : float
Current value of the objective function
success : bool
True when an algorithm has completed successfully.
slack : 1-D array
The values of the slack variables. Each slack variable
corresponds to an inequality constraint. If the slack is zero,
the corresponding constraint is active.
con : 1-D array
The (nominally zero) residuals of the equality constraints,
that is, ``b - A_eq @ x``
phase : int
The phase of the algorithm being executed.
status : int
An integer representing the status of the optimization::
0 : Algorithm proceeding nominally
1 : Iteration limit reached
2 : Problem appears to be infeasible
3 : Problem appears to be unbounded
4 : Serious numerical difficulties encountered
nit : int
The number of iterations performed.
message : str
A string descriptor of the exit status of the optimization.
postsolve_args : tuple
Data needed by _postsolve to convert the solution to the standard-form
problem into the solution to the original problem.
Options
-------
maxiter : int
The maximum number of iterations to perform.
disp : bool
If True, print exit status message to sys.stdout
tol : float
The tolerance which determines when a solution is "close enough" to
zero in Phase 1 to be considered a basic feasible solution or close
enough to positive to serve as an optimal solution.
bland : bool
If True, use Bland's anti-cycling rule [3]_ to choose pivots to
prevent cycling. If False, choose pivots which should lead to a
converged solution more quickly. The latter method is subject to
cycling (non-convergence) in rare instances.
unknown_options : dict
Optional arguments not used by this particular solver. If
`unknown_options` is non-empty a warning is issued listing all
unused options.
Returns
-------
x : 1-D array
Solution vector.
status : int
An integer representing the exit status of the optimization::
0 : Optimization terminated successfully
1 : Iteration limit reached
2 : Problem appears to be infeasible
3 : Problem appears to be unbounded
4 : Serious numerical difficulties encountered
message : str
A string descriptor of the exit status of the optimization.
iteration : int
The number of iterations taken to solve the problem.
References
----------
.. [1] Dantzig, George B., Linear programming and extensions. Rand
Corporation Research Study Princeton Univ. Press, Princeton, NJ,
1963
.. [2] Hillier, S.H. and Lieberman, G.J. (1995), "Introduction to
Mathematical Programming", McGraw-Hill, Chapter 4.
.. [3] Bland, Robert G. New finite pivoting rules for the simplex method.
Mathematics of Operations Research (2), 1977: pp. 103-107.
Notes
-----
The expected problem formulation differs between the top level ``linprog``
module and the method specific solvers. The method specific solvers expect a
problem in standard form:
Minimize::
c @ x
Subject to::
A @ x == b
x >= 0
Whereas the top level ``linprog`` module expects a problem of form:
Minimize::
c @ x
Subject to::
A_ub @ x <= b_ub
A_eq @ x == b_eq
lb <= x <= ub
where ``lb = 0`` and ``ub = None`` unless set in ``bounds``.
The original problem contains equality, upper-bound and variable constraints
whereas the method specific solver requires equality constraints and
variable non-negativity.
``linprog`` module converts the original problem to standard form by
converting the simple bounds to upper bound constraints, introducing
non-negative slack variables for inequality constraints, and expressing
unbounded variables as the difference between two non-negative variables.
"""
_check_unknown_options(unknown_options)
status = 0
messages = {0: "Optimization terminated successfully.",
1: "Iteration limit reached.",
2: "Optimization failed. Unable to find a feasible"
" starting point.",
3: "Optimization failed. The problem appears to be unbounded.",
4: "Optimization failed. Singular matrix encountered."}
n, m = A.shape
# All constraints must have b >= 0.
is_negative_constraint = np.less(b, 0)
A[is_negative_constraint] *= -1
b[is_negative_constraint] *= -1
# As all constraints are equality constraints the artificial variables
# will also be basic variables.
av = np.arange(n) + m
basis = av.copy()
# Format the phase one tableau by adding artificial variables and stacking
# the constraints, the objective row and pseudo-objective row.
row_constraints = np.hstack((A, np.eye(n), b[:, np.newaxis]))
row_objective = np.hstack((c, np.zeros(n), c0))
row_pseudo_objective = -row_constraints.sum(axis=0)
row_pseudo_objective[av] = 0
T = np.vstack((row_constraints, row_objective, row_pseudo_objective))
nit1, status = _solve_simplex(T, n, basis, callback=callback,
postsolve_args=postsolve_args,
maxiter=maxiter, tol=tol, phase=1,
bland=bland
)
# if pseudo objective is zero, remove the last row from the tableau and
# proceed to phase 2
nit2 = nit1
if abs(T[-1, -1]) < tol:
# Remove the pseudo-objective row from the tableau
T = T[:-1, :]
# Remove the artificial variable columns from the tableau
T = np.delete(T, av, 1)
else:
# Failure to find a feasible starting point
status = 2
messages[status] = (
"Phase 1 of the simplex method failed to find a feasible "
"solution. The pseudo-objective function evaluates to {0:.1e} "
"which exceeds the required tolerance of {1} for a solution to be "
"considered 'close enough' to zero to be a basic solution. "
"Consider increasing the tolerance to be greater than {0:.1e}. "
"If this tolerance is unacceptably large the problem may be "
"infeasible.".format(abs(T[-1, -1]), tol)
)
if status == 0:
# Phase 2
nit2, status = _solve_simplex(T, n, basis, callback=callback,
postsolve_args=postsolve_args,
maxiter=maxiter, tol=tol, phase=2,
bland=bland, nit0=nit1
)
solution = np.zeros(n + m)
solution[basis[:n]] = T[:n, -1]
x = solution[:m]
return x, status, messages[status], int(nit2)

File diff suppressed because it is too large Load Diff

View File

@@ -0,0 +1,86 @@
# Wrapper for the shortest augmenting path algorithm for solving the
# rectangular linear sum assignment problem. The original code was an
# implementation of the Hungarian algorithm (Kuhn-Munkres) taken from
# scikit-learn, based on original code by Brian Clapper and adapted to NumPy
# by Gael Varoquaux. Further improvements by Ben Root, Vlad Niculae, Lars
# Buitinck, and Peter Larsen.
#
# Copyright (c) 2008 Brian M. Clapper <bmc@clapper.org>, Gael Varoquaux
# Author: Brian M. Clapper, Gael Varoquaux
# License: 3-clause BSD
from . import _lsap_module
def linear_sum_assignment(cost_matrix, maximize=False):
"""Solve the linear sum assignment problem.
Parameters
----------
cost_matrix : array
The cost matrix of the bipartite graph.
maximize : bool (default: False)
Calculates a maximum weight matching if true.
Returns
-------
row_ind, col_ind : array
An array of row indices and one of corresponding column indices giving
the optimal assignment. The cost of the assignment can be computed
as ``cost_matrix[row_ind, col_ind].sum()``. The row indices will be
sorted; in the case of a square cost matrix they will be equal to
``numpy.arange(cost_matrix.shape[0])``.
See Also
--------
scipy.sparse.csgraph.min_weight_full_bipartite_matching : for sparse inputs
Notes
-----
The linear sum assignment problem [1]_ is also known as minimum weight
matching in bipartite graphs. A problem instance is described by a matrix
C, where each C[i,j] is the cost of matching vertex i of the first partite
set (a "worker") and vertex j of the second set (a "job"). The goal is to
find a complete assignment of workers to jobs of minimal cost.
Formally, let X be a boolean matrix where :math:`X[i,j] = 1` iff row i is
assigned to column j. Then the optimal assignment has cost
.. math::
\\min \\sum_i \\sum_j C_{i,j} X_{i,j}
where, in the case where the matrix X is square, each row is assigned to
exactly one column, and each column to exactly one row.
This function can also solve a generalization of the classic assignment
problem where the cost matrix is rectangular. If it has more rows than
columns, then not every row needs to be assigned to a column, and vice
versa.
This implementation is a modified Jonker-Volgenant algorithm with no
initialization, described in ref. [2]_.
.. versionadded:: 0.17.0
References
----------
.. [1] https://en.wikipedia.org/wiki/Assignment_problem
.. [2] DF Crouse. On implementing 2D rectangular assignment algorithms.
*IEEE Transactions on Aerospace and Electronic Systems*,
52(4):1679-1696, August 2016, :doi:`10.1109/TAES.2016.140952`
Examples
--------
>>> cost = np.array([[4, 1, 3], [2, 0, 5], [3, 2, 2]])
>>> from scipy.optimize import linear_sum_assignment
>>> row_ind, col_ind = linear_sum_assignment(cost)
>>> col_ind
array([1, 0, 2])
>>> cost[row_ind, col_ind].sum()
5
"""
return _lsap_module.calculate_assignment(cost_matrix, maximize)

View File

@@ -0,0 +1,5 @@
"""This module contains least-squares algorithms."""
from .least_squares import least_squares
from .lsq_linear import lsq_linear
__all__ = ['least_squares', 'lsq_linear']

Some files were not shown because too many files have changed in this diff Show More