Initializing help system before first use

Analyzing and Handling Numerical Issues

There are many optimization applications which give rise to numerically challenging models. You might notice that the Optimizer takes unexpectedly long for simplex reoptimization, that minimal changes in the models lead to an unexpectedly large change in the optimal solution value or that the optimal solution shows a certain amount of violation in the postsolved state. The Optimizer provides various tools to analyze whether a model is numerically challenging and to handle numerical issues when they occur.

Analyzing Models for Numerical Issues

There are two main reasons which can make models numerically challenging: Firstly, using coefficients that span many orders of magnitude, e.g., using numbers as large as 100 million mixed with numbers as small as 1 over 100 million. Those span 16 orders of magnitude. A double-precision floating point number, however, can only represent 15 precise digits. Thus, round-off errors are inevitable. Secondly, if a model contains structures that amplify the effect of numeric error propagation, e.g., when the result of subtracting two almost identical values is scaled up and then used for further computations.

To both ends, Xpress provides features to analyze models for numerical stability. Addressing the first issue, Xpress provides the user with information on the coefficient ranges in both the original problem and the problem that is solved after presolving and scaling has been applied. In the log, the minimum and maximum absolute values of the matrix coefficients, the right-hand side/bounds and the objective are printed. The relevant part for the numerical behavior of the solution process are the coefficient ranges in the solved model. The difference between the exponents of the min and max values tells you how many orders of magnitude are covered. As a rule of thumb, those should not be more than nine (and not more than six in an individual row or column of the original matrix).

The second issue, error propagation, is a bit trickier to trace. The most important source to consider for this is the multiplication of a vector with the constraint matrix, which gets stored in a factorized fashion. Hence, it makes sense to consider the condition number of the basis inverse matrix. Computing this can be expensive and is hence not done by default. You can activate it by setting the MIPKAPPAFREQ control to one. When setting this control, you will get a final statistic report that summarizes the condition numbers collected during search. Besides the percentage of stable, suspicious, unstable, and ill-posed basis inverse matrices, the Optimizer will report a quantity called the attention level after the solve. The attention level takes values between zero and one. It is equal to zero if all basis inverse matrices are stable, and one if all basis inverse matrices are ill-posed. The higher the attention level, the more likely are numerical errors. As a rule of thumb, matrices with an attention level larger than 0.1 should be investigated further.

Finally, if the Optimizer undergoes numerical failures during the optimization process, it will report these at the end of the solve. If you see dual, primal or barrier failures, or single inverts being reported, it might be worthwhile to try some of the methods described in the following sections.

Scaling

Scaling is a widely used preconditioning technique that aims at reducing the condition number of the constraint matrix, at reducing error propagation, and at reducing the number of LP iterations required to solve the problem. In Xpress, both columns and rows are scaled, however, only by powers of 2 to avoid round-off errors. By default, Xpress applies a machine learning algorithm to choose a scaling variant that is predicted to give the most stable performance. Although this prediction is correct in most of the cases, one can try the opposite setting, i.e., setting SCALING to 163 when autoscaling selected Curtis-Reid scaling and setting scaling to 16 when autoscaling selected standard scaling. Furthermore, disabling special handling of big-M rows and conduction scaling before presolving, represented by bits 6 and 9 of the SCALING control, is useful for some problems.

Solution Refinement

The Optimizer offers two methods of refining solutions, both are independent and complement each other. The first is called LP Refinement and aims at providing LP solutions of a higher precision, i.e., with more significant bits. It consists of two parts. Standard LP Refinement iteratively attempts to increase the accuracy of the solution until either both FEASTOLTARGET and OPTIMALITYTOLTARGET are satisfied, or accuracy cannot further be increased, or some effort limit is exhausted. It is applied by default both to LP solutions and to MIP solutions. Iterative refinement has the same goal, but uses more expensive, but also more promising measures of doing so, e.g., quad precision computing. If the postsolved LP solution is slightly infeasible, setting bits 5 and 6 of the REFINEOPS control aims at reducing those infeasibilities.

The second refinement scheme is called MIP Refinement and aims at providing MIP solutions which are truly integral and will not lead to infeasibilities when fixing integer variables in the original space. Note that both Iterative Refinement and MIP Refinement can lead to a slowdown of the solution process which is more considerable the more numerically challenging the matrix is.

Other Ways to Handle Numerical Issues

In addition to the methods named above, the Optimizer gives the user the possibility to change the numerical tolerances, such as FEASTOL and MATRIXTOL, but caution is advised here. Finally, if the numerical issues mainly come from the behavior of the simplex algorithm, setting DUALSTRATEGY to values 7 or 32 might help, or even using only barrier for solving LPs during a MIP solve, achieved by changing DEFAULTALG to 4.

In any case, it is best practice to reconsider the model. If you have very small and/or very large values in there — are those really necessary? Or could they be adapted to some significantly more stable value while still representing the same logic? Can you determine places where large values might cancel each other out and the residual is used for further computations? Have you tried using indicator instead of big-M formulations?

© 2001-2020 Fair Isaac Corporation. All rights reserved. This documentation is the property of Fair Isaac Corporation (“FICO”). Receipt or possession of this documentation does not convey rights to disclose, reproduce, make derivative works, use, or allow others to use it except solely for internal evaluation purposes to determine whether to purchase a license to the software described in this documentation, or as otherwise set forth in a written software license agreement between you and FICO (or a FICO affiliate). Use of this documentation and the software described in it must conform strictly to the foregoing permitted uses, and no other use is permitted.