PSORLab
diff --git a/‎Manifest.toml‎
Lines changed: 260 additions & 373 deletions b/‎Manifest.toml‎
Lines changed: 260 additions & 373 deletions
diff --git a/‎Project.toml‎
Lines changed: 5 additions & 5 deletions b/‎Project.toml‎
Lines changed: 5 additions & 5 deletions
diff --git a/‎README.md‎
Lines changed: 104 additions & 64 deletions b/‎README.md‎
Lines changed: 104 additions & 64 deletions
diff --git a/‎examples/ParBB/extension.jl‎
Lines changed: 58 additions & 0 deletions b/‎examples/ParBB/extension.jl‎
Lines changed: 58 additions & 0 deletions
@@ -10,16 +10,16 @@ ModelingToolkit = "961ee093-0014-501f-94e3-6117800e7a78"
 SymbolicUtils = "d1185830-fcd6-423d-90d6-eec64667417b"
 
 [compat]
-DocStringExtensions = "~0.8"
+DocStringExtensions = "0.8 - 0.9"
 IfElse = "0.1.0 - 1.1.1"
-ModelingToolkit = "~8"
-SymbolicUtils = "~0.19"
-julia = "~1.7"
+ModelingToolkit = "8"
+SymbolicUtils = "0.19.7"
+julia = "1.7"
 
 [extras]
 McCormick = "53c679d3-6890-5091-8386-c291e8c8aaa1"
 Symbolics = "0c5d862f-8b57-4792-8d23-62f2024744c7"
 Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
 
 [targets]
-test = ["McCormick", "Symbolics", "Test"]
+test = ["McCormick", "Symbolics", "Test"]
@@ -1,72 +1,78 @@
 # SourceCodeMcCormick.jl
 
 This package is an experimental approach to use source-code transformation to apply McCormick relaxations
-to symbolic functions for use in deterministic global optimization. While packages like `McCormick.jl`
+to symbolic functions for use in deterministic global optimization. While packages like `McCormick.jl` [1]
 take set-valued McCormick objects and utilize McCormick relaxation rules to overload standard math operations,
-`SourceCodeMcCormick.jl` aims to interpret symbolic expressions, apply McCormick transformations, and
-return a new symbolic expression representing those relaxations. This functionality is designed to be
-used for both algebraic and dynamic systems.
+`SourceCodeMcCormick.jl` (SCMC) aims to interpret symbolic expressions, apply generalized McCormick rules,
+create source code that computes the McCormick relaxations and natural interval extension of the input,
+and compile the source code into functions that return pointwise values of the natural interval extension
+and convex/concave relaxations. This functionality is designed to be used for both algebraic and dynamic
+(in development) systems.
 
 ## Algebraic Systems
 
-For a given algebraic equation or system of equations, `SourceCodeMcCormick` is designed to provide
-symbolic transformations that represent the lower/upper bounds and convex/concave relaxations of the
-provided equation(s). Most notably, `SourceCodeMcCormick` uses this symbolic transformation to
-generate "evaluation functions" which, for a given expression, return the lower/upper bounds and
-convex/concave relaxations of an expression. E.g.:
+For a given algebraic equation or system of equations, `SCMC` is designed to provide symbolic transformations 
+that represent the lower/upper bounds and convex/concave relaxations of the provided equation(s). Most notably, 
+`SCMC` uses this symbolic transformation to generate "evaluation functions" which, for a given expression, 
+return the natural interval extension and convex/concave relaxations of an expression. E.g.:
 
 ```
 using SourceCodeMcCormick, Symbolics
 
-@variables x
-to_compute = x*(15+x)^2
-x_lo_eval, x_hi_eval, x_cv_eval, x_cc_eval, order = all_evaluators(to_compute)
+@variables x, y
+expr = exp(x/y) - (x*y^2)/(y+1)
+expr_lo_eval, expr_hi_eval, expr_cv_eval, expr_cc_eval, order = all_evaluators(expr)
 ```
 
 Here, the outputs marked `_eval` are the evaluation functions for the lower bound (`lo`), upper
-bound (`hi`), convex underestimator (`cv`), and concave overestimator (`cc`) of the `to_compute`
-expression. The inputs to each of these functions are described by the `order` vector, which 
-in this case is `[x_cc, x_cv, x_hi, x_lo]`. 
-
-There are several important benefits of using these functions. First, they are fast. Normally,
-applying McCormick relaxations takes some time as the relaxation depends on the provided bounds
-and relaxation values of the input(s), but also on the form of the expression itself. Packages 
-like `McCormick.jl` use control flow to quickly determine the form of the relaxations and provide
-McCormick objects of the results, but because the forms of the expressions are not known *a priori*,
-this control flow takes time to process. In contrast, `SourceCodeMcCormick` effectively hard-codes
-the relaxations for a specific, pre-defined function into the evaluation functions, making it
-inherently faster. For example:
+bound (`hi`), convex underestimator (`cv`), and concave overestimator (`cc`) of the symbolic
+expression given by `expr`. The inputs to each of these functions are described by the `order` 
+vector, which in this case is `[x_cc, x_cv, x_hi, x_lo, y_cc, y_cv, y_hi, y_lo]`, representing
+the concave/convex relaxations and interval bounds of the variables `x` and `y`. E.g., if being
+used in a branch-and-bound (B&B) scheme, the interval bounds for each variable will be the lower and
+upper bounds of the B&B node for that variable, and the convex/concave relaxations will take the
+value where the relaxation of the original expression is desired.
+
+One benefit of using a source code transformation approach such as this over a multiple dispatch
+approach like `McCormick.jl` is speed. When McCormick relaxations of functions are evaluated using
+`McCormick.jl`, there is overhead associated with finding the correct functions to call for each
+overloaded math operation. The functions generated by `SCMC`, however, eliminate this overhead by
+creating static functions with the correct McCormick rules already applied. While the `McCormick.jl`
+approach is flexible in quickly evaluating any new expression you provide, in the `SCMC` approach,
+one expression is selected up front, and relaxations and interval extension values are calculated
+for that expression quickly. For example:
 
 ```
 using BenchmarkTools, McCormick
 
-xMC = MC{1, NS}(2.5, Interval(-1.0, 4.0), 1)
+xMC = MC{2, NS}(2.5, Interval(-1.0, 4.0), 1)
+yMC = MC{2, NS}(1.5, Interval(0.5, 3.0), 2)
 
-@btime xMC*(15+xMC)^2
-# 228.381 ns (15 allocations: 384 bytes)
+@btime exp(xMC/yMC) - (xMC*yMC^2)/(yMC+1)
+# 497.382 ns (7 allocations: 560 bytes)
 
-@btime x_cv_eval(2.5, 2.5, 4.0, -1.0)
-# 30.382 ns (1 allocation: 16 bytes)
+@btime expr_cv_eval(2.5, 2.5, 4.0, -1.0, 1.5, 1.5, 3.0, 0.5)
+# 184.964 ns (1 allocation: 16 bytes)
 ```
 
-This is not an *entirely* fair example because the operation with xMC is simultaneously calculating
-lower/upper bounds, both relaxations, and (sub)gradients of the convex/concave relaxations, but the
-`SourceCodeMcCormick` result is just under an order of magnitude faster. As the expression inceases
-in complexity, this speedup becomes even more apparent.
+Note that this is not an entirely fair comparison because `McCormick.jl`, by using the `MC` type and
+multiple dispatch, simultaneously calculates all of the following: natural interval extensions,
+convex and concave relaxations, and corresponding subgradients. 
 
-Second, these evaluation functions are compatible with `CUDA.jl` in that they are broadcastable
-over `CuArray`s. Depending on the GPU, number of evaluations, and complexity of the function,
-this can dramatically decrease the time to compute the numerical values of function bounds and
-relaxations. E.g.:
+Another benefit of the `SCMC` approach is its compatibility with `CUDA.jl` [2]: `SCMC` functions are
+broadcastable over `CuArray`s. Depending on the GPU, number of evaluations, and complexity of the
+function, this can dramatically decrease the time to compute the numerical values of interval
+extensions and relaxations. E.g.:
 
 ```
 using CUDA
 
 # Using McCormick.jl
-xMC_array = MC{1,NS}.(rand(10000), Interval.(zeros(10000), ones(10000)), ones(Int, 10000))
+xMC_array = MC{2,NS}.(rand(10000), Interval.(zeros(10000), ones(10000)), ones(Int, 10000))
+yMC_array = MC{2,NS}.(rand(10000), Interval.(zeros(10000), ones(10000)), ones(Int, 10000).*2)
 
-@btime xMC_array.*(15 .+ xMC_array).^2
-# 1.616 ms (120012 allocations: 2.37 MiB)
+@btime @. exp(xMC_array/yMC_array) - (xMC_array*yMC_array^2)/(yMC_array+1)
+# 2.365 ms (18 allocations: 703.81 KiB)
 
 
 # Using SourceCodeMcCormick.jl, broadcast using CPU
@@ -75,33 +81,40 @@ xcv = copy(xcc)
 xhi = ones(10000)
 xlo = zeros(10000)
 
-@btime x_cv_eval.(xcc, xcv, xhi, xlo)
-# 100.100 μs (4 allocations: 78.27 KiB)
+ycc = rand(10000)
+ycv = copy(xcv)
+yhi = ones(10000)
+ylo = zeros(10000)
 
+@btime expr_cv_eval.(xcc, xcv, xhi, xlo, ycc, ycv, yhi, ylo);
+# 1.366 ms (20 allocations: 78.84 KiB)
 
-# Using SourceCodeMcCormick.jl and CUDA.jl, broadcast using GPU
-xcc_GPU = cu(xcc)
-xcv_GPU = cu(xcv)
-xhi_GPU = cu(xhi)
-xlo_GPU = cu(xlo)
 
-@btime x_cv_eval.(xcc_GPU, xcv_GPU, xhi_GPU, xlo_GPU)
-# 6.575 μs (33 allocations: 2.34 KiB)
+# Using SourceCodeMcCormick.jl and CUDA.jl, broadcast using GPU
+xcc_GPU = CuArray(xcc)
+xcv_GPU = CuArray(xcv)
+xhi_GPU = CuArray(xhi)
+xlo_GPU = CuArray(xlo)
+ycc_GPU = CuArray(ycc)
+ycv_GPU = CuArray(ycv)
+yhi_GPU = CuArray(yhi)
+ylo_GPU = CuArray(ylo)
+
+@btime CUDA.@sync expr_cv_eval.(xcc_GPU, xcv_GPU, xhi_GPU, xlo_GPU, ycc_GPU, ycv_GPU, yhi_GPU, ylo_GPU);
+# 29.800 μs (52 allocations: 3.88 KiB)
 ```
 
 
 ## Dynamic Systems
 
-For dynamic systems, `SourceCodeMcCormick` was built with a differential inequalities
-approach in mind where the relaxations of derivatives are calculated in advance and
-the resulting (larger) differential equation system, with explicit definitions of
-the relaxations of derivatives, can be solved. For algebraic systems, the main
-product of this package is the broadcastable evaluation functions. For dynamic
-systems, this package follows the same idea as in algebraic systems but stops at
-the symbolic representations of relaxations. This functionality is designed to work
-with a `ModelingToolkit`-type `ODESystem` with factorable equations--`SourceCodeMcCormick`
-will take such a system and return a new `ODESystem` with expanded equations to
-provide interval extensions and (if desired) McCormick relaxations. E.g.:
+(In development) For dynamic systems, `SCMC` assumes a differential inequalities approach where the 
+relaxations of derivatives are calculated in advance and the resulting (larger) differential equation 
+system, with explicit definitions of the relaxations of derivatives, can be solved. For algebraic 
+systems, the main product of this package is the broadcastable evaluation functions. For dynamic
+systems, this package follows the same idea as in algebraic systems but stops at the symbolic 
+representations of relaxations. This functionality is designed to work with a `ModelingToolkit`-type 
+`ODESystem` with factorable equations [3]--`SCMC` will take such a system and return a new `ODESystem` 
+with expanded equations to provide interval extensions and (if desired) McCormick relaxations. E.g.:
 
 ```
 using SourceCodeMcCormick, ModelingToolkit
@@ -150,8 +163,35 @@ using `DiffEqGPU.jl`.
 
 ## Limitations
 
-Currently, as proof-of-concept, `SourceCodeMcCormick` can only handle functions with 
-addition (+), subtraction (-), multiplication (\*), division (/), powers of 2 (^2), natural base
-exponentials (exp), and minimum/maximum (min/max) expressions. Future work will include
-adding other operations found in `McCormick.jl`. 
-
+`SCMC` has several limitations, some of which are described here. Ongoing research effort seeks
+to address several of these.
+- SCMC does not calculate subgradients, which are used in the lower bounding routines of many 
+global optimizers
+- Complicated expressions may cause significant compilation time. This can be manually avoided
+by combining results together in a user-defined function
+- `SCMC` is currently compatible with elementary arithmetic operations +, -, *, /, and the
+univariate intrinsic functions ^2 and exp. More diverse functions will be added in the future
+- Functions created with `SCMC` may only accept 32 CUDA arrays as inputs, so functions with more
+than 8 unique variables will need to be split/factored by the user to be accommodated
+- Due to the large number of floating point calculations required to calculate McCormick-based
+relaxations, it is highly recommended to use double-precision floating point numbers, including
+for operations on a GPU. Since most GPUs are designed for single-precision floating point operation,
+forcing double-precision will often result in a significant performance hit. GPUs designed for
+scientific computing, with a higher proportion of double-precision-capable cores, are recommended
+for optimal performance with `SCMC`.
+- Due to the high branching factor of McCormick-based relaxations and the possibility of warp
+divergence, there will likely be a performance gap between optimizations with variables covering
+positive-only domains and variables with mixed domains. Additionally, more complicated expressions
+where the structure of a McCormick relaxation changes more frequently with respect to the bounds
+on its domain will likely perform worse than problems where the structure of the relaxation is
+more consistent.
+
+## References
+
+1. M.E. Wilhelm, R.X. Gottlieb, and M.D. Stuber, PSORLab/McCormick.jl (2020), URL
+https://github.com/PSORLab/McCormick.jl.
+2. T. Besard, C. Foket, and B. De Sutter, Effective extensible programming: Unleashing Julia
+on GPUs, IEEE Transactions on Parallel and Distributed Systems (2018).
+3. Y. Ma, S. Gowda, R. Anantharaman, C. Laughman, V. Shah, C. Rackauckas, ModelingToolkit: 
+A composable graph transformation system for equation-based modeling. arXiv preprint 
+arXiv:2103.05244, 2021. doi: 10.48550/ARXIV.2103.05244.
@@ -0,0 +1,58 @@
+
+# This extension is meant to be used with SourceCodeMcCormick to overload
+# EAGO's standard branch-and-bound algorithm, to enable the parallel
+# evaluation of multiple B&B nodes.
+# See also: subroutines.jl, in this folder
+
+using DocStringExtensions, EAGO
+"""
+$(TYPEDEF)
+
+The ExtendGPU integrator is meant to be paired with the SourceCodeMcCormick
+package. A required component of ExtendGPU is the function `convex_func`, 
+which should take arguments corresponding to the McCormick tuple [cc, cv, hi, lo]
+for each branch variable in the problem and return a vector of convex
+relaxation evaluations of the objective function, of length equal to the
+length of the inputs.
+
+$(TYPEDFIELDS)
+"""
+Base.@kwdef mutable struct ExtendGPU <: EAGO.ExtensionType
+    "A user-defined function taking argument `p` and returning a vector
+    of convex evaluations of the objective function"
+    convex_func
+    "Number of decision variables"
+    np::Int
+    "The number of nodes to evaluate in parallel (default = 10000)"
+    node_limit::Int64 = 50000
+    "A parameter from Song et al. 2021 that determines how spread out 
+    blackbox points are (default = 0.01)"
+    α::Float64 = 0.01
+    "Lower bound storage to hold calculated lower bounds for multiple nodes."
+    lower_bound_storage::Vector{Float64} = Vector{Float64}()
+    "Upper bound storage to hold calculated upper bounds for multiple nodes."
+    upper_bound_storage::Vector{Float64} = Vector{Float64}()
+    "Node storage to hold individual nodes outside of the main stack"
+    node_storage::Vector{NodeBB} = Vector{NodeBB}()
+    "An internal tracker of nodes in internal storage"
+    node_len::Int = 0
+    "Variable lower bounds to evaluate"
+    all_lvbs::Matrix{Float64} = Matrix{Float64}()
+    "Variable upper bounds to evaluate"
+    all_uvbs::Matrix{Float64} = Matrix{Float64}()
+    "Internal tracker for the count in the main stack"
+    # node_count::Int = 0
+    "Flag for stack prepopulation. Good if the total number
+    of nodes throughout the solve is expected to be large (default = true)"
+    prepopulate::Bool = true
+    "(In development) Number of points to use for multistarting the NLP solver"
+    multistart_points::Int = 1
+end
+
+function ExtendGPU(convex_func, var_count::Int; alpha::Float64 = 0.01, node_limit::Int = 50000, 
+                    prepopulate::Bool = true, multistart_points::Int = 1)
+    return ExtendGPU(convex_func, var_count, node_limit, alpha, 
+                    Vector{Float64}(undef, node_limit), Vector{Float64}(undef, node_limit), Vector{NodeBB}(undef, node_limit), 0,
+                    Matrix{Float64}(undef, node_limit, var_count),
+                    Matrix{Float64}(undef, node_limit, var_count), prepopulate, multistart_points)
+end