0% found this document useful (0 votes)
33 views12 pages

Dunning Lubin2015

Uploaded by

hugo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views12 pages

Dunning Lubin2015

Uploaded by

hugo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

This article was downloaded by: [147.188.128.

74] On: 22 March 2015, At: 02:43


Publisher: Institute for Operations Research and the Management Sciences (INFORMS)
INFORMS is located in Maryland, USA

INFORMS Journal on Computing


Publication details, including instructions for authors and subscription information:
http://pubsonline.informs.org

Computing in Operations Research Using Julia


Miles Lubin, Iain Dunning

To cite this article:


Miles Lubin, Iain Dunning (2015) Computing in Operations Research Using Julia. INFORMS Journal on Computing 27(2):238-248.
http://dx.doi.org/10.1287/ijoc.2014.0623

Full terms and conditions of use: http://pubsonline.informs.org/page/terms-and-conditions

This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact [email protected].

The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.

Copyright © 2015, INFORMS

Please scroll down for article—it is on subsequent pages

INFORMS is the largest professional society in the world for professionals in the fields of operations research, management
science, and analytics.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
INFORMS Journal on Computing
Vol. 27, No. 2, Spring 2015, pp. 238–248
ISSN 1091-9856 (print) — ISSN 1526-5528 (online)
http://dx.doi.org/10.1287/ijoc.2014.0623
© 2015 INFORMS

Computing in Operations Research Using Julia


Miles Lubin, Iain Dunning
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

MIT Operations Research Center, Cambridge, Massachusetts 02139,


{[email protected], [email protected]}

T he state of numerical computing is currently characterized by a divide between highly efficient yet typically
cumbersome low-level languages such as C, C++, and Fortran and highly expressive yet typically slow
high-level languages such as Python and MATLAB. This paper explores how Julia, a modern programming
language for numerical computing that claims to bridge this divide by incorporating recent advances in language
and compiler design (such as just-in-time compilation), can be used for implementing software and algorithms
fundamental to the field of operations research, with a focus on mathematical optimization. In particular, we
demonstrate algebraic modeling for linear and nonlinear optimization and a partial implementation of a practical
simplex code. Extensive cross-language benchmarks suggest that Julia is capable of obtaining state-of-the-art
performance.
Data, as supplemental material, are available at http://dx.doi.org/10.1287/ijoc.2014.0623.
Keywords: algebraic modeling; scientific computing; programming languages; metaprogramming
History: Accepted by Jean-Paul Watson, Area Editor for Modeling; received May 2013; revised June 2014;
accepted September 2014.

Operations research and digital computing have written in C and Fortran. Examples of languages
grown hand-in-hand over the last 60 years, with of this type would be Python (especially with the
historically large amounts of available computing Numpy (van der Walt et al. 2011) package), R, and
power being dedicated to the solution of linear pro- MATLAB. Besides being interpreted rather than stati-
grams (Bixby 2002). Linear programming is one of the cally compiled, these languages are slower for a vari-
key tools in the operations research toolbox and con- ety of additional reasons, including the lack of strict
cerns selection of variable values to maximize a lin- variable typing.
ear function subject to a set of linear constraints. This Just-in-time (JIT) compilation has emerged as a way
foundational problem, the algorithms to solve it, and to have the expressiveness of modern scripting lan-
its extensions form a large part of operations research- guages and the performance of lower-level languages
related computation. The purpose of this paper is to such as C. JIT compilers attempt to compile at run
explore modern advances in programming languages time by inferring information not explicitly stated by
that will affect how algorithms for operations research the programmer and use these inferences to opti-
computation are implemented; we will use linear and mize the machine code that is produced. Attempts
nonlinear programming as motivating cases. to retrofit this functionality to the languages men-
The primary languages of high-performance com- tioned has had mixed success because of issues with
puting have been Fortran, C, and C++ for a multi- language design conflicting with the ability of the
tude of reasons, including their interoperability, their JIT compiler to make these inferences and problems
ability to compile to highly efficient machine code, with the compatibility of the JIT functionality with the
and their sufficient level of abstraction over pro- wider package ecosystems.
gramming in an assembly language. These languages Julia (Bezanson et al. 2012) is a new programming
are compiled offline and have strict variable typing, language that is designed to address these issues.
allowing advanced optimizations of the code to be The language is designed from the ground-up to be
made by the compiler. both expressive and enable the LLVM-based JIT com-
A second class of more modern languages has piler (Lattner and Adve 2004) to generate efficient
arisen that is also popular for scientific computing. code. In benchmarks reported by its authors, Julia per-
These languages are typically interpreted languages formed within a factor of two of C on a set of com-
that are highly expressive but do not match the speed mon basic tasks. The contributions of this paper are
of lower-level languages in most tasks. They make up twofold: first, we develop publicly available codes to
for this by focusing on “glue code” that links together, demonstrate the technical features of Julia that greatly
or provides wrappers around, high-performance code facilitate the implementation of optimization-related
238
Lubin and Dunning: Computing in OR Using Julia
INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS 239

tools. Second, we will confirm that the aforemen- models and in practice, e.g., in simulations of decision
tioned performance results hold for realistic problems processes. They achieve a similar “look” to AMPL by
of interest to the field of operations research. utilizing the operator overloading functionality in their
This paper is not a tutorial. We encourage inter- respective languages, which introduces significant
ested readers to view the language documentation overhead and inefficient memory usage. Interfaces in
at julialang.org. An introduction to Julia’s syntax C++ based on operator overloading, such as those
will not be provided here, although the examples of provided by the commercial solvers Gurobi and
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

code presented should be comprehensible to read- CPLEX, are often significantly faster than AMLs in
ers with a background in programming. The source interpreted languages, although they sacrifice ease of
code for all the experiments in the paper is available use and solver independence.
in the online supplement (available as supplemental We propose a new AML, JuMP, implemented and
material at http://doi.dx.org/10.1287/ijoc.2014.0623). released as a Julia package, that combines the speed
JuMP (Julia for Mathematical Programming), a library of commercial products with the benefits of remaining
developed by the authors for mixed-integer algebraic within a fully functional high-level modern language.
modeling, is available directly through the Julia pack- We achieve this by using Julia’s metaprogramming fea-
age manager, together with community-developed tures to turn natural mathematical expressions into
low-level interfaces to a number of existing commer- sparse internal representations of the model without
cial and open-source solvers for mixed-integer and using operator overloading. In this way we achieve
linear optimization. performance comparable to AMPL and an order of
The rest of the paper is organized as follows: In §1, magnitude faster than other embedded AMLs.
we present the package JuMP. In §2, we explore non-
1.1. Metaprogramming with Macros
linear extensions. In §3, we evaluate the suitability of
Julia can represent its own code as a Julia data struc-
Julia for low-level implementation of numerical opti-
ture and provides easy access to its built-in syntax
mization algorithms by examining its performance on
parser. This feature is also found in languages such as
a realistic partial implementation of the simplex algo-
Lisp. To make the concept of metaprogramming more
rithm for linear programming.
clear, consider the following Julia code snippet:
1 macro m(ex)
1. JuMP 2 ex.args[1] = :(-) # Replace operation with
Algebraic Modeling Languages (AMLs) are an essen- subtraction
tial component in any operations researcher’s tool- 3 return esc(ex) # Escape expression
box. AMLs enable researchers and programmers to (see below)
describe optimization models in a natural way, mean- 4 end
ing that the description of the model in code resem- 5 x = 2; y = 5 # Initialize variables
bles the mathematical statement of the model. One of 6 2x + yˆx # Prints 29
the most well-known AMLs is AMPL (Fourer et al. 7 @m(2x + yˆx) # Prints -21.
1993), a commercial tool that is both fast and expres-
sive. This speed comes at a cost: AMPL is not a On lines 1–4 we define the macro m. Macros are
fully fledged modern programming language, which compile-time source-transformation functions, similar
makes it a less-than-ideal choice for manipulating in concept to the preprocessing features of C but oper-
data to create the model, for working with the results ating at the syntactic level instead of performing tex-
of an optimization, and for linking optimization into tual substitution. When the macro is invoked on line 7
a larger project. with the expression 2x + y x , the value of ex is a Julia
Interpreted languages such as Python and object that contains a representation of the expression
MATLAB have become popular with researchers and as a tree. Although we will not describe the precise
practitioners alike because of their expressiveness, structure of this object, one may essentially visualize
package ecosystems, and acceptable speeds. Packages it by using the compact Polish (prefix) notation:
for these languages that add AML functionality 4+1 4∗1 21 x51 4∧ 1 y1 x550
such as YALMIP (Lofberg 2004) for MATLAB and
PuLP (Mitchell et al. 2011) and Pyomo (Hart et al. Line 2 replaces the + in the above expression with −,
2011) for Python address the general-purpose- where :(-) is Julia’s syntax for the symbol −. Line 3
computing issues of AMPL but sacrifice speed. These returns the escaped output, indicating that the expres-
AMLs take a nontrivial amount of time to build the sion refers to variables in the surrounding scope.
sparse representation of the model in memory, which Hence, the output of the macro is the expression 2x −
is especially noticeable if models are being rebuilt y x , which is subsequently compiled and finally eval-
many times, which happens in the development of uated to the value −21.
Lubin and Dunning: Computing in OR Using Julia
240 INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS

Macros provide powerful functionality to perform AMLs like PuLP achieve this with operator over-
arbitrary transformation of expressions efficiently. The loading. By defining new types to represent variables,
complete language features of Julia are available new definitions are provided for the basic mathemat-
within macros, unlike the limited syntax for macros ical operators when one or both the operands is a
in C. Additionally, macros are evaluated only once, variable. The expression is then built by combining
at compile time, and so have no runtime overhead, subexpressions together until the full expression is
unlike eval functions in MATLAB and Python. (Note: obtained. This typically leads to an excessive num-
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

with JIT compilation, “compile time” in fact occurs ber of intermediate memory allocations. One of the
during the program’s execution, e.g., the first time a advantages of AMPL is that, as a purpose-built tool,
function is called.) We will use macros as a basis for it has the ability to statically analyze the expression
both linear and nonlinear modeling. to determine the storage required for its final repre-
sentation. One way it may achieve this is by doing
1.2. Language Design an initial pass to determine the size of the arrays
Although we did not set out to design a full modeling to allocate, and then a second pass to store the cor-
language with as wide a variety of options as AMPL, rect coefficients. Our goal with Julia was to use the
we have sufficient functionality to model any linear metaprogramming features to achieve a similar effect
optimization problem with a similar number of lines and bypass the need for operator overloading.
of code. Consider the following simple AMPL model
of a “knapsack” problem (we will assume the data are 1.4. Metaprogramming Implementation
provided before the following lines): Our solution is similar to what is possible with AMPL
and does not rely on operator overloading at all. Con-
var x{j in 1..N} >= 0.0, <= 1.0;
sider the knapsack constraint provided in the pre-
maximize Obj: vious example. We will change the constraint into
sum {j in 1..N} profit[j] ∗ x[j]; an equality by adding a slack variable to make the
subject to CapacityCon: expression more complex than a single sum. The
sum {j in 1..N} weight[j] ∗ x[j] <= capacity. addConstraint macro converts the expression
The previous model would be written in Julia using @addConstraint(m, sum{weight[j] ∗ x[j], j=1:N} +
JuMP with the following code: s == capacity)
m = Model() into the following code, transparently to the user:
@defVar(m, 0 <= x[1:N] <= 1)
aff = AffExpr()
@setObjective(m, Max, sum{ profit[j] ∗ x[j],
sumlen = length(1:N)
j = 1:N })
sizehint!(aff.vars, sumlen)
@addConstraint(m, sum{ weight[j] ∗ x[j], sizehint!(aff.coeffs, sumlen)
j = 1:N } <= capacity). for i = 1:N
The syntax is mostly self-explanatory and is not addToExpression(aff, 1.0 ∗ weight[i], x[i])
the focus of this paper, but we draw attention to end
the similarities between the syntax of our Julia AML addToExpression(aff, 1.0, s)
and existing AMLs. In particular, macros permit us to addToExpression(aff, -1.0, capacity)
define new syntax such as sum{}, which is not part of addConstraint(m, Constraint(aff,"==")).
the Julia language.
The macro breaks the expression into parts and
1.3. Building Expressions then stitches them back together, as in our desired
The model is stored internally as a set of rows until data structure. AffExpr represents the custom type
it is completely specified. Each row is defined by two that contains the variable indices (vars) and coeffi-
arrays: the first is the indices of the columns that cients (coeffs). In the first segment of code the macro
appear in this row and the second contains the cor- pulls the indexing scheme out of the sum and deter-
responding coefficients. This representation is essen- mines how long an array is required. Sufficient space
tially the best possible while the model is being built to store the sum is reserved in one pass using the
and can be converted to a sparse column-wise for- built-in function sizehint! before addToExpression
mat with relative efficiency. The challenge then is (defined elsewhere) is used to fill it out. We use mul-
to convert the user’s statement of the problem into tiple dispatch to let Julia decide what type of object
this sparse row representation as quickly as possible, x[i] is, either a constant or a variable placeholder,
while not requiring the user to express rows in a way using its efficient built-in type inference mechanism.
that loses the readability expected from AMLs. After the sum is handled, the single slack variable s is
Lubin and Dunning: Computing in OR Using Julia
INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS 241

appended, and finally the right-hand side of the con- Note that JuMP does not need to transpose the nat-
straint is set. Note the invocation of addToExpression urally row-wise data when outputting in LP format,
with different argument types in the last usage—this which explains the observed difference in execution
time two constants instead of a constant and a vari- times. In JuMP and AMPL, model construction was
able. The last step is to construct the Constraint object observed to consume 20%–50% of the total time for
that is essentially a wrapper around the expression this model:
and the sense. The function addConstraint is defined N X
L
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

separately from the macro with the same name. We


X
min —Ci − j—xij
note that our implementation is not as efficient as i=1 j=1
AMPLs can be; space for the coefficients of single s.t. xij ≤ yj i = 11 0 0 0 1 N 1 j = 11 0 0 0 1 L
variables like s is not preallocated, and so additional
L
memory allocations are required. However, we still X
avoid the creation of many small temporary objects xij = 1 i = 11 0 0 0 1 N
j=1
that would be produced with operator overloading.
L
A technical limitation of this approach is that the X
code generated by a macro may depend only on the yj = M
j=1
syntax of the given expression and not on any run-
time values or type information. The abovementioned 2. Linear-Quadratic control problem (cont5_2_1):
implementation, therefore, generates very efficient This quadratic programming model is part of the col-
code for linear expressions but cannot easily be gener- lection maintained by Hans Mittelmann (2013). Not
alized to quadratic or general nonlinear expressions. all the compared AMLs support quadratic objectives,
and the quadratic objective sections of the file for-
1.5. Benchmarks mat specifications are ill-defined, so the objective was
Different languages produce the final internal repre- dropped and set to zero. The results in Table 2 mir-
sentation of the problem at different stages, making ror the general pattern of results observed in the p-
pure in-memory “model construction” time difficult median model:
to isolate. Our approach was to force all the AMLs
to output the resulting model in the LP and/or MPS min 0 0 0
yi1 j 1 ui
file formats and record the total time from execut-
ing the script until the file is output. Note that this yi+11 j − yi1 j 1
s0t0 = 4y − 2yi1 j + yi1 j+1
approach is imperfect for comparing timings of the ãt 24ãx52 i1 j−1
same magnitude, as different AMLs may use different + yi+11 j−1 − 2yi+11 j + yi+11 j+1 5
precisions for their output.
We evaluated the performance of Julia relative to i = 01 0 0 0 M − 11 j = 11 0 0 0 1 N − 1
other AMLs by implementing two models whose size yi12 − 4yi11 + 3yi10 = 0 i = 11 0 0 0 1 M
can be controlled by varying a parameter. Experi-
ments were performed on a Linux system with an yi1 n−2 − 4yi1 n−1 + 3yi1 n = 42ãx54ui − yi1 n 5
Intel Xeon E5-2650 processor with the following com- i = 11 0 0 0 1 M
binations of software: Julia 0.2.1, AMPL 2008-11-20,
Gurobi 5.6, PuLP 1.5.3 with PyPy 2.2.1, and Pyomo 3.4 −1 ≤ ui ≤ 1 i = 11 0 0 0 1 M
with Python 2.7. y01 j = 0 j = 01 0 0 0 1 N
1. P -median: This model was used by Hart et al.
(2011) to compare Pyomo with AMPL. The model 0 ≤ yi1 j ≤ 1 i = 11 0 0 0 1 m1 j = 01 0 0 0 1 N
determines the location of M facilities over L possi- 1
ble locations to minimize the distance between each where gj = 41 − 4jãx52 50
2
of N customers and the closest facility. Vector Ci con-
tains customer locations that we generate randomly. We note that both AMPL and Pyomo include fea-
In our benchmarks we fixed M = 100 and N = 100 tures that are not present in JuMP, the Gurobi C++
and varied L. The results are in Table 1 and show that interface, or PuLP, for example, support for nonlinear
JuMP is safely within a factor of two of the speed modeling and the concept of parameters, which may
of AMPL, comparable in speed to, if not occasion- be used to generate and solve a modified problem
ally faster than, Gurobi’s C++ modeling interface, without explicitly recreating a model. These modeling
and an order of magnitude faster than the Python- systems, and JuMP in particular, have an advantage
based modeling languages. Subsequent to the initial in the performance comparisons because the imple-
publication of these benchmarks, Gurobi developers mentations take advantage of the special case of linear
achieved a 25% improvement on the times reported. expressions with explicitly provided coefficients.
Lubin and Dunning: Computing in OR Using Julia
242 INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS

Table 1 P -Median Benchmark Results • The problem specified is simply too difficult for
available solvers, whether in terms of memory use or
JuMP/Julia AMPL Gurobi/C++ PuLP/PyPy Pyomo
computation time. In this case, users are directed to
L LP MPS MPS LP MPS LP MPS LP consider reformulating the problem or implementing
1,000 005 100 007 008 008 505 408 1007 specialized algorithms for it.
5,000 203 402 303 306 309 2604 2302 5406 Our focus in this section is on the first case, which is
10,000 500 809 607 703 803 5305 5006 11000 typical of the large-scale performance of open-source
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

50,000 2709 4803 3500 3702 3903 22401 22506 58307 AMLs implemented in high-level languages, as will
Notes. L is the number of locations. Total time (in seconds) to process the be demonstrated in the numerical experiments in this
model definition and produce the output file in LP and MPS formats (as section.
available). This performance gap between commercial and
open-source AMLs can be partially explained by
Table 2 Linear-Quadratic Control Benchmark Results considering the possibly very different motivations
JuMP/Julia AMPL Gurobi/C++ PuLP/PyPy Pyomo
of their respective authors; however, we posit that
there is a more technical reason. In languages such
N LP MPS MPS LP MPS LP MPS LP as MATLAB and Python there is no programmatic
250 005 009 008 102 101 803 702 1303 access to the language’s highly optimized expres-
500 200 306 300 405 404 2706 2404 5304 sion parser. Instead, to handle nonlinear expressions
750 500 804 607 1002 1001 6100 5405 12100 such as y ∗ sin(x), one must either overload both the
1,000 902 1505 1106 1706 1703 10802 9705 21407 multiplication operator and the sin function—which
Notes. N = M is the grid size. Total time (in seconds) to process the model leads to the expensive creation of many temporary
definition and produce the output file in LP and MPS formats (as available). objects as previously discussed—or manually parse
the expression as a string, which itself may be slow
1.6. Availability and breaks the connection with the surrounding lan-
JuMP (https://github.com/JuliaOpt/JuMP.jl) has been guage. YALMIP and Pyomo implement the operator
released with documentation as a Julia package. It overloading approach, while CVX implements a com-
remains under active development. It currently inter- bination of both approaches.
faces with Cbc, Clp, CPLEX, GLPK, Gurobi, and Julia, in contrast, provides first-class access to its
MOSEK. Linux, OS X, and Windows platforms are expression parser through its previously discussed
supported. metaprogramming features, which facilitates the gen-
eration of resulting code with performance compara-
2. Nonlinear Modeling ble to that of commercial AMLs. In §2.1 we describe
Commercial AMLs such as AMPL and GAMS (Brooke our proof-of-concept implementation, followed by
et al. 1988) are widely used for specifying large-scale computational results in §2.2.
nonlinear optimization problems, that is, problems of
2.1. Implementation in Julia
the form
Whereas linear expressions are represented as sparse
min f 4x5 vectors of nonzero values, nonlinear expressions are
x
(1) represented as algebraic expression graphs, as illus-
subject to gi 4x5 ≤ 0 i = 11 0 0 0 1 m1 trated later. Expression graphs, when available, are
where f and gi are given by closed-form expressions. integral to the solution of nonlinear optimization
Similar to the case of modeling linear optimiza- problems. The AML is responsible for using these
tion problems, open-source AMLs exist and provide graphs to evaluate function values and first and sec-
comparable, if not superior, functionality; however, ond derivatives as requested by the solver (typically
they may be significantly slower to build the model, through callbacks). Additionally, they may be used by
even impractically slower on some large-scale prob- the AML to infer problem structure to decide which
lems. The user guide of CVX, an award-winning solution methods are appropriate (Fourer and Orban
open-source AML for convex optimization built on 2010) or by the solver itself to perform important
MATLAB, states that “CVX is not meant for very large problem reductions in the case of mixed-integer non-
problems” (Grant and Boyd 2013). This statement linear programming (Belotti et al. 2009).
refers to two cases that are important to distinguish: Analogously to the linear case, where macros are
• An appropriate solver is available for the prob- used to generate code that forms sparse vector rep-
lem, but the time it takes to build the model in mem- resentations, a macro was implemented that gener-
ory and pass it to the solver is a bottleneck. In this ates code to form nonlinear expression trees. Macros,
case, users are directed to use a low-level interface to when called, are provided with an expression tree
the solver in place of the AML. of the input; however, symbols are not resolved to
Lubin and Dunning: Computing in OR Using Julia
INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS 243

– –

getindex getindex * Value Value


*
x[i + 1] x[i]
x (i + 1) x i (0.5 * h) +
(0.5 * h) +
sin sin
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

sin sin
getindex getindex
Value Value

t (i + 1) t i t[i + 1] t[i]

Figure 1 A Macro Called with the Text Expression x[i+1] - x[i] - (0.5h) ∗ (sin(t[i+1])+sin(t[i])) is Given as Input as the Expression
Tree on the Left
Notes. Parentheses indicate subtrees combined for brevity. At this stage, symbols are abstract and not resolved. To prepare the nonlinear expression, the
macro produces code that generates the expression tree on the right with variable placeholders spliced in from the runtime context.

values. Indeed, values do not exist at compile time a one-to-one correspondence between the variables
when macros are evaluated. The task of the macro, present) and only performing symbolic differentiation
therefore, is to generate code that replicates the input once per equivalent expression.
expression with runtime values (both numeric con- The implementation of the Jacobian computa-
stants and variable placeholder objects) spliced in, as tion, which supports general expression trees, spans
illustrated in Figure 1. This splicing of values is by approximately 250 lines of code, including the basic
construction and does not require expensive runtime logic for the chain rule. In the following section it
calls, as MATLAB’s eval function does. is demonstrated that evaluating the Jacobian using
The implementation of this macro to generate the JIT-compiled function is as fast as using AMPL
expression trees is compact, approximately 20 lines of through the low-level AMPL solver library (Gay
code including support for the sum{} syntax presented 1997), presently a de facto standard for evaluating
in §1.2. Although nontrivial understanding of Julia’s derivatives in nonlinear models. Interestingly, there is
metaprogramming syntax is required to implement an executable accompanying the AMPL solver library
such a macro, the effort should be compared with (nlc), which generates and compiles C code to eval-
what would be necessary to obtain the equivalent out- uate derivatives for a specific model, although it
put and performance from a low-level language; in is seldom used in practice because of the cost of
particular, one would need to write a custom expres- compilation and the marginal gains in performance.
sion parser. However, in a language such as Julia with JIT compi-
Given expression trees for the constraints (1), we lation, compiling functions generated at runtime can
consider computing the Jacobian matrix be a technique to both simplify an implementation
  and obtain performance comparable to that of low-
ïg1 4x5 level languages.
 
 ïg2 4x5  The idea of using JIT compilation for derivatives
  is not new; CasADi (Andersson 2013), a recently
J 4x5 =  00
1
  developed framework for computing derivatives in
 0 
  nonlinear optimization, contains experimental but
ïgm 4x5 nonfunctioning code from 2012 to use the JIT com-
piler of LLVM, the same compiler used internally
where ïgi 4x5 is a row-oriented gradient vector. Unlike by Julia. CasADi, however, is implemented in C++.
the typical approach of using automatic differentia- We do not know why this approach was eventu-
tion for computing derivatives in AMLs (Gay 1996), ally abandoned, but we suppose that the difficulty
a simpler method based on symbolic differentiation of interfacing with a JIT compiler from C++ was
can be equally efficient in Julia for the case of first- a significant contributing factor. Note that CasADi,
order derivatives. In particular, we derive the form like AMPL, uses techniques from automatic differen-
of the sparse Jacobian matrix by applying the chain tiation instead of computing derivatives of symbolic
rule symbolically and then, using JIT compilation, expressions.
compile a function that evaluates the Jacobian for any
given input vector. This process is accelerated by 2.2. Computational Tests
identifying equivalent expression trees (those that We test our implementation on two nonlinear opti-
are symbolically identical and for which there exists mization problems obtained from Hans Mittelmann’s
Lubin and Dunning: Computing in OR Using Julia
244 INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS

AMPL-NLP benchmark set (http://plato.asu.edu/ Table 3 Nonlinear Test Instance Dimensions


ftp/ampl-nlp.html). Experiments were performed on
Instance No. of vars. No. of constr. No. of nz
a Linux system with an Intel Xeon E5-2650 processor.
Note that we have not developed a complete nonlin- clnlbeam-5 151003 101000 401000
clnlbeam-50 1501003 1001000 4001000
ear AML; the implementation is intended to serve as clnlbeam-500 115001003 110001000 410001000
a proof of concept only. The operations implemented cont5_1-2 401601 401200 2401200
are solely the construction of the model and the eval- cont5_1-4 1611201 1601400 9601400
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

cont5_1-10 110031001 110011000 610011000


uation of the Jacobian of the constraints. Hence, objec-
tive functions and right-hand side expressions are Note. nz = nonzero elements in Jacobian matrix.
omitted or simplified here. We are careful to avoid
forcing AMPL to perform unnecessary work by using Table 4 Nonlinear Benchmark Results
the fg_read interface in the AMPL solver library,
with which AMPL prepares only first-order deriva- Build model (s) Evaluate Jacobian (ms)
tive information, and AMPL’s preprocessing features Instance AMPL Julia YALMIP Pyomo AMPL Julia YALMIP
are disabled. Despite these measures to produce a fair
comparison, the complete AMLs may perform error clnlbeam-5 002 001 3600 203 004 003 803
clnlbeam-50 108 003 1134408 2307 703 402 9604
checking and validation, which are not present in our clnlbeam-500 1803 303 >3,600 23309 7401 7406 ∗
implementation. Nevertheless, we believe it is valid cont5_1-2 101 003 200 1202 101 008 903
and illustrative to make comparisons between orders cont5_1-4 404 104 109 4904 504 300 3704
cont5_1-10 2706 601 1305 31004 3307 3904 26000
of magnitude in execution times.
The first instance is clnlbeam: Notes. “Build model“ includes writing and reading model files, if required,
and precomputing the structure of the Jacobian. Pyomo uses AMPL for
min 000 Jacobian evaluations.

t1 x1 u∈n+1 indicates that the “Evaluate Jacobian” experiment could not be com-
pleted because the “Build model” phase did not complete within the time
1 limit.
subject to xi+1 − xi − 4sin4ti+1 5 + sin4ti 55 = 0
2n
i = 11 0 0 0 1 n i = 110001n1 j = 110001n−1
1 1
ti+1 − ti − ui+1 − ui = 0 i = 11 0 0 0 1 n yi+11 3 −4yi+11 2 +3yi+11 1 = 0 i = 110001n
2n 2n
−1 ≤ ti ≤ 11 −0005 ≤ xi ≤ 0005 c4yi+11 n−1 −4yi+11 n +3yi+11 n+1 5+yi+11 n+1
i = 11 0 0 0 1 n + 10 −ui +yi+11 n+1 44yi+11 n+1 52 53/2 = 0

We take n = 510001 501000, and 500,000. The fol- i = 110001n1


lowing code builds the corresponding model in Julia
using our proof-of-concept implementation: where a = 48n2 5/ 2 and c = 42n5/. We take n =
2001400, and 1,000.
m = Model() The dimensions of these instances and the num-
h = 1/n
ber of nonzero elements in the corresponding Jacobian
@defVar(m, -1 <= t[1:(n+1)] <= 1)
matrices are listed in Table 3. In Table 4 we present
@defVar(m, -0.05 <= x[1:(n+1)] <= 0.05)
a benchmark of our implementation compared with
@defVar(m, u[1:(n+1)])
AMPL, YALMIP version 20130322 (MATLAB R2013b),
for i in 1:n and Pyomo 3.4 (Python 2.7). JIT compilation of the
@addNLConstr(m, x[i+1] - x[i] - (0.5h) ∗ Jacobian function is included in the “Build model”
(sin(t[i+1]) + sin(t[i])) == 0)
phase for Julia. Observe that Julia performs as fast as
end
AMPL, if not faster. Julia’s advantage over AMPL is
for i in 1:n
partly explained by AMPL’s need to write the model
@addNLConstr(m, t[i+1] - t[i] - (0.5h) ∗
to an intermediate nl file before evaluating Jacobians;
u[i+1] - (0.5h) ∗ u[i] == 0)
this input/output (I/O) time is included. YALMIP per-
end.
forms well on the mostly linear cont5_1 instance but is
The second instance is cont5_1: unable to process the largest clnlbeam instance in less
than an hour. Pyomo’s performance is more consistent
min 000 but more than 50× slower than our implementation
y∈4n+15×4n+15 1 u∈n
in Julia on the largest instances. Pyomo is run under
subject to n4yi+11 j+1 −yi1 j+1 5−a4yi1 j −2yi1 j +yi1 j+1
pure Python; it does not support JIT accelerators such
+yi+11 j −2yi+11 j+1 +yi+11 j+2 5 = 0 as PyPy.
Lubin and Dunning: Computing in OR Using Julia
INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS 245

3. Implementing Optimization matrix formed by a constantly changing subset of the


Algorithms columns (those corresponding to the nonbasic vari-
In this section we evaluate the performance of Julia ables). Another important aspect is the treatment of
for implementation of the simplex method for LP, sparsity of the vector itself, in addition to that of the
arguably one of the most important algorithms in the matrix (Hall and McKinnon 2005). This is achieved
algorithmically using the nonzero elements of the vec-
field of operations research. Our aim is not to develop
tor to form a linear combination of the rows of the
a complete implementation but instead to compare
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

matrix, instead of the more common approach of com-


the performance of Julia to that of other popular lan-
puting dot-products with the columns, as illustrated
guages, both high and low level, on a benchmark of
in (2). This follows from viewing the matrix A equiv-
core operations.
alently as either a collection of column vectors Ai or
Although high-level languages can achieve good
as row vectors aTi .
performance when performing vectorized operations  T
(that is, block operations on dense vectors and matri- a1
ces), state-of-the-art implementations of the simplex  T
 a2 
method are characterized by their effective exploita-   T
tion of sparsity (the presence of many zeros) in all A = 6 A1 A2 ··· An 7 =   00  −→ A x

 0 
operations; hence, they use sparse linear algebra.  
Opportunities for vectorized operations are small in aTm
scale and do not represent a majority of the execution  T 
time; see Hall (2010). Furthermore, the sparse linear A1 x
algebra operations used, such as the LU factorization
 T 
 A2 x  m
of Suhl and Suhl (1990), are specialized and not pro-
  X
= 00  =
 a i xi (2)
vided by standard libraries.  0  i=1
x
 i =
6 0
The simplex method is therefore an example of

an algorithm that requires a low-level coding style, ATn x
in particular, manually coded loops, which are known Algorithm 1 (Restricted sparse matrix transpose-
to have poor performance in languages such as MAT- dense vector product)
LAB or Python (see, e.g., van der Walt et al. 2011).
To achieve performance in such cases, one would be Input: Sparse column-oriented m×n matrix A,
required to code time-consuming loops in another dense vector x ∈ m , and flag vector
language and link to these separate routines from the N ∈ 80119n (with n−m nonzero elements)
Output: y 2= ATN x as a dense vector, where N selects
high-level language, using, for example, MATLAB’s
columns of A
MEX interface. Our benchmarks will demonstrate, for i in 8110001n9 do
however, that within Julia, the native performance of if Ni = 1 then
this style of computation can nearly achieve that of s ←0
low-level languages. For each nonzero element q (in row j) of ith
column of A do
3.1. Benchmark Operations s ← s +q ∗xj F Compute dot-product of x
A presentation of the simplex algorithm and a dis- with column i
cussion of its computational components are beyond end for
the scope of this paper. We refer the reader to Maros yi ← s
(2003) and Koberstein (2005) for a comprehensive end if
treatment of modern implementations, which include end for
significant advances over versions presented in most
textbooks. We instead present three selected opera- Algorithm 2 (Sparse matrix transpose-sparse vector
tions from the revised dual simplex method in a mostly product)
self-contained manner. Knowledge of the simplex Input: Sparse row-oriented m×n matrix A and
algorithm is helpful but not required. The descrip- sparse vector x ∈ m
tions are realistic and reflect the essence of the rou- Output: Sparse representation of AT x
tines as they might be implemented in an efficient For each nonzero element p (in index j) in x do
implementation. For each nonzero element q (in column i) of jth
The first operation considered is a matrix- row of A do
transpose-vector product (Mat-Vec). In the revised Add p ∗q to index i of output
simplex method, this operation is required to form F Compute linear combination of rows of A
a row of the tableau. A nonstandard aspect of this end for
Mat-Vec is that we would like to consider the end for
Lubin and Dunning: Computing in OR Using Julia
246 INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS

The Mat-Vec operation is illustrated for dense that the implementation in the benchmark handles
vectors in Algorithm 1 and for sparse vectors in both upper and lower bounds; Algorithm 3 is sim-
Algorithm 2. Sparse matrices are provided in either plified in this respect for brevity. A sparse variant is
compressed sparse column or compressed sparse row easily obtained by looping over the nonzero elements
format as appropriate (Duff et al. 1989). Note that of  in the first pass.
in Algorithm 1 we use a flag vector to indicate the The third operation is a modified form of the vector
selected columns of the matrix A. This corresponds update y ← x +y (Axpy). In the variant used in the
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

to skipping particular dot-products. The result vector simplex algorithm, the value of each updated com-
has a memory layout with n, not n−m entries. This ponent is tested for membership in an interval. For
form could be desired in some cases for subsequent example, given a tolerance …, a component belonging
operations and is illustrative of the common practice to the interval 4−ˆ1−…5 may indicate loss of numeri-
in simplex implementations of designing data struc- cal feasibility, in which case a certain corrective action,
tures with a global view of the operations in which such as local perturbation of problem data, may be
they will be used (Maros 2003, Chap. 5). In Algo- triggered. This procedure is more naturally expressed
rithm 2 we omit what would be a costly flag check using an explicit loop over elements of x instead of
for each nonzero element of the row-wise matrix; the performing operations on vectors.
gains of exploiting sparsity often outweigh the extra The three operations discussed represent a non-
floating-point operations. trivial proportion of execution time of the simplex
method, between 20% and 50%, depending on the
Algorithm 3 (Two-pass stabilized minimum ratio test problem instance (Hall and McKinnon 2005). Most
(dual simplex)) of the remaining execution time is spent factorizing
Input: Vectors d1 ∈ n , state vector and solving linear systems using specialized proce-
s ∈ {“lower,”“basic”}n , parameters …P 1…D > 0 dures, which we do not implement because of their
Output: Solution index result. complexity. We claim, however, that the performance
ämax ← ˆ observed on the benchmark operations may reason-
for i in 8110001n9 do ably be expected to generalize to an implementation
if si = “lower” and i > …P then of these specialized linear algebra operations, because
Add index  i to list of candidates
 their implementation is similarly based on low-level
d +… sparsity-exploiting loops and scalar operations.
ämax ← min i D 1ämax
i
end if 3.2. Results
end for The benchmark operations described previously were
max ← 0, result ← 0 implemented in Julia, C++, MATLAB, Python, and
for i in list of candidates do Java. Examples of code have been omitted for brevity.
if di /i ≤ ämax and i > max then The style of the code in Julia is qualitatively similar
max ← i to that of the other high-level languages. Readers are
result ← i encouraged to view the implementations available in
end if the online supplement. To measure the overhead of
end for bounds checking, a validity check performed on array

The second operation is the minimum ratio test,


Table 5 Execution Time of Each Language (Version Listed)
which determines both the step size of the next itera-
Relative to C++ with Bounds Checking
tion and the constraint that prevents further progress.
Mathematically this may be expressed as Julia C++ MATLAB PyPy Python Java
Operation 0.2 GCC 4.8.2 R2013b 2.2.1 2.7.6 SE 1.8.0
di
min Dense
i >0 i Mat-Vec 4ATN x5 1.19 0078 7076 5036 85006 3091
Min. ratio test 1.33 0084 5020 4024 74068 4045
for given vectors d and . Although seemingly sim- Axpy (y ← x +y ) 1.20 0070 11076 3016 84070 3061
ple, this operation is one of the more complex parts Sparse
of an implementation, as John Forrest observes in the Mat-Vec (AT x) 1.10 0090 5054 6010 68008 4044
source of Clp. We implement a relatively simple two- Min. ratio test 1.33 0090 4010 15012 70034 6080
pass variant (Algorithm 3) from Harris (1973) and Axpy (y ← x +y ) 1.38 0074 19012 9049 83013 4014
described more recently in Koberstein (2005, §6.2.2.2), Notes. Lower values are better. Figures are geometric means of average exe-
whose aim is to avoid numerical instability caused by cution times over iterations over four standard LP problems. Recorded value
small values of i . In the process, small infeasibilities is fastest time of three repetitions. Dense/sparse distinction refers to the vec-
up to a numerical tolerance …D may be created. Note tor x; all matrices are sparse.
Lubin and Dunning: Computing in OR Using Julia
INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS 247

Greenbea Stocfor3 Ken-13 Fome12


0
10

Time per iteration (sec.)

10–1
Language
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

Julia
C++
10–2 MATLAB
PyPy
Java

10–3
. R ec

. R ec

. R ec

. R ec
c

c
.

.A .
.M y

.M y

.M y

y
a y

a y

a y

a y
o

o
Sp atio

Sp atio

Sp atio

Sp atio
en e

en e

en e

en e
en Axp

en Axp

en Axp

xp
. M xp

. M xp

. M xp

. M xp
en ti

en ti

en ti

en ti
Sp tV

Sp tV

Sp tV

Sp tV
D atV

D atV

D atV

D atV
D Ra

D Ra

D Ra

D Ra
Sp . A

Sp . A

Sp . A

Sp . A
.

.
.M

.
en
D

D
Operation

Figure 2 (Color online) Average Execution Time for Each Operation and Language, by Instance
Note. Compared with MATLAB, PyPy, and Java, the execution time of Julia is significantly closer to that of C++.

indices in high-level languages, we implemented a PyPy are within a factor of 3–18. Java has more consis-
variant in C++ with explicit bounds checking. We tent performance within a factor of 3–7 of C++ with
also consider executing the Python code under the bounds checking. Pure Python is far from competi-
PyPy engine (Bolz et al. 2009), a JIT-compiled imple- tive, being at least 70× slower than C++.
mentation of Python. We have not used the popu- Figure 2 displays the absolute execution times bro-
lar NumPy library in Python because it would not ken down by instance. We observe the consistent per-
alleviate the need for manually coded loops and so formance of Julia, whereas those of MATLAB and
would provide little speed benefit. No special runtime PyPy are subject to more variability. In all cases except
parameters are used, and the C++ code is compiled the smaller greenbea instance, use of the vector-
with -O2. For languages with JIT compilers (Java, sparse routines significantly decreases execution time,
PyPy, and Julia) we attempt to exclude measuring although PyPy’s performance is relatively poorer on
compilation time by executing the code a number of these routines.
times before timing the results. Our results are qualitatively similar to those
Realistic input data were generated by running a reported by Bezanson et al. (2012) on a set of unre-
modified implementation of the dual simplex algo- lated general language benchmarks and thus serve as
rithm on a small set of standard LP problems and an independent corroboration of their findings that
recording the required input data for each oper- Julia’s performance is within a factor of 2 of equiva-
ation from iterations sampled uniformly over the lent low-level compiled code.
course of the algorithm. At least 200 iterations
are recorded from each instance. Using such data
from real instances is important because execution Supplemental Material
Supplemental material to this paper is available at http://dx
times depend significantly on the sparsity patterns
.doi.org/10.1287/ijoc.2014.0623.
of the input. The instances we consider are green-
bea, stocfor3, and ken-13 from the NETLIB repos-
itory (Gay 1985) and the fome12 instance from Acknowledgments
This work would not be possible without the effort of the
Hans Mittelmann’s benchmark set (Mittelmann 2013).
Julia team—Jeff Bezanson, Stefan Karpinski, Viral Shah, and
These instances represent a range of problem sizes Alan Edelman—as well as that of the larger community of
and sparsity structures. Julia contributors. The authors acknowledge, in particular,
Experiments were performed under the Linux oper- Carlo Baldassi and Dahua Lin for significant contributions
ating system on a laptop with an Intel i5-3320M pro- to the development of interfaces for linear programming
cessor. See Table 5 for a summary of results. Julia solvers. They thank Juan Pablo Vielma for his comments on
consistently performs within 50% of the implemen- this manuscript, which substantially improved its presen-
tation in C++ with bounds checking; MATLAB and tation. They thank Chris Maes of Gurobi Optimization for
Lubin and Dunning: Computing in OR Using Julia
248 INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS

his feedback. M. Lubin was supported by the DOE Compu- Berz M, Bischof C, Corliss G, Griewank A, eds. Computational
tational Science Graduate Fellowship [Grant no. DE-FG02- Differentiation: Applications, Techniques, and Tools (SIAM,
97ER25308]. Philadelphia), 173–184.
Gay DM (1997) Hooking your solver to AMPL. Technical report,
Bell Laboratories, Murray Hill, NJ.
Grant MC, Boyd SP (2013) The CVX users’ guide (release 2.0).
References Accessed May 1, 2013, http://cvxr.com/cvx/doc/CVX.pdf.
Andersson J (2013) A general-purpose software framework for Hall J (2010) Towards a practical parallelisation of the simplex
method. Comput. Management Sci. 7(2):139–170.
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.

dynamic optimization. Ph.D. thesis, Arenberg Doctoral School,


KU Leuven, Department of Electrical Engineering (ESAT/SCD) Hall J, McKinnon K (2005) Hyper-sparsity in the revised simplex
and Optimization in Engineering Center, Heverlee, Belgium. method and how to exploit it. Comput. Optim. Appl. 32(3):
Belotti P, Lee J, Liberti L, Margot F, Wächter A (2009) Branching and 259–283.
Harris PMJ (1973) Pivot selection methods of the DEVEX LP code.
bounds tightening techniques for non-convex MINLP. Optim.
Math. Programming 5(1):1–28.
Methods Software 24(4–5):597–634.
Hart WE, Watson J-P, Woodruff DL (2011) Pyomo: Modeling and
Bezanson J, Karpinski S, Shah VB, Edelman A (2012) Julia: A fast
solving mathematical programs in Python. Math. Programming
dynamic language for technical computing. Accessed January
Comput. 3(3):219–260.
29, 2015, http://arxiv.org/abs/1209.5145.
Koberstein A (2005) The dual simplex method, techniques for a
Bixby RE (2002) Solving real-world linear programs: A decade and
fast and stable implementation. Unpublished doctoral thesis,
more of progress. Oper. Res. 50(1):3–15.
Universität Paderborn, Paderborn, Germany.
Bolz CF, Cuni A, Fijalkowski M, Rigo A (2009) Tracing the meta- Lattner C, Adve V (2004) LLVM: A compilation framework for
level: PyPy’s tracing JIT compiler. Proc. 4th Workshop Implemen- lifelong program analysis and transformation. Code Generation
tation, Compilation, Optim. Object-Oriented Languages Program- Optim., 2004. Internat. Sympos., Palo Alto, CA, 75–86.
ming Systems, ICOOOLPS ’09 (ACM, New York), 18–25. Lofberg J (2004) YALMIP: A toolbox for modeling and optimization
Brooke A, Kendrick D, Meeraus A, Raman R (1988) GAMS: A User’s in MATLAB. Comput. Aided Control Systems Design, 2004 IEEE
Guide (Scientific Press, Redwood City, CA). Internat. Sympos., Taipai, Taiwan, 284–289.
Duff IS, Grimes RG, Lewis JG (1989) Sparse matrix test problems. Maros I (2003) Computational Techniques of the Simplex Method
ACM Trans. Math. Software 15(1):1–14. (Kluwer Academic Publishers, Norwell, MA).
Fourer R, Orban D (2010) DrAmpl: A meta solver for opti- Mitchell S, O’Sullivan M, Dunning I (2011) Pulp: A linear
mization problem analysis. Comput. Management Sci. 7(4): programming toolkit for python. Accessed May 1, 2013,
437–463. https://code.google.com/p/pulp-or/.
Fourer R, Gay DM, Kernighan BW (1993) AMPL: A Modeling Lan- Mittelmann H (2013) Benchmarks for optimization software.
guage for Mathematical Programming, 2nd ed. (Brooks/Cole, Accessed April 28, 2013, http://plato.la.asu.edu/bench.html.
Pacific Grove, CA). Suhl UH, Suhl LM (1990) Computing sparse LU factorizations
Gay DM (1985) Electronic mail distribution of linear programming for large-scale linear programming bases. ORSA J. Comput.
test problems. Math. Programming Soc. COAL Newsletter 13: 2(4):325–335.
10–12. van der Walt S, Colbert SC, Varoquaux G (2011) The NumPy array:
Gay DM (1996) More AD of nonlinear AMPL models: Comput- A structure for efficient numerical computation. Comput. Sci.
ing Hessian information and exploiting partial separability. Engrg. 13(2):22–30.

You might also like