Dunning Lubin2015
Dunning Lubin2015
This article may be used only for the purposes of research, teaching, and/or private study. Commercial use
or systematic downloading (by robots or other automatic processes) is prohibited without explicit Publisher
approval, unless otherwise noted. For more information, contact [email protected].
The Publisher does not warrant or guarantee the article’s accuracy, completeness, merchantability, fitness
for a particular purpose, or non-infringement. Descriptions of, or references to, products or publications, or
inclusion of an advertisement in this article, neither constitutes nor implies a guarantee, endorsement, or
support of claims made of that product, publication, or service.
INFORMS is the largest professional society in the world for professionals in the fields of operations research, management
science, and analytics.
For more information on INFORMS, its publications, membership, or meetings visit http://www.informs.org
INFORMS Journal on Computing
Vol. 27, No. 2, Spring 2015, pp. 238–248
ISSN 1091-9856 (print) ISSN 1526-5528 (online)
http://dx.doi.org/10.1287/ijoc.2014.0623
© 2015 INFORMS
T he state of numerical computing is currently characterized by a divide between highly efficient yet typically
cumbersome low-level languages such as C, C++, and Fortran and highly expressive yet typically slow
high-level languages such as Python and MATLAB. This paper explores how Julia, a modern programming
language for numerical computing that claims to bridge this divide by incorporating recent advances in language
and compiler design (such as just-in-time compilation), can be used for implementing software and algorithms
fundamental to the field of operations research, with a focus on mathematical optimization. In particular, we
demonstrate algebraic modeling for linear and nonlinear optimization and a partial implementation of a practical
simplex code. Extensive cross-language benchmarks suggest that Julia is capable of obtaining state-of-the-art
performance.
Data, as supplemental material, are available at http://dx.doi.org/10.1287/ijoc.2014.0623.
Keywords: algebraic modeling; scientific computing; programming languages; metaprogramming
History: Accepted by Jean-Paul Watson, Area Editor for Modeling; received May 2013; revised June 2014;
accepted September 2014.
Operations research and digital computing have written in C and Fortran. Examples of languages
grown hand-in-hand over the last 60 years, with of this type would be Python (especially with the
historically large amounts of available computing Numpy (van der Walt et al. 2011) package), R, and
power being dedicated to the solution of linear pro- MATLAB. Besides being interpreted rather than stati-
grams (Bixby 2002). Linear programming is one of the cally compiled, these languages are slower for a vari-
key tools in the operations research toolbox and con- ety of additional reasons, including the lack of strict
cerns selection of variable values to maximize a lin- variable typing.
ear function subject to a set of linear constraints. This Just-in-time (JIT) compilation has emerged as a way
foundational problem, the algorithms to solve it, and to have the expressiveness of modern scripting lan-
its extensions form a large part of operations research- guages and the performance of lower-level languages
related computation. The purpose of this paper is to such as C. JIT compilers attempt to compile at run
explore modern advances in programming languages time by inferring information not explicitly stated by
that will affect how algorithms for operations research the programmer and use these inferences to opti-
computation are implemented; we will use linear and mize the machine code that is produced. Attempts
nonlinear programming as motivating cases. to retrofit this functionality to the languages men-
The primary languages of high-performance com- tioned has had mixed success because of issues with
puting have been Fortran, C, and C++ for a multi- language design conflicting with the ability of the
tude of reasons, including their interoperability, their JIT compiler to make these inferences and problems
ability to compile to highly efficient machine code, with the compatibility of the JIT functionality with the
and their sufficient level of abstraction over pro- wider package ecosystems.
gramming in an assembly language. These languages Julia (Bezanson et al. 2012) is a new programming
are compiled offline and have strict variable typing, language that is designed to address these issues.
allowing advanced optimizations of the code to be The language is designed from the ground-up to be
made by the compiler. both expressive and enable the LLVM-based JIT com-
A second class of more modern languages has piler (Lattner and Adve 2004) to generate efficient
arisen that is also popular for scientific computing. code. In benchmarks reported by its authors, Julia per-
These languages are typically interpreted languages formed within a factor of two of C on a set of com-
that are highly expressive but do not match the speed mon basic tasks. The contributions of this paper are
of lower-level languages in most tasks. They make up twofold: first, we develop publicly available codes to
for this by focusing on “glue code” that links together, demonstrate the technical features of Julia that greatly
or provides wrappers around, high-performance code facilitate the implementation of optimization-related
238
Lubin and Dunning: Computing in OR Using Julia
INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS 239
tools. Second, we will confirm that the aforemen- models and in practice, e.g., in simulations of decision
tioned performance results hold for realistic problems processes. They achieve a similar “look” to AMPL by
of interest to the field of operations research. utilizing the operator overloading functionality in their
This paper is not a tutorial. We encourage inter- respective languages, which introduces significant
ested readers to view the language documentation overhead and inefficient memory usage. Interfaces in
at julialang.org. An introduction to Julia’s syntax C++ based on operator overloading, such as those
will not be provided here, although the examples of provided by the commercial solvers Gurobi and
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.
code presented should be comprehensible to read- CPLEX, are often significantly faster than AMLs in
ers with a background in programming. The source interpreted languages, although they sacrifice ease of
code for all the experiments in the paper is available use and solver independence.
in the online supplement (available as supplemental We propose a new AML, JuMP, implemented and
material at http://doi.dx.org/10.1287/ijoc.2014.0623). released as a Julia package, that combines the speed
JuMP (Julia for Mathematical Programming), a library of commercial products with the benefits of remaining
developed by the authors for mixed-integer algebraic within a fully functional high-level modern language.
modeling, is available directly through the Julia pack- We achieve this by using Julia’s metaprogramming fea-
age manager, together with community-developed tures to turn natural mathematical expressions into
low-level interfaces to a number of existing commer- sparse internal representations of the model without
cial and open-source solvers for mixed-integer and using operator overloading. In this way we achieve
linear optimization. performance comparable to AMPL and an order of
The rest of the paper is organized as follows: In §1, magnitude faster than other embedded AMLs.
we present the package JuMP. In §2, we explore non-
1.1. Metaprogramming with Macros
linear extensions. In §3, we evaluate the suitability of
Julia can represent its own code as a Julia data struc-
Julia for low-level implementation of numerical opti-
ture and provides easy access to its built-in syntax
mization algorithms by examining its performance on
parser. This feature is also found in languages such as
a realistic partial implementation of the simplex algo-
Lisp. To make the concept of metaprogramming more
rithm for linear programming.
clear, consider the following Julia code snippet:
1 macro m(ex)
1. JuMP 2 ex.args[1] = :(-) # Replace operation with
Algebraic Modeling Languages (AMLs) are an essen- subtraction
tial component in any operations researcher’s tool- 3 return esc(ex) # Escape expression
box. AMLs enable researchers and programmers to (see below)
describe optimization models in a natural way, mean- 4 end
ing that the description of the model in code resem- 5 x = 2; y = 5 # Initialize variables
bles the mathematical statement of the model. One of 6 2x + yˆx # Prints 29
the most well-known AMLs is AMPL (Fourer et al. 7 @m(2x + yˆx) # Prints -21.
1993), a commercial tool that is both fast and expres-
sive. This speed comes at a cost: AMPL is not a On lines 1–4 we define the macro m. Macros are
fully fledged modern programming language, which compile-time source-transformation functions, similar
makes it a less-than-ideal choice for manipulating in concept to the preprocessing features of C but oper-
data to create the model, for working with the results ating at the syntactic level instead of performing tex-
of an optimization, and for linking optimization into tual substitution. When the macro is invoked on line 7
a larger project. with the expression 2x + y x , the value of ex is a Julia
Interpreted languages such as Python and object that contains a representation of the expression
MATLAB have become popular with researchers and as a tree. Although we will not describe the precise
practitioners alike because of their expressiveness, structure of this object, one may essentially visualize
package ecosystems, and acceptable speeds. Packages it by using the compact Polish (prefix) notation:
for these languages that add AML functionality 4+1 4∗1 21 x51 4∧ 1 y1 x550
such as YALMIP (Lofberg 2004) for MATLAB and
PuLP (Mitchell et al. 2011) and Pyomo (Hart et al. Line 2 replaces the + in the above expression with −,
2011) for Python address the general-purpose- where :(-) is Julia’s syntax for the symbol −. Line 3
computing issues of AMPL but sacrifice speed. These returns the escaped output, indicating that the expres-
AMLs take a nontrivial amount of time to build the sion refers to variables in the surrounding scope.
sparse representation of the model in memory, which Hence, the output of the macro is the expression 2x −
is especially noticeable if models are being rebuilt y x , which is subsequently compiled and finally eval-
many times, which happens in the development of uated to the value −21.
Lubin and Dunning: Computing in OR Using Julia
240 INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS
Macros provide powerful functionality to perform AMLs like PuLP achieve this with operator over-
arbitrary transformation of expressions efficiently. The loading. By defining new types to represent variables,
complete language features of Julia are available new definitions are provided for the basic mathemat-
within macros, unlike the limited syntax for macros ical operators when one or both the operands is a
in C. Additionally, macros are evaluated only once, variable. The expression is then built by combining
at compile time, and so have no runtime overhead, subexpressions together until the full expression is
unlike eval functions in MATLAB and Python. (Note: obtained. This typically leads to an excessive num-
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.
with JIT compilation, “compile time” in fact occurs ber of intermediate memory allocations. One of the
during the program’s execution, e.g., the first time a advantages of AMPL is that, as a purpose-built tool,
function is called.) We will use macros as a basis for it has the ability to statically analyze the expression
both linear and nonlinear modeling. to determine the storage required for its final repre-
sentation. One way it may achieve this is by doing
1.2. Language Design an initial pass to determine the size of the arrays
Although we did not set out to design a full modeling to allocate, and then a second pass to store the cor-
language with as wide a variety of options as AMPL, rect coefficients. Our goal with Julia was to use the
we have sufficient functionality to model any linear metaprogramming features to achieve a similar effect
optimization problem with a similar number of lines and bypass the need for operator overloading.
of code. Consider the following simple AMPL model
of a “knapsack” problem (we will assume the data are 1.4. Metaprogramming Implementation
provided before the following lines): Our solution is similar to what is possible with AMPL
and does not rely on operator overloading at all. Con-
var x{j in 1..N} >= 0.0, <= 1.0;
sider the knapsack constraint provided in the pre-
maximize Obj: vious example. We will change the constraint into
sum {j in 1..N} profit[j] ∗ x[j]; an equality by adding a slack variable to make the
subject to CapacityCon: expression more complex than a single sum. The
sum {j in 1..N} weight[j] ∗ x[j] <= capacity. addConstraint macro converts the expression
The previous model would be written in Julia using @addConstraint(m, sum{weight[j] ∗ x[j], j=1:N} +
JuMP with the following code: s == capacity)
m = Model() into the following code, transparently to the user:
@defVar(m, 0 <= x[1:N] <= 1)
aff = AffExpr()
@setObjective(m, Max, sum{ profit[j] ∗ x[j],
sumlen = length(1:N)
j = 1:N })
sizehint!(aff.vars, sumlen)
@addConstraint(m, sum{ weight[j] ∗ x[j], sizehint!(aff.coeffs, sumlen)
j = 1:N } <= capacity). for i = 1:N
The syntax is mostly self-explanatory and is not addToExpression(aff, 1.0 ∗ weight[i], x[i])
the focus of this paper, but we draw attention to end
the similarities between the syntax of our Julia AML addToExpression(aff, 1.0, s)
and existing AMLs. In particular, macros permit us to addToExpression(aff, -1.0, capacity)
define new syntax such as sum{}, which is not part of addConstraint(m, Constraint(aff,"==")).
the Julia language.
The macro breaks the expression into parts and
1.3. Building Expressions then stitches them back together, as in our desired
The model is stored internally as a set of rows until data structure. AffExpr represents the custom type
it is completely specified. Each row is defined by two that contains the variable indices (vars) and coeffi-
arrays: the first is the indices of the columns that cients (coeffs). In the first segment of code the macro
appear in this row and the second contains the cor- pulls the indexing scheme out of the sum and deter-
responding coefficients. This representation is essen- mines how long an array is required. Sufficient space
tially the best possible while the model is being built to store the sum is reserved in one pass using the
and can be converted to a sparse column-wise for- built-in function sizehint! before addToExpression
mat with relative efficiency. The challenge then is (defined elsewhere) is used to fill it out. We use mul-
to convert the user’s statement of the problem into tiple dispatch to let Julia decide what type of object
this sparse row representation as quickly as possible, x[i] is, either a constant or a variable placeholder,
while not requiring the user to express rows in a way using its efficient built-in type inference mechanism.
that loses the readability expected from AMLs. After the sum is handled, the single slack variable s is
Lubin and Dunning: Computing in OR Using Julia
INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS 241
appended, and finally the right-hand side of the con- Note that JuMP does not need to transpose the nat-
straint is set. Note the invocation of addToExpression urally row-wise data when outputting in LP format,
with different argument types in the last usage—this which explains the observed difference in execution
time two constants instead of a constant and a vari- times. In JuMP and AMPL, model construction was
able. The last step is to construct the Constraint object observed to consume 20%–50% of the total time for
that is essentially a wrapper around the expression this model:
and the sense. The function addConstraint is defined N X
L
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.
Table 1 P -Median Benchmark Results • The problem specified is simply too difficult for
available solvers, whether in terms of memory use or
JuMP/Julia AMPL Gurobi/C++ PuLP/PyPy Pyomo
computation time. In this case, users are directed to
L LP MPS MPS LP MPS LP MPS LP consider reformulating the problem or implementing
1,000 005 100 007 008 008 505 408 1007 specialized algorithms for it.
5,000 203 402 303 306 309 2604 2302 5406 Our focus in this section is on the first case, which is
10,000 500 809 607 703 803 5305 5006 11000 typical of the large-scale performance of open-source
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.
50,000 2709 4803 3500 3702 3903 22401 22506 58307 AMLs implemented in high-level languages, as will
Notes. L is the number of locations. Total time (in seconds) to process the be demonstrated in the numerical experiments in this
model definition and produce the output file in LP and MPS formats (as section.
available). This performance gap between commercial and
open-source AMLs can be partially explained by
Table 2 Linear-Quadratic Control Benchmark Results considering the possibly very different motivations
JuMP/Julia AMPL Gurobi/C++ PuLP/PyPy Pyomo
of their respective authors; however, we posit that
there is a more technical reason. In languages such
N LP MPS MPS LP MPS LP MPS LP as MATLAB and Python there is no programmatic
250 005 009 008 102 101 803 702 1303 access to the language’s highly optimized expres-
500 200 306 300 405 404 2706 2404 5304 sion parser. Instead, to handle nonlinear expressions
750 500 804 607 1002 1001 6100 5405 12100 such as y ∗ sin(x), one must either overload both the
1,000 902 1505 1106 1706 1703 10802 9705 21407 multiplication operator and the sin function—which
Notes. N = M is the grid size. Total time (in seconds) to process the model leads to the expensive creation of many temporary
definition and produce the output file in LP and MPS formats (as available). objects as previously discussed—or manually parse
the expression as a string, which itself may be slow
1.6. Availability and breaks the connection with the surrounding lan-
JuMP (https://github.com/JuliaOpt/JuMP.jl) has been guage. YALMIP and Pyomo implement the operator
released with documentation as a Julia package. It overloading approach, while CVX implements a com-
remains under active development. It currently inter- bination of both approaches.
faces with Cbc, Clp, CPLEX, GLPK, Gurobi, and Julia, in contrast, provides first-class access to its
MOSEK. Linux, OS X, and Windows platforms are expression parser through its previously discussed
supported. metaprogramming features, which facilitates the gen-
eration of resulting code with performance compara-
2. Nonlinear Modeling ble to that of commercial AMLs. In §2.1 we describe
Commercial AMLs such as AMPL and GAMS (Brooke our proof-of-concept implementation, followed by
et al. 1988) are widely used for specifying large-scale computational results in §2.2.
nonlinear optimization problems, that is, problems of
2.1. Implementation in Julia
the form
Whereas linear expressions are represented as sparse
min f 4x5 vectors of nonzero values, nonlinear expressions are
x
(1) represented as algebraic expression graphs, as illus-
subject to gi 4x5 ≤ 0 i = 11 0 0 0 1 m1 trated later. Expression graphs, when available, are
where f and gi are given by closed-form expressions. integral to the solution of nonlinear optimization
Similar to the case of modeling linear optimiza- problems. The AML is responsible for using these
tion problems, open-source AMLs exist and provide graphs to evaluate function values and first and sec-
comparable, if not superior, functionality; however, ond derivatives as requested by the solver (typically
they may be significantly slower to build the model, through callbacks). Additionally, they may be used by
even impractically slower on some large-scale prob- the AML to infer problem structure to decide which
lems. The user guide of CVX, an award-winning solution methods are appropriate (Fourer and Orban
open-source AML for convex optimization built on 2010) or by the solver itself to perform important
MATLAB, states that “CVX is not meant for very large problem reductions in the case of mixed-integer non-
problems” (Grant and Boyd 2013). This statement linear programming (Belotti et al. 2009).
refers to two cases that are important to distinguish: Analogously to the linear case, where macros are
• An appropriate solver is available for the prob- used to generate code that forms sparse vector rep-
lem, but the time it takes to build the model in mem- resentations, a macro was implemented that gener-
ory and pass it to the solver is a bottleneck. In this ates code to form nonlinear expression trees. Macros,
case, users are directed to use a low-level interface to when called, are provided with an expression tree
the solver in place of the AML. of the input; however, symbols are not resolved to
Lubin and Dunning: Computing in OR Using Julia
INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS 243
– –
sin sin
getindex getindex
Value Value
t (i + 1) t i t[i + 1] t[i]
Figure 1 A Macro Called with the Text Expression x[i+1] - x[i] - (0.5h) ∗ (sin(t[i+1])+sin(t[i])) is Given as Input as the Expression
Tree on the Left
Notes. Parentheses indicate subtrees combined for brevity. At this stage, symbols are abstract and not resolved. To prepare the nonlinear expression, the
macro produces code that generates the expression tree on the right with variable placeholders spliced in from the runtime context.
values. Indeed, values do not exist at compile time a one-to-one correspondence between the variables
when macros are evaluated. The task of the macro, present) and only performing symbolic differentiation
therefore, is to generate code that replicates the input once per equivalent expression.
expression with runtime values (both numeric con- The implementation of the Jacobian computa-
stants and variable placeholder objects) spliced in, as tion, which supports general expression trees, spans
illustrated in Figure 1. This splicing of values is by approximately 250 lines of code, including the basic
construction and does not require expensive runtime logic for the chain rule. In the following section it
calls, as MATLAB’s eval function does. is demonstrated that evaluating the Jacobian using
The implementation of this macro to generate the JIT-compiled function is as fast as using AMPL
expression trees is compact, approximately 20 lines of through the low-level AMPL solver library (Gay
code including support for the sum{} syntax presented 1997), presently a de facto standard for evaluating
in §1.2. Although nontrivial understanding of Julia’s derivatives in nonlinear models. Interestingly, there is
metaprogramming syntax is required to implement an executable accompanying the AMPL solver library
such a macro, the effort should be compared with (nlc), which generates and compiles C code to eval-
what would be necessary to obtain the equivalent out- uate derivatives for a specific model, although it
put and performance from a low-level language; in is seldom used in practice because of the cost of
particular, one would need to write a custom expres- compilation and the marginal gains in performance.
sion parser. However, in a language such as Julia with JIT compi-
Given expression trees for the constraints (1), we lation, compiling functions generated at runtime can
consider computing the Jacobian matrix be a technique to both simplify an implementation
and obtain performance comparable to that of low-
ïg1 4x5 level languages.
ïg2 4x5 The idea of using JIT compilation for derivatives
is not new; CasADi (Andersson 2013), a recently
J 4x5 = 00
1
developed framework for computing derivatives in
0
nonlinear optimization, contains experimental but
ïgm 4x5 nonfunctioning code from 2012 to use the JIT com-
piler of LLVM, the same compiler used internally
where ïgi 4x5 is a row-oriented gradient vector. Unlike by Julia. CasADi, however, is implemented in C++.
the typical approach of using automatic differentia- We do not know why this approach was eventu-
tion for computing derivatives in AMLs (Gay 1996), ally abandoned, but we suppose that the difficulty
a simpler method based on symbolic differentiation of interfacing with a JIT compiler from C++ was
can be equally efficient in Julia for the case of first- a significant contributing factor. Note that CasADi,
order derivatives. In particular, we derive the form like AMPL, uses techniques from automatic differen-
of the sparse Jacobian matrix by applying the chain tiation instead of computing derivatives of symbolic
rule symbolically and then, using JIT compilation, expressions.
compile a function that evaluates the Jacobian for any
given input vector. This process is accelerated by 2.2. Computational Tests
identifying equivalent expression trees (those that We test our implementation on two nonlinear opti-
are symbolically identical and for which there exists mization problems obtained from Hans Mittelmann’s
Lubin and Dunning: Computing in OR Using Julia
244 INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS
The Mat-Vec operation is illustrated for dense that the implementation in the benchmark handles
vectors in Algorithm 1 and for sparse vectors in both upper and lower bounds; Algorithm 3 is sim-
Algorithm 2. Sparse matrices are provided in either plified in this respect for brevity. A sparse variant is
compressed sparse column or compressed sparse row easily obtained by looping over the nonzero elements
format as appropriate (Duff et al. 1989). Note that of in the first pass.
in Algorithm 1 we use a flag vector to indicate the The third operation is a modified form of the vector
selected columns of the matrix A. This corresponds update y ← x +y (Axpy). In the variant used in the
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.
to skipping particular dot-products. The result vector simplex algorithm, the value of each updated com-
has a memory layout with n, not n−m entries. This ponent is tested for membership in an interval. For
form could be desired in some cases for subsequent example, given a tolerance
, a component belonging
operations and is illustrative of the common practice to the interval 4−1−
5 may indicate loss of numeri-
in simplex implementations of designing data struc- cal feasibility, in which case a certain corrective action,
tures with a global view of the operations in which such as local perturbation of problem data, may be
they will be used (Maros 2003, Chap. 5). In Algo- triggered. This procedure is more naturally expressed
rithm 2 we omit what would be a costly flag check using an explicit loop over elements of x instead of
for each nonzero element of the row-wise matrix; the performing operations on vectors.
gains of exploiting sparsity often outweigh the extra The three operations discussed represent a non-
floating-point operations. trivial proportion of execution time of the simplex
method, between 20% and 50%, depending on the
Algorithm 3 (Two-pass stabilized minimum ratio test problem instance (Hall and McKinnon 2005). Most
(dual simplex)) of the remaining execution time is spent factorizing
Input: Vectors d1 ∈ n , state vector and solving linear systems using specialized proce-
s ∈ {“lower,”“basic”}n , parameters
P 1
D > 0 dures, which we do not implement because of their
Output: Solution index result. complexity. We claim, however, that the performance
ämax ← observed on the benchmark operations may reason-
for i in 8110001n9 do ably be expected to generalize to an implementation
if si = “lower” and i >
P then of these specialized linear algebra operations, because
Add index i to list of candidates
their implementation is similarly based on low-level
d +
sparsity-exploiting loops and scalar operations.
ämax ← min i D 1ämax
i
end if 3.2. Results
end for The benchmark operations described previously were
max ← 0, result ← 0 implemented in Julia, C++, MATLAB, Python, and
for i in list of candidates do Java. Examples of code have been omitted for brevity.
if di /i ≤ ämax and i > max then The style of the code in Julia is qualitatively similar
max ← i to that of the other high-level languages. Readers are
result ← i encouraged to view the implementations available in
end if the online supplement. To measure the overhead of
end for bounds checking, a validity check performed on array
10–1
Language
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.
Julia
C++
10–2 MATLAB
PyPy
Java
10–3
. R ec
. R ec
. R ec
. R ec
c
c
.
.A .
.M y
.M y
.M y
y
a y
a y
a y
a y
o
o
Sp atio
Sp atio
Sp atio
Sp atio
en e
en e
en e
en e
en Axp
en Axp
en Axp
xp
. M xp
. M xp
. M xp
. M xp
en ti
en ti
en ti
en ti
Sp tV
Sp tV
Sp tV
Sp tV
D atV
D atV
D atV
D atV
D Ra
D Ra
D Ra
D Ra
Sp . A
Sp . A
Sp . A
Sp . A
.
.
.M
.
en
D
D
Operation
Figure 2 (Color online) Average Execution Time for Each Operation and Language, by Instance
Note. Compared with MATLAB, PyPy, and Java, the execution time of Julia is significantly closer to that of C++.
indices in high-level languages, we implemented a PyPy are within a factor of 3–18. Java has more consis-
variant in C++ with explicit bounds checking. We tent performance within a factor of 3–7 of C++ with
also consider executing the Python code under the bounds checking. Pure Python is far from competi-
PyPy engine (Bolz et al. 2009), a JIT-compiled imple- tive, being at least 70× slower than C++.
mentation of Python. We have not used the popu- Figure 2 displays the absolute execution times bro-
lar NumPy library in Python because it would not ken down by instance. We observe the consistent per-
alleviate the need for manually coded loops and so formance of Julia, whereas those of MATLAB and
would provide little speed benefit. No special runtime PyPy are subject to more variability. In all cases except
parameters are used, and the C++ code is compiled the smaller greenbea instance, use of the vector-
with -O2. For languages with JIT compilers (Java, sparse routines significantly decreases execution time,
PyPy, and Julia) we attempt to exclude measuring although PyPy’s performance is relatively poorer on
compilation time by executing the code a number of these routines.
times before timing the results. Our results are qualitatively similar to those
Realistic input data were generated by running a reported by Bezanson et al. (2012) on a set of unre-
modified implementation of the dual simplex algo- lated general language benchmarks and thus serve as
rithm on a small set of standard LP problems and an independent corroboration of their findings that
recording the required input data for each oper- Julia’s performance is within a factor of 2 of equiva-
ation from iterations sampled uniformly over the lent low-level compiled code.
course of the algorithm. At least 200 iterations
are recorded from each instance. Using such data
from real instances is important because execution Supplemental Material
Supplemental material to this paper is available at http://dx
times depend significantly on the sparsity patterns
.doi.org/10.1287/ijoc.2014.0623.
of the input. The instances we consider are green-
bea, stocfor3, and ken-13 from the NETLIB repos-
itory (Gay 1985) and the fome12 instance from Acknowledgments
This work would not be possible without the effort of the
Hans Mittelmann’s benchmark set (Mittelmann 2013).
Julia team—Jeff Bezanson, Stefan Karpinski, Viral Shah, and
These instances represent a range of problem sizes Alan Edelman—as well as that of the larger community of
and sparsity structures. Julia contributors. The authors acknowledge, in particular,
Experiments were performed under the Linux oper- Carlo Baldassi and Dahua Lin for significant contributions
ating system on a laptop with an Intel i5-3320M pro- to the development of interfaces for linear programming
cessor. See Table 5 for a summary of results. Julia solvers. They thank Juan Pablo Vielma for his comments on
consistently performs within 50% of the implemen- this manuscript, which substantially improved its presen-
tation in C++ with bounds checking; MATLAB and tation. They thank Chris Maes of Gurobi Optimization for
Lubin and Dunning: Computing in OR Using Julia
248 INFORMS Journal on Computing 27(2), pp. 238–248, © 2015 INFORMS
his feedback. M. Lubin was supported by the DOE Compu- Berz M, Bischof C, Corliss G, Griewank A, eds. Computational
tational Science Graduate Fellowship [Grant no. DE-FG02- Differentiation: Applications, Techniques, and Tools (SIAM,
97ER25308]. Philadelphia), 173–184.
Gay DM (1997) Hooking your solver to AMPL. Technical report,
Bell Laboratories, Murray Hill, NJ.
Grant MC, Boyd SP (2013) The CVX users’ guide (release 2.0).
References Accessed May 1, 2013, http://cvxr.com/cvx/doc/CVX.pdf.
Andersson J (2013) A general-purpose software framework for Hall J (2010) Towards a practical parallelisation of the simplex
method. Comput. Management Sci. 7(2):139–170.
Downloaded from informs.org by [147.188.128.74] on 22 March 2015, at 02:43 . For personal use only, all rights reserved.