Related To Optimization
Related To Optimization
The purpose of this document is to present an introduction to optimization with the Petrel 2010.1
uncertainty and optimization process. The first part is devoted to defining optimization and its
associated vocabulary. The second part explains usage of optimization for a reservoir simulation
forecast in Petrel while the last part contains a number of "lessons learned" from previous optimization
projects. A Petrel Optimization license is necessary for all the following.
What is optimization ?
What follows is a somewhat simplified description to forecast optimization. A more detailed, and
general, description of optimization (with or without uncertainty) is provided in the Appendix of this
document. This text comes from a chapter written by Bailey et al. (2010) in the AAPG Memoir #96
"Uncertainty Analysis and Reservoir Modeling" (Y.Z.Ma and P.R. LaPointe, eds). One should ideally
read this reference first if one is either new to the concept of optimization or desires a more thorough
understanding of the topic. Indeed, the whole subject is very broad with a profusion of issues, both
nuanced and major, that may have a material impact on your own optimization.
Definition
Optimization
A reasonably good definition of optimization is: “Optimization is the design and operation of a system or
process to make it as good as possible in some pre-defined sense. The most common goals are
minimizing cost, maximizing throughput, profit and/or efficiency, etc.”
More precisely, in the context of this content, optimization consists of improving a pre-defined objective
function (the target to optimize) by varying the control variables (the operational parameters).
Optimization algorithms modify the control variables in order to reach this desired optimum, exploring a
potentially complex multi-dimensional solution space more efficiently than could any human mind
(assuming the number of control variables > 2).
A simulation is necessary at each step, until the maximum number of iterations has been reached or
until the final result is satisfactory (i.e., until the optimizer does not see the possibility of any further
improvement in the answer). The optimizer can only have one target to aim for, the objective function,
F. For example, to maximize the total oil produced and also (at the same time) minimize the amount of
produced water, one must define a unique function combining both oil and water production, such that
oil production is sufficiently rewarded and water production sufficiently penalized to yield an equable
balance between these two competing demands.
Objective function: The optimization approaches we consider here are designed to maximize (or
minimize) a single-valued function comprising one or more variables. In fact the optimization algorithm
is specifically designed for functions that are computationally expensive to evaluate (long run times,
such as most reservoir simulations). The form of this function can be very simple (e.g., the Total Field Oil
Production as forecast by a reservoir simulator, the "FOPT" solution vector in Eclipse) or it may be
complex in form (e.g., the post-tax NPV computed from the composition of output from both reservoir
and financial simulators). A note of caution here: defining the right objective function is not always as
simple as it may seem (refer to the appendix for more discussion).
Control variables: A control variable (CV), also called a decision variable, is defined as a parameter of
the optimization problem that we have physical control over. For example, liquid flow rate from a well is
usually a legitimate CV because it can be manipulated by the operator (via chokes) to comply with the
rates suggested by the optimizer. A large array of legitimate CVs are possible for forecast optimization
and include: flowing area of an ICD (at design) or an FCV, ACTION keyword trigger thresholds (e.g., water
cut, time), optimum BHP targets or completion flow rates, etc.
Constraint: A constraint is a requirement that the CVs must satisfy in order to be valid solutions of the
optimization problem. For example, a bounds constraint requires that the CVs must lie between pre-
specified upper and lower bounds. During optimization, any sample of a CV that exceeds these bounds is
rejected and only values within the bounds are utilized. Two primary classes of constraint exist: linear
and non-linear.
Uncertainty: these are parameters that impact the objective function but are outside of our control, like
oil price (known as "market" or "external" uncertainty as it is completely market-driven and is not
controllable by the asset holder) or permeability (and other reservoir rock properties), which are known
as "internal" or "private" uncertainties as these are under the domain of the asset holder and are not
related to any external market.
As illustrated below, it is easy to identify the maximum for a one-dimensional problem (the left-hand
image labeled "1D") - here the maximum is found at CV=1. The solution lies on the line shown. It is
somewhat more complex for a two dimensional problem (image labeled "2D"). For a 2D problem, if one
of the variables, say x1, is kept fixed, the maximum for the objective function may be found by varying
the second control variable, say x2. However, if x1 is fixed to a different value than before, another
maximum may exist for a totally different value of x2. This new maximum may, or may not, be higher
than the previous one. The result of repeating the exercise of (varying x2 while keeping x1 fixed over all
values of x1) is a solution surface which is shown below. The maximum value is still visible, but not as
immediately so as the 1D problem.
6
2
?
0
0 1 2 3 4
-2
-4
CV
1D 2D 3+D
If one imagines a problem with more than 2 control variables, say 3 or 4, it is not possible to visualize
the problem. The solution surface lies in a multi-dimensional terrain which we are unable to fully
conceptualize. As such, we utilize numerical optimization techniques as these are specifically designed
to operate on multidimensional problems.
Improving oil production from a field is often a complex problem due to interference and interaction
between wells. For example, reducing the production at one well may increase the production at
another, or injecting water may improve oil production, but the subsequent increase in water
production may prove to be detrimental to the economics of the asset being considered.
In a utopian world, our optimization would yield the global optimum with just a few simulations (ideally
just one). In practice few simulations is usually insufficient and the “global optimum” is probably an
unrealistic target to achieve with a finite amount of available computing time. Many local optima for the
problem at hand may exist. These local solutions may be spread quite far from each other and possibly
quite distant from the global one. Unless one can invest in a full and complete exploration of the
solution space, the optimum value returned by the algorithm may prove to be one of these local
solutions. Could one improve on this solution? Possibly yes, but there is a cost. Remember we must pay
a price for more complete exploration of the solution space - the price being more (expensive)
simulation runs. Furthermore and especially in a search domain of continuous variables, it is impossible
to know precisely if you have achieved the best (global) optimum value without previously exploring the
whole space.
1.2
Global maximum
1
0.8
0.6
f(X)
0.4
Local maxima
0.2
0
-0.2 0 5 10 15 20
Local minimum
-0.4
Global X
minimum
As such, we promote the following practical (and somewhat wise) operational definition of what
constitutes a "good
good optimization",
optimization as stated by Cullick et al. (2003): “optimization means solution
improvement, i.e., it is not necessarily achieving a provably optimal answer but will result in better-
quality solutions.” We urge the reader to employ this practical and realistic policy for optimization.
For the optimization to be efficient, the control variables selected should fulfill two main conditions:
• Be influential (i.e.,, will change [influence] the value of the objective function)
• Be independent as much as possible
po one from another, otherwise they may interfere with each
other in the optimization process.
A badly defined problem (objective function and/or CV set) will often result in many iterations to
provide a result that may be inferior to expectations.
2 Petrel Optimization
This section outlines the main steps required to create a (reservoir) forecast optimization in Petrel.
While the workflow is intuitive, one should bear in mind the issues outlined above (and in the appendix).
An interesting thing to note is that you do not need the case to be optimized to be part of a Petrel
project. The only prerequisite is that the case has been previously run. Note that if the case has not
been generated by Petrel, the only control variables that can be selected are numerical values inside the
reservoir simulation data deck. Users unfamiliar with raw Eclipse/Frontsim data decks should refer to
their respective manuals for detail.
Preliminary steps
In all cases, you should make a copy of the original base case as it will end up being modified in the
process. Once control variables or uncertainty variables have been defined, trying to run directly the
case from Petrel Cases pane will lead to an error, as numerical values have been replaced by $variable
strings.
If the model run has a significant size, a very good idea to avoid running out of space is to remove
RPTRST, RPTSOL, RPTSCHED, INIT keywords and to set all arguments in GRIDFILE keyword to 0 (zero).
Further reduction in size of the simulation footprint can be obtained by specifying only those solution
vectors (SUMMARY section of the data deck) that are actually used by the objective function. For
example, if the defined objective function utilizes only FOPT, FWPT and FWIT arrays, then there is no
reason to specify the ubiquitous "ALL" keyword (or other individual solution vectors). Petrel will keep
the result of all simulations so the user is encouraged to be vigilant about available spare hard disk
space. Furthermore, if the optimization is to run for several days, XP users should switch from Radia to
Radia Light to avoid being rebooted in the middle of an optimization, resulting in an incomplete run and
wasted time.
Customers not having access to a lot of ECLIPSE licenses, should also consider to use ECLIPSE Multiple
realisation licenses. This would so allow them to save “basic” ECLIPSE license tokens which themselves
will be kept available for the other company users. As an optimization process is requiring a significant
amount of simulations to be run, this solution has also the benefit to avoid that one single user,
suddenly consumes alone all the ECLIPSE company licenses.
Tutorial
The DEMOOPT project is the result of the execution of this tutorial.
The example presented here aims at maximizing the oil production from a three-well model. First, a case
(eg EXAMPLE6.DATA) is run outside of Petrel with ECLIPSE, and then loaded into a new Petrel project
from Cases Tab > Import on Tree) .
A video attached to this document retraces the steps described below. If your video player cannot
recognize the format, please download the Codec from http://camstudio.org/CamStudioCodec14.exe
If the video plays too fast, you can slow it down using your preferred video player. For example, in
Windows Media Player, right click on the play button to select Slow Playback.
Go to processes tab and select define objective function in Simulation. Drop your case in the Simulation
field.
You then select the kind of objective function you want. In our example, we select oil cumulative
production. If you choose net present value, you will be asked to fill below the corresponding conditions
(like the price of oil, for example). The next step is to validate everything.
Then, select the uncertainty and optimization in Utilities in the processes tab, followed by selecting
optimization for the task. You finally place your case to optimize in the base case above. The workflow
will automatically include the “Define objective function” command defined previously.
Select the keyword containing your control variable and click on it:
A text editor will open. Remove the numerical value you want to vary and replace it by a name of the
form $NAME . Save in the editor and close it. In the example above, we replaced the rates of each well
by a control variable ($CV1, $CV2 and $CV3). Ensure that the variable being defined corresponds to the
control mode of the keyword. A possible error is to have, say, ORAT as a control mode (argument #3 of
keyword WCONPROD) but define the argument for control as the one corresponding to LRAT. The user
should consult the Eclipse reference manual to ensure correct definitions of control variables in
keywords.
A small trick: in order to avoid having to open each keyword separately, go to the editor, then open the
full data file in an external editor, and replace there all the control variables with their $equivalent. Save
in the editor, and then go back to in Petrel keyword editor. Open any keyword in the editor, save, click
on apply and go back to the workflow screen. All the $values you have introduced with the external
editor should now be loaded.
You can now see on the left of run simulation case the name of the control variables.
You can open any steps, like the ‘Make development strategy’, and replace there any numerical value
by a control variable with a $NAME.
You can now see the control variable on the left side:
More control variables can be selected inside Petrel. For example, in addition to the numerical values
from the DATA file, numerical values involved in the case definition prior to the DATA generation (like
well inclination or length) can be selected.
- Disabled: if you don’t want the CV to be used. It will revert to its base value.
- Control: control variable that will be changed by the software during the optimization.
- Expression: if you want your control variables to depend on other variables, you can define it as
an expression you will have to fill below.
A control variable will need a base value (for when it is disabled), a min and a max values, and an initial
guess value that has the value used at the beginning of the optimization. The Int switch enables you to
define a value as an integer. Let us consider the following values for the control variables.
Here, you can select the optimizer among simplex optimizer, simplex non linear and neural network. An
important value to consider is the maximal number of iterations: the smaller the case, the more
simulations you can run. If you have time, you can put a large number: that does not mean that the
maximum number of simulations will be ran; if the optimizer hits a maximum before, optimization will
stop. You can also select if the optimizer will minimize or maximize the objective function (Optimizer
mode).
The convergence parameters can be very important: if the control variables input seems to have a
relatively small influence on the final results, the optimization will stop after just a few steps. You may
want to increase the precision needed (decrease the convergence value).
The optimization process has obviously no way to know anything a priori about the objective function
value. The user is the only person who can have an idea on a typical value for the objective function in
the domain defined by the control variable bounds. Depending on the optimization problem, the
objective function relative variation may be relatively important or small. For example, when optimizing
a field economics, a gain of one hundred thousand dollars will appear relatively less significant if the
total economics are of the order of tens of millions of dollars than if they are of the order of several
hundreds of thousands. If variations are small and the convergence parameters lax, the optimization will
decide quickly that it has reached a final result. The user should then input such values as the objective
function scale. This is necessary to insure that the convergence criteria are properly scaled.
Linear constraints (i.e., constraints that are a linear combination of the control variables, for example,
aX1+bX2 +… < D) can be input for the simplex and the neural network optimizers. Non-linear ones
involving more complex expressions can be input only for the simplex non-linear by combining an
expression and a constraint.
When the optimization starts, it will run N+1 simulations to construct the simplex or train the neural
network. Afterwards it will either progress simulation after simulation in the estimated best direction
(simplex), or test its prediction (Neural Network). If the simulations are run on a cluster the first N+1
simulations will be submitted all at the same time instead of one after another on a one core computer.
Reading the results
Following the optimization run, by requesting “Show variables spreadsheet” from the Cases pane, you
will obtain such a report as above one with:
In the Cases pane, an uncertainty and optimization folder will contain all simulations.
Remark: Taking the uncertainty into account during optimization
It is possible to have uncertainty taken into account during optimization. In order to take these
uncertainties into account in the optimization, one must select the Optimization under uncertainty task.
The uncertainty itself is defined in a way similar to the control variable, by using a $NAME variable either
in a Petrel Process or in a data file. The Variables tab enables to select the type of uncertainty
(Triangular, Normal, Log normal…).
Lists can be used to import different includes (containing for example different reservoir properties
realization). For example,
INCLUDE
GRIDPROP_$UNCERT1.INC /
The syntax above will enable to use the following different grids GRIDPROP_1.INC, GRIDPROP_2.INC and
GRIDPROP_3.INC.
A new tab uncertainty will appear next to the optimizer tab, where the number of uncertainty grids and
the sampling used can be set. At each iteration of the optimizer, the number of simulations chosen will
be run.
These simulations will be run at the same time if run on a cluster. A unique objective functions will be
calculated from the results of each simulation, depending on each grid results and the risk considered
(more details in the next part).
Frequently Asked Questions/Lessons Learned
If the control variables have almost no effect on the final result or hinder each other (for example, if the
same well has one control variable on the liquid rate and an equivalent one on the oil producer rate),
the risk is that optimization will take a lot of simulations (and time to run), without leading to significant
gains. Optimization needs some degrees of freedom to provide significant gains. Running optimization
on only one or two parameters will likely result in limited gains and one may have been better off
running the model “manually” by analysis of different sensitivities. Similarly, control variables must have
some space to evolve: benefits will probably be small if one bounds the control variables within very
limited range(s). For example if a CV specifying bottom hole pressure of a producer can only vary
between 25 and 29 bars, then there is only limited potential for improvement.
A problem posed in a smart way has more chances of yielding better results: in SAGD processes where a
well injects steam above a producer, for example, if the positions of the wells are to be optimized,
rather than defining the injector and the producer position each as control variable (creating the risk of
having the producer placed above the injector in some cases), it is possible to either ensure such
situation won’t happen by adapting the variables constraints or to define the producer position as one
control variable and the distance between the injector and the producer as another one.
I defined 100 separate CV's in my problem and the optimizer took a long time
to run and the results were not as good as expected.
The general rule-of-thumb is to not specify more than around 40 separate CVs in any single optimization
run. This is not an absolute upper limit - rather it represents a practical upper bound for most real-world
situations. One of the reasons the above run took a long time was that it still takes N+1 runs (where N =
100 in this instance) before the algorithm even starts to optimize. It takes this many simulations calls
just to build the simplex (or train the Neural Network). As for the complaint that the result was less than
expected. Here is an example of more does not necessarily mean better. By introducing 100 CVs, the
optimizer might have found a local optimum - not wholly unexpected considering the size and
complexity of the 101-dimension solution space it is traversing.
There are 2 main steps the user can do to remedy this problem:
1: Sub-divide the problem into, say, 3 smaller optimizations, each one containing only, say, 33 CVs each
and launched sequentially with the successive run using the optimum values found from the previous.
Such optimization needs to be conducted with some caution as it requires bookkeeping. Say we sub-
divide our 100-CV simulation model into 3 (arbitrary) regions (call them R1, R2 and R3) - and the time
period for simulation is the same for each region. We first optimize over R1 but still include R2 and R3
but keeping their values constants. Once R1 is optimized, these optimum values (say these are LRAT for
simplicity) should now be kept fixed and a second optimization run launched, this time optimizing over
R2 CVs. Once this is done, we keep those CVs fixed and perform a third (and final) optimization run - but
with over the CVs belonging to R3. This is not guaranteed to improve the solution, but it could reduce
the likelihood of the optimizer landing in one of the many local optima that might be present in a 101-
dimension optimization.
Sector modelling or restart and optimization are compatible, which means it may be possible to limit
simulation times by using a reduced model or restarts.
2: If one insists on keeping 100 CVs, one should consider restarting a second optimization using the
optimum values found in the first run (see below). The concept of restarting is not exclusive to runs with
large numbers of CVs, as it applies to many optimizations, especially runs with uncertainty present.
If all else fails try a combination of points 1 & 2 (above) or, failing that, perform a sensitivity study to try
and reduce the number of CVs being declared.
Another way to check if you have unfortunately located a local optimum is to restart the optimization,
but this time from a significantly different initial set of control variable starting values (initial guesses).
Q: Why did some of the optimization runs take much longer than others?
It is good practice to "stress test" (double check) your data to ensure that it indeed functions properly
over the range of values entered for the control variables. You should also ensure that the stop limit for
the different message types in Eclipse/Frontsim keyword "MESSAGES" is not too low. Also check that the
TUNING family of keywords is not going to hamper an optimization run.
One should also try to avoid any PVT extrapolation for this will often lead to convergence problems,
longer simulations and possible crashes. This includes ensuring that the PVT is defined on the range of
situations generated by the control variables. Use EXTRAPMS 3 to test for extrapolations. If these
extrapolations, even erroneous, lead to better results (by underestimating viscosity at higher pressures,
for example), the optimization process will continue and new cases will be generated. This will only lead
to erroneous conclusions as we have found a good optimum on an incorrect model.
Q: Should I just try to define and run directly an optimization case and run it ?
- Indicate which control variables were the most influential on a given problem. For example, the
tornado plot below displays the influence of each of the control variables on the final net
present value of a case. We can see that the last four ones have absolutely no effect.
- See if the simulations could run into problems or errors when certain values are used.
- Obtain a better understanding of the effect of each control variable on the final result.
- Better define the problem: if the sensitivities indicate that setting far too high a pressure limit
for a producer well will lead to a disaster, it may be valuable to make sure that the values this
producer can use during optimization do not reach that level. The time spent on a pre-analysis
of the variables can well be compensated by gain on the optimization time.
Avocet Integrated Asset Management optimization features are also using the same solver library than
the one used in Petrel.
E300 has an optimization option based on gradients. When set up properly, it could give results faster
than the Petrel optimization module. The gradient method makes it, however, more prone to be
trapped in optima far too localized (underwhelming gains). There are also many limitations to it.
Miscellaneous:
Time can be a control variable: operations can be divided in variable periods or a specific well can be
opened at a particular time. Intouch 5029249 details how to do so.
Flexible field management, particularly ACTIONX (trigger set of actions on defined conditions), UDQ (self
defined summary vectors) and UDAs (user defined arguments) offers many possibilities. In particular,
you can use UDQ to define your own objective function (Intouch 5029199).
References:
Bailey, W.J., Couët, B., and Wilkinson, D., Framework for Field Optimization to Maximize Asset Value,
SPE 87026-PA, SPE Reservoir Engineering J., 8, 1, February 2005, pp. 7-21.
Bailey, W.J., Couët, B., and Prange, M., Forecast Optimization and Value of Information Under
Uncertainty, in Uncertainty Analysis and Reservoir Modeling, Y. Z. Ma & P. R. LaPointe (editors), Chapter
14, AAPG Memoir 96, 2010, pp. 1-17.
Conn, A.R., Scheinberg, K., and Vicente, L.N., Introduction to Derivative-Free Optimization, MPS-SIAM
Series on Optimization, SIAM, Philadelphia, 2009.
Couët, B., Bailey, W.J., and Wilkinson, D., Reservoir Optimization Tool for Risk and Decision Analysis,
Proceedings of the 9th European Conference on the Mathematics of Oil Recovery, Cannes, France,
August 30 - September 2, 2004.
Couët, B., Djikpesse, Wilkinson, D., and Tonkin, T., Production Enhancement through Integrated Asset
Modeling Optimization, SPE-135901-PP, 2010 SPE Production and Operations Conference and
Exhibition, Tunis, Tunisia, June 8-10, 2010.
Cullick, A.S., Heath, D., Narayanan, K., April, J., and Kelly, J., Optimizing Multiple-Field Scheduling and
Production Strategy with Reduced Risk, SPE 84239, Annual Technical Conference and Exhibition, October
5-8, 2003, Denver, Colorado.
Gossuin, J., Bailey, W.J., Couët, B., and Naccache, P., Steam-Assisted Gravity Drainage Optimization for
Extra Heavy Oil, Proceedings of the 12th European Conference on the Mathematics of Oil Recovery,
Oxford, UK, September 6-8, 2010.
Nelder, J.A. and Mead, R., A Simplex Method for Function Minimization, Comput. J., 7, 1965, pp. 308-
313.
What follows is an extract from the following book:
"Forecast Optimization and Value of Information under Uncertainty", by William J. Bailey, Benoît Couët
and Michael Prange, in . Zee Ma & Paul LaPointe (editors) book "Uncertainty Analysis and Reservoir
Modeling" Y, AAPG Memoir Series #96, Chapter 14, scheduled for publication late 2010.
Schlumberger-Doll Research
Optimization algorithms provide methods to explore complex solution spaces efficiently and accurately
to achieve a desired outcome. Optimization problems are common in our daily lives. If planning to drive
a car, one usually decides on the "best" (optimum) route to the desired destination. For oilfield
exploration and development, optimization can take many forms, but essentially the goal is to maximize
recovery, total production, or net monetary profit from the asset. This chapter discusses forecast
Introduction
The goal in optimization is to find the optimum value of an objective function. This article focuses on
optimization methods and practices applicable to computationally expensive objective functions (those
with long computational run times). This governs much of how we approach optimization and
necessitates efficient use of time as each function evaluation (often referred to as a "trial") may take
many hours to complete. This is an important distinction, for the literature is rich with methods and
techniques developed and proven on a variety of problems with names such as “traveling salesman”,
“knapsack”, “airline scheduling”, where the underlying function being optimized requires only relatively
short computational times. This is not to say that such optimization problems are mathematically trivial
or computationally simple — indeed the opposite is often true. Obtaining the globally optimal solution
to the “traveling salesman” problem is very complex and non-trivial. However, when considering which
optimization approaches to use to solve problems with expensive underlying functions, such as those
we use to model hydrocarbon assets, flows and structure, many standard optimization approaches
become untenable because they would result in overly long computation times.
Another facet of many conventional optimization approaches is their need for derivative information,
which is often not available with engineering simulators. As such, we assume throughout that such
Valuation is somewhat more elusive as a term. It is easy to compute a simple Net Present Value (NPV)
using a spreadsheet and basic data. This type of valuation is trivial. However, where “valuation”
becomes complex is when one introduces the concept of future operational decisions which materially
impact the underlying value of the asset and also when uncertainty is present. This type of valuation is in
the realm of "Real Options" (RO), which refers to valuation under uncertainty and flexibility. While real
option valuation has been the subject of many texts, the debate continues over its usefulness in making
practical engineering decisions. A detailed discussion of real options is beyond our scope here (see
Wilkinson et al., 2005). Nevertheless, the presence of uncertainty does impact our ability to formulate a
meaningful valuation when we also have some operational flexibility, which itself is related to the
optimization.
We first introduce an example problem that will be used throughout this article, followed by definitions
of essential terms and concepts. Our example will be used to demonstrate a number of concepts
including a deterministic optimization involving well completions, well shut-in criteria and production
targets. This example is then expanded to include reservoir uncertainty, and finally is used to show the
value of obtaining better fluid gradient information from this reservoir over a conventional single-point
fluid sample.
Figure 1 shows two perspectives of the model we shall be using throughout this chapter. The model has
two producing wells (#1 and #2) and the reservoir is not compartmentalized. Other pertinent
(a) (b)
Figure 1: Model used for all the examples. (a) shows the field looking northeast. (b) shows the location of the 2
producing wells (#1 and #2) located on the crestal part of the field with green sections representing individual
completions. There is an aquifer located on the Eastern flank. Other details of the model are as follows:
Grid Dimensions I=16 j=40 k=25 (16000 cells in total) Thickness of pay at well #1 is 400ft
Simple Example
Given that the two producing wells in our example are in communication, a reasonable question would
be: what is the best way to allocate production in each producing well in order to maximize a specific
objective function (e.g., NPV)? One way would be a “brute force” approach whereby we compute each
and every possible combination of the flow rates in each well, allowing us to construct the complete
solution surface of NPV versus flow rates. From this we can then locate the flow rates corresponding to
the maximum of the objective function. This method will certainly furnish the best solution, but such
exhaustive enumeration is clearly inefficient and almost certainly would require so much time that it is
computationally intractable. Figure 2 illustrates an objective function on the vertical axis plotted for all
possible combinations of flow rates for the two wells. Three distinct maxima (peaks) are apparent in this
fully enumerated solution space, while only one of these is the best solution to the optimization
problem.
In this example, it is possible to visualize the solution surface only because the number of controls is
small. When we expand the number of controls to more than two, it becomes much more difficult to
compute and to view the solution space. The surface we wish to traverse becomes a hyperspace — a
multi-dimensional object —that is nigh impossible to conceptualize, let alone draw. This inherent
Figure 2: Solution surface of a simple problem involving 2 production wells (#1 & #2). The vertical axis is
the objective function, F (e.g., NPV) plotted against all possible values of the two control variables. The
plot reveals 3 discernable peaks (or local maxima) and one global maximum (the solution).
Definition of Terms
It is important to define what key terms mean as there are some contradictory uses in the literature and
Objective Function F: The optimization approaches we discuss here are designed to maximize (or
minimize) a single function of one or more variables, and are specifically targeted to functions that are
expensive to evaluate. The form of this function can be very simple in nature (e.g., the total oil
production as forecast by a reservoir simulator) or it may be complex (e.g., the post-tax NPV computed
from the composition of output from both reservoir and financial simulators). It could be a misfit
function such as those used to minimize error for history matching or some inversion process. This
A key issue in optimization is the proper identification of F, both in terms of the values being maximized
and the functional arguments, as this may have a large impact on the success of the optimization. This
step is not always as straightforward as one would expect. An NPV-type function provides a good deal of
flexibility and allows one to penalize non-revenue generating fluid streams and other facets impacting
operational viability. Apart from time, there are two main obstacles to effective optimization, objective
functions that are overly flat or overly rough. Like a desert floor, a flat solution surface means that there
is no feature in the solution terrain for the optimizer to establish bearings. Roughness, on the other
hand, makes it difficult for an optimizer to anticipate the function values at points that are near
neighbors to the existing trial set. Such roughness may be a feature of the true solution surface (some
solution surface roughness is real) or it may be spurious (e.g., resulting from poorly-constructed black-
box functions). If one is unlucky, roughness may come from both sources. Either way, roughness is
ruinous as it is both time consuming (as the optimizer jumps around a multi-peaked surface) and worse,
misleading (it may stumble across a noise-induced spike that is not a real solution). Unless one is deeply
cognizant of the problem at hand, there is almost no way to know the nature of the solution surface
beforehand. Thus, to achieve good optimization results, one must ensure that the choice of control
variables, simulators and objective function are such that the solution surface is neither too flat nor too
Global vs. Local: Cullick et al. (2003) stated that “optimization means solution improvement … it is not
necessarily achieving a provably optimal answer but ... better quality solutions”. This maxim provides a
philosophical beacon for our approach to optimization, and acknowledges the fact that provably global
solutions are, in practice, unlikely to be found unless one has sufficient time to fully explore the solution
space. While some optimizers claim to provide global optima (albeit unproven), such capabilities will
come at the cost of a large number of function evaluations. Figure 2 shows the global maximum (the
highest peak on the plane) along with two sub-optimal local maxima (the two lesser peaks). However,
such an illustration is impractical for most optimization problems because it requires a full enumeration
of the whole solution space — a luxury we cannot afford with real multi-dimensional problems.
Control Variable: A control variable (CV), also called a decision variable, is defined as a parameter of the
optimization problem that we have physical control over. For example, well flow rate is a legitimate CV
because it can be manipulated by the operator in order to maximize reservoir output. Permeability and
oil price, for example, are not legitimate CVs in a forecast reservoir optimization problem because they
are not controllable by the operator, even though their values will most likely have a large impact on the
objective function value. Such impactful variables are separately treated as uncertainty variables, whose
We must identify the most suitable set of CVs for the problem at hand. This is not always
straightforward. It is too easy to throw every available parameter at the optimization engine and hope
that it will find its way out of the tangle of parameters. This is poor practice and will be computationally
wasteful, as parameters that have little or no impact on the objective function will be optimized for.
Instead, one should perform some form of preliminary sensitivity analysis to select the best candidate
controls. This can be undertaken either manually (through simple batch scripts) or semi-automated
It should be noted that some optimization approaches make no distinction between CV and uncertainty
variables, allowing both to be arguments of the objective function. History matching is such an example,
as is any inversion process that tries to constrain the possible values of uncertain variables based on
fitting predictions to observations. Sensitivity analysis, while not strictly a formal optimization process, is
another example in which it is sensible to allow both CVs and uncertainty variables as arguments of the
objective function. However this approach is not appropriate for complex forecasting objectives, as we
Constraint: A constraint is a requirement that the CVs must satisfy in order to be valid solutions of the
optimization problem. For example, a bounds constraint requires that the CVs must lie between pre-
specified upper and lower bounds. During optimization, any sample of a CV that exceeds these bounds is
Two main classes of constraints exist in optimization: linear and non-linear. Linear constraints place
bounds on a linear function of the CVs. An example of a linear constraint in our example problem is that
the combined production from both wells (Q1 and Q2) must not exceed some known value, e.g., Q1 +
Q2 ≤ 50,000 blpd. Non-linear constraints place bounds on a non-linear function of the CVs. These are
or “expensive”, based on the time needed to compute the non-linear function. An expensive constraint,
for example, might itself be computed from an expensive reservoir simulator. If our two wells are
connected to a facility by a flow line, an upper temperature limit in that flow line can only be computed
using the full simulation model with the CV, flow rate, being a key input. Some optimization problems
have mixtures of both linear and non-linear constraints. The nature of the constraints dictates which
Uncertainties (public / private): These are parameters that impact the objective function but are
outside of our control. Being uncontrollable clearly means that they are unsuitable candidates as CVs.
We distinguish between public and private uncertainty. For example, oil price is a public uncertainty
because, while it is publically known, it is outside of our direct control. Permeability, on the other hand,
is a private uncertainty because it is not known publically and is also outside of our control. Private
uncertainties typically relate to actual physical uncertainties in the reservoir and may or may not be
reasonably well understood. One could perform measurements to better resolve private uncertainty
variables, but their values are still not known with certainty.
In order to account for uncertainty in a meaningful and consistent manner, we need to have some idea
about the statistical distribution of possible values prior to initiating an optimization. The impact of this
distribution is that the objective function, F, will have a distribution of possible values for each choice of
the CVs (see Fig. 3). If we choose to do so, we may invest in some measurement or analysis to reduce, or
refine, a particular uncertainty, and it is this facet that lies at the heart of the valuation approach we
discuss later.
Figure 3: Uncertainty results in a range of values for the objective function, F, for any point on the
solution surface. The range of values associated with the unique combination of controls (marked CV1
and CV2) is shown. The magnitude and distribution of the variation of F might not be uniform and might
vary depending on the location on the solution surface. The range and distributional form of F is labelled
F (CV,U) where CV is the unique set of controls applicable to that particular point.
Proxy Model: This is an inexpensive function that tries to emulate an expensive function. We use
replacing them with calls to an inexpensive proxy. Many proxies are discussed in the literature, such as
experimental design, neural networks and radial basis functions. However, proxies tend not to model
the physics of the problem itself, and as such they should not be used alone to make design or capital
decisions. A couple of general guidelines in using proxies should be stated. The first is the need to
continually compare proxy predictions against the actual (expensive) physical model and, if possible,
revise the proxy accordingly using the new information. The second could be to use proxies for the
objective function itself and not as a proxy for the physical model. Thirdly, one needs to understand the
best method to train the proxy. With too many training points, the net gain of using a proxy is
diminished. With too few, the proxy may be unable to perform properly. An empirical rule-of-thumb is
to initially make a linear approximation of the objective function, i.e., to pick N+1 data points for the
Deterministic: This refers to optimization problems in which there are no uncertainty variables.
Basic Approach and Preliminary Steps to Optimization
We restrict our discussion to the approaches proven useful for our brand of optimization: forecasting
with expensive black-box functions. There are two traditional approaches to optimizing problems with
expensive functions: replace the expensive function with a proxy and then use one of the many
optimization algorithms suitable for inexpensive functions, or perform the optimization directly on the
expensive function using specialized optimizers. The former might sacrifice some accuracy for speed,
while the latter has the advantage of honoring the actual underlying objective function exactly at the
expense of high computational cost. Note that optimization methods take their simplest form for
unconstrained problems. Techniques for transforming a constrained problem into an unconstrained one
The methods we have found most useful for forecast optimization involve a hybrid of these approaches.
A proxy function (e.g., neural network, radial basis functions) is first trained on samples from the
expensive function, and is then used with a standard solver to compute an optimum objective function
value along with its corresponding CVs. This result is then tested against the expensive model to ensure
that the objective function value is in agreement with the proxy, within some tolerance, for the same
CVs. If not, this new trial point, (CVs, F), is included in the re-training of the proxy, and the process
continues until the stated matching tolerance on the optimal value is achieved.
Preliminary Steps: There are a few practical issues to consider before conducting any optimization. First,
it is essential to identify the appropriate objective function, CVs and constraints, as discussed earlier, in
order to focus the optimization on the appropriate sub-problem of interest and to identify a suitable
optimizer.
Another input is the solution tolerance, which dictates when satisfactory solution convergence has been
reached. There is a trade-off between fine accuracy (small tolerance) and speed (large tolerance). Set
the tolerance too small and the optimizer may continue to run for a long time, meandering about a final
solution for many iterations before reaching convergence. Conversely, if the tolerance is too large, then
we will most likely find a suboptimal value — although optimization run time will be much shorter.
Choosing appropriate CV starting values for the optimization is important because an optimizer may find
different optima depending on the starting values. As we shall discuss later, good practice states that we
should, if time permits, restart an optimization using the best values from a previous run. These starting
points also furnish us with the baseline upon which we may judge the quality of the optimization itself.
It is important that the baseline value be “reasonable”, in other words that we use our best guess
and/or judgment to provide an initial value that makes sense and provides a cogent benchmark upon
The steps described in the previous section were applied to the example shown in Figure 1. The
following CVs were are shown in Table 1. The forecast period for our optimization is two years. Table 2
3 Maximum water cut for trigger shut-in of Well #1 95% 55% 100%
4 Maximum water cut for trigger shut-in of Well #2 95% 55% 100%
Table 1: Control variables used in the 2-well problem. Period 1 - January through the end of June. Period
2 - July to end of December. Period 3 - the whole of the 2nd year of the forecast (12 months).
Objective Function, F NPV (computed at the end of the 2 years of the production forecast). Computed using
a discounted cash flow model with the following parameters: Oil Price $56.75/bbl
(oil); Lift Cost $17.85/bbl (liquid); Water processing cost $2.75/bbl (water); Fixed
Costs $5k/day; Gas processing cost $1.57 /Mscf; discount factor 4% p.a.
Constraint Q1+Q2 <= 50,000 where Q1 and Q2 are the liquid flowrates (blpd) of Well #1 and
Well #2 respectively (a linear constraint).
Convergence tolerance 0.01. This is the fractional (thus dimensionless) improvement of the objective
function between iterates (i.e., trial n and trial n-1). Typically, setting a smaller
tolerance value results in longer run-times. The value used here is deemed suitable
for most cases.
The initial guesses (Table 1) were obtained from an engineering design and yield an NPV of $411.3
million. This will be the baseline upon which we can determine the quality of our optimization. Note that
Figure 4 shows the objective function value (NPV) against trials (each trial being a call to the underlying
Eclipse E300 simulation model). Here we compare the results of two optimization approaches. The
difference between the two approaches is that one directly evaluates the expensive objective function
value throughout the optimization process, while the other follows the strategy of optimizing a proxy,
confirming the proxy solution against the expensive objective function value, and, if the proxy solution is
a poor match, retraining the proxy and re-optimizing. As determined by the given convergence tolerance
(Table 2), the direct optimizer converged after 28 trials with a value of $444.4 million, a 8.05% gain over
The same problem was then optimized using a proxy. Initial training of the proxy used, for convenience,
the first N+1 points evaluated by the direct optimizer — hence the first 11 points in Figure 4 are
perfectly coincident for the two approaches. Including training points, the proxy approach converged
(using the same tolerance) after 23 trials with $449.2 million, a 9.21% gain.
Figure 4: Deterministic optimization of the example problem using the direct approach and the proxy-
enhanced approach.
A legitimate question would be: Why does one optimization scheme provide better results than another?
One could also add: why bother with the direct optimizer at all if the proxy approach is clearly better
and faster? The answer, however, is not so clear cut and demonstrates why we cannot prescribe a single
optimization method for all applications. While we do not discuss the internal workings of the
optimizers here, the proxy approach provides a better solution — for this particular case — because of
its iterative re-training, as mentioned earlier. In so doing, it found a better optimum while the direct
approach was clearly caught in a local maximum from which it could not escape.
There was no way of knowing beforehand that the proxy approach would find a better optimum.
Indeed, if we had trained the proxy differently, then there is a possibility that it may not have performed
better. Conversely, a different training set may have allowed it to find an even better optimum than
shown here. This is the fate of optimization — the final solution is contingent not only on the problem at
hand, but also on how you initialize the process and what method you use to get there. In response to
the second question, "Why bother with direct optimization at all?", the answer is “We cannot tell
whether one optimization method is going to be better than another until we try it.” Admittedly, this is
not particularly satisfactory when we have expensive functions and only limited time and resources to
obtain a solution. This is where experience plays a part — and even then there is no guarantee that an
experienced user will find the best optimum possible or choose exactly the best optimizer. Nevertheless,
it is unwise to simply throw a problem blindly at an optimizer while hoping that it will find a good
solution. This is why the preliminary activities discussed earlier are so crucial to the whole process.
Restarting
Another good practice is to re-launch the optimization using the best values of the previous optimization
as the initial guess. Although this means increasing the total computational run-time, the new (and
better) initial starting point will result in a different initialization and the optimizer may locate a better
optimum. Nonetheless, there is no guarantee that a better optimum will be found. Figure 5 shows the
results of a single restart of the deterministic problem shown earlier. While the proxy delivered a better
optimum than direct approach for the first optimization, the reverse is true after the second restarted
optimization and at the expense of more trials. Because we are working with such expensive functions,
the law of diminishing returns applies, and it is up to the user to decide the appropriate balance of study
time and reward. Unfortunately, no universal rules about when and how one should restart can be
stipulated.
Figure 5: This shows NPV against trial number for the deterministic problem for both the direct and the
proxy approaches. In both cases, a second optimization (dashed lines) is launched immediately after the
first, but using the best values of the Control Variables (CVs) obtained from the previous optimization run.
We have outlined the basic approaches to optimization without uncertainty. Using the example
previously described, we now add reservoir uncertainties in the form of uncertain oil-water contacts and
aquifer strengths. Financial (public) uncertainty is beyond the scope of this article. Suffice to say that
public uncertainty can introduce additional complexities to a problem. As such, the discussion is
restricted to reservoir (private) uncertainty (refer Wilkinson et al (2005) for a discussion on this issue).
Defining Uncertainties
Even under ideal conditions (numerous logs, history, analogs, cores, etc.), much of the subsurface
hydrocarbon reserve remains uncertain. We may reduce uncertainties through measurements and
testing and we may have extensive experience that narrows down the range, nevertheless, residual
uncertainty will remain — and, as such, should be accounted for in our analysis. The critical point to
note here is that we should make every attempt to identify the critical uncertainties and only bring them
into the calculation. While one may be tempted to include all the uncertainties into the analysis, we
shall explain how this is not expedient and will result in unduly long computational run-times. The
following steps should provide a rough guide to good practice for optimization with uncertainty:
1. Identify all major uncertainties in the system that will likely impact the objective function. There is
no point in including an uncertainty that has no impact whatsoever on the objective function.
2. Establish reasonable ranges for these uncertainties. This step is actually non-trivial and can be the
most challenging of all to properly define. Expert interviews and/or panels, data collection etc., all
3. Perform a sensitivity analysis on the uncertainties. Various methods exist to aid in this process. A
popular one is Experimental Design (ED), which can be illuminating in identifying those uncertainties
that impact most the objective function. ED can also be used to identify important CVs. Other
approaches could include evaluating the model at the lower- and upper-bounds of each uncertainty
4. Based on these results, bring into the optimization the single (most impactful) uncertainty or
possibly the top two uncertainties, unless the available time and computational resources will allow
more.
5. Once the main uncertainties are identified, one needs to establish their sampling points. These may
be samples from a continuous probability density function of a given form (i.e., normal, log-normal,
etc.). Alternatively, these may be discrete lower-, upper- and mid-values of a parameter of unknown
distribution.
Optimize the Mean
The deterministic example shown previously provides a simple case where the objective function was
deterministic. However, when uncertainty is present, each objective function evaluation results in a
distribution of values. While we are still optimizing for NPV, instead of a single value, we will have a
range of values of NPV for each sample of the CVs. For example, assume that the location of the oil-
water contact (OWC) is the main uncertainty in our asset and that everything else is known. We sample
this parameter at 3 specific points (low, medium and high). Without any optimization, we know that we
will obtain 3 different values of NPV for each sample of the CVs. Which of these NPV values do we
optimize over? One incorrect approach is to perform a deterministic optimization for each of the
samples of uncertainty separately and then take the mean of the results. This approach involves taking
the mean of the optima, and is erroneous, as being unrealizable — the optimizer has not found the
In contrast, the accepted approach in its simplest form is to take the optimum of the mean. This yields
the risk-neutral solution to the problem. The risk-neutral solution is one in which the values being
optimized are equally likely to be overestimates as underestimates. However, we have been successful
in applying a generalization of this approach, one which includes an allowance for the risk tolerance of
the decision maker. Allowing for risk tolerance means that a risk averse person could optimize on, say,
the P10 value, meaning that the there is only a ten percent chance that the actual value will be less than
the optimal P10 value. Of course, this prediction of a “ten percent chance” is contingent on the
simulation model being correct, the simulator itself being correct, and the input uncertainties being
properly quantified. The prediction is just the logical consequence of all the input and engines used. We
express the risk tolerance by using the following simple objective function:
ܨλ = µλ − λ σ , (1)
λ
where λ is a user-defined, dimensionless, risk preference factor, Fλ is the objective function (NPV being
the underlying value in the case of our example) for that specific value of λ, and µλ and σλ are the mean
and standard deviation (for a specific value of λ) of the ensemble of NPVs resulting from the model
uncertainties for a given set of CVs. The utility of this formulation is not only that we have a single value
for our objective function, but also that this Fλ is accompanied a specific set of CVs — ones which we can
actually apply operationally to the asset. Furthermore, by including risk preference in the optimization,
this approach allows one to place a value on reducing the uncertainty on a parameter through
We do not suggest that this is the only approach to defining an objective function under uncertainty —
utility functions and percentiles are also possible. However, what has been presented here has proven
Number of Realizations
The standard approach to drawing p samples in an M-dimensional space is to first rotate the coordinate
system so that the model variables are uncorrelated. There are then a number of sampling approaches
that can be applied to draw random samples from these uncorrelated variables. The simplest approach,
which is appropriate when the number of variables is small, is to independently sample the distribution
of the i-th variable with mi realizations. Then the total number of realizations, p, used to describe the
This approach becomes geometrically more expensive as the number of dimensions grows. There are, of
course, many other approaches to sampling the uncertainty more efficiently when the number of
dimensions is large (see Prange et al., 2008), but this simple approach will be used as the focus of our
discussion. As p refers to the number of simulation realizations needed to describe the uncertainty in
the objective function, F, for each trial CV, we can see that the computational load grows rapidly as the
number of uncertain variables and/or the values of the mj increase. For practical purposes, we need to
keep the number of realizations, p, as small as possible. Parallel processing certainly helps in reducing
total runtime, but that will depend on the computational capability available.
The efficient frontier was first proposed by Harry Markowitz (1952, 1959) for portfolio optimization. His
original work consisted of a plot of return versus risk for an ensemble of portfolios. The concept of an
Figure 6. The efficient frontier formed from the optimization of the example problem with two
uncertainties. Each point represents the solution of 6 realizations to sample the uncertainty space. The
frontier is formed from the points shown for each value of risk preference factor λ. The region above the
efficient frontier, as indicated by the “Unfeasible point" shown, is unrealizable space. Conversely, a point
lying beneath the frontier, as shown, is sub-optimal.
In optimization, we can think of the µ and σ in Eq. (1) as the “return” and “risk”, respectively. The trade-
off between expected return and risk preference can be visualized by running a suite of optimizations,
one for each trial value of risk preference λ, and plotting µλ against σλ for each of these optima, as we
show in Figure 6 (Raghuraman et al., 2003; Bailey et al., 2005). The resulting curve is called the Efficient
• It can fold into itself as many different uncertainties as the user can handle (as all these are
• Each point represents a set of CVs that correspond to a particular level of risk preference and
• The region above the frontier represents “unobtainable space”. Given that the model and its
mathematical implementation are correct, values in that region are not feasible and cannot be
• Points located below the frontier are sub-optimal as better optimum values can be obtained along
• Values along the frontier are optimal and each one corresponds to a unique value of risk preference.
So what is the relationship between the risk preference parameter λ and the standard percentile notion
used in decision analysis? Our underlying function (e.g., NPV) has uncertainty resulting from the model
uncertainty. Percentiles describe the probability that NPV will be less than the defined objective, Fλ.
Thus a particular percentile is a function of λ, but in general this function is unknown except when one
assumes a specific form of probability distribution, e.g., a normal distribution. The inset table in Figure 6
presents the confidence level of an outcome for a given λ under the assumption that NPV is normally
distributed. This table can assist decision makers by allowing them to adopt a control strategy that is
consistent with their own risk tolerance (either personal or defined by policy). Thus a decision maker
with high risk tolerance may be willing to accept a 50% likelihood of achieving, or surpassing, the
optimized value of the defined objective function, Fλ (associated with λ = 0.0). Conversely, a decision
maker with a low risk tolerance may only be comfortable with the result associated with, say, λ = 2.0. In
this case, there is a 97.72% likelihood of achieving, or surpassing, the optimized value of the defined
We cannot overstate the power and utility the Efficient Frontier can bring to an optimization problem
where uncertainty is present. Not only does it provide a managerial route-map for operation of the
given CVs, but also allows one to visualize the best points and how they correspond to their associated
risk preferences, irrespective of the number of uncertainties and realizations involved. There are
numerous ways to utilize this construct. We may first build it and then use it to make decisions (once we
know the appropriate risk tolerance). We may have our own risk preference (λ), say P15 (equivalent to
roughly one standard deviation from the mean), from which we may compute the optimal
corresponding mean NPV (µNPV) and its standard deviation (σNPV). We may then compare these
estimates to an Efficient Frontier constructed later and use it to evaluate the suitability of this initial
estimate and risk tolerance. Conversely, we may use it to quality-check expectations of NPV (at
associated levels of confidence) from decision makers. For example, there is an expectation for a given
NPV at a level of confidence of, say, P15 (see above). We can then review such expectations against
what is realizable using the Efficient Frontier. Values lying above the frontier (e.g., the point shown in
Figure 6) are not possible, as the region above the frontier line represents unfeasible space.
Alternatively, if this NPV expectation is located below this frontier (e.g., the “Sub-optimal point" shown
in Figure 6), then we can advise that this is sub-optimal and, with the strategy forthcoming from our
optimization, we may increase this expected mean NPV while retaining the same degree of confidence.
While there is no single, 'correct' way to approach utilization of the Efficient Frontier, our primary
intention is to stress the multipurpose virtue of this tool and encourage its adoption by decision makers.
Example: Forecast Optimization with Uncertainty
Continuing with the example stated earlier in Figure 1 and keeping the same CVs (Table 1), we now
introduce two uncorrelated uncertainties. From the preliminary analysis described earlier, these were
identified as the depth of the oil-water contact and the permeability of the aquifer model (Table 3).
OWC is sampled at 3 discrete values, while aquifer permeability is sampled at two discrete values. Thus
Oil Water Contact (OWC) Shallowest: 15340ft; Mid-level: 15400ft; Deepest: 15460ft
Table 3: The two uncertainties in the example along with their sample values.
The results of this optimization are shown in Table 4 for several values of risk preference (λ), and the
Table 4: Optimization results for selected values of risk preference λ. Each optimized value has a
unique set of control variables associated with it. The confidence percentiles assume a normal
distribution of NPV.