SGeMS User’s Guide

•
UNICAMP

Ulisses Miguel Correia
22/05/2014
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 3, do total de 129 páginas
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 6, do total de 129 páginas
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
Você viu 9, do total de 129 páginas
Faça como milhares de estudantes: teste grátis o Passei Direto

Esse e outros conteúdos desbloqueados
16 milhões de materiais de várias disciplinas
Impressão de materiais
Agora você pode testar o

Passei Direto grátis

Você também pode ser Premium ajudando estudantes
E aí, curtiu este material?
Ajude a incentivar outros estudantes a melhorar o conteúdo
Gostou desse material? Compartilhe! 🧡
Geoestatística

134 Materiais compartilhados
Baixe o app para aproveitar ainda mais
Leia os materiais offline, sem usar a internet. Além de vários outros recursos!
Prévia do material em texto
SGeMS User’s Guide
Nicolas Remy, Alexandre Boucher & Jianbing Wu
April 13, 2006
Chapter 1
General Overview
Section ?? indicated how GsTL can be integrated into an already existing software. SGeMS,
the Geostatistical Earth Modeling Software, is an example of software built from scratch
using the GsTL. The source code of SGeMS serves as an example of how to use GsTL
facilities.
SGeMS was designed with two aims in mind. The first one, geared toward the end-
user, is to provide a user-friendly software which offers a large range of geostatistics
tools: the most common geostatistics algorithms are implemented, in addition to more
recent developments such as multiple-point statistics simulation. The user-friendliness
of SGeMS comes mainly from its non-obtrusive graphical user interface, and the pos-
sibility to directly visualize data sets and results in a full 3-D interactive environment.
The second objective was to design a software whose functionalities could conveniently
be augmented. New features can be added into SGeMS through a system of plug-ins,
i.e. pieces of software which can not be run by themselves but complement a main soft-
ware. In SGeMS, plug-ins can be used to add new (geostatistics) tools, add new grid data
structures (faulted stratigraphic grids for example) or define new import/export filters.
1.1 First Steps with SGeMS
1.1.1 A quick tour to the graphical user interface
The graphical user interface (GUI) of SGeMS is divided into 3 main parts, see Fig. 1.1:
The Algorithm Panel The user selects in this panel which geostatistics tool to use and
1
Figure 1.1: SGeMS’s graphical interface. The three main panels are highlighted in red.
The top-right panel is the Algorithm Panel, the top-left is the Visualization Panel and the
bottom is the Command Panel
inputs the required parameters (see Fig. 1.2). The top part of that panel shows a
list of available algorithms, e.g. kriging, sequential Gaussian simulation. When
an algorithm from that list is selected, a form containing the corresponding input
parameters appears below the tools list.
The Visualization Panel One or multiple objects can be displayed in this panel, e.g. a
Cartesian grid and a set of points, in an interactive 3-D environment. Visualization
options such as color-maps are also set in the Visualization Panel. The Visualization
Panel is shown in more details in Fig. 1.3.
The Command Panel This panel is not shown by default when SGeMS is started. It
gives the possibility to control the software from a command line rather that from
2
Figure 1.2: The 3 parts of the Algorithm Panel highlighted in red. The top part displays
the list of available tools. The middle part is where the input parameters for the selected
tool are entered.
the GUI. It displays a history of all commands executed so far and provides an input
field where new commands can be typed (see Fig. 1.4). See tutorial 1.1.2 for more
details about the Command Panel.
1.1.2 A simple tutorial
This short tutorial describes a SGeMS session in which ordinary kriging is performed on
a 100 ∗ 130 ∗ 30 Cartesian grid. The data consist of a set of 400 points in 3-D space,
with a rock porosity value associated to each data point. This point-set object is called
porosity data.
The steps involved are the following:
1. Load the data set
3
Figure 1.3: The Visualization Panel. The left-hand side (highlighted in red) controls
which objects (e.g. grids) are visible. It is also used to set display options, such as which
color-map to use.
Figure 1.4: The Command Panel
2. Create a Cartesian grid
3. Select the kriging tool and enter the necessary parameters
4. Display and save the result
4
Loading the data set
The first step is to load the data set into the objects database of SGeMS. Click
Objects | Load Object and select the file containing the data. Refer to section 1.2.1 for a
description of the available data file formats. When the object is loaded a new entry called
porosity data appears in the Objects section of the Visualization Panel, as shown
in Fig. 1.5.
Figure 1.5: Object list after the data set is loaded
Click in the square before the point-set name to display it. Displayed objects have a
little eye painted inside the rectangle before their name. The plus sign before the square
indicates that the object contains properties. Click on the plus sign to display the list
of properties. Click in the square before the property name to paint the object with the
corresponding property (see Fig. 1.6).
Figure 1.6: Showing/hiding an object or a property
5
Creating a grid
The next step is to create the grid on which kriging will be performed. The grid we will
create is a 3-D Cartesian grid with 100 ∗ 130 ∗ 30 cells.
Click Objects | New Cartesian Grid to open the grid creation dialog. Enter the di-
mensions of the grid, the coordinates of the origin of the grid, i.e. the lower left corner
(i.e. the grid node with smallest x, y, z coordinates), and the dimensions of each grid cell.
Provide a name for the new grid, working_grid for example. Click Create Grid to
create the grid. A new entry called working_grid appears in the Objects panel of the
Visualization Panel, as shown in Fig. 1.7.
Figure 1.7: Object list after the Cartesian grid is created
The object data-base now contains two objects: a point-set with the rock porosity
property, and a Cartesian grid with not yet any property attached. We can proceed to the
kriging run.
Running the kriging algorithm
Select the kriging tool from the list in the Algorithm Panel. A form prompting for the
kriging parameters appears below the algorithms list. You can either type in the para-
meters or load them from a file. Fig. 1.8 shows an example of parameter file for kriging
(refer to section 1.2.2 for a description of the parameter file format and to section 5.0.2
for details on kriging parameters). Using the parameters of Fig. 1.8, ordinary kriging is
performed with an isotropic search ellipsoid of radius 50 (section ?? describes how to
specify a 3-D ellipsoid in SGeMS) and an isotropic spherical variogram of range 30, sill
1, and a nugget effect of 0.1 (section ?? explains how to specify a variogram).
Once all parameters have been entered, click the Run Algorithm button. If some
parameters are not correctly set, they are highlighted in red and a description of the error
will appear if the mouse is left a few seconds on the offending parameter.
6
<parameters> <algorithm name="kriging" />
<Grid_Name value="working_grid" />
<Property_Name value="krig_porosity" />
<Hard_Data grid="porosity data" property="porosity" />
<Kriging_Type type="Ordinary Kriging (OK)" >
<parameters />
</Kriging_Type>
<Max_Conditioning_Data value="12" />
<Search_Ellipsoid value="50 50 50
0 0 0 " />
<Variogram nugget="0.1" structures_count="1" >
<structure_1 contribution="0.9" type="Spherical" >
<ranges max="30" medium="30" min="30" />
<angles x="0" y="0" z="0" />
</structure_1>
</Variogram>
</parameters>
Figure 1.8: Kriging parameter file
If kriging was run with the parameters shown on Fig. 1.8, the grid named
working_grid now contains a new property called krig_porosity.
Displaying and saving the result
The algorithm Kriging created a new property krig_porosity in the grid
working_grid. Click on the plus sign before the working_grid entry in the objects
list to show the list of properties attached to the grid, and click in the square before the
newly created property to display it. To save the results, click Object | Save Object to
open the Save Object dialog. Provide a file name, the name of the object to save, e.g.
working_grid and the file format to use (see section 1.2.1).
It is also possible to save all the objects at once by saving the project (File | Save Project).
7
1.1.3 Automating tasks in SGeMS
Tutorial1.1.2 showed how to perform a single run of the kriging algorithm. Next, one
would like to study the sensitivity of the algorithm to parameter Max Conditioning Data,
the maximum number of conditioning data retained for each kriging. The user would like
to vary that number from 1 to 50 in increments of 1. It would be very impractical to
perform such a study in successive sequences as explained in Tutorial 1.1.2.
SGeMS provides a solution to this problem through its command line interface. Many
actions in SGeMS can either be performed with mouse clicks or by typing a command in
the Command Panel. For example, loading the data set in step 1 of Tutorial 1.1.2 could
have been achieved by typing the following command:
LoadObjectFromFile /home/user/data_file.dat:All
Each command has the following format:
• the name of the command, e.g. LoadObjectFromFile
• a list of parameters, separated by a colon “:”. In the previous example two parame-
ters were supplied: the name of the file to load, and the file format All (meaning
that every available file format should be considered).
Every command performed in SGeMS, either typed or resulting from mouse clicks,
is recorded to both the “Commands History” section of the Command Panel and to a file
called gstlappli history.log. Hence if one does not know a command name, one
can use the GUI to perform the corresponding action and check the command name and
syntax in the command history.
It is possible to combine several commands into a single file and have SGeMS execute
them all sequentially. For the sensitivity study example, one would have to write a “script”
file containing 50 kriging commands, each time changing the Max Conditioning Data
parameter. Note that there are no control structures, e.g. “for” loops, in SGeMS script
files. However there are many programming tools available (Perl, Awk, Shell,. . . ) that
can be used to generate such script file.
8
1.2 File Formats
1.2.1 Objects Files
SGeMS supports two file formats by default to describe grids and sets of points: the
GSLIB format and the SGeMS binary format.
The GSLIB file format
It is a simple ASCII format used by the GSLIB software (ref ??). It is organized by lines:
• the first line gives the title.
• the second line is a single number n indicating the number of properties in the
object.
• the n following lines contain the names of each property (one property name per
line)
• each remaining line contains the values of each property (n values per line) sepa-
rated by spaces or tabulations. The order of the property values along each line is
given by the order in which the property names were entered.
Note that the same format is used for describing both point sets and Cartesian grids.
When a GSLIB file is loaded into SGeMS the user has to supply all the information that
are not provided in the file itself, e.g. the name of the object, the number of cells in each
direction if it is a Cartesian grid.
The SGeMS binary file format
SGeMS can use an uncompressed binary file format to store objects. Binary formats have
two main advantages over ASCII files: they occupy less disk space and they can be loaded
and saved faster. The drawback is a lack of portability between platforms. SGeMS binary
format is self-contained, hence the user need not provide any additional information when
loading such a file.
9
1.2.2 Parameter Files
When an algorithm is selected from the Algorithm Panel (see step 3 of Tutorial 1.1.2),
several parameters are called for. It is possible to save those parameters to a file, and later
retrieve them.
The format of a parameter file in SGeMS is based on the eXtended Markup Language
(XML), a standard formating language of the World Wild Web Consortium (www.w3.org).
Fig. 1.8 shows an example of such parameter file.
In a parameter file, each parameter is represented by an XML element. An element
consists of an opening and a closing tag, e.g. <tag> and </tag>, and one or several
attributes. Following is an example of an element called algorithm which contains a
single attribute “name”:
<algorithm name="kriging"> </algorithm>
Elements can themselves contain other elements:
<Variogram nugget="0.1" structures_count="1" >
<structure_1 contribution="0.9" type="Spherical" >
<ranges max="30" medium="30" min="30"> </ranges>
<angles x="0" y="0" z="0"> </angles>
</structure_1>
</Variogram>
Here the element Variogram contains an element structure_1, which itself
contains two elements: ranges and angles. Each of these elements have attributes.
Note that if an element only contains attributes the closing tag can be abbreviated: in the
previous example, element ranges only contains attributes and could have been written:
<ranges max="30" medium="30" min="30" />
The /> sequence indicates the end of the element
A SGeMS parameter file always has the two following elements:
• element parameters. It is the root element: it contains all other elements
• element algorithm. It has a single attribute name which gives the name of the
algorithm for which parameters are specified.
10
All other elements are algorithm-dependent and are described in sections 5 and ??.
Such XML formated parameter file has several advantages:
• Elements can be entered in any order
• comments can be inserted anywhere in the file. A comment block starts with <!--
and end with -->. They can span multiple lines. For example:
<!-- An example of a comment block spanning
multiple lines -->
<parameters> <algorithm name="kriging" />
<!-- the name of the working grid -->
<Grid_Name value="working_grid" />
</parameters>
11
Chapter 2
Data Sets & SGeMS EDA Tools
This chapter presents the data sets which will be used to demonstrate the geostatistics
algorithms in the following chapters. It also provides an introduction to the exploratory
data analysis (EDA) tools of the SGeMS software.
Section 2.1 presents the two data sets: one in 2D and one in 3D. The smaller 2D data
set will be used for most of the examples of running geostatistics algorithms. The 3D
data set, which mimics a large deltaic channel reservoir, will be used to demonstrate the
practice of these algorithms on the real and large 3D applications.
Section 2.2 introduces some basic EDA tools, such as histogram, Q-Q (quantile to
quantile) plot, P-P (probability to probability) plot and scatter plot. These EDA tools are
very useful to check/compare both visually and statistically any given data sets.
2.1 The Data Sets
The application example given for each algorithm should be short yet informative, allow-
ing the user to check his understanding of the algorithm and its input-output parameters.
The results of any application based on a specific data set, should NOT be extended into
a the general conclusion. Although the SGeMS software was initially designed for reser-
voir modeling, the application examples should stress the generality of the algorithms and
their software implementation.
2.1.1 The 2D data
To be done ...
12
2.1.2 The 3D data
The 3D data set retained in this book is extracted from a layer of Stanford VI, a synthetic
fluvial channel reservoir (Castro et al., 2005). The corresponding SGeMS project is lo-
cated at ‘data set/stanford6.prj’. This project contains three SGeMS objects: well, grid
and container.
• The well object contains the well data set. There are a total of 26 wells (21 verti-
cal wells, 4 deviated wells and 1 horizontal well). The properties associated with
the wells are density, binary facies (sand channel or mud floodplain), P-wave im-
pedance, P-wave velocity, permeability and porosity. These data can be used as
hard or soft conditioning data for the geostatistics algorithms. Fig. 2.1 shows the
well locations and the porosity distribution along the wells.
• The grid object is a cartesian reservoir grid (see its rectangularboundary on Fig. 2.1),
with,
- grid size: 150× 200× 80,
- origin point at (0,0,0),
- unit cell size in each x/y/z direction.
This reservoir grid holds the petrophysical properties, currently it contains:
1. Seismic data. The original Stanford VI contains a cube of P-wave seismic
impedance, which is the product of density and P-wave velocity. However for
most geostatistics algorithms such seismic data cannot be used directly as is,
it must be calibrated into, e.g., facies probability distributions. Here, two sand
probability cubes (properties P(sand|seis) and P(sand|seis) 2) are provided:
one displaying sharp channel boundaries (best quality data, see Fig. 2.2a); and
a second displaying more fuzzy channel boundaries with noise (poor quality
data, see Fig. 2.2b). These probability data will be used as soft data to con-
strain the facies modeling.
2. Region code. Typically a large reservoir would be divided into different re-
gions with each individual region having its own characteristics, for instance,
different channel orientations, channel thickness. The regions associated with
the Stanford VI reservoir are rotation regions (property angle) correspond-
ing to different channel orientations (Fig. 2.3), and affinity regions (property
affinity) corresponding to different channel thickness (Fig. 2.4). Each rotation
13
Angle category 0 1 2 3 4 5 6 7 8 9
Angle value (degree) -63 -49 -35 -21 -7 7 21 35 49 63
Table 2.1: Rotation region indicaotrs
Affinity category 0 1 2
Affinity value ([x,y,z]) [0.5, 0.5, 0.5] [ 1, 1, 1 ] [ 2, 2, 2 ]
Table 2.2: Affinity region indicaotrs
region is labeled with an indicator number, and assigned an angle value, see
Table 2.1. The affinity indicators and the attached affinity values are given
in Table 2.2. An affinity value must be assigned for each x/y/z direction; the
smaller the affinity value, the thicker the channel in that direction.
This grid object is used to demonstrate the application examples.
• The container object is composed of all the reservoir nodes which are located inside
the channels, hence it is a point set with (x,y,z) coordinates. The user can perform
geostatistics on this channel container, for example, to obtain the within channel
petrophysical properties. In Fig. 2.5 the channel container is represented by all
nodes with value 1.
Although this 3D data set is taken from a reservoir model, it could represent any 3D
spatially-distributed attribute and applications other than reservoir modeling. For exam-
ple, one can interpret each layer of the seismic data cube as satellite measurements defined
over the same area but recorded at different times. The application could then be modeling
landscape change in both space and time.
2.2 The SGeMS EDA Tools
SGeMS software provides some useful exploratory data analysis (EDA) tools, such as
histogram, quantile (Q-Q) plot, probability (P-P) plot, scatter plot, variogram and cross
variogram calculation and fit. In this chapter, the first four elementary tools are presented;
the two latter tools are presented in the next chapter.
14
Figure 2.1: Well locations and the porosity distribution along the wells.
(a) Good quality probability. (b) Poor quality probability.
Figure 2.2: Sand probability cube calibrated from seismic impedance.
Figure 2.3: Angle indicator cube. Figure 2.4: Affinity indicator cube.
15
Figure 2.5: Channel container (nodes with value 1).
All the EDA tools can be invoked through the Data Analysis menu from the main
SGeMS graphical interface. Once a specific tool is selected, the corresponding SGeMS
window is popped up. The EDA tool window is independent of the main SGeMS inter-
face, and the user can have multiple windows for each EDA tool.
2.2.1 Common Parameters
The main interface for any of the EDA tools presented in this chapter has 3 main panels
(see Fig. 2.6, Fig. 2.7 and Fig. 2.8):
A. Parameter Panel The user selects in this panel the properties to be analyzed and
the display options. This panel has two pages: ‘Data’ and ‘Display Options’, the
latter being common to all EDA tools;
B. Visualization Panel This panel shows the graphic result of the selected statistics;
C. Statistics Panel This panel displays some relevant statistical summaries.
In the lower part of main interface, there are two buttons: Save as Image and Close.
The Save as Image button is used to save a graphical visualization (for example his-
togram) into a picture data file in either ‘png’ format or ‘bmp’ format. The user can also
write the statistical summaries and/or paint the grid into the data file by selecting the cor-
responding options in the figure saving dialog box. The Close button is used to close the
current EDA tool interface.
16
Parameters Description
The parameters of the ‘Display Options’ page are listed as follows:
• X Axis: The X axis for variable 1. Only the property value between ‘Min’ and
‘Max’ are displayed in the plot; the values less than ‘Min’ or greater than ‘Max’
still contribute to the statistical summaries. The default values of ‘Min’ and ‘Max’
are the minimum and maximum of the selected Property. The X Axis can be set to
a logarithmic scale by marking the corresponding check box.
• Y Axis: The Y axis for variable 2. The previous remarks made for the X Axis apply.
The user can modify the parameters through either the keyboard or the mouse. Any
modification through mouse will instantly reflect on the visualization or the statistical
summaries; while the change through keyboard must be inured by pressing the ‘Enter’
key.
2.2.2 Histogram
The histogram tool creates a visual output of the frequency distribution, and displays
some statistical summaries, such as the mean and variance of the selected variable. The
histogram tool is activated by clicking Data Analysis|Histogram. Although the pro-
gram will automatically scale the histogram, the user can set the histogram limits in the
Parameter Panel. The main histogram interface is given in Fig. 2.6, and the parameters
of the ‘Data’ page are listed below.
Parameters Description
• Object: A cartesian grid or a point set containing variables.
• Property: The values or distribution of a certain variable listed in the Object above.
• bins: The number of classes. The user can change this number through the key-
board, or by clicking the scroll bar. Any value change will be instantly reflected on
the histogram display.
• Clipping Values: Statistical calculation settings. All values less than ‘Min’ and
greater than ‘Max’ are ignored, and any change of ‘Min’ and ‘Max’ will affect the
statistics calculation. The default values of ‘Min’ and ‘Max’ are the minimum and
17
maximum of the selected Property. After modifying ‘Min’ and/or ‘Max’, the user
can go back to the default setting by clicking ‘Reset’.
2.2.3 Q-Q plot and P-P plot
The Q-Q plot compares the equal p-values quantiles of two variables; and P-P plot com-
pares the cumulative probability distributions of two variables. The two variables need
not be in the same Object or have the same number of data. The Q-Q plot and P-P plot are
combined into one program, which can be invoked from Data Analysis|QQ-plot. This
EDA tool generates both a graph in the Visualization Panel and some statistical sum-
maries (mean and variance for each variable) in the Statistics Panel, see Fig. 2.7. The
parameters in the ‘Data’ page are listed below.
Parameters Description
• Analysis Type: Algorithm selection. The user can choose either a Q-Q plot or a
P-P plot.
• Variable 1: The variable selection for the X axis. The user must choose first an
Object, then the Property name.
• Clipping Values for Variable 1: All values strictly less than ‘Min’ and strictly
greater than ‘Max’ are ignored, any change of ‘Min’ and ‘Max’ will affect the
statistics calculation. The user can go back to the default setting byclicking ‘Reset’.
• Variable 2: The variable selection for the Y axis. The user must choose first an
Object, then the Property name. Note that Variable 2 and Variable 1 might be
from different objects.
• Clipping Values for Variable 2: Remarks similar to those for Clipping Values for
Variable 1.
2.2.4 Scatter Plot
The scatterplot tool (executed by clicking Data Analysis|Scatter-plot) is used to com-
pare a pair of variables by displaying their bivariate scatter plot and some statistics. All
available data pairs are used to compute the summary statistics, such as the correlation
18
coefficient, the mean and varianace of each variable (see part [C] in Fig. 2.8). To avoid a
crowded figure in the Visualization Panel, only up to 10,000 data pairs are given in the
scatter plot. The parameters in the ‘Data’ page are listed below.
Parameters Description
• Object: A cartesian grid or a point set containing variables. This Object must at
least contain two properties.
• Variable 1: The variable property listed in the Object above. This variable is
associated with the X axis.
• Clipping Values for Variable 1: All values strictly less than ‘Min’ and strictly
greater than ‘Max’ are ignored, hence any change of ‘Min’ and ‘Max’ will affect
the statistics calculation. The user can go back to the default setting by clicking
‘Reset’. If Variable 1 has more than 10,000 data, then the ‘Reset’ button can be
used to generate a new scatter plot with a new set of data pairs containing up to
10,000 data.
• Variable 2: The variable property listed in the upper Object. This variable is
associated with the Y axis. Variable 2 must have the same number of data as
Variable 1.
• Clipping Values for Variable 2: Remarks similar to those for Variable 1.
• Options: The choice of visualizing the least square fit line in the scatter plot. The
slope and the intercept are always given below the check box ‘Show Least Square
Fit’.
19
Fi
gu
re
2.
6:
H
ist
og
ra
m
in
te
rfa
ce
[A
]:
pa
ra
m
et
er
pa
ne
l;
[B
]:
v
isu
al
iz
at
io
n
pa
ne
l;
[C
]:
st
at
ist
ic
sp
an
el
.
20
Fi
gu
re
2.
7:
Q-
Q
pl
ot
in
te
rfa
ce
[A
]:
pa
ra
m
et
er
pa
ne
l;
[B
]:
v
isu
al
iz
at
io
n
pa
ne
l;
[C
]:
st
at
ist
ic
sp
an
el
.
21
Fi
gu
re
2.
8:
Sc
at
te
rp
lo
ti
nt
er
fa
ce
[A
]:
pa
ra
m
et
er
pa
ne
l;
[B
]:
v
isu
al
iz
at
io
n
pa
ne
l;
[C
]:
st
at
ist
ic
sp
an
el
.
22
Chapter 3
Utilities
3.1 Non-parametric distribution
In SGeMS a non-parametric cumulative distribution function, cdf F (z), is infered from a
set of values z1 ≤ . . . ≤ zL which can either be read from a file or from a property. F (z)
is built such that: F (z1) = 1L and F (zL) = 1 − 1L . This requires modeling of the tails of
the distribution F (z) if the values z1 and zL are not the minimum and maximum values.
The lower tail extrapolation function provides the shape of the distribution between
the minimum zmin and the first value z1. The options for the lower tail are:
• Z is bounded: F (zmin) = 0. The lower tail of F is then modeled with a power
model:
F (z1)− F (z)
F (z1)
=
( z1 − z
z1 − zmin
)ω
∀z ∈]zmin, z1[ (3.1)
The parameter ω controls the decrease of the function, with the constraint ω ≥ 1.
The greater ω the less likely are low values close to zmin. For ω = 1, all values
between zmin and z1 are equally likely.
• Z is not bounded: the lower tail is modeled with an exponential funtion:
F (z) = F (z1) exp
(− (z − z1)2) ∀z < z1 (3.2)
The options for the upper tail extrapolation function are similar but applied to the
interval [zL, zmax].
23
• Z is bounded: F (zmax) = 1. The upper tail of F is then modeled with a power
model:
F (z)− F (zL)
1− F (zL) =
( z − zL
zmax − z1
)ω
∀z ∈]zL, zmax[ (3.3)
The parameter ω controls the decrease of the function, with the constraint
ω ∈ [0, 1]. The lower the ω value the less likely are extreme values close to zmax.
For ω = 1, all values between zL and zmax are equally likely.
• Z is not bounded: the lower tail is modeled by an hyperbolic model
1− F (z)
1− F (zL) =
zL
z
∀z > zL (3.4)
All L − 1 intermediary intervals [zi, zi+1] for i = 1, ..., L − 1 are interpolated linearly ,
corresponding to power model with ω = 1.
Note: When zmin and zmax values are set to z1 and zL, the functional shape of the tail
extrapolation function becomes irrelevant.
Parameters description
1. Reference Distribution: Read the data either from the file [ref on file] or on the
grid [ref on grid] .
2. File with reference distribution [filename]: File containing the reference distribu-
tion in one column without header. Required if [ref on file] is selected
3. Property with reference distribution: The grid [grid] and property [property]
contains the values for the non-parametric distributions
4. Lower Tail Extrapolation: Parametrization of the lower tail. The type of extrapola-
tion function is selected with [LTI function]. If the power model is selected, the
minimum value [LTI min] and the parameter ω [LTI omega] must be specified. Note
that the minimum [LTI min] must be less or equal to the minimum datum value as
entered in the reference distribution, and the power omega [LTI omega] must be
greater or equal to 1. The exponential model does not require any parameter.
5. Upper Tail Extrapolation: Parametrization of the upper tail. The type of extrap-
olation fucntion is selected with [UTI function]. If the power model is selected,
24
the maximum value [UTI max] and the parameter ω [UTI omega] must be specified.
Note that the minimum [UTI max] must be greater or equal to the maximum datum
value as entered in the reference distribution, and the power omega [UTI omega]
must be lesser or equal to 1. The hyperbolic model only required the parameter
omega [UTI omega], the upper tail is unbounded.
Histogram Transformation
The algorithm TRANS allows to transform histogram into any other one. For example, the
Gaussian simulation algorithms (SGSIM and COSGSIM), as described in Section 6.1.1
and 6.1.2, assume Gaussian variables. If the attribute of interest is not Gaussian, it is pos-
sible to transform the marginal distribution of that attribute into a Gaussian distribution,
then work on the transformed variable.
The algorithm TRANS transform a property following a source distribution to a new
variable following a target distribution. The transformation of a variable Z with a source
cdf FZ into variable Y with target cdf FY is written:
Y = F−1Y
(
FZ
(
Z
))
(3.5)
Note : The transformation of a source distribution to a Gaussian distribution does
not ensure that Y is multivariate Gaussian, only its marginal distribution is. One should
check that the multivariate (or at least bi-variate) Gaussian hypothesis holds for Y before
performing Gaussian simulation. If the hypothesis is not appropriate, other algorithms
that do not require Gaussianity, e.g. sequential indicator simulation (SISIM) should be
considered.
Histogram transformation with conditioning data
It is possible to apply a weighting factor to control how much specific values are trans-
formed.
y = z − ω(z − F−1Y (FZ (z)))
When ω = 0 then y = z, there is no transformation.
25
When ω = 1 then y = F−1Y (FZ (z))) which is the standard rank transform. The weight
ω can be set equal to the standardized kriging variance. At data locations that kriging
variance is zero, hence there is no transform and the datum value is unchanged. Away
from data locations the kriging variance increases allowing for a larger transform. That
option is to be used for slight adjustment of the marginal distribution. When weightsare
used the transform is not rank preserving anymore and would only approximately match
the target distribution.
The histogram transformation algorithm is:
Algorithm 3.1 Histogram transformation
1: for Each value zk to be rank transformed do
2: Get the quantile from the source histogram associated with zk, pk = FZ(zk)
3: Get the values yk from the target distribution associated with pk, yk = F−1Y (pk)
4: if weighted transformation then
5: Applied weighting to the transform.
6: end if
7: end for
Parameters description
The TRANS algorithm is activated from Utilities—trans in the algorithm panel. The
TRANS interface contains 3 pages: Data, Source and Target (see Figure 3.2). The text
inside [ ] is the corresponding keyword in the TRANS parameter file.
1. Object Name [grid]: Selection of the grid containing the properties to be transformed
2. Properties [props]: Properties to be transformed
3. Suffix for output [out suffix]: The name for each output property consists of the
original name plus the suffix entered here.
4. Local Conditioning [is cond]: Enables the use of weight for the histogram transfor-
mation
5. Weight Property [cond prop]: Property with weights for transformation of the his-
togram conditional to data, see Eq. 3.1. The standardized kriging variance is a good
weighting option. Only required if Local Conditioning[is cond] is selected
26
6. Control Parameter [weight factor]: Value between 0 and 1 adjusting the weights.
Only required if Local Conditioning [is cond] is selected
7. Source histogram [ref type source]: Define the type of the source histogram.
That histogram may be either non parametric, Gaussian, LogNormal, or Uniform.
Each has its own interface for parameters
8. Gaussian parameters: The mean is given in Mean [G mean source] and the
variance in Variance[G variance source]
9. LogNormal parameters: The mean is given in Mean [LN mean source] and
the variance in Variance[LN variance source]
10. Uniform parameters: The minimum value is given in Min [Unif min source]
and the maximum in Max[Unif max source]
11. Non Parametric: The non parametric distribution is entered in [nonParamCdf source],
see 3.1
12. Target histogram [ref type target]: Define the type of the target histogram. The
histogram may be either non parametric, Gaussian, LogNormal, or Uniform. Each
has its own interface for parameters
13. Gaussian parameters: The mean is given in Mean [G mean target] and the
variance in Variance[G variance target]
14. LogNormal parameters: The mean is given in Mean [LN mean target] and
the variance in Variance[LN variance target]
15. Uniform parameters: The minimum value is given in Min [Unif min target]
and the maximum in Max[Unif max target]
16. Non Parametric: The non parametric distribution is entered in [nonParamCdf target],
see 3.1
27
Figure 3.1: Widget for non parametric distribution
(a) Data tab (b) Source tab (c) Target tab
Figure 3.2: User interface for TRANS.
28
Chapter 4
Variograms and variography
Variogram models are used by many geostatistical algorithms; SGeMS allows four basic
models and any positive linear combinations of them, that is the linear model of region-
alization. Linear models of coregionalization are also possible but the software does not
check the permissibility of the model, it is the user responsability. The four basic models
are: the nugget effect model, the spherical model, the exponential model and the Gaussian
model.
nugget effect model
γ(h) =
0 if ‖h‖ = 01 otherwise (4.1)
A pure nugget effect model for a variable Z(u) expresses a lack of (linear) depen-
dence between variables Z(u) and Z(u+ h)
spherical model with actual range a
γ(h) =
32
‖h‖
a
− 1
2
(‖h‖
a
)3 if ‖h‖ ≤ a
1 otherwise
(4.2)
exponential model with practical range a
γ(h) = 1− exp
(−3‖h‖
a
)
(4.3)
Gaussian model with practical range a
γ(h) = 1− exp
(−3‖h‖2
a2
)
(4.4)
29
Notes: Gaussian models are not permissible for binary indicator variables.
The term variogram is used to designate what is strictly speaking a semi-variogram γ(h)
It is also understood as cross semivariogram depending on the context.
All above models are permissible in 3-D and have a covariance counterpart:
C(h) = C(0)− γ(h)
In SGeMS a variogram model: γ(h) = c0γ(0)(h)+
∑L
l=1 clγ
(l)(h) is characterized by
the following parameters:
• a nugget effect c0γ(0) with nugget constant c0 ≥ 0
• the number L of nested structures. Each structure γ(l)(h) is then defined by:
– a variance contribution cl ≥ 0
– the type of the variogram: spherical, exponential or Gaussian
– six parameters: the three practical ranges along the main directions of the
anisotropy ellipsoid and the three corresponding rotation angles defining that
ellipsoid. Note that each nested structure can have a different anisotropy.
Example:
Consider the variogram model γ(h) = 0.3γ(0)(h) + 0.4γ(1)(h) + 0.3γ(2)(h), with:
• γ(0)(h) a nugget with sill 0.3
• γ(1)(h) an anisotropic spherical variogram with major range 40, medium range 20
and minor range 5, and angles α = 45o, β = 0, θ = 0
• γ(2)(h) an isotropic Gaussian variogram of range 200
The corresponding XML parameter file snippet would be:
30
<[Parameter_name] nugget="0.3" structures_count="2" >
<structure_1 contribution="0.4" type="Spherical" >
<ranges max="40" medium="20" min="5" />
<angles x="45" y="0" z="0" />
</structure_1>
<structure_2 contribution="0.3" type="Exponential" >
<ranges max="200" medium="200" min="200" />
<angles x="0" y="0" z="0" />
</structure_2>
</[Parameter_name]>
Where [Parameter_name] is the name of the parameter.
4.1 Modeling variogram in SGeMS
SGeMS offers the computation and modeling of experimental variograms for data both on
regular grids and on point sets. The variogram module is accessible from the Data Analy-
sis—Variogram menu. In addition to the variogram, the experimental covariance and
correlogram can also be calculated. SGeMS only fits permissible models to variograms.
The variography in SGeMS is done in three steps 1) choosing the grid and properties, 2)
calculating the experimental variograms and 3) fitting a model. Step 2) differs depending
whether the properties are to be found on a point set or a grid.
For all steps, the next windows are accessed by clicking on Next.
The parametes for the variogram computation are entered in the second window. That
window takes two forms depending if the property resides on a point set, shown in Figure
4.1(b), or on a cartesian grid 4.1(c).
Selecting properties
The selection of the properties on which to perform the variography is done in the first
window, shown in Figure 4.1(a). The experimental variogram can either be loaded from
a previous session or computed from scratch. When the computation is required, the
user has to specified two properties, termed head and tail. Specifying different properties
for head and tail results in computing their cross-variogram. If it is retrieved from a
31
previous session, then the next window is the modeling window of Figure 4.1(d), the
parametrization window is skipped.
1. Experimental variogram: Either compute the experimental variogram from scratch
or retrieve it from a previous modeling session.
2. Grid and Properties: Select the properties for the experimental varigrams. Specifying
different properties for head and tail results in computing a cross-variogram. Using
the same property for head and tail produces a univariate experimental variogram
Experimental variogram on point set
Calculating experimental variograms on point set is challenging due to the irregularity of
data locations. For a fixed lag and direction, it is unlikely that enough pairs, if any, would
be found to calculate the corresponding variogram. Thus, tolerance on the distance and
anglesis introduced such that enough data pairs can be found on a point set. Parameters
3 to 13 shown in Figure 4.1(b) are required when working on a point set. Parameters
entered in this window can be saved for future use.
3. Number of lags: Number of variogram lags to compute
4. Lag separation: Distance between two lags. The experimental variogram will be
computed from h = 0 to h = Number of lags times the lag separation.
5. Lag tolerance: Tolerance around the lag separation. All pairs of data separated by
Lag separation ± lag tolerance would be reported to the same lag.
6. Number of directions: Number of directions along which to compute the experimen-
tal variograms, each with the same number of lags. Each direction is parametrize
by the next four items.
7. Azimuth: Horizontal degrees in angle from North moving clockwise
8. Dip: Vertical gegrees in angle from the horizontal plane, moving clockwise.
9. Tolerance: Half-window angle tolerance.Tolerance on the angle to pair data.
10. Bandwidh: Maximum acceptable deviation from the direction vector..
32
11. Measure type: Choice of the metric to measure bivariate spatial patterns. The op-
tions are: variogram, indicator variogram, covariance and correlogram. The mod-
eling capability of SGeMS permits only to fit a model on variogram and indicator
variogram.
12. Head indicator cutoff: Threshold defining the indicator variable. The value of the
head continuous property is coded one if it is below that threshold, it is zero other-
wise. This field is to be used only for indicator variogram.
13. Tail indicator cutoff: Threshold defining the indicator variable. The value of the tail
continuous property is coded one if it is below that threshold, it is zero otherwise.
This field is to be used only for indicator variogram. Different head and tail thresh-
olds lead paramount to a cross-indicator variogram if the tail and head refer to the
same property, or to a cross-indicator cross-variogram if the head and tail property
are different.
Experimental variogram on cartesian grid
Pairs of data are easier to find on a cartesian grid. Experimental variograms on cartesian
grid does not require tolerance. The computation take advantage of the grid by finding
pairs along discrete direction vector increment, (∆x,∆y,∆z), specified by the user. For
example, a direction vector of (1, 0, 0) is horizontal East-West, (1, 1, 0) correponds to
azimuth 45 degree, while (0, 0, 1) is vertical.
Parameters 14 to 18 shown in Figure 4.1(c) are required when the selected head and
tail properties are located on a cartesian grid. Parameters entered in this window can be
saved for future use. The description of parameters without reference number in 4.1(c)
already been described for Figure 4.1(b).
14. Number of lags: Number of variograms lags to consider.
15. Number of directions: Number of discretized direction vectors
16. x: East-West increment
17. y: North-South increment
18. z: Vertical increment
33
Fitting a permissible model
The tools to fit a model to the experimental variogram are shown in Figure 4.1(d). The
figures can be reorganized in the window either manually or by using the menu Window.
The axes for one or all the plots can be changed from the Settings menu. The File menu
allows to save or load the variogram model in XML format. The experimental variograms
can also be saved in file from that menu. The last option is to take snapshot for all or for
a particular variogram display.
Right-clicking on any variogram display will toggle the number of pairs next to each
experimental values. When fitting, points with few pairs should be given less considera-
tion.
19. Nugget Effect: Contribution to the sill of the nugget efect
20. Nb of structures: Number of structures for the linear model of regionalization
21. Sill Contribution: Contribution to the sill of that structure
22. Type: Type of variogram for that structure. There are three variogram type: spherical,
exponential and Gaussian.
23. Ranges: Ranges of the variogram. The range can either be changed be manually
entering the value, or by sliding the bar.
24. Angles: Angles defining the anisotropy
34
(a) Property selection (b) Parameters for point set
(c) Parameters for cartesian grid
(d) Model fitting
Figure 4.1: User interface for variogram modeling.
35
Chapter 5
Estimation Algorithms
SGeMS provides several tools for estimating a spatially distributed variable Z(u) from
a limited number of samples. All algorithms rely on the linear, least-square estimation
paradigm called kriging. The kriging estimator can be extended to deal with a variable
whose mean is not stationary, and to the case of multiple covariates (cokriging).
5.0.1 Common Parameters
All estimation algorithms in SGeMS work on a 3-D object, loosely called the estimation
grid: in the current version, that object can either be a Cartesian grid or an unstructured
set of points. When an estimation algorithm is run on an object, a new property containing
the result of the estimation is attached to that object. All estimation algorithms require
the following two parameters:
Parameters description
Grid Name: The grid (or more generally, the object) on which the estimation is to be
performed
Property Name: The name of the new property resulting from the estimation
5.0.2 Kriging
Kriging is a 3-D estimation program for variables defined on a constant volume support.
Estimation can either be performed by simple kriging (SK), ordinary kriging (OK), krig-
36
ing with a polynomial trend (KT) or simple kriging with a locally varying mean (LVM).
SK, OK and KT were introduced in ??, p. ??. LVM is another variant of kriging, in
which the mean m(u) is assumed not constant but known for every u. The kriging prob-
lem consists of finding the weights {λα}, α = 1, . . . , n such that:
V ar
( n∑
α=1
λα[Z(uα)−m(uα)]− [Z(u)−m(u)]
)
is minimum
Example:
Simple kriging is used to estimate porosity in the second layer of Stanford V. Porosity
is modeled by a stationary random function with an anisotropic spherical variogram. Its
expected value, inferred from the well data, is 0.17. The input parameters for algorithm
Kriging are reproduced in Fig. 5.1. Fig. 5.2 shows the estimated porosity field.
<parameters> <algorithm name="kriging" />
<Grid_Name value="layer 2" />
<Property_Name value="krig" />
<Hard_Data grid="layer 2 well data" property="porosity" />
<Kriging_Type type="Simple Kriging (SK)" >
<parameters mean="0.17" />
</Kriging_Type>
<Max_Conditioning_Data value="200" />
<Search_Ellipsoid value="100 100 5
0 0 0 " />
<Variogram nugget="0.2" structures_count="1" >
<structure_1 contribution="0.8" type="Spherical" >
<ranges max="80" medium="40" min="2" />
<angles x="30" y="0" z="0" />
</structure_1>
</Variogram>
</parameters>
Figure 5.1: Simple kriging parameters
37
(a) 3D view
(b) Cross-section 1 (c) Cross-section 2
Figure 5.2: Porosity estimated by simple kriging
Parameters description
1. Hard Data [Hard Data]: The grid and the name of the property containing the hard
data
2. Kriging Type [Kriging Type]: Possible types of kriging include: Simple Kriging
(SK), Ordinary Kriging (OK), Kriging with Trend (KT) and Simple Kriging with
Locally Varying Mean (LVM)
38
3. Trend components [Trend]: The trend components. Possible components are, with
u = (x, y, z): t1(u) = x, t2(u) = y, t3(u) = z, t4(u) = x.y, t5(u) = x.z,
t6(u) = y.z, t7(u) = x
2
, t8(u) = y
2
, t9(u) = z
2
.
The trend is coded by a string of 9 flags, for example the string
“0 1 0 0 0 1 0 1 0’’ would correspond to a trend with components t2, t6
and t8, i.e. T (u) = α x+ β yz + δ y2. Each flag is separated by a space.
4. Local Mean Property [Local Mean Property]: The property of the simulation grid
containing the non-stationarymean. A mean value must be available at each loca-
tion to be estimated.
5. Max Conditioning Data [Max Conditioning Data]: The maximum number of con-
ditioning data to be used for kriging at any location
6. Search Ellipsoid [Search Ellipsoid]: The ranges and angles defining the search
ellipsoid
7. Variogram [Variogram]: The variogram of the variable to be estimated by kriging
5.0.3 Indicator Kriging
Let Z(u) be a continuous random variable, and I(u, zk) the binary indicator function
defined at cutoff zk:
I(u, zk) =
{
1 if Z(u) ≤ zk
0 otherwise
The aim of indicator kriging is to estimate the conditional cumulative distribution function
(ccdf) at any cutoff zk, conditional to data (n):
I∗(u, zk) = E∗
(
I(u, zk) | (n)
)
= Prob∗
(
Z(u) < zk | (n)
) (5.1)
I∗(u, zk) is estimated by the simple kriging estimator:
I∗(u, zk)− E{I(u, zk)} =
n∑
α=1
λα
(
I(uα, zk)− E{I(uα, zk)}
)
Estimating I∗(u, zk) for different cutoffs zk, k =1,. . . ,K, yields a discrete estimate of
the conditional cumulative distribution function at the threshold values z1, . . . , zK .
39
The algorithm Indicator Kriging assumes that the marginal probabilitiesE{I(uα, zk)}
are constant and known for all uα and zk (Simple Indicator Kriging ref??
Median IK
Estimating a conditional cumulative distribution function at a given location u requires
the knowledge of the variogram of each of the indicator variables I(., zk), k =1, . . . , K.
Inference of these K variograms can be a daunting task. Moreover a kriging system has
to be solved for each indicator variable. The inference and computational burden can be
alleviated if two conditions are met:
• the K indicator variables are intrinsically correlated ref(??):
γZ(h) = γI(h, zk) = γI(h, zk, zk′) ∀zk, zk′
where γZ(h) is the variogram of variable Z(u) and γI(h, zk, zk′) is the cross-
variogram between indicator variables I(., zk) and I(., zk′). All these variograms
are standardized to a unit sill
• All vectors of hard indicator data are complete: there are no missing values as could
result from inequality constraints on the Z value.
When those two conditions are met, it is only necessary to infer one variogram γZ(h) and
only one kriging system has to be solved for all thresholds zk (indeed, the data locations
and the variogram are the same for all thresholds). This simplification is called median
indicator kriging.
Ensuring the validity of the estimated cdf
Since kriging is a non-convex estimator, the cdf values estimated by kriging Prob∗
(
Z(u) < zk
)
,
k = 1, . . . , K estimated by indicator kriging may not satisfy the properties of a cdf F , that
is:
∀k ∈ {1, ..., K} F (zk) = Prob(Z(u) < zk) ∈ [0, 1] (5.2)
∀zk ≥ zk′ F (zk) ≥ F (zk′) (5.3)
If properties 5.2 and 5.3 are not verified, the program Indicator Kriging modifies the
kriging values so as to ensure the validity of the ccdf. The corrections are performed in 3
steps.
40
1. Upward correction:
• Loop through all the cutoffs z1,. . . , zK , starting with the lowest cutoff z1.
• At cutoff zk, if the estimated probability I∗(Z(u), zk) is not in [I∗(Z(u), zk−1), 1],
reset it to the closest bound
2. Downward correction:
• Loop through all the cutoffs z1,. . . , zK , starting with the highest cutoff zK .
• At cutoff zk, if the estimated probability I∗(Z(u), zk) is not in [0, I∗(Z(u), zk+1)],
reset it to the closest bound
3. The final corrected probability values are the average of the upward and downward
corrections
Categorical Variables
Indicator Kriging can also be applied to categorical variables, i.e. variables that take a dis-
crete, finite, number of values (also called classes or categories): z(u) ∈ {0, ..., K − 1}.
The indicator variable for class k is then defined as:
I(u, k) =
{
1 if Z(u) = k
0 otherwise
and the probability I∗(u, k) of Z(u) belonging to class k is estimated by simple kriging:
I∗(u, k)− E{I(u, k)} =
n∑
α=1
λα
(
I(uα, k)− E{I(uα, k)}
)
In the case of categorical variables, the estimated probabilities must all be in [0, 1] and
verify:
K∑
k=1
P
(
Z(u = k) | (n)
)
= 1 (5.4)
If not, they are corrected as follows:
1. If I∗(Z(u), zk) /∈ [0, 1] reset it to the closest bound. If all the probability values are
lesser or equal to 0, no correction is made and a warning is issued.
2. Standardize the values so that they sum-up to 1:
I∗corrected(Z(u), zk) =
I∗(Z(u), zk)∑K
i=1 I
∗(Z(u), zi)
41
Parameters description
1. Hard Data Grid [Hard Data Grid]: The grid containing the hard data
2. Hard Data Property [Hard Data Property]: The name of the properties containing
the hard data. If there are K indicators, K properties must be specified. Each
property contains the indicator values I(u, k), k = 1,. . . ,K
3. Categorical Variable Flag [Categorical Variable Flag]: A flag indicating whether
the variable to be kriged is categorical or continuous. If the flag is on (equal to 1),
the variable is assumed categorical
4. Marginal Probabilities [Marginal Probabilities]: The marginal probabilities of
each indicator. The probability values are separated by one or more space. The first
probability corresponds to the probability of the first indicator and so on.
5. Max Conditioning Data [Max Conditioning Data]: The maximum number of con-
ditioning data to be used for each kriging
6. Search Ellipsoid [Search Ellipsoid]: The ranges and angles defining the search
ellipsoid
7. Median IK Flag [Median Ik Flag]: A flag indicating whether median IK should be
performed
8. Full IK Flag [Full Ik Flag]: A flag indicating whether full IK should be performed.
If
Median Ik Flag is on/off, Full Ik Flag is off/on
9. Median IK Variogram [Variogram Median Ik]: The variogram of the indicator vari-
ables used for median IK. IfMedian Ik Flag is off this parameter is ignored
10. Full IK Variogram [Variogram Full Ik]: The variograms of each indicator vari-
able. IfMedian Ik Flag is on, this parameter is ignored
42
5.0.4 CoKriging
The aim of cokriging is to estimate a variable Z1(u) accounting for data on related vari-
ables Z2, ..., ZJ+1. The cokriging estimator is given by:
Z∗1(u)−m1(u) =
∑
α
λα
[
Z1(uα)−m1(uα)
]
+
J+1∑
j=2
Nj∑
βj
λβj
[
Zj(uβj)−mj(uβj)
]
(5.5)
Algorithm CoKriging allows to distinguish between the two cases of simple and ordi-
nary cokriging:
• the means m1(u), ...,mJ+1(u) are known and constant: simple cokriging
• the means are locally constant but unknown. The weights λα and λβj must then
satisfy: ∑
α
λα = 1 (5.6)
Nj∑
βj
λβj = 0 ∀j ∈ {1, ..., J} (5.7)
Solving the cokriging system calls for the inference and modeling of multiple vari-
ograms: the variogram of each variable and all cross-variograms between any two vari-
ables. In order to alleviate the burden of modeling all these variograms, two models have
been proposed: the Markov Model 1 and the Markov Model 2 ref(??)
For text clarity, we will only consider the case of a single secondary variable (J = 1).
Markov Model 1
The Markov Model 1 (MM1) considers the following Markov-type screening hypothesis:
E
(
Z2(u) | Z1(u), Z1(u+ h)
)
= E
(
Z2(u) | Z1(u)
)
(5.8)
i.e. the dependence of the secondary variable on the primary is limited to the co-located
primary variable.
The cross-variogram (or cross-covariance) is then proportional to the auto-variogram
of the primary variable (??)
C12(h) =
C12(0)
C11(0)
C11(h) (5.9)
43
Where C12 is the covariance between Z1 and Z2 and C11 is the covariance of Z1.
Solving the cokriging system under the MM1 model only calls for knowledge of C11,
hence the inference and modeling effort is the same as for univariate kriging.
Although very congenial the MM1 model should not be used when the support of
the secondary variable Z2 is larger than the one of Z1, lest thevariance of Z1 would be
underestimated. It is better to use the Markov Model 2 in that case.
Markov Model 2
The Markov Model 2 (MM2) was developed for the case where the volume support of the
secondary variable is larger than that of the primary variable ref(??) . This is often the
case with remote sensing and seismic-related data. The Markov-type hypothesis is now:
E
(
Z1(u) | Z2(u), Z2(u+ h)
)
= E
(
Z1(u) | Z2(u)
)
(5.10)
i.e. the dependence of the primary variable on the secondary is limited to the co-located
secondary variable.
The cross-variogram is now proportional to the variogram of the secondary variable:
C12(h) =
C12(0)
C11(0)
C22(h) (5.11)
In order for all three covariances C11, C12 and C22 to be consistent, ref(??) proposed
to model C11 as a linear combination of C22 and any permissible residual covariance CR.
Expressed in term of correlograms (C11(h) = C11(0)ρ11(h)), this is written:
ρ11(h) = ρ
2
12 ρ22(h) + (1− ρ212) ρR(h) (5.12)
Parameters description
1. Primary Hard Data Grid [Primary Harddata Grid]: The grid containing the pri-
mary hard data
2. Primary Variable [Primary Variable]: The name of the property containing the
primary hard data
3. Assign Hard Data [Assign Hard Data]: A flag specifying whether the hard data
should be re-located on the simulation grid. See ??
44
4. Secondary Hard Data [Secondary Harddata Grid]: The grid containing the sec-
ondary hard data
5. Secondary Variable [Secondary Variable]: The name of the property containing
the secondary hard data
6. Kriging Type [Kriging Type]: Possible types of cokriging include: Simple Kriging
(SK) and Ordinary Kriging (OK). Note that selecting OK while retaining only a
single secondary datum (e.g. when doing co-located cokriging) amounts to com-
pletely ignoring the secondary information : from Eq. (5.7), the weight associated
to the single secondary datum will be 0.
7. SK Means [SK Means]: The mean of the primary and secondary variables. If the
selected kriging type is Ordinary Kriging, this parameter is ignored
8. Co-kriging Type [Cokriging Type]: The cokriging option. Available options are:
Full Cokriging, Co-located Cokriging with Markov Model 1 (MM1), and Co-located
Cokriging with Markov Model 2 (MM2)
9. Max Conditioning Data [Max Conditioning Data]: The maximum number of pri-
mary conditioning data to be used for each cokriging
10. Search Ellipsoid - Primary Variable [Search Ellipsoid 1]: The ranges and angles
defining the search ellipsoid used to find the primary conditioning data
11. Primary Variable Variogram [Variogram C11]: The variogram model for the pri-
mary variable
12. Max Conditioning Data [Max Conditioning Data 2]: The maximum number of
secondary conditioning data used for each cokriging. This parameter is only re-
quired if Cokriging Type is set to Full Cokriging
13. Search Ellipsoid - Secondary Variable [Search Ellipsoid 2]: The ranges and an-
gles defining the search ellipsoid used to find the secondary conditioning data. This
parameter is only required if
Cokriging Type is set to Full Cokriging
45
14. Cross-Variogram [Variogram C12]: The cross-variogram between the primary and
secondary variable. This parameter is only required if Cokriging Type is set to
Full Cokriging
15. Secondary Variable Variogram [Variogram C22]: The variogram of the secondary
variable. This parameter is only required if Cokriging Type is set to Full Cokrig-
ing
16. Coefficient of correlation [Correl Z1Z2]: The coefficient of correlation between the
primary and secondary variable. This parameter is only required if Cokriging Type
is set to MM1
17. Coefficient of correlation [MM2 Correl Z1Z2]: The coefficient of correlation be-
tween the primary and secondary variable. This parameter is only required if
Cokriging Type is set to MM2
18. Secondary Variable Variogram [MM2 Variogram C22]: The variogram of the sec-
ondary variable. This parameter is only required if Cokriging Type is set to MM2.
Note that the variogram of the secondary variable and the variogram of the primary
variable must verify the condition (5.12)
46
Chapter 6
Simulation Algorithms
This chapter presents the collection of stochastic simulation algorithms provided by SGeMS.
Section 6.1 presents the traditional variogram based (two-points) algorithms, such as
SGSIM (Sequential Gaussian Simulation), SISIM (Sequential Indicator Simulation), COS-
GSIM (Sequential Gaussian Co-simulation), COSISIM (Sequential Indicator co-imulation).
SGSIM and COSGSIM are the choices for most applications with continuous variables,
while SISIM and COSISMI would be choice for categorical variables.
Section 6.2 gives detailed descriptions of the two multiple-point simulation (mps) al-
gorithms: SNESIM (Single Normal Equation Simulation) and FILTERSIM (Filter-based
Simulation). SNESIM only works for categorical variables as involved in facies distribu-
tions, while FILTERSIM is suited to both continuous and categorical variables.
Each simulation algorithm presented in this chapter is demonstrated with some exam-
ple runs.
6.1 Variogram-based Simulations
This section covers the variogram-based sequential simulation algorithms implemented in
SGeMS. The stochastic realizations from any of these algorithms draw their spatial pat-
terns from input model variograms. Variograms-based algorithms are to be used when the
critical patterns are reasonably amorphous (high entropy) and can be represented by bi-
variate statistics. Cases when variograms fail to reproduce critical spatial patterns should
lead to using the multiple-point geostatistics algorithms described in the next section.
Variogram-based sequential simulations are the most popular stochastic simulation
47
algorithms mostly due to their robustness and ease of conditioning, both for hard and
soft data. Moreover, they do not require rasterized (regular or cartesian ) grid and allow
simulation on irregular grids such as point sets.
SGeMS takes full advantage of this flexibility; all the simulation algorithms in this
section work both on point sets and on cartesian grids. Moreover, the conditioning data
may or may not be on the simulation grid. However, working on point set induces a
performance penalty as the search for neighboring data is significantly more costly than
on a regular (cartesian) grid.
When the simulation grid is cartesian, all algorithms have the option to relocate the
conditioning data to the nearest grid node for increased execution speed. The re-allocation
strategy is to move each datum to the closest grid node. In case two data share the same
closest grid node, the farthest one is ignored.
This section presents first the simulation algorithms requiring some Gaussian as-
sumption: the sequential Gaussian simulation SGSIM and the sequential Gaussian co-
simulation COSGSIM for integration of secondary information through a coregionaliza-
tion model. Next, the direct sequential simulation DSSIM is presented, DSSIM does not
require any Gaussian assumption but may not reproduce the target distribution. Finally
the implementation of the indicator simulation algorithms SISIM and COSISIM are dis-
cussed. These last two algorithms rely on the decomposition of the cumulative distribution
function by a set of thresholds. At any location the probability of exceeding each thresh-
old is estimated by kriging and these propabilities are combined to construct the local
conditional cumulative distribution function (ccdf) from which simulation is performed.
6.1.1 SGSIM: sequential Gaussian simulation
Let Y (u) be a multivariate Gaussian random function with zero mean, unit variance, and
a given variogram model γ(h). Realizations of Y (u), conditioned to data (n) can be
generated by Algorithm 6.1.
A non Gaussian random function Z(u) must first be transformed into Gaussian ran-
dom function Y (u); Z(u)7→ Y (u). If no analytical model are available, a normal score
transform may be applied. The Algorithm 6.1 then becomes Algorithm 6.2
Using the normal score transform involves more than just transforming and back trans-
forming the data at simulation time. First, the algorithm calls for the variogram of the
normal score not of the original data. Only that normal score variogram is guaranteed
48
Algorithm 6.1 Sequential Gaussian Simulation
1: Define a random path visiting each node of the grid
2: for Each node u along the path do
3: Get the conditioning data consisting of both neighboring original data (n) and pre-
viously simulated values
4: Estimate the local conditional cdf as a Gaussian distribution with mean given by a
form of kriging and its variance by the kriging variance
5: Draw a value from that Gaussian ccdf and add the simulated value to the data set
6: end for
7: Repeat for another realization
Algorithm 6.2 Sequential Gaussian Simulation with normal score transform
1: Transform the data into normal score space. Z(u) 7→ Y (u)
2: Perform Algorithm 6.1
3: Back transform the Gaussian simulated field into the data space. Y (u) 7→ Z(u)
to be reproduced, not the original Z-value variogram. However, in many cases the back
transform does not affect adversely the reproduction of the Z-value variogram model. If
required, the SGeMS implementation of SGSIM can automatically perform the normal
score transform of the original hard data and back-transform the simulated realizations.
The user must still perform independently the normal score transform of the hard data
with the TRANS, see Algorithm 3.1, to calculate and model the normal score variogram.
At each data location the algorithm can either solve a simple kriging, ordinary krig-
ing, kriging with local mean ot kriging with a trend system. Theory only guarantees
reproduction of the variogram with the simple kriging system.
SGSIM with local varying mean
In most cases, the local varying mean zm(u) = E{Z(u)} is given as a Z-value and must
be converted into the Gaussian space, such that ym(u) = E{Y (u)}. Transforming zm(u)
into ym(u) using the FZ and the rank preserving transform
ym(u) = F
−1
Y
(
FZ
(
zm(u)
))
would not, in all generality, ensure that ym(u) = E{Y (u)}. A better alternative, when
possible, is to get directly ym(u) by direct calibration of the secondary information with
49
the normal score transform y of the primary attribute z.
A note on Gaussian spatial law
Gaussian random functions have very specific and consequential spatial structures and
distribution law. Median values are maximally correlated while extreme values are in-
creasingly less correlated. That property is known as destructuration. If the phenomenon
under study is known to have well correlated extreme values, a Gaussian-related simula-
tion algorithm is not appropriate.
Example:
Figure 6.6 shows two conditional realizations on a point set using simple kriging and
normal score transform of the hard data.
Parameters description
The SGSIM algorithm is activated from Simulation—sgsim in algorithm panel. The main
SGSIM interface contains 3 pages: General, Data and Variogram (see Figure 6.1). The
text inside ‘[ ]’ is the corresponding keyword in the SGSIM parameter file.
1. Simulation Grid Name [Grid Name]: Name of the simulation grid.
2. Property Name Prefix [Property Name]: Prefix for the simulation output. The suffix
real# is added for each realization.
3. # of realizations [Nb Realizations]: Number of simulations to generate.
4. Seed [Seed]: Seed for the random number generator.
5. Kriging Type [Kriging Type]: Select the type of kriging system to be solved at each
node along the random path. The simple kriging (SK) mean is set to zero.
6. Hard Data—Object [Hard Data.grid]: Name of the grid containing the conditioning
data. If no grid is selected, the realizations are unconditional.
7. Hard Data—Property [Hard Data.property]: Property for the conditioning data.
Only required if a grid has been selected in Hard Data—Object[Hard Data.grid]
50
8. Assign hard data to simulation grid [Assign Hard Data]: If selected, the hard data
are copied on the simulation grid. The program does not proceed if the copying
fails. This option significantly increases execution speed.
9. Max Conditionning data [Max Conditioning Data]: Maximum number of data to
be retained in the search neighborhood.
10. Search Ellipsoid Geometry [Search Ellipsoid]: Parametrization of the search el-
lipsoid.
11. Target Histogram: If used, the data are normal score transformed prior to simulation
and the simulated field is transformed back to the original space.
Use Target Histogram [Use Target Histogram] flag to use the normal score trans-
form. [nonParamCdf] parametrize the target histogram (see section 3.1)
12. Variogram [Variogram]: Parametrization of the normal score variogram.
6.1.2 COSGSIM: Sequential Gaussian co-simulation
COSGSIM allows to simulate a Gaussian variable accounting for secondary information.
Let Y1(u) and Y2(u) be two correlated multiGaussian random variables. Y1(u) is called
the primary variable, and Y2(u) the secondary variable. The COSGSIM simulation of
the primary variable Y1 conditioned to both primary and secondary data is described in
Algorithm 6.3.
Algorithm 6.3 Sequential Gaussian Cosimulation
1: Define a path visiting each node u of the grid
2: for At each node u do
3: Get the conditioning data consisting of neighboring original data, previously sim-
ulated values and secondary data
4: Get the local Gaussian ccdf for the primary attribute, its mean is estimated by a
cokriging estimate and its variance by the cokriging variance.
5: Draw a value from the that Gaussian ccdf and add it to the data set
6: end for
If the primary and secondary variable are not Gaussian, make sure that the trans-
formed variables Y1 and Y2 are at least bigaussian. If they are not, another simulation
51
algorithm should be considered, COSISIM for example (see 6.1.5). If no analytical model
are available for such transformation, a normal score transform may be applied to both
variables. The TRANS algorithm only ensures that the respective marginals are Gaussian.
The algorithm 6.3 then becomes Algorithm 6.4
Algorithm 6.4 Sequential Gaussian Cosimulation for non-Gaussian variable
1: Transform Z1 and Z2 into Gaussian variables Y1 and Y2, according to Eq. (3.5).
2: Perform Algorithm Algorithm 6.3.
3: “Back-transform” the simulated values y1,1, ..., y1,N into z1,1, ..., z1,N :
Example:
Figure 6.11 shows two conditional realizations with a MM1-type of coregionalization
and simple cokriging. Both the primary and secondary information are normal score
transformed.
Parameters description
The COSGSIM algorithm is activated from Simulation—cosgsim in the algorithm panel.
The COSGSIM interface contains 5 pages: General, Primary Data, Secondary Data, Pri-
mary Variogram and Secondary Variogram (see Figure 6.2). The text inside ‘[ ]’ is the
corresponding keyword in the COSGSIM parameter file.
1. Simulation Grid Name [Grid Name]: Name of the simulation grid.
2. Property Name Prefix [Property Name]: Prefix for the simulation output. The suffix
real# is added for each realization.
3. # of realizations [Nb Realizations]: Number of simulations to generate.
4. Seed [Seed]: Seed for the random number generator.
5. Kriging Type [Kriging Type]: Select the type of kriging system to be solved at each
node along the random path.
6. Cokriging Option [Cokriging Type]: Select the type of coregionalization model:
LMC, MM1 or MM2
52
7. Primary Hard Data Grid [Primary Harddata Grid]: Selection of the grid for the
primary variable. If no grid is selected, the realizations are unconditional.
8. Primary Property [Primary Variable]: Selection of the hard data property for the
primary variable.9. Assign hard data to simulation grid [Assign Hard Data]: If selected, the hard data
are copied on the simulation grid. The program does not proceed if the copying
fails. This option significantly increases execution speed.
10. Primary Max Conditionning data [Max Conditioning Data 1]: Maximum num-
ber of primary data to be retained in the search neighborhood.
11. Primary Search Ellipsoid Geometry [Search Ellipsoid 1]: Parametrization of
the search ellipsoid for the primary variable.
12. Target Histogram: If used, the primary data are normal score transformed prior to
the simulation and the simulated field is transformed back to the original space.
Transform Primary Variable [Transform Primary Variable] flag to use the nor-
mal score transform. [nonParamCdf primary] parametrize the primary variable tar-
get histogram (see section 3.1).
13. Secondary Data Grid [Secondary Harddata Grid]: Selection of the grid for the
secondary variable.
14. Secondary Property [Secondary Variable]: Selection of the data property for the
secondary variable.
15. Secondary Max Conditionning data [Max Conditioning Data 2]: Maximum num-
ber of secondary data to be retained in the search neighborhood.
16. Secondary Search Ellipsoid Geometry [Search Ellipsoid 2]: Parametrization of
the search ellipsoid for the secondary variable.
17. Target Histogram: If used, the primary data are normal score transformed prior to
the simulation and the simulated field is transformed back to the original space.
Transform Secondary Variable [Transform Secondary Variable] flag to use the
normal score transform. [nonParamCdf secondary] parametrizes the secondary
variable target histogram (see section 3.1).
53
18. Variogram for primary variable [Variogram C11]: Parametrization of the normal
score variogram for the primary variable.
19. Cross-variogram between primary and secondary variables [Variogram C12]:
Parametrization of the cross-variogram for the normal score primary and secondary
variables. Required if Cokriging Option [Cokriging Type] is set to Full Cokrig-
ing
20. Variogram for secondary variable [Variogram C22]: Parametrization of the nor-
mal score variogram for the secondary variable. Required if Cokriging Option
[Cokriging Type] is set to Full Cokriging
21. Coef. Correlation Z1,Z2: Coefficient of correlation between the primary and sec-
ondary variable. Only required if Cokriging Option [Cokriging Type] is set to
MM1 or MM2. The correlation keyword is [Correl Z1Z2] for the MM1 coregion-
alization, and [MM2 Correl Z1Z2] for the MM2 coregionalization.
22. Covariance for secondary variable [MM2 Variogram C22]: Parametrization of the
normal score variogram for the secondary vaiable. Required if Cokriging Option
[Cokriging Type] is set to MM2
6.1.3 DSSIM: Direct Sequential Simulation
The direct sequential simulation algorithm DSSIM performs simulation of continuous
attributes without prior indicator coding or Gaussian transform. It can be shown that the
only condition for the model variogram to be reproduced is that the ccdf has for mean and
variance the simple kriging estimate and variance. The shape of the ccdf does not matter,
it may not even be the same for each node. The drawback is that there is no guarantee that
the marginal distribution is reproduced.
One solution is to post-process the simulated realizations with a rank transform al-
gorithm to identify the target histogram, see algorithm TRANS in section 3.1. This may
affect variogram reproduction. The second alternative is to set the shape of the local ccdf,
at all locations along the path, such that the marginal distribution is approximated at the
end of each realization.
SGeMS offers two potential distributions type for the first alternative; the ccdf is either
a uniform distribution or a lognormal distribution. Neither of these distributions would
54
produce realizations that have either uniform or lognormal marginal distributions, thus
some rank transformation may be required to identify the target marginal histogram.
For the second alternative, the method proposed by Soares (2001) is implemented.
The ccdf is sampled from the data marginal distribution, modified to be centered on the
kriging estimate with spread equal to the kriging variance. The resulting shape of each lo-
cal ccdf differs from location to location. The method gives reasonable results for a target
symmetric (even multi-modal) distribution, but less so for a highly skewed distribution;
in this latter case the first option of a log-normal ccdf type followed by a rank tranform
may give better results.
Algorithm 6.5 Direct Sequential Simulation
1: Define a random path visiting each node u of the grid
2: for Each location u along the path do
3: Get the conditioning data consisting of both neighboring original data and previ-
ously simulated values
4: Define the local ccdf with its mean and variance given by the kriging estimate and
variance.
5: Draw a value from that ccdf and add the simulated value to the data set
6: end for
Example:
Figure 6.8 shows two conditional realizations on a point set using the Soares method and
simple kriging. A local varying mean with the Soares method is used in Figures 6.9.
Parameters description
The DSSIM algorithm is activated from Simulation—dssim in the upper part of algorithm
panel. The main DSSIM interface contains 3 pages: General, Data and Variogram (see
Figure 6.3). The text inside ‘[ ]’ is the corresponding keyword in the DSSIM parameter
file.
1. Simulation Grid Name [Grid Name]: Name of the simulation grid.
2. Property Name Prefix [Property Name]: Prefix for the simulation output. The suffix
real# is added for each realization.
55
3. # of realizations [Nb Realizations]: Number of simulations to generate.
4. Seed [Seed]: Seed for the random number generator.
5. Kriging Type [Kriging Type]: Select the form of kriging system to be solved at each
node along the random path.
6. SK mean [SK mean]: Mean of the attribute. Only required if Kriging Type[Kriging Type]
is set to Simple Kriging(SK).
7. Hard Data—Object [Hard Data.grid]: Name of the grid containing the conditioning
data. If no grid is selected, the realizations are unconditional.
8. Hard Data—Property [Hard Data.property]: Property for the conditioning data.
Only required if a grid has been selected in Hard Data—Object[Hard Data.grid]
9. Assign hard data to simulation grid [Assign Hard Data]: If selected, the hard data
are copied on the simulation grid. The program does not proceed if the copying
fails. This option significantly increases execution speed.
10. Max Conditionning data [Max Conditioning Data]: Maximum number of data to
be retained in the search neighborhood.
11. Search Ellipsoid Geometry [Search Ellipsoid]: Parametrization of the search el-
lipsoid.
12. Distribution type [cdf type]: Select the type of cdf to be build at each location
along the random path
13. LogNormal parameters: Only activated if Distribution type[cdf type] is
set to LogNormal. Parametrization of the global lognormal distribution, the
mean is specified by Mean[LN mean] and the variance by Variance[LN variance].
14. Uniform parameters: Only activated if Distribution type[cdf type] is set to
Uniform. Parametrization of the global Uniform distribution, the minimum is
specified by Min[U min] and the maximum by Max[U max].
15. Soares Distribution [nonParamCdf]: Only activated if Distribution type[cdf type]
is set to Soares. Parametrization of the global distribution from which the lo-
cal distribution is sampled (see section 3.1).
56
16. Variogram [Variogram]: Parametrization of the variogram. For this algorithm, the
sill of the variogram is a critical information and should not be standardized to 1.
6.1.4 SISIM: sequential indicator simulation
Sequential indicator simulation SISIM combines the indicator formalism