Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -128,3 +128,5 @@ docs/source/sg_execution_times.rst
*.vtu
*.vtr
*.vtk

*.npz
2 changes: 2 additions & 0 deletions docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,7 @@ def setup(app):
"../../examples/10_normalizer/",
"../../examples/11_plurigaussian/",
"../../examples/12_sum_model/",
"../../examples/13_mps/",
],
# path where to save gallery generated examples
"gallery_dirs": [
Expand All @@ -318,6 +319,7 @@ def setup(app):
"examples/10_normalizer/",
"examples/11_plurigaussian/",
"examples/12_sum_model/",
"examples/13_mps/",
],
# Pattern to search for example files
"filename_pattern": r"\.py",
Expand Down
1 change: 1 addition & 0 deletions docs/source/tutorials.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ explore its whole beauty and power.
examples/10_normalizer/index
examples/11_plurigaussian/index
examples/12_sum_model/index
examples/13_mps/index
examples/00_misc/index

.. only:: html
Expand Down
656 changes: 656 additions & 0 deletions examples/13_mps/00_mps_overview.ipynb

Large diffs are not rendered by default.

54 changes: 54 additions & 0 deletions examples/13_mps/00_simple_unconditional.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
r"""
A first Direct Sampling simulation
----------------------------------

This is the minimal Multiple Point Statistics example: build a training image,
wrap it in a :any:`TrainingImage`, and generate one unconditional realization
with :any:`DirectSampling`.

We use a small, synthetic *channelized* training image generated with NumPy, so
the example is fast and needs no downloads.
"""

import matplotlib.pyplot as plt
import numpy as np

import gstools as gs

###############################################################################
# Create a synthetic binary training image with curvilinear "channels".
# The two facies (0 and 1) form connected, meandering bands — exactly the kind
# of structure two-point statistics struggles to reproduce.

gx, gy = np.meshgrid(np.arange(60), np.arange(60), indexing="ij")
ti_data = ((np.sin(gx / 5.0) + np.sin((gx + gy) / 8.0)) > 0).astype(float)

###############################################################################
# Wrap the array in a :any:`TrainingImage`. For a categorical variable (facies
# codes) the distance is the fraction of mismatching neighbours, so the
# ``distance`` argument is ignored here.

ti = gs.TrainingImage(ti_data, categorical=True)
print(ti)

###############################################################################
# Create the :any:`DirectSampling` generator and simulate on a 40x40 grid.
#
# * ``n_neighbors`` — how many already-known cells define each data event.
# * ``scan_fraction`` — fraction of the training image scanned per cell
# (smaller is faster, slightly noisier).
# * ``threshold=0.0`` — the recommended DSBC mode: always take the best match.

ds = gs.DirectSampling(ti, n_neighbors=12, scan_fraction=0.3, threshold=0.0)
field = ds([np.arange(40, dtype=float)] * 2, seed=20250616)

###############################################################################
# Plot the training image next to the realization. The realization is not a
# copy of the TI, but it reproduces the same channel patterns.

fig, (ax0, ax1) = plt.subplots(1, 2, figsize=(10, 5))
ax0.imshow(ti_data, cmap="cividis", origin="lower")
ax0.set_title("Training image")
ax1.imshow(field, cmap="cividis", origin="lower")
ax1.set_title("DS realization")
fig.tight_layout()
56 changes: 56 additions & 0 deletions examples/13_mps/01_conditional.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
r"""
Conditioning to hard data
-------------------------

Real simulations must honour measured data (boreholes, samples). Direct
Sampling supports this through :meth:`~gstools.DirectSampling.set_condition`:
conditioning values are pinned into the grid before simulation and preserved
exactly, while the rest of the field is filled in around them.

We reuse the synthetic channel training image from the first example.
"""

import matplotlib.pyplot as plt
import numpy as np

import gstools as gs

###############################################################################
# Same synthetic channelized training image as before.

gx, gy = np.meshgrid(np.arange(60), np.arange(60), indexing="ij")
ti_data = ((np.sin(gx / 5.0) + np.sin((gx + gy) / 8.0)) > 0).astype(float)
ti = gs.TrainingImage(ti_data, categorical=True)

###############################################################################
# Draw 40 random "hard data" points and read their facies from the TI. In a
# real study these would be field measurements; here we sample the TI so the
# conditioning data are consistent with the patterns.

rng = np.random.default_rng(0)
cond_x = rng.integers(0, 40, 40).astype(float)
cond_y = rng.integers(0, 40, 40).astype(float)
cond_val = ti_data[cond_x.astype(int), cond_y.astype(int)]

###############################################################################
# Set the conditioning data and simulate. ``set_condition`` snaps each point to
# its nearest grid node, so the values are honoured exactly at those cells.

ds = gs.DirectSampling(ti, n_neighbors=12, scan_fraction=0.3, threshold=0.0)
ds.set_condition([cond_x, cond_y], cond_val)
field = ds([np.arange(40, dtype=float)] * 2, seed=7)

honored = int(
(field[cond_x.astype(int), cond_y.astype(int)] == cond_val).sum()
)
print(f"conditioning honoured: {honored}/{cond_val.size}")

###############################################################################
# Plot the realization with the conditioning points overlaid. Every marker sits
# on a cell whose simulated facies matches the datum.

fig, ax = plt.subplots(figsize=(6, 5))
ax.imshow(field, cmap="cividis", origin="lower")
ax.scatter(cond_y, cond_x, c=cond_val, cmap="cividis", edgecolors="red", s=30)
ax.set_title("Conditional DS realization (red-edged = hard data)")
fig.tight_layout()
60 changes: 60 additions & 0 deletions examples/13_mps/02_continuous.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
r"""
Continuous variables and distance metrics
-----------------------------------------

Direct Sampling is not limited to categorical facies: with ``categorical=False``
it simulates continuous variables (permeability, porosity, elevation, ...). The
``distance`` argument then selects how two patterns are compared:

* ``"l1"`` / ``"l2"`` — Manhattan / Euclidean distance on the raw values
(Mariethoz et al., 2010, Eq. 6 / Eq. 4).
* ``"variation"`` — compares only the *relative* variations within a pattern
(Eq. 9), tolerating a locally varying mean. Useful for non-stationary data.

Here we use a smooth continuous training image and a small acceptance
``threshold`` (standard DS mode) to allow approximate matches.

.. note::

The acceptance ``threshold`` is **not comparable across metrics**: the
``"variation"`` distance is normalised by ``2·d_max``, so the same value is
stricter than it would be for ``"l1"``/``"l2"``. Re-tune it when you switch
the distance metric.
"""

import matplotlib.pyplot as plt
import numpy as np

import gstools as gs

###############################################################################
# A smooth, continuous synthetic training image.

gx, gy = np.meshgrid(np.arange(60), np.arange(60), indexing="ij")
ti_data = np.sin(gx / 6.0) * np.cos(gy / 8.0)

###############################################################################
# Build a continuous training image with the Euclidean (``"l2"``) distance.

ti = gs.TrainingImage(ti_data, categorical=False, distance="l2")
print(ti)

###############################################################################
# Simulate. ``threshold=0.03`` accepts the first pattern within that distance
# (standard DS), which is faster than the exhaustive best-candidate search for
# continuous variables.

ds = gs.DirectSampling(ti, n_neighbors=12, scan_fraction=0.3, threshold=0.03)
field = ds([np.arange(32, dtype=float)] * 2, seed=3)

###############################################################################
# Plot. The realization reproduces the smooth wavy structure of the TI without
# copying it. A shared colour scale makes the comparison fair.

vmin, vmax = ti_data.min(), ti_data.max()
fig, (ax0, ax1) = plt.subplots(1, 2, figsize=(10, 5))
im = ax0.imshow(ti_data, cmap="RdBu_r", origin="lower", vmin=vmin, vmax=vmax)
ax0.set_title("Training image (continuous)")
ax1.imshow(field, cmap="RdBu_r", origin="lower", vmin=vmin, vmax=vmax)
ax1.set_title("DS realization")
fig.colorbar(im, ax=(ax0, ax1), shrink=0.7)
83 changes: 83 additions & 0 deletions examples/13_mps/03_channel_strebelle.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,83 @@
r"""
A real training image: the Strebelle channels
----------------------------------------------

The previous examples used tiny synthetic training images. Here we use the
classic **Strebelle (2002) channelized fluvial training image**, the de-facto
benchmark for MPS, and condition the simulation on random hard data.

.. note::

**Data source / license.** The training image is downloaded from the
`GeoDataSets <https://github.com/GeostatsGuy/GeoDataSets>`_ repository by
Michael Pyrcz (GeostatsGuy), which is distributed under the **MIT license**
(redistribution permitted with attribution). The underlying channel TI is
due to Strebelle, S. (2002), *Conditional simulation of complex geological
structures using multiple-point statistics*, Mathematical Geology, 34(1),
1-21. If the download is unavailable, the example falls back to a synthetic
training image so it still runs offline.
"""

import os
import urllib.request

import matplotlib.pyplot as plt
import numpy as np
from matplotlib.colors import ListedColormap

import gstools as gs

###############################################################################
# Load the Strebelle training image, with a synthetic fallback for offline use.

TI_URL = (
"https://raw.githubusercontent.com/GeostatsGuy/"
"GeoDataSets/master/MPS_Training_image_and_Realizations_500.npz"
)
CACHE = "mps_strebelle.npz"
try:
if not os.path.exists(CACHE):
urllib.request.urlretrieve(TI_URL, CACHE)
ti_arr = np.load(CACHE)["array1"].astype(float)
source = "Strebelle (2002) via GeoDataSets (MIT)"
except Exception as err: # pragma: no cover - network fallback
print(f"download failed ({err}); using a synthetic channel TI instead")
gx, gy = np.meshgrid(np.arange(150), np.arange(150), indexing="ij")
ti_arr = ((np.sin(gx / 6.0) + np.sin((gx + gy) / 10.0)) > 0).astype(float)
source = "synthetic fallback"

ti = gs.TrainingImage(ti_arr, categorical=True)
print(f"TI {ti.shape} ({source}), sand fraction = {ti_arr.mean():.3f}")

###############################################################################
# Take 80 random conditioning points from the training image patterns.

sg_size = 80
rng = np.random.default_rng(0)
cond_x = rng.integers(0, sg_size, 80).astype(float)
cond_y = rng.integers(0, sg_size, 80).astype(float)
cond_val = ti_arr[cond_x.astype(int), cond_y.astype(int)]

###############################################################################
# Simulate with DSBC-style parameters (best-candidate + partial scan).

ds = gs.DirectSampling(ti, n_neighbors=30, scan_fraction=0.2, threshold=0.0)
ds.set_condition([cond_x, cond_y], cond_val)
field = ds([np.arange(sg_size, dtype=float)] * 2, seed=42)

honored = int(
(field[cond_x.astype(int), cond_y.astype(int)] == cond_val).sum()
)
print(f"conditioning honoured: {honored}/{cond_val.size}")

###############################################################################
# Plot the training image crop next to the conditional realization.

cmap = ListedColormap(["#c9a96e", "#2b6cb0"]) # shale / sand
fig, (ax0, ax1) = plt.subplots(1, 2, figsize=(11, 5.5))
ax0.imshow(ti_arr[:sg_size, :sg_size], cmap=cmap, origin="lower")
ax0.set_title("Training image (crop)")
ax1.imshow(field, cmap=cmap, origin="lower")
ax1.scatter(cond_y, cond_x, c=cond_val, cmap=cmap, edgecolors="k", s=18)
ax1.set_title("Conditional DS realization")
fig.tight_layout()
38 changes: 38 additions & 0 deletions examples/13_mps/README.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
Multiple Point Statistics
=========================

Two-point geostatistics (covariance models, kriging, SRFs) describes spatial
structure through pairs of points and a variogram. This is powerful, but it
cannot reproduce *curvilinear* or *connected* features such as meandering
channels, fractures, or other patterns that depend on the joint configuration
of many points at once.

**Multiple Point Statistics (MPS)** addresses this by learning patterns
directly from a **training image (TI)** — an example image deemed
representative of the spatial structure to simulate. Instead of fitting a
variogram, MPS borrows whole patterns from the TI.

GSTools provides the **Direct Sampling (DS)** algorithm
(`Mariethoz et al., 2010 <https://doi.org/10.1029/2008WR007621>`_), together
with the **Direct Sampling Best Candidate (DSBC)** parametrization
(`Juda et al., 2022 <https://doi.org/10.1016/j.acags.2022.100091>`_), through
two classes:

* :any:`TrainingImage` — the MPS model: the training image plus the distance
used to compare patterns (the analogue of a :any:`CovModel`).
* :any:`DirectSampling` — the generator that produces realizations on a
structured grid (the analogue of :any:`SRF`).

The core idea: to fill each grid cell, DS looks at the values already present
around it (its *data event*), scans the training image for a location whose
surroundings look similar enough, and copies that cell's value over.
"Similar enough" is decided by a **distance** between the two surroundings,
controlled by three parameters — the number of neighbours ``n``, the scan
fraction ``f``, and the acceptance threshold ``t`` (with ``t = 0`` giving the
recommended DSBC mode).

The following tutorials build up from a minimal unconditional simulation to
conditioning and continuous variables.

Examples
--------
17 changes: 16 additions & 1 deletion src/gstools/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,21 @@
tools
transform
normalizer
mps

Classes
=======

Multiple Point Statistics
^^^^^^^^^^^^^^^^^^^^^^^^^
Classes for Multiple Point Statistics (MPS) simulations

.. currentmodule:: gstools.mps

.. autosummary::
DirectSampling
TrainingImage

Kriging
^^^^^^^
Swiss-Army-Knife for Kriging. For short cut classes see: :any:`gstools.krige`
Expand Down Expand Up @@ -139,6 +150,7 @@
covmodel,
field,
krige,
mps,
normalizer,
random,
tools,
Expand Down Expand Up @@ -169,6 +181,7 @@
)
from gstools.field import PGS, SRF, CondSRF
from gstools.krige import Krige
from gstools.mps import DirectSampling, TrainingImage
from gstools.tools import (
DEGREE_SCALE,
EARTH_RADIUS,
Expand Down Expand Up @@ -200,7 +213,7 @@

__all__ = ["__version__"]
__all__ += ["covmodel", "field", "variogram", "krige", "random", "tools"]
__all__ += ["transform", "normalizer", "config"]
__all__ += ["transform", "normalizer", "config", "mps"]
__all__ += [
"CovModel",
"SumModel",
Expand Down Expand Up @@ -237,6 +250,8 @@
"SRF",
"CondSRF",
"PGS",
"DirectSampling",
"TrainingImage",
"rotated_main_axes",
"generate_grid",
"generate_st_grid",
Expand Down
Loading
Loading