Skip to content

pygmt.nearneighbor: Grid table data using a "Nearest neighbor" algorithm #4561

@seisman

Description

@seisman

This issue serves as the central place for discussing and tracking the implementation of the pygmt.nearneighbor function in PyGMT. The issue will be closed when the initial implementation is complete. Progress is tracked at PyGMT: Wrapping GMT modules.

Documentation

GMT Option Flags and Modifiers

☑️: Implemented; ⬜: To be implemented/discussed; Strikethrough: Won't implement.

  • ☑️ -E (empty): Set the value assigned to empty nodes [Default is NaN].
  • ☑️ -G (outgrid): Output grid file name (positional argument outgrid in PyGMT).
  • ☑️ -I (spacing): x_inc[/y_inc]. Grid spacing.
  • ☑️ -N (sectors): sectors[+m min_sectors]|n. Divide the search area into sectors sectors and require a minimum number of occupied sectors before computing a node value.
  • ☑️ -R (region): Output grid region.
  • ☑️ -S (search_radius): Search radius that determines which data points are considered close to a node.
  • ☑️ -V (verbose): Verbosity level.
  • -W: Read a 4th input column with data weights; the weights are used in the averaging.
  • -X/-Y: Use Figure.shift_origin instead.
  • ☑️ -a (aspatial): Aspatial column assignment.
  • ☑️ -b (binary): Binary input/output.
  • ☑️ -d (nodata): Replace NaN with a specified nodata value on input/output.
  • ☑️ -e (find): Pattern matching to select input rows.
  • ☑️ -f (coltypes): Column data types.
  • ☑️ -g (gap): Gap detection.
  • ☑️ -h (header): Read/write header records.
  • ☑️ -i (incols): Select input columns.
  • ☑️ -r (registration): Set grid node registration to gridline or pixel.
  • ☑️ -w (wrap): Wrap repeated cycles.
  • --PAR=value: Use pygmt.config instead.

Notes on Input Formats

  • data: Accepts a file path, 2-D numpy.ndarray, or pandas.DataFrame with at least 3 columns (x, y, z). Alternatively, provide x, y, and z as separate 1-D arrays.
  • outgrid: If not set, returns an xarray.DataArray; if set to a file path, writes the grid to disk and returns None.
  • Both spacing and search_radius are required parameters.

Linked Pull Requests

Related Issues and Discussions

  • pygmt.nearneighbor is appropriate when the input data are dense and approximately uniformly distributed; for sparser or irregularly distributed data, pygmt.surface (minimum curvature interpolation) may be preferable.
  • The sectors parameter (-N) can be set to "n" to delegate the gridding to GDAL's nearest-neighbour algorithm, which handles very large datasets efficiently.
  • The search radius (-S) must be specified in the same units as the data coordinates; append a unit character (e.g., "10m" for arc-minutes) when using geographic coordinates.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions