Skip to contents

Introduction

runoptGPP is an R package developed for optimizing the random walk (path/dispersion) and PCM (distance) components of mass-movement runout simulations. It provides functions to perform grid search optimization, evaluate performance metrics, and visualize results - with support for runoutSIM.

The optimization follows a two-stage approach:

  1. Optimize random walk model for best path simulation.

  2. Optimize PCM model for best runout distance using the previously optimized path.

Performance is evaluated using AUROC for path accuracy, and relative runout distance error for distance modeling (see Goetz et al. 2021, NHESS).

This vignette covers:

  • Installing runoptGPP from GitHub

  • Finding optimal global parameters using grid search

  • Performing spatial cross-validation to assess model sensitivity

Installing runoptGPP

runoptGPP is not currently available on CRAN. To install it directly from GitHub, use:

remotes::install_github("jngtz/runoptGPP")

Loading packages and input data

We begin by loading the required packages, a digital elevation model (DEM), and vector data for debris-flow source points and mapped runout polygons.

# load packages
library(runoutSIM)
library(runoptGPP)
library(terra)
#> Warning: package 'terra' was built under R version 4.4.3
library(sf)
#> Warning: package 'sf' was built under R version 4.4.3

# Load digital elevation model (DEM)
dem <- rast("Data/elev_fillsinks_WangLiu.tif")

# Compute hillshade for visualization 
slope <- terrain(dem, "slope", unit="radians")
aspect <- terrain(dem, "aspect", unit="radians")
hill <- round(shade(slope, aspect, 40, 270, normalize = TRUE))

# Load debris flow runout source points and polygons
source_points <- st_read("Data/debris_flow_source_points.shp")
runout_polygons <- st_read("Data/debris_flow_runout_polygons.shp")

# Plot input data
plot(hill, col=grey(150:255/255), legend=FALSE,
     mar=c(2,2,1,4))
plot(dem, col=viridis::mako(100), alpha = .5, add = TRUE)
plot(st_geometry(runout_polygons), add = TRUE)

Optimizing Random Walk Parameters

Step 1: Define the parameter search space

In our first stage of optimization, we will set up vectors to define our grid search space for the random walk runout path model component. This is an exhaustive list of the parameters that will be tested.

To simulate runout paths, we define a parameter grid for:

  • rwexp: divergence exponent (controls spread)

  • rwper: persistence factor (controls directionality)

  • rwslp: slope threshold (affects frictional resistance)

steps <- 5
rwexp_vec <- seq(1.3, 3, len=steps) # Expondent of divergence
rwper_vec <- seq(1.5, 2, len=steps) # Persistence factor
rwslp_vec <- seq(20, 40, len=steps) # Slope threshold

rwexp_vec
#> [1] 1.300 1.725 2.150 2.575 3.000
rwper_vec
#> [1] 1.500 1.625 1.750 1.875 2.000
rwslp_vec
#> [1] 20 25 30 35 40

We use the foreach and doParallel packages to speed up model runs by distributing them across multiple CPU cores. For each mapped runout polygon, all combinations of parameters are evaluated using the rwGridsearch() function.

Depending on the number of runout events and the grid search space size, this computation can take some time. The rwGridsearch function has an option save_res = TRUE that allows us to save the grid search results for each runout individually. This is useful in the case where processing fails since it avoids the need to re-run the grid search for all runout events.

In this example we are creating a limited grid search space (i.e. only 444 possible parameter combinations) to reduce computational time.

library(foreach)
library(raster)
polyid_vec <- 1:nrow(source_points)

n_cores <- parallel::detectCores() -2
cl <- parallel::makeCluster(n_cores)
doParallel::registerDoParallel(cl)

#Coerce dem to raster() dem.
dem <- raster(dem)

rw_gridsearch_multi  <-
  foreach(poly_id=polyid_vec, .packages=c('terra','raster', 'ROCR', 'sf', 'runoptGPP', 'runoutSIM')) %dopar% {
    
    rwGridsearch(dem, slide_plys = runout_polygons, slide_src = source_points,
                 slide_id = poly_id, slp_v = rwslp_vec, ex_v = rwexp_vec, 
                 per_v = rwper_vec, gpp_iter = 1000, buffer_ext = 500, buffer_source = NULL,
                 save_res = TRUE, plot_eval = FALSE)
    
  }

parallel::stopCluster(cl)

Step 3: Get the optimal parameters

We extract the optimal parameters across all runouts by aggregating their performance (here, using the median AUROC).

rw_opt <- rwGetOpt(rw_gridsearch_multi, 
                   measure = median)
rw_opt
#>   rw_slp_opt rw_exp_opt rw_per_opt  rw_auroc
#> 1         40          3      1.875 0.8748927

Optimizing PCM Parameters (Runout Distance)

Step 1: Define the parameter space

We now define grid vectors for:

  • pcmmd: mass-to-drag ratio (affects how far material travels)
  • pcmmu: sliding friction coefficient

In this example we are creating a limited grid search space (i.e. only 20*10 possible parameter combinations) to reduce computational time.

# Define PCM model grid seach space
pcmmd_vec <- seq(20, 140, by=20) # mass-to-drag ratio (m)
pcmmu_vec <- seq(0.01, 0.1, by=0.01) # sliding friction coefficient 

pcmmd_vec
#> [1]  20  40  60  80 100 120 140

Step 2: Run PCM grid search in parallel

Here, we re-use parallelization to optimize the PCM model using the previously selected random walk parameters. Each runout is simulated independently.

# Run using parallelization
cl <- parallel::detectCores() -2
doParallel::registerDoParallel(cl)

pcm_gridsearch_multi <-
  foreach(poly_id=polyid_vec, .packages=c('terra','raster', 'ROCR', 'sf', 'runoptGPP', 'runoutSIM')) %dopar% {
    
    pcmGridsearch(dem,
                  slide_plys = runout_polygons, slide_src = source_points, 
                  slide_id = poly_id, rw_slp = 40, rw_ex = 3, rw_per = 1.9, 
                  pcm_mu_v = pcmmu_vec, pcm_md_v = pcmmd_vec,
                  gpp_iter = 1000,
                  buffer_ext = NULL, buffer_source = NULL,
                  predict_threshold = 0.5, save_res = FALSE)
  }

parallel::stopCluster(cl)

Step 3: Extract optimal PCM parameters

We find the PCM parameter set that results in the lowest median relative error across all simulated runouts. This gives us a regionally optimized distance model. The median model performances across grid search space can be visualized using plot_opt=TRUE.

pcmGetOpt(pcm_gridsearch_multi, 
          performance = "relerr", 
          measure = "median", 
          plot_opt = TRUE)
#>   pcm_mu pcm_md median_relerr median_auroc
#> 1   0.04     40    0.03576527    0.9042586

Spatial Cross-Validation of Parameters

To evaluate the sensitivity of our optimal parameters to spatial sampling, we apply spatial cross-validation using k-means partitioning from the sperrorest package. This simulates training/testing under different spatial configurations.

We run 5-fold cross-validation with 10 repetitions using the results from the random walk and PCM grid searches.

par(mfrow = c(1,2))
rw_spcv <- rwSPCV(x = rw_gridsearch_multi, slide_plys = runout_polygons,
                  n_folds = 5, repetitions = 10)

freq_rw <- rwPoolSPCV(rw_spcv, plot_freq = TRUE)

pcm_spcv <- pcmSPCV(pcm_gridsearch_multi, slide_plys = runout_polygons,
                    n_folds = 5, repetitions = 10, from_save = FALSE)

freq_pcm <- pcmPoolSPCV(pcm_spcv, plot_freq = TRUE)

freq_rw
#>   slp   per exp freq rel_freq median_auroc   iqr_auroc
#> 1  40 1.500   3    9       18    0.8843889 0.020861264
#> 2  40 1.625   3    7       14    0.8215202 0.080693418
#> 3  40 1.750   3    7       14    0.9083694 0.005970888
#> 4  40 1.875   3   24       48    0.8619509 0.029733802
#> 5  40 2.000   3    3        6    0.8119542 0.000000000
freq_pcm
#>     mu  md freq rel_freq    rel_err  iqr_relerr
#> 1 0.04  40   35       70 0.05290689 0.049399707
#> 2 0.05  40    9       18 0.03205946 0.004627829
#> 3 0.07  40    3        6 0.07844790 0.000000000
#> 4 0.09 100    3        6 0.04196954 0.000000000