subsettools.subsetting

Functions to subset gridded files from national datasets in HydroData.

The following functions can be used to subset gridded input files to set up a ParFlow simulation.

  • subset static model inputs

  • subset meteorological forcings

  • subset initial pressure data

  • subset gridded CLM inputs (vegm)

Functions

subset_static(ij_bounds, dataset, write_dir[, var_list])

Subset static input files from national datasets in HydroData.

subset_press_init(ij_bounds, dataset, date, write_dir)

Subset a pressure file from a national dataset in HydroData.

subset_forcing(ij_bounds, grid, start, end, dataset, ...)

Subset forcing files from national datasets in HydroData.

subsettools.subsetting.subset_static(ij_bounds, dataset, write_dir, var_list=('slope_x', 'slope_y', 'pf_indicator', 'mannings', 'pf_flowbarrier', 'pme', 'ss_pressure_head'))[source]

Subset static input files from national datasets in HydroData.

The subset values will be written as ParFlow binary files (pfbs) in write_dir. By default the following variables will be subset.

  • Slope in the east/west direction (slope_x)

  • Slope in the north/south direction (slope_y)

  • Subsurface units indicator file (pf_indicator)

  • Mannings roughness coefficients (mannings)

  • Depth to bedrock (pf_flowbarrier)

  • Long term average precipitation minus evaporation (i.e. recharge) (pme)

  • Steady state pressure head used to initialize transient simulations (ss_pressure_head)

Note that some datasets might not contain all 7 static input variables. In that case, the subset_static function is going to raise a ValueError for any variables that do not exist in the dataset. The default variable list contains the necessary static variables for the CONUS2 grid. For CONUS1-based datasets, “mannings” and “pf_flowbarrier” should be removed from the list.

Parameters:
  • ij_bounds (tuple[int]) – bounding box for subset. This should be given as i,j index values where 0,0 is the lower left hand corner of a domain. ij_bounds are given relative to whatever grid is being used for the subset.

  • dataset (str) – static inputs dataset name from the HydroData catalog e.g. “conus1_domain”

  • write_dir (str) – directory where the subset files will be written

  • var_list (tuple[str]) – tuple of variables to subset from the dataset. By default all 7 variables above will be subset. The user can specify a subset of these variables or list additional variables that are available in their dataset of choice.

Returns:

A dictionary mapping the static variable names to the corresponding file paths where the subset data were written.

Example:

# Subsetting static variables for a CONUS1 workflow
# We need to remove "pf_flowbarrier" and "mannings" from the list
filepaths = subset_static(
    ij_bounds=(375, 239, 487, 329),
    dataset="conus1_domain",
    write_dir="/path/to/your/chosen/directory",
    var_list=("slope_x", "slope_y", "pf_indicator", "pme",
              "ss_pressure_head")
)

# Subsetting static variables for a CONUS2 workflow
# Note that we can use the default var_list here
filepaths = subset_static(
    ij_bounds=(3701, 1544, 3792, 1633),
    dataset="conus2_domain",
    write_dir="/path/to/your/chosen/directory",
)
subsettools.subsetting.subset_press_init(ij_bounds, dataset, date, write_dir, time_zone='UTC')[source]

Subset a pressure file from a national dataset in HydroData.

This function will select the pressure file for midnight on the date provided and subset the selected pressure file to the ij_bounds provided. The subset data will be written out as a ParFlow binary file (pfb) to be used as an initial pressure file for a ParFlow simulation.

Parameters:
  • ij_bounds (tuple[int]) – bounding box for subset. This should be given as i,j index values where 0,0 is the lower left hand corner of a domain. ij_bounds are given relative to whatever grid is being used for the subset.

  • dataset (str) – dataset name from the HydroData catalog that the pressure file will be subset from e.g. “conus1_baseline_mod”

  • date (str) – The date of the pressure file that you would like to subset, in the form ‘yyyy-mm-dd’

  • write_dir (str) – directory where the subset file will be written

  • time_zone (str) – timezone information for subset date. Data will be subset at midnight in the specified timezone. Defaults to “UTC”.

Returns:

The filepath of the subset file, which includes datetime information, so that it can be used by later functions (e.g. edit_runscript_for_subset).

Example:

filepath = subset_press_init(
    ij_bounds=(375, 239, 487, 329),
    dataset="conus1_baseline_mod",
    date="2005-12-15",
    write_dir="/path/to/your/chosen/directory",
    time_zone="EST"
)
subsettools.subsetting.subset_forcing(ij_bounds, grid, start, end, dataset, write_dir, time_zone='UTC', forcing_vars=('precipitation', 'downward_shortwave', 'downward_longwave', 'specific_humidity', 'air_temp', 'atmospheric_pressure', 'east_windspeed', 'north_windspeed'))[source]

Subset forcing files from national datasets in HydroData.

Subset forcing data will be written out as pfb files formatted for a ParFlow run with 24 hours per forcing file. Per ParFlow-CLM convention separate files will be written for each variable following the standard clm variable naming convention.

Forcing file outputs will be numbered starting with 0000 and data will start at midnight local time for the timezone that has been provided. If no timezone is provided it will default to midnight UTC.

Parameters:
  • ij_bounds (tuple[int]) – bounding box for subset. This should be given as i,j index values where 0,0 is the lower left hand corner of a domain. ij_bounds are given relative to whatever grid is being used for the subset.

  • grid (str) – The spatial grid that the ij indices are calculated relative to and that the subset data will be returned on. Possible values: “conus1” or “conus2”

  • start (str) – start date (inclusive), in the form ‘yyyy-mm-dd’

  • end (str) – end date (exlusive), in the form ‘yyyy-mm-dd’

  • dataset (str) – forcing dataset name from the HydroData catalog that the forcing files will be subset from e.g. “NLDAS2”.

  • write_dir (str) – directory where the subset files will be written

  • time_zone (str) – timezone information for start and end dates. Data will be subset starting at midnight in the specified timezone. Defaults to “UTC”.

  • forcing_vars (tuple[str]) – tuple of forcing variables to subset. By default all 8 variables needed to run ParFlow-CLM will be subset.

Returns:

A dictionary mapping the forcing variable names to the corresponding file paths where the subset data were written.

Example:

filepaths = subset_forcing(
    ij_bounds=(1225, 1738, 1347, 1811),
    grid="conus2",
    start="2005-11-01",
    end="2005-12-01",
    dataset="CW3E",
    write_dir="/path/to/your/chosen/directory",
    forcing_vars=("precipitation", "air_temp")
)