7.2.1. tobac.themes.tobac_v1.feature_detection

Provide feature detection.

This module can work with any two-dimensional field either present or derived from the input data. To identify the features, contiguous regions above or below a threshold are determined and labelled individually. To describe the specific location of the feature at a specific point in time, different spatial properties are used to describe the identified region. [2]

References

[2](1, 2) Heikenfeld, M., Marinescu, P. J., Christensen, M., Watson-Parris, D., Senf, F., van den Heever, S. C., and Stier, P.: tobac v1.0: towards a flexible framework for tracking and analysis of clouds in diverse datasets, Geosci. Model Dev. Discuss., https://doi.org/10.5194/gmd-2019-105 , in review, 2019, 6f.

Functions

feature_detection_multithreshold(field_in, dxy) Perform feature detection based on contiguous regions.
feature_detection_multithreshold_timestep(…) Find features in each timestep.
feature_detection_threshold(data_i, i_time) Find features based on individual threshold value.
feature_position(hdim1_indices, …) Determine feature position.
filter_min_distance(features, dxy, min_distance) Perform feature detection based on contiguous regions.
remove_parents(features_thresholds, …) Remove parents of newly detected feature regions.
test_overlap(region_inner, region_outer) Test for overlap between two regions
tobac.themes.tobac_v1.feature_detection.feature_detection_multithreshold(field_in, dxy, threshold=None, min_num=0, target='maximum', position_threshold='center', sigma_threshold=0.5, n_erosion_threshold=0, n_min_threshold=0, min_distance=0, feature_number_start=1)

Perform feature detection based on contiguous regions.

The regions are above/below a threshold.

Parameters:
  • field_in (iris.cube.Cube) – 2D field to perform the tracking on (needs to have coordinate ‘time’ along one of its dimensions),
  • dxy (float) – Grid spacing of the input data.
  • thresholds (list of floats, optional) – Threshold values used to select target regions to track. Default is None.
  • target ({‘maximum’, ‘minimum’}, optional) – Flag to determine if tracking is targetting minima or maxima in the data. Default is ‘maximum’.
  • position_threshold ({‘center’, ‘extreme’, ‘weighted_diff’,) – ‘weighted_abs’}, optional Flag choosing method used for the position of the tracked feature. Default is ‘center’.
  • sigma_threshold (float, optional) – Standard deviation for intial filtering step. Default is 0.5.
  • n_erosion_threshold (int, optional) – Number of pixel by which to erode the identified features. Default is 0.
  • n_min_threshold (int, optional) – Minimum number of identified features. Default is 0.
  • min_distance (float, optional) – Minimum distance between detected features. Default is 0.
  • feature_number_start (int, optional) – Feature id to start with. Default is 1.
Returns:

features – Detected features.

Return type:

pandas.DataFrame

tobac.themes.tobac_v1.feature_detection.feature_detection_multithreshold_timestep(data_i, i_time, threshold=None, min_num=0, target='maximum', position_threshold='center', sigma_threshold=0.5, n_erosion_threshold=0, n_min_threshold=0, min_distance=0, feature_number_start=1)

Find features in each timestep.

Based on iteratively finding regions above/below a set of thresholds. Smoothing the input data with the Gaussian filter makes output more reliable. [2]

Parameters:
  • data_i (iris.cube.Cube) – 2D field to perform the feature detection (single timestep) on.
  • threshold (float, optional) – Threshold value used to select target regions to track. Default is None.
  • min_num (int, optional) – Default is 0.
  • target ({‘maximum’, ‘minimum’}, optinal) – Flag to determine if tracking is targetting minima or maxima in the data. Default is ‘maximum’.
  • position_threshold ({‘center’, ‘extreme’, ‘weighted_diff’,) – ‘weighted_abs’}, optional Flag choosing method used for the position of the tracked feature. Default is ‘center’.
  • sigma_threshold (float, optional) – Standard deviation for intial filtering step. Default is 0.5.
  • n_erosion_threshold (int, optional) – Number of pixel by which to erode the identified features. Default is 0.
  • n_min_threshold (int, optional) – Minimum number of identified features. Default is 0.
  • min_distance (float, optional) – Minimum distance between detected features. Default is 0.
  • feature_number_start (int, optional) – Feature id to start with. Default is 1.
Returns:

features_threshold – Detected features for individual timestep.

Return type:

pandas.DataFrame

Notes

unsure about feature_number_start

tobac.themes.tobac_v1.feature_detection.feature_detection_threshold(data_i, i_time, threshold=None, min_num=0, target='maximum', position_threshold='center', sigma_threshold=0.5, n_erosion_threshold=0, n_min_threshold=0, min_distance=0, idx_start=0)

Find features based on individual threshold value.

Parameters:
  • data_i (iris.cube.Cube) – 2D field to perform the feature detection (single timestep) on.

  • i_time (int) – Number of the current timestep.

  • threshold (float, optional) –

    Threshold value used to select target regions to track. Default

    is None.

  • target ({‘maximum’, ‘minimum’}, optional) – Flag to determine if tracking is targetting minima or maxima in the data. Default is ‘maximum’.

  • position_threshold ({‘center’, ‘extreme’, ‘weighted_diff’,) – ‘weighted_abs’}, optional Flag choosing method used for the position of the tracked feature. Default is ‘center’.

  • sigma_threshold (float, optional) – Standard deviation for intial filtering step. Default is 0.5.

  • n_erosion_threshold (int, optional) – Number of pixel by which to erode the identified features. Default is 0.

  • n_min_threshold (int, optional) – Minimum number of identified features. Default is 0.

  • min_distance (float, optional) – Minimum distance between detected features. Default is 0.

  • idx_start (int, optional) – Feature id to start with. Default is 0.

Returns:

  • features_threshold (pandas DataFrame) – Detected features for individual threshold.
  • regions (dict) – Dictionary containing the regions above/below threshold used for each feature (feature ids as keys).

tobac.themes.tobac_v1.feature_detection.feature_position(hdim1_indices, hdim2_indeces, region, track_data, threshold_i, position_threshold, target)

Determine feature position.

Parameters:
  • hdim1_indices, hdim2_indices (list)
  • region (list) – 2-element tuples.
  • track_data (numpy.ndarray) – 2D numpy array containing the data.
  • threshold_i (float)
  • position_threshold (str)
  • target ({‘maximum’, ‘minimum’}) – Flag to determine if tracking is targetting minima or maxima in the data.
Returns:

hdim1_index, hdim2_index – Feature position along 1st and 2nd horizontal dimension.

Return type:

float

Notes

need more descriptions

tobac.themes.tobac_v1.feature_detection.filter_min_distance(features, dxy, min_distance)

Perform feature detection based on contiguous regions.

Regions are above/below a threshold.

Parameters:
  • features (pandas.DataFrame)
  • dxy (float) – Grid spacing of the input data.
  • min_distance (float, optional) – Minimum distance between detected features.
Returns:

features – Detected features.

Return type:

pandas.DataFrame

tobac.themes.tobac_v1.feature_detection.remove_parents(features_thresholds, regions_i, regions_old)

Remove parents of newly detected feature regions.

Remove features where its regions surround newly detected feature regions.

Parameters:
  • features_thresholds (pandas.DataFrame) – Dataframe containing detected features.
  • regions_i (dict) – Dictionary containing the regions above/below threshold for the newly detected feature (feature ids as keys).
  • regions_old (dict) – Dictionary containing the regions above/below threshold from previous threshold (feature ids as keys).
Returns:

features_thresholds – Dataframe containing detected features excluding those that are superseded by newly detected ones.

Return type:

pandas.DataFrame

tobac.themes.tobac_v1.feature_detection.test_overlap(region_inner, region_outer)

Test for overlap between two regions

(probably scope for further speedup here)

Parameters:region_inner region_outer (list) – List of 2-element tuples defining the indeces of all cells in the region.
Returns:overlap – True if there are any shared points between the two regions.
Return type:bool

Notes

rework extended summary unsure about description of region_inner, region_outer