Data selection#

To access Argo data with a DataFetcher, you need to define how to select your data of interest.

argopy provides 3 different data selection methods:

To show how these methods (i.e. access points) work, letโ€™s first create a DataFetcher:

In [1]: import argopy

In [2]: f = argopy.DataFetcher()

In [3]: f
Out[3]: 
<datafetcher.erddap> 'No access point initialised'
Available access points: float, profile, region
Performances: cache=False, parallel=False
User mode: standard
Dataset: phy

By default, argopy will load the phy dataset (see here for details), in standard user mode (see here for details) from the erddap data source (see here for details).

The standard DataFetcher print indicates all available access points, and here, that none is selected yet.

๐Ÿ—บ For a space/time domain#

Use the fetcher access point argopy.DataFetcher.region() to select data for a rectangular space/time domain. For instance, to retrieve data from 75W to 45W, 20N to 30N, 0db to 10db and from January to May 2011:

In [4]: f = f.region([-75, -45, 20, 30, 0, 10, '2011-01', '2011-06'])

In [5]: f.data
Out[5]: 
<xarray.Dataset>
Dimensions:          (N_POINTS: 998)
Coordinates:
  * N_POINTS         (N_POINTS) int64 0 1 2 3 4 5 6 ... 992 993 994 995 996 997
    LATITUDE         (N_POINTS) float64 24.54 24.54 25.04 ... 26.67 24.96 24.96
    LONGITUDE        (N_POINTS) float64 -45.14 -45.14 -51.58 ... -50.4 -50.4
    TIME             (N_POINTS) datetime64[ns] 2011-01-01T11:49:19 ... 2011-0...
Data variables: (12/15)
    CYCLE_NUMBER     (N_POINTS) int64 23 23 10 10 10 10 10 ... 1 5 2 10 10 38 38
    DATA_MODE        (N_POINTS) <U1 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D' 'D'
    DIRECTION        (N_POINTS) <U1 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A' 'A'
    PLATFORM_NUMBER  (N_POINTS) int64 1901463 1901463 ... 1901463 1901463
    POSITION_QC      (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    PRES             (N_POINTS) float64 5.0 10.0 2.0 4.0 ... 5.12 9.42 5.0 10.0
    ...               ...
    PSAL_ERROR       (N_POINTS) float32 0.01 0.01 0.01 ... 0.01 0.01091 0.01182
    PSAL_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    TEMP             (N_POINTS) float64 24.08 24.08 24.03 ... 25.64 25.1 24.79
    TEMP_ERROR       (N_POINTS) float32 0.002 0.002 0.002 ... 0.0025 0.002 0.002
    TEMP_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    TIME_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
Attributes:
    DATA_ID:              ARGO
    DOI:                  http://doi.org/10.17882/42182
    Fetched_from:         https://erddap.ifremer.fr/erddap
    Fetched_by:           docs
    Fetched_date:         2024/04/22
    Fetched_constraints:  [x=-75.00/-45.00; y=20.00/30.00; z=0.0/10.0; t=2011...
    Fetched_uri:          ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
    history:              Variables filtered according to DATA_MODE; Variable...

You can now see that the standard DataFetcher print has been updated with information for the data selection.

Note

  • The constraint on time is not mandatory: if not specified, the fetcher will return all data available in this region.

  • The last time bound is exclusive: thatโ€™s why here we specify June to retrieve data collected in May.

๐Ÿค– For one or more floats#

If you know the Argo float unique identifier number called a WMO number you can use the fetcher access point DataFetcher.float() to specify one or more float WMO platform numbers to select.

For instance, to select data for float WMO 6902746:

In [6]: f = f.float(6902746)

In [7]: f.data
Out[7]: 
<xarray.Dataset>
Dimensions:          (N_POINTS: 12518)
Coordinates:
  * N_POINTS         (N_POINTS) int64 0 1 2 3 4 ... 12514 12515 12516 12517
    LATITUDE         (N_POINTS) float64 20.08 20.08 20.08 ... 16.67 16.67 16.67
    LONGITUDE        (N_POINTS) float64 -60.17 -60.17 -60.17 ... -77.13 -77.13
    TIME             (N_POINTS) datetime64[ns] 2017-07-06T14:49:00 ... 2020-0...
Data variables: (12/15)
    CYCLE_NUMBER     (N_POINTS) int64 1 1 1 1 1 1 1 ... 117 117 117 117 117 117
    DATA_MODE        (N_POINTS) <U1 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D' 'D'
    DIRECTION        (N_POINTS) <U1 'D' 'D' 'D' 'D' 'D' ... 'A' 'A' 'A' 'A' 'A'
    PLATFORM_NUMBER  (N_POINTS) int64 6902746 6902746 ... 6902746 6902746
    POSITION_QC      (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    PRES             (N_POINTS) float64 9.0 14.0 24.0 ... 1.514e+03 1.526e+03
    ...               ...
    PSAL_ERROR       (N_POINTS) float64 0.01003 0.01003 0.01003 ... 0.01 0.01
    PSAL_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    TEMP             (N_POINTS) float64 28.04 28.03 28.02 ... 4.299 4.254 4.238
    TEMP_ERROR       (N_POINTS) float64 0.002 0.002 0.002 ... 0.002 0.002 0.002
    TEMP_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    TIME_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
Attributes:
    DATA_ID:              ARGO
    DOI:                  http://doi.org/10.17882/42182
    Fetched_from:         https://erddap.ifremer.fr/erddap
    Fetched_by:           docs
    Fetched_date:         2024/04/22
    Fetched_constraints:  WMO6902746
    Fetched_uri:          ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
    history:              Variables filtered according to DATA_MODE; Variable...

To fetch data for a collection of floats, input them in a list:

In [8]: f = f.float([6902746, 6902755])

In [9]: f.data
Out[9]: 
<xarray.Dataset>
Dimensions:          (N_POINTS: 31289)
Coordinates:
  * N_POINTS         (N_POINTS) int64 0 1 2 3 4 ... 31285 31286 31287 31288
    LATITUDE         (N_POINTS) float64 20.08 20.08 20.08 ... 43.81 43.81 43.81
    LONGITUDE        (N_POINTS) float64 -60.17 -60.17 -60.17 ... -28.85 -28.85
    TIME             (N_POINTS) datetime64[ns] 2017-07-06T14:49:00 ... 2023-0...
Data variables: (12/15)
    CYCLE_NUMBER     (N_POINTS) int64 1 1 1 1 1 1 1 ... 177 177 177 177 177 177
    DATA_MODE        (N_POINTS) <U1 'D' 'D' 'D' 'D' 'D' ... 'A' 'A' 'A' 'A' 'A'
    DIRECTION        (N_POINTS) <U1 'D' 'D' 'D' 'D' 'D' ... 'A' 'A' 'A' 'A' 'A'
    PLATFORM_NUMBER  (N_POINTS) int64 6902746 6902746 ... 6902755 6902755
    POSITION_QC      (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    PRES             (N_POINTS) float64 9.0 14.0 24.0 34.0 ... 278.0 285.0 296.0
    ...               ...
    PSAL_ERROR       (N_POINTS) float32 0.01003 0.01003 0.01003 ... nan nan nan
    PSAL_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    TEMP             (N_POINTS) float64 28.04 28.03 28.02 ... 13.5 13.45 13.49
    TEMP_ERROR       (N_POINTS) float32 0.002 0.002 0.002 0.002 ... nan nan nan
    TEMP_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    TIME_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
Attributes:
    DATA_ID:              ARGO
    DOI:                  http://doi.org/10.17882/42182
    Fetched_from:         https://erddap.ifremer.fr/erddap
    Fetched_by:           docs
    Fetched_date:         2024/04/22
    Fetched_constraints:  WMO6902746;WMO6902755
    Fetched_uri:          ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
    history:              Variables filtered according to DATA_MODE; Variable...

โš“ For one or more profiles#

Use the fetcher access point argopy.DataFetcher.profile() to specify the float WMO platform number and the profile cycle number(s) to retrieve profiles for.

For instance, to retrieve data for the 12th profile of float WMO 6902755:

In [10]: f = f.profile(6902755, 12)

In [11]: f.data
Out[11]: 
<xarray.Dataset>
Dimensions:          (N_POINTS: 107)
Coordinates:
  * N_POINTS         (N_POINTS) int64 0 1 2 3 4 5 6 ... 101 102 103 104 105 106
    LATITUDE         (N_POINTS) float64 63.68 63.68 63.68 ... 63.68 63.68 63.68
    LONGITUDE        (N_POINTS) float64 -28.81 -28.81 -28.81 ... -28.81 -28.81
    TIME             (N_POINTS) datetime64[ns] 2018-10-19T23:52:00 ... 2018-1...
Data variables: (12/15)
    CYCLE_NUMBER     (N_POINTS) int64 12 12 12 12 12 12 12 ... 12 12 12 12 12 12
    DATA_MODE        (N_POINTS) <U1 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D' 'D'
    DIRECTION        (N_POINTS) <U1 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A' 'A'
    PLATFORM_NUMBER  (N_POINTS) int64 6902755 6902755 ... 6902755 6902755
    POSITION_QC      (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    PRES             (N_POINTS) float64 3.0 4.0 5.0 ... 1.713e+03 1.732e+03
    ...               ...
    PSAL_ERROR       (N_POINTS) float64 0.01 0.01 0.01 0.01 ... 0.01 0.01 0.01
    PSAL_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    TEMP             (N_POINTS) float64 7.598 7.599 7.602 ... 3.588 3.549 3.536
    TEMP_ERROR       (N_POINTS) float64 0.002 0.002 0.002 ... 0.002 0.002 0.002
    TEMP_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    TIME_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
Attributes:
    DATA_ID:              ARGO
    DOI:                  http://doi.org/10.17882/42182
    Fetched_from:         https://erddap.ifremer.fr/erddap
    Fetched_by:           docs
    Fetched_date:         2024/04/22
    Fetched_constraints:  WMO6902755_CYC12
    Fetched_uri:          ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
    history:              Variables filtered according to DATA_MODE; Variable...

To fetch data for more than one profile, input them in a list:

In [12]: f = f.profile(6902755, [3, 12])

In [13]: f.data
Out[13]: 
<xarray.Dataset>
Dimensions:          (N_POINTS: 215)
Coordinates:
  * N_POINTS         (N_POINTS) int64 0 1 2 3 4 5 6 ... 209 210 211 212 213 214
    LATITUDE         (N_POINTS) float64 59.72 59.72 59.72 ... 63.68 63.68 63.68
    LONGITUDE        (N_POINTS) float64 -31.24 -31.24 -31.24 ... -28.81 -28.81
    TIME             (N_POINTS) datetime64[ns] 2018-07-22T00:03:00 ... 2018-1...
Data variables: (12/15)
    CYCLE_NUMBER     (N_POINTS) int64 3 3 3 3 3 3 3 3 ... 12 12 12 12 12 12 12
    DATA_MODE        (N_POINTS) <U1 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D' 'D'
    DIRECTION        (N_POINTS) <U1 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A' 'A'
    PLATFORM_NUMBER  (N_POINTS) int64 6902755 6902755 ... 6902755 6902755
    POSITION_QC      (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    PRES             (N_POINTS) float64 3.0 4.0 5.0 ... 1.713e+03 1.732e+03
    ...               ...
    PSAL_ERROR       (N_POINTS) float64 0.01 0.01 0.01 0.01 ... 0.01 0.01 0.01
    PSAL_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    TEMP             (N_POINTS) float64 8.742 8.743 8.744 ... 3.588 3.549 3.536
    TEMP_ERROR       (N_POINTS) float64 0.002 0.002 0.002 ... 0.002 0.002 0.002
    TEMP_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
    TIME_QC          (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
Attributes:
    DATA_ID:              ARGO
    DOI:                  http://doi.org/10.17882/42182
    Fetched_from:         https://erddap.ifremer.fr/erddap
    Fetched_by:           docs
    Fetched_date:         2024/04/22
    Fetched_constraints:  WMO6902755_CYC3_CYC12
    Fetched_uri:          ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
    history:              Variables filtered according to DATA_MODE; Variable...

Note

You can chain data selection and fetching in a single command line:

f = argopy.DataFetcher().region([-75, -45, 20, 30, 0, 10, '2011-01-01', '2011-06']).load()
f.data