Data selection#
To access Argo data with a DataFetcher, you need to define how to select your data of interest.
argopy provides 3 different data selection methods:
To show how these methods (i.e. access points) work, letβs first create a DataFetcher:
In [1]: import argopy
In [2]: f = argopy.DataFetcher()
In [3]: f
Out[3]:
<datafetcher.erddap> 'No access point initialised'
Available access points: float, profile, region
Performances: cache=False, parallel=False
User mode: standard
Dataset: phy
By default, argopy will load the phy dataset (see here for details), in standard user mode (see here for details) from the erddap data source (see here for details).
The standard DataFetcher print indicates all available access points, and here, that none is selected yet.
πΊ For a space/time domain#
Use the fetcher access point argopy.DataFetcher.region() to select data for a rectangular space/time domain. For instance, to retrieve data from 75W to 45W, 20N to 30N, 0db to 10db and from January to May 2011:
In [4]: f = f.region([-75, -45, 20, 30, 0, 10, '2011-01', '2011-06'])
In [5]: f.data
Out[5]:
<xarray.Dataset> Size: 120kB
Dimensions: (N_POINTS: 998)
Coordinates:
* N_POINTS (N_POINTS) int64 8kB 0 1 2 3 4 5 ... 993 994 995 996 997
LATITUDE (N_POINTS) float64 8kB 24.54 24.54 25.04 ... 24.96 24.96
LONGITUDE (N_POINTS) float64 8kB -45.14 -45.14 -51.58 ... -50.4 -50.4
TIME (N_POINTS) datetime64[ns] 8kB 2011-01-01T11:49:19 ... 20...
Data variables: (12/15)
CYCLE_NUMBER (N_POINTS) int64 8kB 23 23 10 10 10 10 ... 5 2 10 10 38 38
DATA_MODE (N_POINTS) <U1 4kB 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D'
DIRECTION (N_POINTS) <U1 4kB 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_POINTS) int64 8kB 1901463 1901463 ... 1901463 1901463
POSITION_QC (N_POINTS) int64 8kB 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
PRES (N_POINTS) float32 4kB 5.0 10.0 2.0 4.0 ... 9.42 5.0 10.0
... ...
PSAL_ERROR (N_POINTS) float32 4kB 0.01 0.01 0.01 ... 0.01091 0.01182
PSAL_QC (N_POINTS) int64 8kB 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TEMP (N_POINTS) float32 4kB 24.08 24.08 24.03 ... 25.1 24.79
TEMP_ERROR (N_POINTS) float32 4kB 0.002 0.002 0.002 ... 0.002 0.002
TEMP_QC (N_POINTS) int64 8kB 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TIME_QC (N_POINTS) int64 8kB 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://erddap.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2024/09/23
Fetched_constraints: [x=-75.00/-45.00; y=20.00/30.00; z=0.0/10.0; t=2011...
Fetched_uri: ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
history: Variables filtered according to DATA_MODE; Variable...
You can now see that the standard DataFetcher print has been updated with information for the data selection.
Note
The constraint on time is not mandatory: if not specified, the fetcher will return all data available in this region.
The last time bound is exclusive: thatβs why here we specify June to retrieve data collected in May.
π€ For one or more floats#
If you know the Argo float unique identifier number called a WMO number you can use the fetcher access point DataFetcher.float() to specify one or more float WMO platform numbers to select.
For instance, to select data for float WMO 6902746:
In [6]: f = f.float(6902746)
In [7]: f.data
Out[7]:
<xarray.Dataset> Size: 2MB
Dimensions: (N_POINTS: 12518)
Coordinates:
* N_POINTS (N_POINTS) int64 100kB 0 1 2 3 ... 12514 12515 12516 12517
LATITUDE (N_POINTS) float64 100kB 20.08 20.08 20.08 ... 16.67 16.67
LONGITUDE (N_POINTS) float64 100kB -60.17 -60.17 ... -77.13 -77.13
TIME (N_POINTS) datetime64[ns] 100kB 2017-07-06T14:49:00 ... ...
Data variables: (12/15)
CYCLE_NUMBER (N_POINTS) int64 100kB 1 1 1 1 1 1 ... 117 117 117 117 117
DATA_MODE (N_POINTS) <U1 50kB 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D'
DIRECTION (N_POINTS) <U1 50kB 'D' 'D' 'D' 'D' 'D' ... 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_POINTS) int64 100kB 6902746 6902746 ... 6902746 6902746
POSITION_QC (N_POINTS) int64 100kB 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
PRES (N_POINTS) float32 50kB 9.0 14.0 ... 1.514e+03 1.526e+03
... ...
PSAL_ERROR (N_POINTS) float32 50kB 0.01003 0.01003 ... 0.01 0.01
PSAL_QC (N_POINTS) int64 100kB 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TEMP (N_POINTS) float32 50kB 28.04 28.03 28.02 ... 4.254 4.238
TEMP_ERROR (N_POINTS) float32 50kB 0.002 0.002 0.002 ... 0.002 0.002
TEMP_QC (N_POINTS) int64 100kB 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TIME_QC (N_POINTS) int64 100kB 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://erddap.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2024/09/23
Fetched_constraints: WMO6902746
Fetched_uri: ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
history: Variables filtered according to DATA_MODE; Variable...
To fetch data for a collection of floats, input them in a list:
In [8]: f = f.float([6902746, 6902755])
In [9]: f.data
Out[9]:
<xarray.Dataset> Size: 4MB
Dimensions: (N_POINTS: 31289)
Coordinates:
* N_POINTS (N_POINTS) int64 250kB 0 1 2 3 ... 31285 31286 31287 31288
LATITUDE (N_POINTS) float64 250kB 20.08 20.08 20.08 ... 43.81 43.81
LONGITUDE (N_POINTS) float64 250kB -60.17 -60.17 ... -28.85 -28.85
TIME (N_POINTS) datetime64[ns] 250kB 2017-07-06T14:49:00 ... ...
Data variables: (12/15)
CYCLE_NUMBER (N_POINTS) int64 250kB 1 1 1 1 1 1 ... 177 177 177 177 177
DATA_MODE (N_POINTS) <U1 125kB 'D' 'D' 'D' 'D' ... 'A' 'A' 'A' 'A'
DIRECTION (N_POINTS) <U1 125kB 'D' 'D' 'D' 'D' ... 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_POINTS) int64 250kB 6902746 6902746 ... 6902755 6902755
POSITION_QC (N_POINTS) int64 250kB 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
PRES (N_POINTS) float32 125kB 9.0 14.0 24.0 ... 285.0 296.0
... ...
PSAL_ERROR (N_POINTS) float32 125kB 0.01003 0.01003 ... nan nan
PSAL_QC (N_POINTS) int64 250kB 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TEMP (N_POINTS) float32 125kB 28.04 28.03 28.02 ... 13.45 13.49
TEMP_ERROR (N_POINTS) float32 125kB 0.002 0.002 0.002 ... nan nan nan
TEMP_QC (N_POINTS) int64 250kB 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TIME_QC (N_POINTS) int64 250kB 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://erddap.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2024/09/23
Fetched_constraints: WMO6902746;WMO6902755
Fetched_uri: ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
history: Variables filtered according to DATA_MODE; Variable...
β For one or more profiles#
Use the fetcher access point argopy.DataFetcher.profile() to specify the float WMO platform number and the profile cycle number(s) to retrieve profiles for.
For instance, to retrieve data for the 12th profile of float WMO 6902755:
In [10]: f = f.profile(6902755, 12)
In [11]: f.data
Out[11]:
<xarray.Dataset> Size: 13kB
Dimensions: (N_POINTS: 107)
Coordinates:
* N_POINTS (N_POINTS) int64 856B 0 1 2 3 4 5 ... 102 103 104 105 106
LATITUDE (N_POINTS) float64 856B 63.68 63.68 63.68 ... 63.68 63.68
LONGITUDE (N_POINTS) float64 856B -28.81 -28.81 ... -28.81 -28.81
TIME (N_POINTS) datetime64[ns] 856B 2018-10-19T23:52:00 ... 2...
Data variables: (12/15)
CYCLE_NUMBER (N_POINTS) int64 856B 12 12 12 12 12 12 ... 12 12 12 12 12
DATA_MODE (N_POINTS) <U1 428B 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D'
DIRECTION (N_POINTS) <U1 428B 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_POINTS) int64 856B 6902755 6902755 ... 6902755 6902755
POSITION_QC (N_POINTS) int64 856B 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
PRES (N_POINTS) float32 428B 3.0 4.0 5.0 ... 1.713e+03 1.732e+03
... ...
PSAL_ERROR (N_POINTS) float32 428B 0.01 0.01 0.01 ... 0.01 0.01 0.01
PSAL_QC (N_POINTS) int64 856B 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TEMP (N_POINTS) float32 428B 7.598 7.599 7.602 ... 3.549 3.536
TEMP_ERROR (N_POINTS) float32 428B 0.002 0.002 0.002 ... 0.002 0.002
TEMP_QC (N_POINTS) int64 856B 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TIME_QC (N_POINTS) int64 856B 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://erddap.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2024/09/23
Fetched_constraints: WMO6902755_CYC12
Fetched_uri: ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
history: Variables filtered according to DATA_MODE; Variable...
To fetch data for more than one profile, input them in a list:
In [12]: f = f.profile(6902755, [3, 12])
In [13]: f.data
Out[13]:
<xarray.Dataset> Size: 26kB
Dimensions: (N_POINTS: 215)
Coordinates:
* N_POINTS (N_POINTS) int64 2kB 0 1 2 3 4 5 ... 210 211 212 213 214
LATITUDE (N_POINTS) float64 2kB 59.72 59.72 59.72 ... 63.68 63.68
LONGITUDE (N_POINTS) float64 2kB -31.24 -31.24 ... -28.81 -28.81
TIME (N_POINTS) datetime64[ns] 2kB 2018-07-22T00:03:00 ... 20...
Data variables: (12/15)
CYCLE_NUMBER (N_POINTS) int64 2kB 3 3 3 3 3 3 3 ... 12 12 12 12 12 12 12
DATA_MODE (N_POINTS) <U1 860B 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D'
DIRECTION (N_POINTS) <U1 860B 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_POINTS) int64 2kB 6902755 6902755 ... 6902755 6902755
POSITION_QC (N_POINTS) int64 2kB 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
PRES (N_POINTS) float32 860B 3.0 4.0 5.0 ... 1.713e+03 1.732e+03
... ...
PSAL_ERROR (N_POINTS) float32 860B 0.01 0.01 0.01 ... 0.01 0.01 0.01
PSAL_QC (N_POINTS) int64 2kB 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TEMP (N_POINTS) float32 860B 8.742 8.743 8.744 ... 3.549 3.536
TEMP_ERROR (N_POINTS) float32 860B 0.002 0.002 0.002 ... 0.002 0.002
TEMP_QC (N_POINTS) int64 2kB 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TIME_QC (N_POINTS) int64 2kB 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://erddap.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2024/09/23
Fetched_constraints: WMO6902755_CYC3_CYC12
Fetched_uri: ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
history: Variables filtered according to DATA_MODE; Variable...
Note
You can chain data selection and fetching in a single command line:
f = argopy.DataFetcher().region([-75, -45, 20, 30, 0, 10, '2011-01-01', '2011-06']).load()
f.data