Fetching Argo data#
To fetch (i.e. access, download, format) Argo data, argopy provides the DataFetcher
class. In this section of the documentation, we explain how to use it.
Several DataFetcher
arguments exist to help you select the dataset, the data source and the user mode the most suited for your applications; and also to improve performances.
You define the selection of data you want to fetch with one of the DataFetcher
methods: region, float or profile.
These methods and arguments are all explained in the following sections:
In a nutshell#
2 lines to download Argo data: import and fetch !
In [1]: import argopy
In [2]: ds = argopy.DataFetcher().region([-75, -45, 20, 30, 0, 10, '2011-01', '2011-06']).load().data
In [3]: ds
Out[3]:
<xarray.Dataset>
Dimensions: (N_POINTS: 998)
Coordinates:
* N_POINTS (N_POINTS) int64 0 1 2 3 4 5 6 ... 992 993 994 995 996 997
LATITUDE (N_POINTS) float64 24.54 24.54 25.04 ... 26.67 24.96 24.96
LONGITUDE (N_POINTS) float64 -45.14 -45.14 -51.58 ... -50.4 -50.4
TIME (N_POINTS) datetime64[ns] 2011-01-01T11:49:19 ... 2011-0...
Data variables: (12/15)
CYCLE_NUMBER (N_POINTS) int64 23 23 10 10 10 10 10 ... 1 5 2 10 10 38 38
DATA_MODE (N_POINTS) <U1 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D' 'D'
DIRECTION (N_POINTS) <U1 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_POINTS) int64 1901463 1901463 ... 1901463 1901463
POSITION_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
PRES (N_POINTS) float64 5.0 10.0 2.0 4.0 ... 5.12 9.42 5.0 10.0
... ...
PSAL_ERROR (N_POINTS) float32 0.01 0.01 0.01 ... 0.01 0.01091 0.01182
PSAL_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
TEMP (N_POINTS) float64 24.08 24.08 24.03 ... 25.64 25.1 24.79
TEMP_ERROR (N_POINTS) float32 0.002 0.002 0.002 ... 0.0025 0.002 0.002
TEMP_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
TIME_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://erddap.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2024/04/22
Fetched_constraints: [x=-75.00/-45.00; y=20.00/30.00; z=0.0/10.0; t=2011...
Fetched_uri: ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
history: Variables filtered according to DATA_MODE; Variable...
Workflow explained#
Letโs explain what happened in the single line Argo data fetching above.
Import argopy and create a instance of DataFetcher
:
In [4]: import argopy
In [5]: f = argopy.DataFetcher()
In [6]: f
Out[6]:
<datafetcher.erddap> 'No access point initialised'
Available access points: float, profile, region
Performances: cache=False, parallel=False
User mode: standard
Dataset: phy
By default, argopy will load the phy
dataset, in standard
user mode from the erddap
data source.
Once you have a DataFetcher
, you must select data. As an example, here is a space/time data selection:
In [7]: f = f.region([-75, -45, 20, 30, 0, 10, '2011-01', '2011-06'])
In [8]: f
Out[8]:
<datafetcher.erddap>
Name: Ifremer erddap Argo data fetcher for a space/time region
API: https://erddap.ifremer.fr/erddap
Domain: [x=-75.00/-45.00; y=20.00/30.0 ... 10.0; t=2011-01-01/2011-06-01]
Performances: cache=False, parallel=False
User mode: standard
Dataset: phy
Once you defined a data selection, data fetching will be triggered if you access one of the DataFetcher
properties:
data
, this is axarray.Dataset
with all Argo data in the selection,index
, this is apandas.DataFrame
with a list of profiles in the selection.
In [9]: f.data
Out[9]:
<xarray.Dataset>
Dimensions: (N_POINTS: 998)
Coordinates:
* N_POINTS (N_POINTS) int64 0 1 2 3 4 5 6 ... 992 993 994 995 996 997
LATITUDE (N_POINTS) float64 24.54 24.54 25.04 ... 26.67 24.96 24.96
LONGITUDE (N_POINTS) float64 -45.14 -45.14 -51.58 ... -50.4 -50.4
TIME (N_POINTS) datetime64[ns] 2011-01-01T11:49:19 ... 2011-0...
Data variables: (12/15)
CYCLE_NUMBER (N_POINTS) int64 23 23 10 10 10 10 10 ... 1 5 2 10 10 38 38
DATA_MODE (N_POINTS) <U1 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D' 'D'
DIRECTION (N_POINTS) <U1 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_POINTS) int64 1901463 1901463 ... 1901463 1901463
POSITION_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
PRES (N_POINTS) float64 5.0 10.0 2.0 4.0 ... 5.12 9.42 5.0 10.0
... ...
PSAL_ERROR (N_POINTS) float32 0.01 0.01 0.01 ... 0.01 0.01091 0.01182
PSAL_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
TEMP (N_POINTS) float64 24.08 24.08 24.03 ... 25.64 25.1 24.79
TEMP_ERROR (N_POINTS) float32 0.002 0.002 0.002 ... 0.0025 0.002 0.002
TEMP_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
TIME_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1 1
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://erddap.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2024/04/22
Fetched_constraints: [x=-75.00/-45.00; y=20.00/30.00; z=0.0/10.0; t=2011...
Fetched_uri: ['https://erddap.ifremer.fr/erddap/tabledap/ArgoFlo...
history: Variables filtered according to DATA_MODE; Variable...
In [10]: f.index
Out[10]:
date latitude longitude wmo cyc
0 2011-01-02 11:26:40 27.829 -56.303 1901461 23
1 2011-01-12 11:44:20 27.988 -56.378 1901461 24
2 2011-01-22 11:23:26 27.891 -55.865 1901461 25
3 2011-02-01 11:18:38 27.538 -54.669 1901461 26
4 2011-02-11 11:46:04 27.487 -53.686 1901461 27
.. ... ... ... ... ...
462 2011-05-02 01:22:10 27.134 -71.040 6901050 2
463 2011-05-12 01:25:11 27.498 -70.044 6901050 3
464 2011-05-20 16:30:09 21.120 -57.985 6901051 1
465 2011-05-30 16:18:21 21.489 -58.071 6901051 2
466 2011-05-25 20:08:09 21.072 -52.625 6901052 1
[467 rows x 5 columns]
These fetcher properties call on the DataFetcher.load()
method, which in turn, calls either DataFetcher.to_xarray()
or DataFetcher.to_index()
if data structures are not already in memory.
Note that the DataFetcher.to_xarray()
and DataFetcher.to_index()
will force data download on every call, while the DataFetcher.load()
method will keep data in memory in the DataFetcher.data
and DataFetcher.index
properties.
If you wish to quickly look at your data selection, you can call on the DataFetcher.plot()
.
f.plot('trajectory', add_legend=False)
If you selected data for a float, the DataFetcher.dashboard()
method can also be used.
See the Data visualisation section for more details on argopy data visualisation tools.
Hint
The DataFetcher.domain
property will also give you the space/time domain covered by your data selection.
In [11]: f.domain # [lon_min, lon_max, lat_min, lat_max, pres_min, pres_max, date_min, date_max]
Out[11]:
[-74.37700000000001,
-45.118,
20.018,
29.995,
1.0,
10.399999618530273,
numpy.datetime64('2011-01-01T11:49:19.000000000'),
numpy.datetime64('2011-05-31T11:34:52.000000000')]