Manipulating data¶
Once you fetched data, argopy comes with a handy xarray.Dataset
accessor namespace argo
to perform specific manipulation of the data. This means that if your dataset is ds, then you can use ds.argo to access more argopy functions.
Let’s start with standard import:
In [1]: from argopy import DataFetcher as ArgoDataFetcher
Transformation¶
Points vs profiles¶
Fetched data are returned as a 1D array collection of measurements:
In [2]: argo_loader = ArgoDataFetcher().region([-75,-55,30.,40.,0,100., '2011-01-01', '2011-01-15'])
In [3]: ds_points = argo_loader.to_xarray()
In [4]: ds_points
Out[4]:
<xarray.Dataset>
Dimensions: (N_POINTS: 524)
Coordinates:
* N_POINTS (N_POINTS) int64 0 1 2 3 4 5 ... 519 520 521 522 523
LATITUDE (N_POINTS) float64 37.28 37.28 37.28 ... 33.07 33.07
LONGITUDE (N_POINTS) float64 -66.77 -66.77 ... -64.59 -64.59
TIME (N_POINTS) datetime64[ns] 2011-01-02T11:14:06 ... ...
Data variables: (12/13)
CONFIG_MISSION_NUMBER (N_POINTS) int64 1 1 1 1 1 1 1 ... 13 13 13 13 13 13
CYCLE_NUMBER (N_POINTS) int64 150 150 150 150 150 ... 13 13 13 13
DATA_MODE (N_POINTS) <U1 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D'
DIRECTION (N_POINTS) <U1 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_POINTS) int64 4900803 4900803 ... 5903377 5903377
POSITION_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
... ...
PRES_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
PSAL (N_POINTS) float64 36.67 36.67 36.67 ... 36.67 36.67
PSAL_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TEMP (N_POINTS) float64 19.46 19.47 19.47 ... 19.2 19.2
TEMP_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TIME_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://www.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2021/11/02
Fetched_constraints: [x=-75.00/-55.00; y=30.00/40.00; z=0.0/100.0; t=201...
Fetched_uri: ['https://www.ifremer.fr/erddap/tabledap/ArgoFloats...
history: Variables filtered according to DATA_MODE; Variable...
If you prefer to work with a 2D array collection of vertical profiles, simply transform the dataset with argopy.xarray.ArgoAccessor.point2profile()
:
In [5]: ds_profiles = ds_points.argo.point2profile()
In [6]: ds_profiles
Out[6]:
<xarray.Dataset>
Dimensions: (N_PROF: 18, N_LEVELS: 50)
Coordinates:
* N_PROF (N_PROF) int64 7 13 15 0 6 2 9 ... 12 10 17 3 8 14 16
* N_LEVELS (N_LEVELS) int64 0 1 2 3 4 5 6 ... 44 45 46 47 48 49
LATITUDE (N_PROF) float64 37.28 33.98 32.88 ... 34.39 33.07
LONGITUDE (N_PROF) float64 -66.77 -71.17 ... -72.75 -64.59
TIME (N_PROF) datetime64[ns] 2011-01-02T11:14:06 ... 20...
Data variables: (12/13)
CONFIG_MISSION_NUMBER (N_PROF) int64 1 1 11 1 1 1 0 1 1 1 1 1 1 2 1 1 1 13
CYCLE_NUMBER (N_PROF) int64 150 3 11 100 180 ... 62 148 151 4 13
DATA_MODE (N_PROF) <U1 'D' 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D'
DIRECTION (N_PROF) <U1 'A' 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_PROF) int64 4900803 4901218 ... 4901218 5903377
POSITION_QC (N_PROF) int64 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
... ...
PRES_QC (N_PROF) int64 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
PSAL (N_PROF, N_LEVELS) float64 36.67 36.67 ... 36.67 nan
PSAL_QC (N_PROF) int64 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
TEMP (N_PROF, N_LEVELS) float64 19.46 19.47 ... 19.2 nan
TEMP_QC (N_PROF) int64 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
TIME_QC (N_PROF) int64 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://www.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2021/11/02
Fetched_constraints: [x=-75.00/-55.00; y=30.00/40.00; z=0.0/100.0; t=201...
Fetched_uri: ['https://www.ifremer.fr/erddap/tabledap/ArgoFloats...
history: Variables filtered according to DATA_MODE; Variable...
You can simply reverse this transformation with the argopy.xarray.ArgoAccessor.profile2point()
:
In [7]: ds = ds_profiles.argo.profile2point()
In [8]: ds
Out[8]:
<xarray.Dataset>
Dimensions: (N_POINTS: 524)
Coordinates:
LONGITUDE (N_POINTS) float64 -66.77 -66.77 ... -64.59 -64.59
TIME (N_POINTS) datetime64[ns] 2011-01-02T11:14:06 ... ...
LATITUDE (N_POINTS) float64 37.28 37.28 37.28 ... 33.07 33.07
* N_POINTS (N_POINTS) int64 0 1 2 3 4 5 ... 519 520 521 522 523
Data variables: (12/13)
CONFIG_MISSION_NUMBER (N_POINTS) int64 1 1 1 1 1 1 1 ... 13 13 13 13 13 13
CYCLE_NUMBER (N_POINTS) int64 150 150 150 150 150 ... 13 13 13 13
DATA_MODE (N_POINTS) <U1 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D'
DIRECTION (N_POINTS) <U1 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_POINTS) int64 4900803 4900803 ... 5903377 5903377
POSITION_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
... ...
PRES_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
PSAL (N_POINTS) float64 36.67 36.67 36.67 ... 36.67 36.67
PSAL_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TEMP (N_POINTS) float64 19.46 19.47 19.47 ... 19.2 19.2
TEMP_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TIME_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://www.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2021/11/02
Fetched_constraints: [x=-75.00/-55.00; y=30.00/40.00; z=0.0/100.0; t=201...
Fetched_uri: ['https://www.ifremer.fr/erddap/tabledap/ArgoFloats...
history: Variables filtered according to DATA_MODE; Variable...
Interpolation to standard levels¶
Once your dataset is a collection of vertical profiles, you can interpolate variables on standard pressure levels using argopy.xarray.ArgoAccessor.interp_std_levels()
with your levels as input :
In [9]: ds_interp = ds_profiles.argo.interp_std_levels([0,10,20,30,40,50])
In [10]: ds_interp
Out[10]:
<xarray.Dataset>
Dimensions: (N_PROF: 18, PRES_INTERPOLATED: 6)
Coordinates:
* N_PROF (N_PROF) int64 7 13 15 0 6 2 9 ... 12 10 17 3 8 14 16
LATITUDE (N_PROF) float64 37.28 33.98 32.88 ... 34.39 33.07
LONGITUDE (N_PROF) float64 -66.77 -71.17 ... -72.75 -64.59
TIME (N_PROF) datetime64[ns] 2011-01-02T11:14:06 ... 20...
* PRES_INTERPOLATED (PRES_INTERPOLATED) int64 0 10 20 30 40 50
Data variables:
CONFIG_MISSION_NUMBER (N_PROF) float64 1.0 1.0 11.0 1.0 ... 1.0 1.0 13.0
CYCLE_NUMBER (N_PROF) float64 150.0 3.0 11.0 ... 151.0 4.0 13.0
DATA_MODE (N_PROF) object 'D' 'D' 'D' 'D' ... 'D' 'D' 'D' 'D'
DIRECTION (N_PROF) object 'A' 'A' 'A' 'A' ... 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_PROF) float64 4.901e+06 4.901e+06 ... 5.903e+06
PRES (N_PROF, PRES_INTERPOLATED) float64 5.0 10.0 ... 50.0
PSAL (N_PROF, PRES_INTERPOLATED) float64 36.67 ... 36.68
TEMP (N_PROF, PRES_INTERPOLATED) float64 19.46 ... 19.24
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://www.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2021/11/02
Fetched_constraints: [x=-75.00/-55.00; y=30.00/40.00; z=0.0/100.0; t=201...
Fetched_uri: ['https://www.ifremer.fr/erddap/tabledap/ArgoFloats...
history: Variables filtered according to DATA_MODE; Variable...
- Note on the linear interpolation process :
Only profiles that have a maximum pressure higher than the highest standard level are selected for interpolation.
Remaining profiles must have at least five data points to allow interpolation.
For each profile, shallowest data point is repeated to the surface to allow a 0 standard level while avoiding extrapolation.
Filters¶
If you fetched data with the expert
mode, you may want to use
filters to help you curate the data.
[To be added]
Complementary data¶
TEOS-10 variables¶
You can compute additional ocean variables from TEOS-10. The default list of variables is: ‘SA’, ‘CT’, ‘SIG0’, ‘N2’, ‘PV’, ‘PTEMP’ (‘SOUND_SPEED’ is optional). Simply raise an issue to add a new one.
This can be done using the argopy.xarray.ArgoAccessor.teos10()
method and indicating the list of variables you want to compute:
In [11]: ds = ArgoDataFetcher().float(2901623).to_xarray()
In [12]: ds.argo.teos10(['SA', 'CT', 'PV'])
Out[12]:
<xarray.Dataset>
Dimensions: (N_POINTS: 8339)
Coordinates:
* N_POINTS (N_POINTS) int64 0 1 2 3 4 ... 8335 8336 8337 8338
LATITUDE (N_POINTS) float64 0.012 0.012 0.012 ... 3.388 3.388
LONGITUDE (N_POINTS) float64 92.28 92.28 92.28 ... 94.77 94.77
TIME (N_POINTS) datetime64[ns] 2010-05-14T03:35:00 ... ...
Data variables: (12/16)
CONFIG_MISSION_NUMBER (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
CYCLE_NUMBER (N_POINTS) int64 0 0 0 0 0 0 0 ... 96 96 96 96 96 96
DATA_MODE (N_POINTS) <U1 'R' 'R' 'R' 'R' ... 'R' 'R' 'R' 'R'
DIRECTION (N_POINTS) <U1 'D' 'D' 'D' 'D' ... 'A' 'A' 'A' 'A'
PLATFORM_NUMBER (N_POINTS) int64 2901623 2901623 ... 2901623 2901623
POSITION_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
... ...
TEMP (N_POINTS) float64 30.16 30.17 30.17 ... 6.189 6.071
TEMP_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
TIME_QC (N_POINTS) int64 1 1 1 1 1 1 1 1 ... 1 1 1 1 1 1 1 1
SA (N_POINTS) float64 34.44 34.44 34.44 ... 35.09 35.08
CT (N_POINTS) float64 30.2 30.2 30.2 ... 6.078 5.959
PV (N_POINTS) float64 nan -1.79e-15 ... 1.574e-12 nan
Attributes:
DATA_ID: ARGO
DOI: http://doi.org/10.17882/42182
Fetched_from: https://www.ifremer.fr/erddap
Fetched_by: docs
Fetched_date: 2021/11/02
Fetched_constraints: phy;WMO2901623
Fetched_uri: ['https://www.ifremer.fr/erddap/tabledap/ArgoFloats...
history: Variables filtered according to DATA_MODE; Variable...
In [13]: ds['SA']
Out[13]:
<xarray.DataArray 'SA' (N_POINTS: 8339)>
array([34.43589931, 34.43691224, 34.43692096, ..., 35.0921157 ,
35.09227648, 35.08238554])
Coordinates:
* N_POINTS (N_POINTS) int64 0 1 2 3 4 5 6 ... 8333 8334 8335 8336 8337 8338
LATITUDE (N_POINTS) float64 0.012 0.012 0.012 0.012 ... 3.388 3.388 3.388
LONGITUDE (N_POINTS) float64 92.28 92.28 92.28 92.28 ... 94.77 94.77 94.77
TIME (N_POINTS) datetime64[ns] 2010-05-14T03:35:00 ... 2013-01-01T0...
Attributes:
long_name: Absolute Salinity
standard_name: sea_water_absolute_salinity
unit: g/kg
Data models¶
By default argopy works with xarray.DataSet and comes with the accessor namespace argo
(see here for more on xarray accessor).
For your own analysis, you may prefer to work with a Pandas dataframe.
In [14]: df = ArgoDataFetcher().profile(6902746, 34).to_dataframe()
In [15]: df
Out[15]:
CONFIG_MISSION_NUMBER CYCLE_NUMBER ... LONGITUDE TIME
N_POINTS ...
0 2 34 ... -58.119 2017-12-20 06:58:00
1 2 34 ... -58.119 2017-12-20 06:58:00
2 2 34 ... -58.119 2017-12-20 06:58:00
3 2 34 ... -58.119 2017-12-20 06:58:00
4 2 34 ... -58.119 2017-12-20 06:58:00
... ... ... ... ... ...
104 2 34 ... -58.119 2017-12-20 06:58:00
105 2 34 ... -58.119 2017-12-20 06:58:00
106 2 34 ... -58.119 2017-12-20 06:58:00
107 2 34 ... -58.119 2017-12-20 06:58:00
108 2 34 ... -58.119 2017-12-20 06:58:00
[109 rows x 16 columns]
but keep in mind that this is merely a short cut for the xarray.Dataset.to_dataframe()
method.
Saving data¶
Once you have your Argo data as xarray.Dataset
, simply use the awesome possibilities of xarray like xarray.Dataset.to_netcdf()
or xarray.Dataset.to_zarr()
.