Dataset#

Profile count Profile BGC count

Hint

argopy makes a difference between the physical and biogeochemical parameters. To make sure you understand which data youโ€™re getting, have a look at this section.

Argo data are distributed as a single dataset. It is referenced at https://doi.org/10.17882/42182.

But they are several Argo missions with specific files and parameters that need special handling by argopy, namely:

  • the core Argo Mission: from floats that measure temperature, salinity, pressure down to 2000m,

  • the Deep Argo Mission: from floats that measure temperature, salinity, pressure down to 6000m,

  • and the BGC-Argo Mission: from floats that measure temperature, salinity, pressure and oxygen, pH, nitrate, chlorophyll, backscatter, irradiance down to 2000m.

Argo dataset available in argopy#

In argopy we simply make the difference between physical and biogeochemical parameters in the Argo dataset. This is because the Deep Argo mission data are accessible following the same files and parameters than those from the Core mission. Only BGC-Argo data requires specific files and parameters.

In argopy you can thus get access to the following Argo data:

  1. the phy dataset, for physical parameters.

    This dataset provides data from floats that measure temperature, salinity, pressure, without limitation in depth. It is available from all Available data sources. Since this is the most common Argo data subset itโ€™s selected with the phy keyword by default in argopy.

  2. the bgc dataset, for biogeochemical parameters.

    This dataset provides data from floats that measure temperature, salinity, pressure and oxygen, pH, nitrate, chlorophyll, backscatter, irradiance, without limitation in depth. You can select this dataset with the keyword bgc and methods described below.

Selecting a dataset#

You have several ways to specify which dataset you want to use:

  • using argopy global options:

In [1]: import argopy

In [2]: argopy.set_options(dataset='bgc')
Out[2]: <argopy.options.set_options at 0x7fd6d9a6b5e0>
  • with an option in a temporary context:

In [3]: import argopy

In [4]: with argopy.set_options(dataset='phy'):
   ...:     argopy.DataFetcher().profile(6904241, 12)
   ...: 
  • with the `ds` argument in the data fetcher:

In [5]: argopy.DataFetcher(ds='phy').profile(6902746, 34)
Out[5]: 
<datafetcher.erddap>
Name: Ifremer erddap Argo data fetcher for profiles
API: https://erddap.ifremer.fr/erddap
Domain: WMO6902746_CYC34
Performances: cache=False, parallel=False
User mode: standard
Dataset: phy

Note

In the future, we could consider to add more mission specific keywords for the dataset option and ds fetcher argument of DataFetcher. This could be deep for instance. Please raise an gitHub โ€œissueโ€ if you may require such a new feature.

The bgc dataset#

Important

At this time, BGC parameters are only available in expert user mode and with the erddap data source.

All argopy features work with the phy dataset. However, they are some specific methods dedicated to the bgc dataset that we now describe.

Specifics in DataFetcher#

The BGC-Argo Mission gathers data from floats that measure temperature, salinity, pressure and oxygen, pH, nitrate, chlorophyll, backscatter, irradiance down to 2000m. However, beyond this short BGC parameter list there exist in the Argo dataset more than 120 BGC-related variables. Therefore, in the DataFetcher we implemented 2 specific arguments to handle BGC variables: params and measured.

The params argument#

With a DataFetcher, the params argument can be used to specify which variables will be returned, whatever their values or availability in BGC floats found in the data selection.

By default, the params argument is set to the keyword all to return all variables found in the data selection. But the params argument can also be a single variable or a list of variables, in which case only these will be returned and all the others discarded.

Syntax example
  • To return data from a single BGC parameter, just add it as a string, for instance DOXY

  • To return more than one BGC parameter, give them as a list of strings, for instance ['DOXY', 'BBP700']

  • To retrieve all available BGC parameters, you can omit the params argument (since this is the default value), or give it explicitly as all.

In [6]: import argopy

In [7]: with argopy.set_options(dataset='bgc', src='erddap', mode='expert'):
   ...:     params = 'all'  # eg: 'DOXY' or ['DOXY', 'BBP700']
   ...:     f = argopy.DataFetcher(params=params)
   ...:     f = f.region([-75, -45, 20, 30, 0, 10, '2021-01', '2021-06'])
   ...:     f.load()
   ...: 

In [8]: print(f.data.argo, "\n")  # Easy print of N profiles and points
<xarray.Dataset.argo>
This is a collection of Argo points
N_POINTS(619) ~ N_PROF(46) x N_LEVELS(26) 


In [9]: print(list(f.data.data_vars))  # List of dataset variables
['BBP532', 'BBP532_ADJUSTED', 'BBP532_ADJUSTED_ERROR', 'BBP532_ADJUSTED_QC', 'BBP532_DATA_MODE', 'BBP532_QC', 'BBP700', 'BBP700_ADJUSTED', 'BBP700_ADJUSTED_ERROR', 'BBP700_ADJUSTED_QC', 'BBP700_DATA_MODE', 'BBP700_QC', 'CHLA', 'CHLA_ADJUSTED', 'CHLA_ADJUSTED_ERROR', 'CHLA_ADJUSTED_QC', 'CHLA_DATA_MODE', 'CHLA_QC', 'CONFIG_MISSION_NUMBER', 'CYCLE_NUMBER', 'DIRECTION', 'DOWNWELLING_PAR', 'DOWNWELLING_PAR_ADJUSTED', 'DOWNWELLING_PAR_ADJUSTED_ERROR', 'DOWNWELLING_PAR_ADJUSTED_QC', 'DOWNWELLING_PAR_DATA_MODE', 'DOWNWELLING_PAR_QC', 'DOWN_IRRADIANCE380', 'DOWN_IRRADIANCE380_ADJUSTED', 'DOWN_IRRADIANCE380_ADJUSTED_ERROR', 'DOWN_IRRADIANCE380_ADJUSTED_QC', 'DOWN_IRRADIANCE380_DATA_MODE', 'DOWN_IRRADIANCE380_QC', 'DOWN_IRRADIANCE412', 'DOWN_IRRADIANCE412_ADJUSTED', 'DOWN_IRRADIANCE412_ADJUSTED_ERROR', 'DOWN_IRRADIANCE412_ADJUSTED_QC', 'DOWN_IRRADIANCE412_DATA_MODE', 'DOWN_IRRADIANCE412_QC', 'DOWN_IRRADIANCE490', 'DOWN_IRRADIANCE490_ADJUSTED', 'DOWN_IRRADIANCE490_ADJUSTED_ERROR', 'DOWN_IRRADIANCE490_ADJUSTED_QC', 'DOWN_IRRADIANCE490_DATA_MODE', 'DOWN_IRRADIANCE490_QC', 'DOXY', 'DOXY_ADJUSTED', 'DOXY_ADJUSTED_ERROR', 'DOXY_ADJUSTED_QC', 'DOXY_DATA_MODE', 'DOXY_QC', 'NITRATE', 'NITRATE_ADJUSTED', 'NITRATE_ADJUSTED_ERROR', 'NITRATE_ADJUSTED_QC', 'NITRATE_DATA_MODE', 'NITRATE_QC', 'PH_IN_SITU_TOTAL', 'PH_IN_SITU_TOTAL_ADJUSTED', 'PH_IN_SITU_TOTAL_ADJUSTED_ERROR', 'PH_IN_SITU_TOTAL_ADJUSTED_QC', 'PH_IN_SITU_TOTAL_DATA_MODE', 'PH_IN_SITU_TOTAL_QC', 'PLATFORM_NUMBER', 'POSITION_QC', 'PRES', 'PRES_ADJUSTED', 'PRES_ADJUSTED_ERROR', 'PRES_ADJUSTED_QC', 'PRES_DATA_MODE', 'PRES_QC', 'PSAL', 'PSAL_ADJUSTED', 'PSAL_ADJUSTED_ERROR', 'PSAL_ADJUSTED_QC', 'PSAL_DATA_MODE', 'PSAL_QC', 'TEMP', 'TEMP_ADJUSTED', 'TEMP_ADJUSTED_ERROR', 'TEMP_ADJUSTED_QC', 'TEMP_DATA_MODE', 'TEMP_QC', 'TIME_QC']

The measured argument#

With a DataFetcher, the measured argument can be used to specify which variables cannot be NaN and must return values. This is very useful to reduce a dataset to points where all or some variables are available.

By default, the measured argument is set to None for unconstrained parameter values. To the opposite, the keyword all requires that all variables found in the data selection cannot be NaNs. In between, you can specific one or more parameters to limit the constrain to a few variables.

Syntax example

Letโ€™s impose that some variables cannot be NaNs, for instance DOXY and BBP700:

In [10]: import argopy

In [11]: with argopy.set_options(dataset='bgc', src='erddap', mode='expert'):
   ....:     f = argopy.DataFetcher(params='all', measured=['DOXY', 'BBP700'])
   ....:     f = f.region([-75, -45, 20, 30, 0, 10, '2021-01', '2021-06'])
   ....:     f.load()
   ....: 

In [12]: print(f.data.argo, "\n")  # Easy print of N profiles and points
<xarray.Dataset.argo>
This is a collection of Argo points
N_POINTS(45) ~ N_PROF(45) x N_LEVELS(1) 


In [13]: print(list(f.data.data_vars))  # List of dataset variables
['BBP532', 'BBP532_ADJUSTED', 'BBP532_ADJUSTED_ERROR', 'BBP532_ADJUSTED_QC', 'BBP532_DATA_MODE', 'BBP532_QC', 'BBP700', 'BBP700_ADJUSTED', 'BBP700_ADJUSTED_ERROR', 'BBP700_ADJUSTED_QC', 'BBP700_DATA_MODE', 'BBP700_QC', 'CHLA', 'CHLA_ADJUSTED', 'CHLA_ADJUSTED_ERROR', 'CHLA_ADJUSTED_QC', 'CHLA_DATA_MODE', 'CHLA_QC', 'CONFIG_MISSION_NUMBER', 'CYCLE_NUMBER', 'DIRECTION', 'DOWNWELLING_PAR', 'DOWNWELLING_PAR_ADJUSTED', 'DOWNWELLING_PAR_ADJUSTED_ERROR', 'DOWNWELLING_PAR_ADJUSTED_QC', 'DOWNWELLING_PAR_DATA_MODE', 'DOWNWELLING_PAR_QC', 'DOWN_IRRADIANCE380', 'DOWN_IRRADIANCE380_ADJUSTED', 'DOWN_IRRADIANCE380_ADJUSTED_ERROR', 'DOWN_IRRADIANCE380_ADJUSTED_QC', 'DOWN_IRRADIANCE380_DATA_MODE', 'DOWN_IRRADIANCE380_QC', 'DOWN_IRRADIANCE412', 'DOWN_IRRADIANCE412_ADJUSTED', 'DOWN_IRRADIANCE412_ADJUSTED_ERROR', 'DOWN_IRRADIANCE412_ADJUSTED_QC', 'DOWN_IRRADIANCE412_DATA_MODE', 'DOWN_IRRADIANCE412_QC', 'DOWN_IRRADIANCE490', 'DOWN_IRRADIANCE490_ADJUSTED', 'DOWN_IRRADIANCE490_ADJUSTED_ERROR', 'DOWN_IRRADIANCE490_ADJUSTED_QC', 'DOWN_IRRADIANCE490_DATA_MODE', 'DOWN_IRRADIANCE490_QC', 'DOXY', 'DOXY_ADJUSTED', 'DOXY_ADJUSTED_ERROR', 'DOXY_ADJUSTED_QC', 'DOXY_DATA_MODE', 'DOXY_QC', 'NITRATE', 'NITRATE_ADJUSTED', 'NITRATE_ADJUSTED_ERROR', 'NITRATE_ADJUSTED_QC', 'NITRATE_DATA_MODE', 'NITRATE_QC', 'PH_IN_SITU_TOTAL', 'PH_IN_SITU_TOTAL_ADJUSTED', 'PH_IN_SITU_TOTAL_ADJUSTED_ERROR', 'PH_IN_SITU_TOTAL_ADJUSTED_QC', 'PH_IN_SITU_TOTAL_DATA_MODE', 'PH_IN_SITU_TOTAL_QC', 'PLATFORM_NUMBER', 'POSITION_QC', 'PRES', 'PRES_ADJUSTED', 'PRES_ADJUSTED_ERROR', 'PRES_ADJUSTED_QC', 'PRES_DATA_MODE', 'PRES_QC', 'PSAL', 'PSAL_ADJUSTED', 'PSAL_ADJUSTED_ERROR', 'PSAL_ADJUSTED_QC', 'PSAL_DATA_MODE', 'PSAL_QC', 'TEMP', 'TEMP_ADJUSTED', 'TEMP_ADJUSTED_ERROR', 'TEMP_ADJUSTED_QC', 'TEMP_DATA_MODE', 'TEMP_QC', 'TIME_QC']

We can see from f.data.argo.N_POINTS that the dataset is reduced compared to the previous version without constraints on variables measured.

Specifics in ArgoIndex#

All details and examples of the BGC specifics methods for ArgoIndex can be found in: Usage with bgc index.