Argo meta-data#

Index of profiles #

Since the Argo measurements dataset is quite complex, it comes with a collection of index files, or lookup tables with meta data. These index help you determine what you can expect before retrieving the full set of measurements.

argopy provides two methods to work with Argo index files: one is high-level and works like the data fetcher, the other is low-level and works like a “store”.

Fetcher: High-level Argo index access #

argopy has a specific fetcher for index files:

In [1]: from argopy import IndexFetcher as ArgoIndexFetcher

You can use the Index fetcher with the region or float access points, similarly to data fetching:

In [2]: idx = ArgoIndexFetcher(src='gdac').float(2901623).load()

In [3]: idx.index
Out[3]: 
                                       file  ...                             profiler
0    nmdis/2901623/profiles/R2901623_000.nc  ...  Provor, Seabird conductivity sensor
1   nmdis/2901623/profiles/R2901623_000D.nc  ...  Provor, Seabird conductivity sensor
2    nmdis/2901623/profiles/R2901623_001.nc  ...  Provor, Seabird conductivity sensor
3    nmdis/2901623/profiles/R2901623_002.nc  ...  Provor, Seabird conductivity sensor
4    nmdis/2901623/profiles/R2901623_003.nc  ...  Provor, Seabird conductivity sensor
..                                      ...  ...                                  ...
93   nmdis/2901623/profiles/R2901623_092.nc  ...  Provor, Seabird conductivity sensor
94   nmdis/2901623/profiles/R2901623_093.nc  ...  Provor, Seabird conductivity sensor
95   nmdis/2901623/profiles/R2901623_094.nc  ...  Provor, Seabird conductivity sensor
96   nmdis/2901623/profiles/R2901623_095.nc  ...  Provor, Seabird conductivity sensor
97   nmdis/2901623/profiles/R2901623_096.nc  ...  Provor, Seabird conductivity sensor

[98 rows x 11 columns]

Alternatively, you can use argopy.IndexFetcher.to_dataframe():

In [4]: idx = ArgoIndexFetcher(src='gdac').float(2901623)

In [5]: df = idx.to_dataframe()

The difference is that with the load method, data are stored in memory and not fetched on every call to the index attribute.

The index fetcher has pretty much the same methods than the data fetchers. You can check them all here: argopy.fetchers.ArgoIndexFetcher.

Store: Low-level Argo Index access #

The IndexFetcher shown above is a user-friendly layer on top of our internal Argo index file store. But if you are familiar with Argo index files and/or cares about performances, you may be interested in using directly the Argo index store ArgoIndex.

If Pyarrow is installed, this store will rely on pyarrow.Table as internal storage format for the index, otherwise it will fall back on pandas.DataFrame. Loading the full Argo profile index takes about 2/3 secs with Pyarrow, while it can take up to 6/7 secs with Pandas.

All index store methods and properties are fully documented in ArgoIndex.

Usage #

You create an index store with default or custom options:

In [6]: from argopy import ArgoIndex

In [7]: idx = ArgoIndex()

# or:
# ArgoIndex(index_file="argo_bio-profile_index.txt")
# ArgoIndex(host="ftp://ftp.ifremer.fr/ifremer/argo")
# ArgoIndex(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt")
# ArgoIndex(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt", cache=True)

You can then trigger loading of the index content:

In [8]: idx.load()  # Load the full index in memory
---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:61, in _runner(event, coro, result, timeout)
     60 try:
---> 61     result[0] = await coro
     62 except Exception as ex:

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/http.py:750, in HTTPStreamFile._read(self, num)
    749 async def _read(self, num=-1):
--> 750     out = await self.r.content.read(num)
    751     self.loc += len(out)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/streams.py:385, in StreamReader.read(self, n)
    384 while not self._buffer and not self._eof:
--> 385     await self._wait("read")
    387 return self._read_nowait(n)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/streams.py:304, in StreamReader._wait(self, func_name)
    303     with self._timer:
--> 304         await waiter
    305 else:

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/helpers.py:721, in TimerContext.__exit__(self, exc_type, exc_val, exc_tb)
    720 if exc_type is asyncio.CancelledError and self._cancelled:
--> 721     raise asyncio.TimeoutError from None
    722 return None

TimeoutError: 

The above exception was the direct cause of the following exception:

FSTimeoutError                            Traceback (most recent call last)
Cell In[8], line 1
----> 1 idx.load()  # Load the full index in memory

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:76, in indexstore_pandas.load(self, nrows, force)
     74     with self.fs["src"].open(self.index_path + ".gz", "rb") as fg:
     75         with gzip.open(fg) as f:
---> 76             self.index = csv2index(f, self.index_path + ".gz")
     77 else:
     78     with self.fs["src"].open(self.index_path, "rb") as f:

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:49, in indexstore_pandas.load.<locals>.csv2index(obj, origin)
     48 def csv2index(obj, origin):
---> 49     index = read_csv(obj, nrows=nrows)
     50     check_index_cols(
     51         index.columns.to_list(),
     52         convention=self.convention,
     53     )
     54     log.debug("Argo index file loaded with pandas read_csv. src='%s'" % origin)

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:43, in indexstore_pandas.load.<locals>.read_csv(input_file, nrows)
     42 def read_csv(input_file, nrows=None):
---> 43     this_table = pd.read_csv(
     44         input_file, sep=",", index_col=None, header=0, skiprows=8, nrows=nrows
     45     )
     46     return this_table

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:912, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)
    899 kwds_defaults = _refine_defaults_read(
    900     dialect,
    901     delimiter,
   (...)
    908     dtype_backend=dtype_backend,
    909 )
    910 kwds.update(kwds_defaults)
--> 912 return _read(filepath_or_buffer, kwds)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:583, in _read(filepath_or_buffer, kwds)
    580     return parser
    582 with parser:
--> 583     return parser.read(nrows)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:1704, in TextFileReader.read(self, nrows)
   1697 nrows = validate_integer("nrows", nrows)
   1698 try:
   1699     # error: "ParserBase" has no attribute "read"
   1700     (
   1701         index,
   1702         columns,
   1703         col_dict,
-> 1704     ) = self._engine.read(  # type: ignore[attr-defined]
   1705         nrows
   1706     )
   1707 except Exception:
   1708     self.close()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py:234, in CParserWrapper.read(self, nrows)
    232 try:
    233     if self.low_memory:
--> 234         chunks = self._reader.read_low_memory(nrows)
    235         # destructive to chunks
    236         data = _concatenate_chunks(chunks)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:814, in pandas._libs.parsers.TextReader.read_low_memory()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:875, in pandas._libs.parsers.TextReader._read_rows()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:850, in pandas._libs.parsers.TextReader._tokenize_rows()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:861, in pandas._libs.parsers.TextReader._check_tokenize_status()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:2021, in pandas._libs.parsers.raise_parser_error()

File ~/.pyenv/versions/3.8.6/lib/python3.8/_compression.py:68, in DecompressReader.readinto(self, b)
     66 def readinto(self, b):
     67     with memoryview(b) as view, view.cast("B") as byte_view:
---> 68         data = self.read(len(byte_view))
     69         byte_view[:len(data)] = data
     70     return len(data)

File ~/.pyenv/versions/3.8.6/lib/python3.8/gzip.py:485, in _GzipReader.read(self, size)
    482     self._new_member = False
    484 # Read a chunk of data from the file
--> 485 buf = self._fp.read(io.DEFAULT_BUFFER_SIZE)
    487 uncompress = self._decompressor.decompress(buf, size)
    488 if self._decompressor.unconsumed_tail != b"":

File ~/.pyenv/versions/3.8.6/lib/python3.8/gzip.py:87, in _PaddedFile.read(self, size)
     85 def read(self, size):
     86     if self._read is None:
---> 87         return self.file.read(size)
     88     if self._read + size <= self._length:
     89         read = self._read

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:121, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    118 @functools.wraps(func)
    119 def wrapper(*args, **kwargs):
    120     self = obj or args[0]
--> 121     return sync(self.loop, func, *args, **kwargs)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:104, in sync(loop, func, timeout, *args, **kwargs)
    101 return_result = result[0]
    102 if isinstance(return_result, asyncio.TimeoutError):
    103     # suppress asyncio.TimeoutError, raise FSTimeoutError
--> 104     raise FSTimeoutError from return_result
    105 elif isinstance(return_result, BaseException):
    106     raise return_result

FSTimeoutError: 

Here is the list of methods and properties of the full index:

idx.load(nrows=12)  # Only load the first N rows of the index
idx.N_RECORDS  # Shortcut for length of 1st dimension of the index array
idx.to_dataframe(index=True)  # Convert index to user-friendly :class:`pandas.DataFrame`
idx.to_dataframe(index=True, nrows=2)  # Only returns the first nrows of the index
idx.index  # internal storage structure of the full index (:class:`pyarrow.Table` or :class:`pandas.DataFrame`)
idx.uri_full_index  # List of absolute path to files from the full index table column 'file'

They are several methods to search the index, for instance:

In [9]: idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])
---------------------------------------------------------------------------
TimeoutError                              Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:61, in _runner(event, coro, result, timeout)
     60 try:
---> 61     result[0] = await coro
     62 except Exception as ex:

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/http.py:750, in HTTPStreamFile._read(self, num)
    749 async def _read(self, num=-1):
--> 750     out = await self.r.content.read(num)
    751     self.loc += len(out)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/streams.py:385, in StreamReader.read(self, n)
    384 while not self._buffer and not self._eof:
--> 385     await self._wait("read")
    387 return self._read_nowait(n)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/streams.py:304, in StreamReader._wait(self, func_name)
    303     with self._timer:
--> 304         await waiter
    305 else:

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/helpers.py:721, in TimerContext.__exit__(self, exc_type, exc_val, exc_tb)
    720 if exc_type is asyncio.CancelledError and self._cancelled:
--> 721     raise asyncio.TimeoutError from None
    722 return None

TimeoutError: 

The above exception was the direct cause of the following exception:

FSTimeoutError                            Traceback (most recent call last)
Cell In[9], line 1
----> 1 idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:327, in indexstore_pandas.search_lat_lon_tim(self, BOX, nrows)
    325 is_indexbox(BOX)
    326 log.debug("Argo index searching for lat/lon/time in BOX=%s ..." % BOX)
--> 327 self.load()
    328 self.search_type = {"BOX": BOX}
    329 tim_min = int(pd.to_datetime(BOX[4]).strftime("%Y%m%d%H%M%S"))

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:76, in indexstore_pandas.load(self, nrows, force)
     74     with self.fs["src"].open(self.index_path + ".gz", "rb") as fg:
     75         with gzip.open(fg) as f:
---> 76             self.index = csv2index(f, self.index_path + ".gz")
     77 else:
     78     with self.fs["src"].open(self.index_path, "rb") as f:

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:49, in indexstore_pandas.load.<locals>.csv2index(obj, origin)
     48 def csv2index(obj, origin):
---> 49     index = read_csv(obj, nrows=nrows)
     50     check_index_cols(
     51         index.columns.to_list(),
     52         convention=self.convention,
     53     )
     54     log.debug("Argo index file loaded with pandas read_csv. src='%s'" % origin)

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:43, in indexstore_pandas.load.<locals>.read_csv(input_file, nrows)
     42 def read_csv(input_file, nrows=None):
---> 43     this_table = pd.read_csv(
     44         input_file, sep=",", index_col=None, header=0, skiprows=8, nrows=nrows
     45     )
     46     return this_table

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:912, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)
    899 kwds_defaults = _refine_defaults_read(
    900     dialect,
    901     delimiter,
   (...)
    908     dtype_backend=dtype_backend,
    909 )
    910 kwds.update(kwds_defaults)
--> 912 return _read(filepath_or_buffer, kwds)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:583, in _read(filepath_or_buffer, kwds)
    580     return parser
    582 with parser:
--> 583     return parser.read(nrows)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:1704, in TextFileReader.read(self, nrows)
   1697 nrows = validate_integer("nrows", nrows)
   1698 try:
   1699     # error: "ParserBase" has no attribute "read"
   1700     (
   1701         index,
   1702         columns,
   1703         col_dict,
-> 1704     ) = self._engine.read(  # type: ignore[attr-defined]
   1705         nrows
   1706     )
   1707 except Exception:
   1708     self.close()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py:234, in CParserWrapper.read(self, nrows)
    232 try:
    233     if self.low_memory:
--> 234         chunks = self._reader.read_low_memory(nrows)
    235         # destructive to chunks
    236         data = _concatenate_chunks(chunks)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:814, in pandas._libs.parsers.TextReader.read_low_memory()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:875, in pandas._libs.parsers.TextReader._read_rows()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:850, in pandas._libs.parsers.TextReader._tokenize_rows()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:861, in pandas._libs.parsers.TextReader._check_tokenize_status()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:2021, in pandas._libs.parsers.raise_parser_error()

File ~/.pyenv/versions/3.8.6/lib/python3.8/_compression.py:68, in DecompressReader.readinto(self, b)
     66 def readinto(self, b):
     67     with memoryview(b) as view, view.cast("B") as byte_view:
---> 68         data = self.read(len(byte_view))
     69         byte_view[:len(data)] = data
     70     return len(data)

File ~/.pyenv/versions/3.8.6/lib/python3.8/gzip.py:485, in _GzipReader.read(self, size)
    482     self._new_member = False
    484 # Read a chunk of data from the file
--> 485 buf = self._fp.read(io.DEFAULT_BUFFER_SIZE)
    487 uncompress = self._decompressor.decompress(buf, size)
    488 if self._decompressor.unconsumed_tail != b"":

File ~/.pyenv/versions/3.8.6/lib/python3.8/gzip.py:87, in _PaddedFile.read(self, size)
     85 def read(self, size):
     86     if self._read is None:
---> 87         return self.file.read(size)
     88     if self._read + size <= self._length:
     89         read = self._read

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:121, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
    118 @functools.wraps(func)
    119 def wrapper(*args, **kwargs):
    120     self = obj or args[0]
--> 121     return sync(self.loop, func, *args, **kwargs)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:104, in sync(loop, func, timeout, *args, **kwargs)
    101 return_result = result[0]
    102 if isinstance(return_result, asyncio.TimeoutError):
    103     # suppress asyncio.TimeoutError, raise FSTimeoutError
--> 104     raise FSTimeoutError from return_result
    105 elif isinstance(return_result, BaseException):
    106     raise return_result

FSTimeoutError: 

Here the list of all methods to search the index:

idx.search_wmo(1901393)
idx.search_cyc(1)
idx.search_wmo_cyc(1901393, [1,12])
idx.search_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition, only time is used
idx.search_lat_lon([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition, only lat/lon is used
idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition

And finally the list of methods and properties for search results:

idx.N_MATCH  # Shortcut for length of 1st dimension of the search results array
idx.to_dataframe()  # Convert search results to user-friendly :class:`pandas.DataFrame`
idx.to_dataframe(nrows=2)  # Only returns the first nrows of the search results
idx.to_indexfile("search_index.txt")  # Export search results to Argo standard index file
idx.search  # Internal table with search results
idx.uri  # List of absolute path to files from the search results table column 'file'

Hint

The argopy index store supports the Bio and Synthetic Profile directory files:

In [10]: idx = ArgoIndex(index_file="argo_bio-profile_index.txt").load()

# idx = ArgoIndex(index_file="argo_synthetic-profile_index.txt").load()
In [11]: idx
Out[11]: 
<argoindex.pandas>
Host: https://data-argo.ifremer.fr
Index: argo_bio-profile_index.txt
Convention: argo_bio-profile_index (Bio-Profile directory file of the Argo GDAC)
Loaded: True (288978 records)
Searched: False

This BGC index store comes with an additional search possibility for parameters:

In [12]: idx.search_params(['C1PHASE_DOXY', 'DOWNWELLING_PAR'])
Out[12]: 
<argoindex.pandas>
Host: https://data-argo.ifremer.fr
Index: argo_bio-profile_index.txt
Convention: argo_bio-profile_index (Bio-Profile directory file of the Argo GDAC)
Loaded: True (288978 records)
Searched: True (38271 matches, 13.2436%)

In [13]: idx.to_dataframe()
Out[13]: 
                                          file  ... profiler
0       bodc/3901496/profiles/BD3901496_001.nc  ...  Unknown
1       bodc/3901496/profiles/BD3901496_002.nc  ...  Unknown
2       bodc/3901496/profiles/BD3901496_003.nc  ...  Unknown
3       bodc/3901496/profiles/BD3901496_004.nc  ...  Unknown
4       bodc/3901496/profiles/BD3901496_005.nc  ...  Unknown
...                                        ...  ...      ...
38266  csiro/7900947/profiles/BR7900947_018.nc  ...  Unknown
38267  csiro/7900947/profiles/BR7900947_019.nc  ...  Unknown
38268  csiro/7900947/profiles/BR7900947_020.nc  ...  Unknown
38269  csiro/7900947/profiles/BR7900947_021.nc  ...  Unknown
38270  csiro/7900947/profiles/BR7900947_022.nc  ...  Unknown

[38271 rows x 13 columns]

Index file supported #

The table below summarize the argopy support status of all Argo index files:

**argopy** GDAC index file support status#
	Index file	Supported
Profile	ar_index_global_prof.txt	✅
Synthetic-Profile	argo_synthetic-profile_index.txt	✅
Bio-Profile	argo_bio-profile_index.txt	✅
Trajectory	ar_index_global_traj.txt	🔜
Bio-Trajectory	argo_bio-traj_index.txt	🔜
Metadata	ar_index_global_meta.txt	❌
Technical	ar_index_global_tech.txt	❌
Greylist	ar_greylist.txt	❌

Index files support can be added on demand. Click here to raise an issue if you’d like to access other index files.

Reference tables #

The Argo netcdf format is strict and based on a collection of variables fully documented and conventioned. All reference tables can be found in the Argo user manual.

However, a machine-to-machine access to these tables is often required. This is possible thanks to the work of the Argo Vocabulary Task Team (AVTT) that is a team of people responsible for the NVS collections under the Argo Data Management Team governance.

Note

The GitHub organization hosting the AVTT is the ‘NERC Vocabulary Server (NVS)’, aka ‘nvs-vocabs’. This holds a list of NVS collection-specific GitHub repositories. Each Argo GitHub repository is called after its corresponding collection ID (e.g. R01, RR2, R03 etc.). The current list is given here.

The management of issues related to vocabularies managed by the Argo Data Management Team is done on this repository.

argopy provides the utility class ArgoNVSReferenceTables to easily fetch and get access to all Argo reference tables. If you already know the name of the reference table you want to retrieve, you can simply get it like this:

In [14]: from argopy import ArgoNVSReferenceTables

In [15]: NVS = ArgoNVSReferenceTables()

In [16]: NVS.tbl('R01')
Out[16]: 
  altLabel  ...                                                 id
0    BPROF  ...  http://vocab.nerc.ac.uk/collection/R01/current...
1    BTRAJ  ...  http://vocab.nerc.ac.uk/collection/R01/current...
2     META  ...  http://vocab.nerc.ac.uk/collection/R01/current...
3    MPROF  ...  http://vocab.nerc.ac.uk/collection/R01/current...
4    MTRAJ  ...  http://vocab.nerc.ac.uk/collection/R01/current...
5     PROF  ...  http://vocab.nerc.ac.uk/collection/R01/current...
6    SPROF  ...  http://vocab.nerc.ac.uk/collection/R01/current...
7     TECH  ...  http://vocab.nerc.ac.uk/collection/R01/current...
8     TRAJ  ...  http://vocab.nerc.ac.uk/collection/R01/current...

[9 rows x 5 columns]

The reference table is returned as a pandas.DataFrame. If you want the exact name of this table:

In [17]: NVS.tbl_name('R01')
Out[17]: 
('DATA_TYPE',
 'Terms describing the type of data contained in an Argo netCDF file. Argo netCDF variable DATA_TYPE is populated by R01 prefLabel.',
 'http://vocab.nerc.ac.uk/collection/R01/current/')

If you’re looking the ID to use for a specific reference table, you can check it from the list of all available tables given by the ArgoNVSReferenceTables.all_tbl_name() property. It will return a dictionary with table IDs as key and table name, definition and NVS link as values. Use the ArgoNVSReferenceTables.all_tbl() property to retrieve all tables.

In [18]: NVS.all_tbl_name
Out[18]: 
OrderedDict([('R01',
              ('DATA_TYPE',
               'Terms describing the type of data contained in an Argo netCDF file. Argo netCDF variable DATA_TYPE is populated by R01 prefLabel.',
               'http://vocab.nerc.ac.uk/collection/R01/current/')),
             ('R03',
              ('PARAMETER',
               'Terms describing individual measured phenomena, used to mark up sets of data in Argo netCDF arrays. Argo netCDF variables PARAMETER and TRAJECTORY_PARAMETERS are populated by R03 altLabel; R03 altLabel is also used to name netCDF profile files parameter variables <PARAMETER>.',
               'http://vocab.nerc.ac.uk/collection/R03/current/')),
             ('R04',
              ('DATA_CENTRE_CODES',
               'Codes for data centres and institutions handling or managing Argo data. Argo netCDF variable DATA_CENTRE is populated by R04 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R04/current/')),
             ('R05',
              ('POSITION_ACCURACY',
               'Accuracy in latitude and longitude measurements received from the positioning system, grouped by location accuracy classes.',
               'http://vocab.nerc.ac.uk/collection/R05/current/')),
             ('R06',
              ('DATA_STATE_INDICATOR',
               'Processing stage of the data based on the concatenation of processing level and class indicators. Argo netCDF variable DATA_STATE_INDICATOR is populated by R06 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R06/current/')),
             ('R07',
              ('HISTORY_ACTION',
               'Coded history information for each action performed on each profile by a data centre. Argo netCDF variable HISTORY_ACTION is populated by R07 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R07/current/')),
             ('R08',
              ('ARGO_WMO_INST_TYPE',
               "Subset of instrument type codes from the World Meteorological Organization (WMO) Common Code Table C-3 (CCT C-3) 1770, named 'Instrument make and type for water temperature profile measurement with fall rate equation coefficients' and available here: https://library.wmo.int/doc_num.php?explnum_id=11283. Argo netCDF variable WMO_INST_TYPE is populated by R08 altLabel.",
               'http://vocab.nerc.ac.uk/collection/R08/current/')),
             ('R09',
              ('POSITIONING_SYSTEM',
               'List of float location measuring systems. Argo netCDF variable POSITIONING_SYSTEM is populated by R09 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R09/current/')),
             ('R10',
              ('TRANS_SYSTEM',
               'List of telecommunication systems. Argo netCDF variable TRANS_SYSTEM is populated by R10 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R10/current/')),
             ('R11',
              ('RTQC_TESTID',
               'List of real-time quality-control tests and corresponding binary identifiers, used as reference to populate the Argo netCDF HISTORY_QCTEST variable.',
               'http://vocab.nerc.ac.uk/collection/R11/current/')),
             ('R12',
              ('HISTORY_STEP',
               'Data processing step codes for history record. Argo netCDF variable TRANS_SYSTEM is populated by R12 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R12/current/')),
             ('R13',
              ('OCEAN_CODE',
               'Ocean area codes assigned to each profile in the Metadata directory (index) file of the Argo Global Assembly Centre.',
               'http://vocab.nerc.ac.uk/collection/R13/current/')),
             ('R15',
              ('MEASUREMENT_CODE_ID',
               'Measurement code IDs used in Argo Trajectory netCDF files. Argo netCDF variable MEASUREMENT_CODE is populated by R15 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R15/current/')),
             ('R16',
              ('VERTICAL_SAMPLING_SCHEME',
               'Profile sampling schemes and sampling methods. Argo netCDF variable VERTICAL_SAMPLING_SCHEME is populated by R16 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R16/current/')),
             ('R19',
              ('STATUS',
               'Flag scale for values in all Argo netCDF cycle timing variables. Argo netCDF cycle timing variables JULD_<RTV>_STATUS are populated by R19 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R19/current/')),
             ('R20',
              ('GROUNDED',
               'Codes to indicate the best estimate of whether the float touched the ground during a specific cycle. Argo netCDF variable GROUNDED in the Trajectory file is populated by R20 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R20/current/')),
             ('R21',
              ('REPRESENTATIVE_PARK_PRESSURE_STATUS',
               'Argo status flag on the Representative Park Pressure (RPP). Argo netCDF variable REPRESENTATIVE_PARK_PRESSURE_STATUS in the Trajectory file is populated by R21 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R21/current/')),
             ('R22',
              ('PLATFORM_FAMILY',
               'List of platform family/category of Argo floats. Argo netCDF variable PLATFORM_FAMILY is populated by R22 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R22/current/')),
             ('R23',
              ('PLATFORM_TYPE',
               'List of Argo float types. Argo netCDF variable PLATFORM_TYPE is populated by R23 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R23/current/')),
             ('R24',
              ('PLATFORM_MAKER',
               'List of Argo float manufacturers. Argo netCDF variable PLATFORM_MAKER is populated by R24 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R24/current/')),
             ('R25',
              ('SENSOR',
               'Terms describing sensor types mounted on Argo floats. Argo netCDF variable SENSOR is populated by R25 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R25/current/')),
             ('R26',
              ('SENSOR_MAKER',
               'Terms describing developers and manufacturers of sensors mounted on Argo floats. Argo netCDF variable SENSOR_MAKER is populated by R26 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R26/current/')),
             ('R27',
              ('SENSOR_MODEL',
               'Terms listing models of sensors mounted on Argo floats. Note: avoid using the manufacturer name and sensor firmware version in new entries when possible. Argo netCDF variable SENSOR_MODEL is populated by R27 altLabel.',
               'http://vocab.nerc.ac.uk/collection/R27/current/')),
             ('RD2',
              ('DM_QC_FLAG',
               "Quality flag scale for delayed-mode measurements. Argo netCDF variables <PARAMETER>_ADJUSTED_QC in 'D' mode are populated by RD2 altLabel.",
               'http://vocab.nerc.ac.uk/collection/RD2/current/')),
             ('RMC',
              ('MEASUREMENT_CODE_CATEGORY',
               "Categories of trajectory measurement codes listed in NVS collection 'R15'",
               'http://vocab.nerc.ac.uk/collection/RMC/current/')),
             ('RP2',
              ('PROF_QC_FLAG',
               'Quality control flag scale for whole profiles. Argo netCDF variables PROFILE_<PARAMETER>_QC are populated by RP2 altLabel.',
               'http://vocab.nerc.ac.uk/collection/RP2/current/')),
             ('RR2',
              ('RT_QC_FLAG',
               "Quality flag scale for real-time measurements. Argo netCDF variables <PARAMETER>_QC in 'R' mode and <PARAMETER>_ADJUSTED_QC in 'A' mode are populated by RR2 altLabel.",
               'http://vocab.nerc.ac.uk/collection/RR2/current/')),
             ('RTV',
              ('CYCLE_TIMING_VARIABLE',
               "Timing variables representing stages of an Argo float profiling cycle, most of which are associated with a trajectory measurement code ID listed in NVS collection 'R15'. Argo netCDF cycle timing variable names JULD_<RTV>_STATUS are constructed by RTV altLabel.",
               'http://vocab.nerc.ac.uk/collection/RTV/current/'))])

Deployment Plan #

It may be useful to be able to retrieve meta-data from Argo deployments. argopy can use the OceanOPS API for metadata access to retrieve these information. The returned deployment plan is a list of all Argo floats ever deployed, together with their deployment location, date, WMO, program, country, float model and current status.

To fetch the Argo deployment plan, argopy provides a dedicated utility class: OceanOPSDeployments that can be used like this:

In [19]: from argopy import OceanOPSDeployments

In [20]: deployment = OceanOPSDeployments()

In [21]: df = deployment.to_dataframe()

In [22]: df
Out[22]: 
                   date    lat     lon  ...      program country    model
0   2023-07-25 00:00:00  72.30 -134.00  ...  Argo CANADA  CANADA    ARVOR
1   2023-07-26 11:06:49  40.10   11.20  ...   Argo ITALY   ITALY    ARVOR
2   2023-07-28 00:00:00  73.00 -150.00  ...  Argo CANADA  CANADA    ARVOR
3   2023-07-30 00:00:00  43.42    7.89  ...     Coriolis  FRANCE    ARVOR
4   2023-07-30 00:00:00  40.00    6.99  ...     Coriolis  FRANCE    ARVOR
..                  ...    ...     ...  ...          ...     ...      ...
427 2024-12-31 13:49:07  47.80   -3.30  ...     Coriolis  FRANCE  ARVOR_D
428 2024-12-31 13:49:07  47.80   -3.30  ...     Coriolis  FRANCE  ARVOR_D
429 2024-12-31 13:49:07  47.80   -3.30  ...     Coriolis  FRANCE  ARVOR_D
430 2024-12-31 13:49:07  47.80   -3.30  ...     Coriolis  FRANCE  ARVOR_D
431 2024-12-31 13:49:07  47.80   -3.30  ...     Coriolis  FRANCE  ARVOR_D

[432 rows x 9 columns]

OceanOPSDeployments can also take an index box definition as argument in order to restrict the deployment plan selection to a specific region or period:

deployment = OceanOPSDeployments([-90, 0, 0, 90])
# deployment = OceanOPSDeployments([-20, 0, 42, 51, '2020-01', '2021-01'])
# deployment = OceanOPSDeployments([-180, 180, -90, 90, '2020-01', None])

Note that if the starting date is not provided, it will be set automatically to the current date.

Last, OceanOPSDeployments comes with a plotting method:

fig, ax = deployment.plot_status()

_images/scatter_map_deployment_status.png

Note

The list of possible deployment status name/code is given by:

OceanOPSDeployments().status_code

Status	Id	Description
PROBABLE	0	Starting status for some platforms, when there is only a few metadata available, like rough deployment location and date. The platform may be deployed
CONFIRMED	1	Automatically set when a ship is attached to the deployment information. The platform is ready to be deployed, deployment is planned
REGISTERED	2	Starting status for most of the networks, when deployment planning is not done. The deployment is certain, and a notification has been sent via the OceanOPS system
OPERATIONAL	6	Automatically set when the platform is emitting a pulse and observations are distributed within a certain time interval
INACTIVE	4	The platform is not emitting a pulse since a certain time
CLOSED	5	The platform is not emitting a pulse since a long time, it is considered as dead

ADMT Documentation #

More than 20 pdf manuals have been produced by the Argo Data Management Team. Using the ArgoDocs class, it’s easy to navigate this great database.

If you don’t know where to start, you can simply list all available documents:

In [23]: from argopy import ArgoDocs

In [24]: ArgoDocs().list
Out[24]: 
             category  ...     id
 Argo data formats  ...  29825
   Quality control  ...  33951
   Quality control  ...  46542
   Quality control  ...  40879
   Quality control  ...  35385
   Quality control  ...  84370
   Quality control  ...  62466
         Cookbooks  ...  41151
         Cookbooks  ...  29824
         Cookbooks  ...  78994
        Cookbooks  ...  39795
        Cookbooks  ...  39459
        Cookbooks  ...  39468
        Cookbooks  ...  47998
        Cookbooks  ...  54541
        Cookbooks  ...  46121
        Cookbooks  ...  51541
        Cookbooks  ...  57195
        Cookbooks  ...  46120
        Cookbooks  ...  52154
        Cookbooks  ...  55637
        Cookbooks  ...  46202

[22 rows x 4 columns]

Or search for a word in the title and/or abstract:

In [25]: results = ArgoDocs().search("oxygen")

In [26]: for docid in results:
   ....:     print("\n", ArgoDocs(docid))
   ....: 

 ---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
Cell In[26], line 2
      1 for docid in results:
----> 2     print("\n", ArgoDocs(docid))

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4121, in ArgoDocs.__repr__(self)
   4119 summary.append("DOI: %s" % doc['doi'])
   4120 summary.append("url: https://dx.doi.org/%s" % doc['doi'])
-> 4121 summary.append("last pdf: %s" % self.pdf)
   4122 if 'AF' in self.ris:
   4123     summary.append("Authors: %s" % self.ris['AF'])

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4172, in ArgoDocs.pdf(self)
   4170 """Link to the online pdf version of a document"""
   4171 if self.docid is not None:
-> 4172     return self.ris['UR']
   4173 else:
   4174     raise ValueError("Select a document first !")

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4151, in ArgoDocs.ris(self)
   4148 if self._ris is None:
   4149     # Fetch RIS metadata for this document:
   4150     import re
-> 4151     file = self._fs.fs.cat_file("https://dx.doi.org/%s" % self.js['doi'])
   4152     x = re.search('<a target="_blank" href="(https?:\/\/([^"]*))"\s+([^>]*)rel="nofollow">TXT<\/a>',
   4153                   str(file))
   4154     export_txt_url = x[1]

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
    411 def __getattribute__(self, item):
    412     if item in [
    413         "load_cache",
    414         "_open",
   (...)
    445         # all the methods defined in this class. Note `open` here, since
    446         # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
    448             *args, **kw
    449         )
    450     if item in ["__reduce_ex__"]:
    451         raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/spec.py:757, in AbstractFileSystem.cat_file(self, path, start, end, **kwargs)
    745 """Get the content of a file
    746 
    747 Parameters
   (...)
    754 kwargs: passed to ``open()``.
    755 """
    756 # explicitly set buffering off?
--> 757 with self.open(path, "rb", **kwargs) as f:
    758     if start is not None:
    759         if start >= 0:

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
    411 def __getattribute__(self, item):
    412     if item in [
    413         "load_cache",
    414         "_open",
   (...)
    445         # all the methods defined in this class. Note `open` here, since
    446         # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
    448             *args, **kw
    449         )
    450     if item in ["__reduce_ex__"]:
    451         raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/spec.py:1241, in AbstractFileSystem.open(self, path, mode, block_size, cache_options, compression, **kwargs)
   1239 else:
   1240     ac = kwargs.pop("autocommit", not self._intrans)
-> 1241     f = self._open(
   1242         path,
   1243         mode=mode,
   1244         block_size=block_size,
   1245         autocommit=ac,
   1246         cache_options=cache_options,
   1247         **kwargs,
   1248     )
   1249     if compression is not None:
   1250         from fsspec.compression import compr

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
    411 def __getattribute__(self, item):
    412     if item in [
    413         "load_cache",
    414         "_open",
   (...)
    445         # all the methods defined in this class. Note `open` here, since
    446         # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
    448             *args, **kw
    449         )
    450     if item in ["__reduce_ex__"]:
    451         raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
    698     self.fs.get(path, fn)
    699 self.save_cache()
--> 700 return self._open(path, mode)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
    411 def __getattribute__(self, item):
    412     if item in [
    413         "load_cache",
    414         "_open",
   (...)
    445         # all the methods defined in this class. Note `open` here, since
    446         # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
    448             *args, **kw
    449         )
    450     if item in ["__reduce_ex__"]:
    451         raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
    698     self.fs.get(path, fn)
    699 self.save_cache()
--> 700 return self._open(path, mode)

    [... skipping similar frames: CachingFileSystem.__getattribute__.<locals>.<lambda> at line 447 (1462 times), WholeFileCacheFileSystem._open at line 700 (1461 times)]

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
    698     self.fs.get(path, fn)
    699 self.save_cache()
--> 700 return self._open(path, mode)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
    411 def __getattribute__(self, item):
    412     if item in [
    413         "load_cache",
    414         "_open",
   (...)
    445         # all the methods defined in this class. Note `open` here, since
    446         # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
    448             *args, **kw
    449         )
    450     if item in ["__reduce_ex__"]:
    451         raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:699, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
    697 else:
    698     self.fs.get(path, fn)
--> 699 self.save_cache()
    700 return self._open(path, mode)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
    411 def __getattribute__(self, item):
    412     if item in [
    413         "load_cache",
    414         "_open",
   (...)
    445         # all the methods defined in this class. Note `open` here, since
    446         # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
    448             *args, **kw
    449         )
    450     if item in ["__reduce_ex__"]:
    451         raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:192, in CachingFileSystem.save_cache(self)
    190         c["blocks"] = list(c["blocks"])
    191 self._mkcache()
--> 192 with atomic_write(fn) as f:
    193     pickle.dump(cache, f)
    194 self.cached_files[-1] = cached_files

File ~/.pyenv/versions/3.8.6/lib/python3.8/contextlib.py:113, in _GeneratorContextManager.__enter__(self)
    111 del self.args, self.kwds, self.func
    112 try:
--> 113     return next(self.gen)
    114 except StopIteration:
    115     raise RuntimeError("generator didn't yield") from None

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:856, in atomic_write(path, mode)
    849 @contextlib.contextmanager
    850 def atomic_write(path, mode="wb"):
    851     """
    852     A context manager that opens a temporary file next to `path` and, on exit,
    853     replaces `path` with the temporary file, thereby updating `path`
    854     atomically.
    855     """
--> 856     fd, fn = tempfile.mkstemp(
    857         dir=os.path.dirname(path), prefix=os.path.basename(path) + "-"
    858     )
    859     try:
    860         with open(fd, mode) as fp:

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:331, in mkstemp(suffix, prefix, dir, text)
    328 else:
    329     flags = _bin_openflags
--> 331 return _mkstemp_inner(dir, prefix, suffix, flags, output_type)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:246, in _mkstemp_inner(dir, pre, suf, flags, output_type)
    243     names = map(_os.fsencode, names)
    245 for seq in range(TMP_MAX):
--> 246     name = next(names)
    247     file = _os.path.join(dir, pre + name + suf)
    248     _sys.audit("tempfile.mkstemp", file)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:147, in _RandomNameSequence.__next__(self)
    145 c = self.characters
    146 choose = self.rng.choice
--> 147 letters = [choose(c) for dummy in range(8)]
    148 return ''.join(letters)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:147, in <listcomp>(.0)
    145 c = self.characters
    146 choose = self.rng.choice
--> 147 letters = [choose(c) for dummy in range(8)]
    148 return ''.join(letters)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/random.py:288, in Random.choice(self, seq)
    286 """Choose a random element from a non-empty sequence."""
    287 try:
--> 288     i = self._randbelow(len(seq))
    289 except ValueError:
    290     raise IndexError('Cannot choose from an empty sequence') from None

RecursionError: maximum recursion depth exceeded while calling a Python object

Then using the Argo doi number of a document, you can easily retrieve it:

In [27]: ArgoDocs(35385)
Out[27]: ---------------------------------------------------------------------------
RecursionError                            Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/IPython/core/formatters.py:708, in PlainTextFormatter.__call__(self, obj)
stream = StringIO()
printer = pretty.RepresentationPrinter(stream, self.verbose,
   self.max_width, self.newline,
   max_seq_length=self.max_seq_length,
   singleton_pprinters=self.singleton_printers,
   type_pprinters=self.type_printers,
   deferred_pprinters=self.deferred_printers)
--> 708 printer.pretty(obj)
printer.flush()
return stream.getvalue()

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/IPython/lib/pretty.py:410, in RepresentationPrinter.pretty(self, obj)
                       return meth(obj, self, cycle)
               if cls is not object \
                       and callable(cls.__dict__.get('__repr__')):
--> 410                     return _repr_pprint(obj, self, cycle)
   return _default_pprint(obj, self, cycle)
finally:

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/IPython/lib/pretty.py:778, in _repr_pprint(obj, p, cycle)
"""A pprint that just redirects to the normal repr function."""
# Find newlines and replace them with p.break_()
--> 778 output = repr(obj)
lines = output.splitlines()
with p.group():

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4121, in ArgoDocs.__repr__(self)
summary.append("DOI: %s" % doc['doi'])
summary.append("url: https://dx.doi.org/%s" % doc['doi'])
-> 4121 summary.append("last pdf: %s" % self.pdf)
if 'AF' in self.ris:
   summary.append("Authors: %s" % self.ris['AF'])

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4172, in ArgoDocs.pdf(self)
"""Link to the online pdf version of a document"""
if self.docid is not None:
-> 4172     return self.ris['UR']
else:
   raise ValueError("Select a document first !")

File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4151, in ArgoDocs.ris(self)
if self._ris is None:
   # Fetch RIS metadata for this document:
   import re
-> 4151     file = self._fs.fs.cat_file("https://dx.doi.org/%s" % self.js['doi'])
   x = re.search('<a target="_blank" href="(https?:\/\/([^"]*))"\s+([^>]*)rel="nofollow">TXT<\/a>',
                 str(file))
   export_txt_url = x[1]

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
def __getattribute__(self, item):
   if item in [
       "load_cache",
       "_open",
   (...)
       # all the methods defined in this class. Note `open` here, since
       # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
           *args, **kw
       )
   if item in ["__reduce_ex__"]:
       raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/spec.py:757, in AbstractFileSystem.cat_file(self, path, start, end, **kwargs)
"""Get the content of a file

Parameters
   (...)
kwargs: passed to ``open()``.
"""
# explicitly set buffering off?
--> 757 with self.open(path, "rb", **kwargs) as f:
   if start is not None:
       if start >= 0:

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
def __getattribute__(self, item):
   if item in [
       "load_cache",
       "_open",
   (...)
       # all the methods defined in this class. Note `open` here, since
       # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
           *args, **kw
       )
   if item in ["__reduce_ex__"]:
       raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/spec.py:1241, in AbstractFileSystem.open(self, path, mode, block_size, cache_options, compression, **kwargs)
else:
   ac = kwargs.pop("autocommit", not self._intrans)
-> 1241     f = self._open(
       path,
       mode=mode,
       block_size=block_size,
       autocommit=ac,
       cache_options=cache_options,
       **kwargs,
   )
   if compression is not None:
       from fsspec.compression import compr

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
def __getattribute__(self, item):
   if item in [
       "load_cache",
       "_open",
   (...)
       # all the methods defined in this class. Note `open` here, since
       # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
           *args, **kw
       )
   if item in ["__reduce_ex__"]:
       raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
   self.fs.get(path, fn)
self.save_cache()
--> 700 return self._open(path, mode)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
def __getattribute__(self, item):
   if item in [
       "load_cache",
       "_open",
   (...)
       # all the methods defined in this class. Note `open` here, since
       # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
           *args, **kw
       )
   if item in ["__reduce_ex__"]:
       raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
   self.fs.get(path, fn)
self.save_cache()
--> 700 return self._open(path, mode)

    [... skipping similar frames: CachingFileSystem.__getattribute__.<locals>.<lambda> at line 447 (1457 times), WholeFileCacheFileSystem._open at line 700 (1456 times)]

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
   self.fs.get(path, fn)
self.save_cache()
--> 700 return self._open(path, mode)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
def __getattribute__(self, item):
   if item in [
       "load_cache",
       "_open",
   (...)
       # all the methods defined in this class. Note `open` here, since
       # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
           *args, **kw
       )
   if item in ["__reduce_ex__"]:
       raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:699, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
else:
   self.fs.get(path, fn)
--> 699 self.save_cache()
return self._open(path, mode)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
def __getattribute__(self, item):
   if item in [
       "load_cache",
       "_open",
   (...)
       # all the methods defined in this class. Note `open` here, since
       # it calls `_open`, but is actually in superclass
--> 447         return lambda *args, **kw: getattr(type(self), item).__get__(self)(
           *args, **kw
       )
   if item in ["__reduce_ex__"]:
       raise AttributeError

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:192, in CachingFileSystem.save_cache(self)
       c["blocks"] = list(c["blocks"])
self._mkcache()
--> 192 with atomic_write(fn) as f:
   pickle.dump(cache, f)
self.cached_files[-1] = cached_files

File ~/.pyenv/versions/3.8.6/lib/python3.8/contextlib.py:113, in _GeneratorContextManager.__enter__(self)
del self.args, self.kwds, self.func
try:
--> 113     return next(self.gen)
except StopIteration:
   raise RuntimeError("generator didn't yield") from None

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:856, in atomic_write(path, mode)
@contextlib.contextmanager
def atomic_write(path, mode="wb"):
   """
   A context manager that opens a temporary file next to `path` and, on exit,
   replaces `path` with the temporary file, thereby updating `path`
   atomically.
   """
--> 856     fd, fn = tempfile.mkstemp(
       dir=os.path.dirname(path), prefix=os.path.basename(path) + "-"
   )
   try:
       with open(fd, mode) as fp:

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:331, in mkstemp(suffix, prefix, dir, text)
else:
   flags = _bin_openflags
--> 331 return _mkstemp_inner(dir, prefix, suffix, flags, output_type)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:246, in _mkstemp_inner(dir, pre, suf, flags, output_type)
   names = map(_os.fsencode, names)
for seq in range(TMP_MAX):
--> 246     name = next(names)
   file = _os.path.join(dir, pre + name + suf)
   _sys.audit("tempfile.mkstemp", file)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:147, in _RandomNameSequence.__next__(self)
c = self.characters
choose = self.rng.choice
--> 147 letters = [choose(c) for dummy in range(8)]
return ''.join(letters)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:147, in <listcomp>(.0)
c = self.characters
choose = self.rng.choice
--> 147 letters = [choose(c) for dummy in range(8)]
return ''.join(letters)

File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/random.py:288, in Random.choice(self, seq)
"""Choose a random element from a non-empty sequence."""
try:
--> 288     i = self._randbelow(len(seq))
except ValueError:
   raise IndexError('Cannot choose from an empty sequence') from None

RecursionError: maximum recursion depth exceeded while calling a Python object

and open it in your browser:

# ArgoDocs(35385).show()
# ArgoDocs(35385).open_pdf(page=12)

Argo meta-data

Contents

Argo meta-data#