Argo meta-data#
Index of profiles#
Since the Argo measurements dataset is quite complex, it comes with a collection of index files, or lookup tables with meta data. These index help you determine what you can expect before retrieving the full set of measurements.
argopy provides two methods to work with Argo index files: one is high-level and works like the data fetcher, the other is low-level and works like a โstoreโ.
Fetcher: High-level Argo index access#
argopy has a specific fetcher for index files:
In [1]: from argopy import IndexFetcher as ArgoIndexFetcher
You can use the Index fetcher with the region
or float
access points, similarly to data fetching:
In [2]: idx = ArgoIndexFetcher(src='gdac').float(2901623).load()
In [3]: idx.index
Out[3]:
file ... profiler
0 nmdis/2901623/profiles/R2901623_000.nc ... Provor, Seabird conductivity sensor
1 nmdis/2901623/profiles/R2901623_000D.nc ... Provor, Seabird conductivity sensor
2 nmdis/2901623/profiles/R2901623_001.nc ... Provor, Seabird conductivity sensor
3 nmdis/2901623/profiles/R2901623_002.nc ... Provor, Seabird conductivity sensor
4 nmdis/2901623/profiles/R2901623_003.nc ... Provor, Seabird conductivity sensor
.. ... ... ...
93 nmdis/2901623/profiles/R2901623_092.nc ... Provor, Seabird conductivity sensor
94 nmdis/2901623/profiles/R2901623_093.nc ... Provor, Seabird conductivity sensor
95 nmdis/2901623/profiles/R2901623_094.nc ... Provor, Seabird conductivity sensor
96 nmdis/2901623/profiles/R2901623_095.nc ... Provor, Seabird conductivity sensor
97 nmdis/2901623/profiles/R2901623_096.nc ... Provor, Seabird conductivity sensor
[98 rows x 11 columns]
Alternatively, you can use argopy.IndexFetcher.to_dataframe()
:
In [4]: idx = ArgoIndexFetcher(src='gdac').float(2901623)
In [5]: df = idx.to_dataframe()
The difference is that with the load method, data are stored in memory and not fetched on every call to the index attribute.
The index fetcher has pretty much the same methods than the data fetchers. You can check them all here: argopy.fetchers.ArgoIndexFetcher
.
Store: Low-level Argo Index access#
The IndexFetcher shown above is a user-friendly layer on top of our internal Argo index file store. But if you are familiar with Argo index files and/or cares about performances, you may be interested in using directly the Argo index store ArgoIndex
.
If Pyarrow is installed, this store will rely on pyarrow.Table
as internal storage format for the index, otherwise it will fall back on pandas.DataFrame
. Loading the full Argo profile index takes about 2/3 secs with Pyarrow, while it can take up to 6/7 secs with Pandas.
All index store methods and properties are fully documented in ArgoIndex
.
Usage#
You create an index store with default or custom options:
In [6]: from argopy import ArgoIndex
In [7]: idx = ArgoIndex()
# or:
# ArgoIndex(index_file="argo_bio-profile_index.txt")
# ArgoIndex(host="ftp://ftp.ifremer.fr/ifremer/argo")
# ArgoIndex(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt")
# ArgoIndex(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt", cache=True)
You can then trigger loading of the index content:
In [8]: idx.load() # Load the full index in memory
---------------------------------------------------------------------------
TimeoutError Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:61, in _runner(event, coro, result, timeout)
60 try:
---> 61 result[0] = await coro
62 except Exception as ex:
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/http.py:750, in HTTPStreamFile._read(self, num)
749 async def _read(self, num=-1):
--> 750 out = await self.r.content.read(num)
751 self.loc += len(out)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/streams.py:385, in StreamReader.read(self, n)
384 while not self._buffer and not self._eof:
--> 385 await self._wait("read")
387 return self._read_nowait(n)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/streams.py:304, in StreamReader._wait(self, func_name)
303 with self._timer:
--> 304 await waiter
305 else:
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/helpers.py:721, in TimerContext.__exit__(self, exc_type, exc_val, exc_tb)
720 if exc_type is asyncio.CancelledError and self._cancelled:
--> 721 raise asyncio.TimeoutError from None
722 return None
TimeoutError:
The above exception was the direct cause of the following exception:
FSTimeoutError Traceback (most recent call last)
Cell In[8], line 1
----> 1 idx.load() # Load the full index in memory
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:76, in indexstore_pandas.load(self, nrows, force)
74 with self.fs["src"].open(self.index_path + ".gz", "rb") as fg:
75 with gzip.open(fg) as f:
---> 76 self.index = csv2index(f, self.index_path + ".gz")
77 else:
78 with self.fs["src"].open(self.index_path, "rb") as f:
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:49, in indexstore_pandas.load.<locals>.csv2index(obj, origin)
48 def csv2index(obj, origin):
---> 49 index = read_csv(obj, nrows=nrows)
50 check_index_cols(
51 index.columns.to_list(),
52 convention=self.convention,
53 )
54 log.debug("Argo index file loaded with pandas read_csv. src='%s'" % origin)
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:43, in indexstore_pandas.load.<locals>.read_csv(input_file, nrows)
42 def read_csv(input_file, nrows=None):
---> 43 this_table = pd.read_csv(
44 input_file, sep=",", index_col=None, header=0, skiprows=8, nrows=nrows
45 )
46 return this_table
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:912, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)
899 kwds_defaults = _refine_defaults_read(
900 dialect,
901 delimiter,
(...)
908 dtype_backend=dtype_backend,
909 )
910 kwds.update(kwds_defaults)
--> 912 return _read(filepath_or_buffer, kwds)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:583, in _read(filepath_or_buffer, kwds)
580 return parser
582 with parser:
--> 583 return parser.read(nrows)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:1704, in TextFileReader.read(self, nrows)
1697 nrows = validate_integer("nrows", nrows)
1698 try:
1699 # error: "ParserBase" has no attribute "read"
1700 (
1701 index,
1702 columns,
1703 col_dict,
-> 1704 ) = self._engine.read( # type: ignore[attr-defined]
1705 nrows
1706 )
1707 except Exception:
1708 self.close()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py:234, in CParserWrapper.read(self, nrows)
232 try:
233 if self.low_memory:
--> 234 chunks = self._reader.read_low_memory(nrows)
235 # destructive to chunks
236 data = _concatenate_chunks(chunks)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:814, in pandas._libs.parsers.TextReader.read_low_memory()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:875, in pandas._libs.parsers.TextReader._read_rows()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:850, in pandas._libs.parsers.TextReader._tokenize_rows()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:861, in pandas._libs.parsers.TextReader._check_tokenize_status()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:2021, in pandas._libs.parsers.raise_parser_error()
File ~/.pyenv/versions/3.8.6/lib/python3.8/_compression.py:68, in DecompressReader.readinto(self, b)
66 def readinto(self, b):
67 with memoryview(b) as view, view.cast("B") as byte_view:
---> 68 data = self.read(len(byte_view))
69 byte_view[:len(data)] = data
70 return len(data)
File ~/.pyenv/versions/3.8.6/lib/python3.8/gzip.py:485, in _GzipReader.read(self, size)
482 self._new_member = False
484 # Read a chunk of data from the file
--> 485 buf = self._fp.read(io.DEFAULT_BUFFER_SIZE)
487 uncompress = self._decompressor.decompress(buf, size)
488 if self._decompressor.unconsumed_tail != b"":
File ~/.pyenv/versions/3.8.6/lib/python3.8/gzip.py:87, in _PaddedFile.read(self, size)
85 def read(self, size):
86 if self._read is None:
---> 87 return self.file.read(size)
88 if self._read + size <= self._length:
89 read = self._read
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:121, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
118 @functools.wraps(func)
119 def wrapper(*args, **kwargs):
120 self = obj or args[0]
--> 121 return sync(self.loop, func, *args, **kwargs)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:104, in sync(loop, func, timeout, *args, **kwargs)
101 return_result = result[0]
102 if isinstance(return_result, asyncio.TimeoutError):
103 # suppress asyncio.TimeoutError, raise FSTimeoutError
--> 104 raise FSTimeoutError from return_result
105 elif isinstance(return_result, BaseException):
106 raise return_result
FSTimeoutError:
Here is the list of methods and properties of the full index:
idx.load(nrows=12) # Only load the first N rows of the index
idx.N_RECORDS # Shortcut for length of 1st dimension of the index array
idx.to_dataframe(index=True) # Convert index to user-friendly :class:`pandas.DataFrame`
idx.to_dataframe(index=True, nrows=2) # Only returns the first nrows of the index
idx.index # internal storage structure of the full index (:class:`pyarrow.Table` or :class:`pandas.DataFrame`)
idx.uri_full_index # List of absolute path to files from the full index table column 'file'
They are several methods to search the index, for instance:
In [9]: idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])
---------------------------------------------------------------------------
TimeoutError Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:61, in _runner(event, coro, result, timeout)
60 try:
---> 61 result[0] = await coro
62 except Exception as ex:
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/http.py:750, in HTTPStreamFile._read(self, num)
749 async def _read(self, num=-1):
--> 750 out = await self.r.content.read(num)
751 self.loc += len(out)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/streams.py:385, in StreamReader.read(self, n)
384 while not self._buffer and not self._eof:
--> 385 await self._wait("read")
387 return self._read_nowait(n)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/streams.py:304, in StreamReader._wait(self, func_name)
303 with self._timer:
--> 304 await waiter
305 else:
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/aiohttp/helpers.py:721, in TimerContext.__exit__(self, exc_type, exc_val, exc_tb)
720 if exc_type is asyncio.CancelledError and self._cancelled:
--> 721 raise asyncio.TimeoutError from None
722 return None
TimeoutError:
The above exception was the direct cause of the following exception:
FSTimeoutError Traceback (most recent call last)
Cell In[9], line 1
----> 1 idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:327, in indexstore_pandas.search_lat_lon_tim(self, BOX, nrows)
325 is_indexbox(BOX)
326 log.debug("Argo index searching for lat/lon/time in BOX=%s ..." % BOX)
--> 327 self.load()
328 self.search_type = {"BOX": BOX}
329 tim_min = int(pd.to_datetime(BOX[4]).strftime("%Y%m%d%H%M%S"))
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:76, in indexstore_pandas.load(self, nrows, force)
74 with self.fs["src"].open(self.index_path + ".gz", "rb") as fg:
75 with gzip.open(fg) as f:
---> 76 self.index = csv2index(f, self.index_path + ".gz")
77 else:
78 with self.fs["src"].open(self.index_path, "rb") as f:
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:49, in indexstore_pandas.load.<locals>.csv2index(obj, origin)
48 def csv2index(obj, origin):
---> 49 index = read_csv(obj, nrows=nrows)
50 check_index_cols(
51 index.columns.to_list(),
52 convention=self.convention,
53 )
54 log.debug("Argo index file loaded with pandas read_csv. src='%s'" % origin)
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/stores/argo_index_pd.py:43, in indexstore_pandas.load.<locals>.read_csv(input_file, nrows)
42 def read_csv(input_file, nrows=None):
---> 43 this_table = pd.read_csv(
44 input_file, sep=",", index_col=None, header=0, skiprows=8, nrows=nrows
45 )
46 return this_table
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:912, in read_csv(filepath_or_buffer, sep, delimiter, header, names, index_col, usecols, dtype, engine, converters, true_values, false_values, skipinitialspace, skiprows, skipfooter, nrows, na_values, keep_default_na, na_filter, verbose, skip_blank_lines, parse_dates, infer_datetime_format, keep_date_col, date_parser, date_format, dayfirst, cache_dates, iterator, chunksize, compression, thousands, decimal, lineterminator, quotechar, quoting, doublequote, escapechar, comment, encoding, encoding_errors, dialect, on_bad_lines, delim_whitespace, low_memory, memory_map, float_precision, storage_options, dtype_backend)
899 kwds_defaults = _refine_defaults_read(
900 dialect,
901 delimiter,
(...)
908 dtype_backend=dtype_backend,
909 )
910 kwds.update(kwds_defaults)
--> 912 return _read(filepath_or_buffer, kwds)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:583, in _read(filepath_or_buffer, kwds)
580 return parser
582 with parser:
--> 583 return parser.read(nrows)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/readers.py:1704, in TextFileReader.read(self, nrows)
1697 nrows = validate_integer("nrows", nrows)
1698 try:
1699 # error: "ParserBase" has no attribute "read"
1700 (
1701 index,
1702 columns,
1703 col_dict,
-> 1704 ) = self._engine.read( # type: ignore[attr-defined]
1705 nrows
1706 )
1707 except Exception:
1708 self.close()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/io/parsers/c_parser_wrapper.py:234, in CParserWrapper.read(self, nrows)
232 try:
233 if self.low_memory:
--> 234 chunks = self._reader.read_low_memory(nrows)
235 # destructive to chunks
236 data = _concatenate_chunks(chunks)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:814, in pandas._libs.parsers.TextReader.read_low_memory()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:875, in pandas._libs.parsers.TextReader._read_rows()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:850, in pandas._libs.parsers.TextReader._tokenize_rows()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:861, in pandas._libs.parsers.TextReader._check_tokenize_status()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/pandas/_libs/parsers.pyx:2021, in pandas._libs.parsers.raise_parser_error()
File ~/.pyenv/versions/3.8.6/lib/python3.8/_compression.py:68, in DecompressReader.readinto(self, b)
66 def readinto(self, b):
67 with memoryview(b) as view, view.cast("B") as byte_view:
---> 68 data = self.read(len(byte_view))
69 byte_view[:len(data)] = data
70 return len(data)
File ~/.pyenv/versions/3.8.6/lib/python3.8/gzip.py:485, in _GzipReader.read(self, size)
482 self._new_member = False
484 # Read a chunk of data from the file
--> 485 buf = self._fp.read(io.DEFAULT_BUFFER_SIZE)
487 uncompress = self._decompressor.decompress(buf, size)
488 if self._decompressor.unconsumed_tail != b"":
File ~/.pyenv/versions/3.8.6/lib/python3.8/gzip.py:87, in _PaddedFile.read(self, size)
85 def read(self, size):
86 if self._read is None:
---> 87 return self.file.read(size)
88 if self._read + size <= self._length:
89 read = self._read
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:121, in sync_wrapper.<locals>.wrapper(*args, **kwargs)
118 @functools.wraps(func)
119 def wrapper(*args, **kwargs):
120 self = obj or args[0]
--> 121 return sync(self.loop, func, *args, **kwargs)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/asyn.py:104, in sync(loop, func, timeout, *args, **kwargs)
101 return_result = result[0]
102 if isinstance(return_result, asyncio.TimeoutError):
103 # suppress asyncio.TimeoutError, raise FSTimeoutError
--> 104 raise FSTimeoutError from return_result
105 elif isinstance(return_result, BaseException):
106 raise return_result
FSTimeoutError:
Here the list of all methods to search the index:
idx.search_wmo(1901393)
idx.search_cyc(1)
idx.search_wmo_cyc(1901393, [1,12])
idx.search_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition, only time is used
idx.search_lat_lon([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition, only lat/lon is used
idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition
And finally the list of methods and properties for search results:
idx.N_MATCH # Shortcut for length of 1st dimension of the search results array
idx.to_dataframe() # Convert search results to user-friendly :class:`pandas.DataFrame`
idx.to_dataframe(nrows=2) # Only returns the first nrows of the search results
idx.to_indexfile("search_index.txt") # Export search results to Argo standard index file
idx.search # Internal table with search results
idx.uri # List of absolute path to files from the search results table column 'file'
Hint
The argopy index store supports the Bio and Synthetic Profile directory files:
In [10]: idx = ArgoIndex(index_file="argo_bio-profile_index.txt").load()
# idx = ArgoIndex(index_file="argo_synthetic-profile_index.txt").load()
In [11]: idx
Out[11]:
<argoindex.pandas>
Host: https://data-argo.ifremer.fr
Index: argo_bio-profile_index.txt
Convention: argo_bio-profile_index (Bio-Profile directory file of the Argo GDAC)
Loaded: True (288978 records)
Searched: False
This BGC index store comes with an additional search possibility for parameters:
In [12]: idx.search_params(['C1PHASE_DOXY', 'DOWNWELLING_PAR'])
Out[12]:
<argoindex.pandas>
Host: https://data-argo.ifremer.fr
Index: argo_bio-profile_index.txt
Convention: argo_bio-profile_index (Bio-Profile directory file of the Argo GDAC)
Loaded: True (288978 records)
Searched: True (38271 matches, 13.2436%)
In [13]: idx.to_dataframe()
Out[13]:
file ... profiler
0 bodc/3901496/profiles/BD3901496_001.nc ... Unknown
1 bodc/3901496/profiles/BD3901496_002.nc ... Unknown
2 bodc/3901496/profiles/BD3901496_003.nc ... Unknown
3 bodc/3901496/profiles/BD3901496_004.nc ... Unknown
4 bodc/3901496/profiles/BD3901496_005.nc ... Unknown
... ... ... ...
38266 csiro/7900947/profiles/BR7900947_018.nc ... Unknown
38267 csiro/7900947/profiles/BR7900947_019.nc ... Unknown
38268 csiro/7900947/profiles/BR7900947_020.nc ... Unknown
38269 csiro/7900947/profiles/BR7900947_021.nc ... Unknown
38270 csiro/7900947/profiles/BR7900947_022.nc ... Unknown
[38271 rows x 13 columns]
Index file supported#
The table below summarize the argopy support status of all Argo index files:
Index file |
Supported |
|
---|---|---|
Profile |
ar_index_global_prof.txt |
โ |
Synthetic-Profile |
argo_synthetic-profile_index.txt |
โ |
Bio-Profile |
argo_bio-profile_index.txt |
โ |
Trajectory |
ar_index_global_traj.txt |
๐ |
Bio-Trajectory |
argo_bio-traj_index.txt |
๐ |
Metadata |
ar_index_global_meta.txt |
โ |
Technical |
ar_index_global_tech.txt |
โ |
Greylist |
ar_greylist.txt |
โ |
Index files support can be added on demand. Click here to raise an issue if youโd like to access other index files.
Reference tables#
The Argo netcdf format is strict and based on a collection of variables fully documented and conventioned. All reference tables can be found in the Argo user manual.
However, a machine-to-machine access to these tables is often required. This is possible thanks to the work of the Argo Vocabulary Task Team (AVTT) that is a team of people responsible for the NVS collections under the Argo Data Management Team governance.
Note
The GitHub organization hosting the AVTT is the โNERC Vocabulary Server (NVS)โ, aka โnvs-vocabsโ. This holds a list of NVS collection-specific GitHub repositories. Each Argo GitHub repository is called after its corresponding collection ID (e.g. R01, RR2, R03 etc.). The current list is given here.
The management of issues related to vocabularies managed by the Argo Data Management Team is done on this repository.
argopy provides the utility class ArgoNVSReferenceTables
to easily fetch and get access to all Argo reference tables. If you already know the name of the reference table you want to retrieve, you can simply get it like this:
In [14]: from argopy import ArgoNVSReferenceTables
In [15]: NVS = ArgoNVSReferenceTables()
In [16]: NVS.tbl('R01')
Out[16]:
altLabel ... id
0 BPROF ... http://vocab.nerc.ac.uk/collection/R01/current...
1 BTRAJ ... http://vocab.nerc.ac.uk/collection/R01/current...
2 META ... http://vocab.nerc.ac.uk/collection/R01/current...
3 MPROF ... http://vocab.nerc.ac.uk/collection/R01/current...
4 MTRAJ ... http://vocab.nerc.ac.uk/collection/R01/current...
5 PROF ... http://vocab.nerc.ac.uk/collection/R01/current...
6 SPROF ... http://vocab.nerc.ac.uk/collection/R01/current...
7 TECH ... http://vocab.nerc.ac.uk/collection/R01/current...
8 TRAJ ... http://vocab.nerc.ac.uk/collection/R01/current...
[9 rows x 5 columns]
The reference table is returned as a pandas.DataFrame
. If you want the exact name of this table:
In [17]: NVS.tbl_name('R01')
Out[17]:
('DATA_TYPE',
'Terms describing the type of data contained in an Argo netCDF file. Argo netCDF variable DATA_TYPE is populated by R01 prefLabel.',
'http://vocab.nerc.ac.uk/collection/R01/current/')
If youโre looking the ID to use for a specific reference table, you can check it from the list of all available tables given by the ArgoNVSReferenceTables.all_tbl_name()
property. It will return a dictionary with table IDs as key and table name, definition and NVS link as values. Use the ArgoNVSReferenceTables.all_tbl()
property to retrieve all tables.
In [18]: NVS.all_tbl_name
Out[18]:
OrderedDict([('R01',
('DATA_TYPE',
'Terms describing the type of data contained in an Argo netCDF file. Argo netCDF variable DATA_TYPE is populated by R01 prefLabel.',
'http://vocab.nerc.ac.uk/collection/R01/current/')),
('R03',
('PARAMETER',
'Terms describing individual measured phenomena, used to mark up sets of data in Argo netCDF arrays. Argo netCDF variables PARAMETER and TRAJECTORY_PARAMETERS are populated by R03 altLabel; R03 altLabel is also used to name netCDF profile files parameter variables <PARAMETER>.',
'http://vocab.nerc.ac.uk/collection/R03/current/')),
('R04',
('DATA_CENTRE_CODES',
'Codes for data centres and institutions handling or managing Argo data. Argo netCDF variable DATA_CENTRE is populated by R04 altLabel.',
'http://vocab.nerc.ac.uk/collection/R04/current/')),
('R05',
('POSITION_ACCURACY',
'Accuracy in latitude and longitude measurements received from the positioning system, grouped by location accuracy classes.',
'http://vocab.nerc.ac.uk/collection/R05/current/')),
('R06',
('DATA_STATE_INDICATOR',
'Processing stage of the data based on the concatenation of processing level and class indicators. Argo netCDF variable DATA_STATE_INDICATOR is populated by R06 altLabel.',
'http://vocab.nerc.ac.uk/collection/R06/current/')),
('R07',
('HISTORY_ACTION',
'Coded history information for each action performed on each profile by a data centre. Argo netCDF variable HISTORY_ACTION is populated by R07 altLabel.',
'http://vocab.nerc.ac.uk/collection/R07/current/')),
('R08',
('ARGO_WMO_INST_TYPE',
"Subset of instrument type codes from the World Meteorological Organization (WMO) Common Code Table C-3 (CCT C-3) 1770, named 'Instrument make and type for water temperature profile measurement with fall rate equation coefficients' and available here: https://library.wmo.int/doc_num.php?explnum_id=11283. Argo netCDF variable WMO_INST_TYPE is populated by R08 altLabel.",
'http://vocab.nerc.ac.uk/collection/R08/current/')),
('R09',
('POSITIONING_SYSTEM',
'List of float location measuring systems. Argo netCDF variable POSITIONING_SYSTEM is populated by R09 altLabel.',
'http://vocab.nerc.ac.uk/collection/R09/current/')),
('R10',
('TRANS_SYSTEM',
'List of telecommunication systems. Argo netCDF variable TRANS_SYSTEM is populated by R10 altLabel.',
'http://vocab.nerc.ac.uk/collection/R10/current/')),
('R11',
('RTQC_TESTID',
'List of real-time quality-control tests and corresponding binary identifiers, used as reference to populate the Argo netCDF HISTORY_QCTEST variable.',
'http://vocab.nerc.ac.uk/collection/R11/current/')),
('R12',
('HISTORY_STEP',
'Data processing step codes for history record. Argo netCDF variable TRANS_SYSTEM is populated by R12 altLabel.',
'http://vocab.nerc.ac.uk/collection/R12/current/')),
('R13',
('OCEAN_CODE',
'Ocean area codes assigned to each profile in the Metadata directory (index) file of the Argo Global Assembly Centre.',
'http://vocab.nerc.ac.uk/collection/R13/current/')),
('R15',
('MEASUREMENT_CODE_ID',
'Measurement code IDs used in Argo Trajectory netCDF files. Argo netCDF variable MEASUREMENT_CODE is populated by R15 altLabel.',
'http://vocab.nerc.ac.uk/collection/R15/current/')),
('R16',
('VERTICAL_SAMPLING_SCHEME',
'Profile sampling schemes and sampling methods. Argo netCDF variable VERTICAL_SAMPLING_SCHEME is populated by R16 altLabel.',
'http://vocab.nerc.ac.uk/collection/R16/current/')),
('R19',
('STATUS',
'Flag scale for values in all Argo netCDF cycle timing variables. Argo netCDF cycle timing variables JULD_<RTV>_STATUS are populated by R19 altLabel.',
'http://vocab.nerc.ac.uk/collection/R19/current/')),
('R20',
('GROUNDED',
'Codes to indicate the best estimate of whether the float touched the ground during a specific cycle. Argo netCDF variable GROUNDED in the Trajectory file is populated by R20 altLabel.',
'http://vocab.nerc.ac.uk/collection/R20/current/')),
('R21',
('REPRESENTATIVE_PARK_PRESSURE_STATUS',
'Argo status flag on the Representative Park Pressure (RPP). Argo netCDF variable REPRESENTATIVE_PARK_PRESSURE_STATUS in the Trajectory file is populated by R21 altLabel.',
'http://vocab.nerc.ac.uk/collection/R21/current/')),
('R22',
('PLATFORM_FAMILY',
'List of platform family/category of Argo floats. Argo netCDF variable PLATFORM_FAMILY is populated by R22 altLabel.',
'http://vocab.nerc.ac.uk/collection/R22/current/')),
('R23',
('PLATFORM_TYPE',
'List of Argo float types. Argo netCDF variable PLATFORM_TYPE is populated by R23 altLabel.',
'http://vocab.nerc.ac.uk/collection/R23/current/')),
('R24',
('PLATFORM_MAKER',
'List of Argo float manufacturers. Argo netCDF variable PLATFORM_MAKER is populated by R24 altLabel.',
'http://vocab.nerc.ac.uk/collection/R24/current/')),
('R25',
('SENSOR',
'Terms describing sensor types mounted on Argo floats. Argo netCDF variable SENSOR is populated by R25 altLabel.',
'http://vocab.nerc.ac.uk/collection/R25/current/')),
('R26',
('SENSOR_MAKER',
'Terms describing developers and manufacturers of sensors mounted on Argo floats. Argo netCDF variable SENSOR_MAKER is populated by R26 altLabel.',
'http://vocab.nerc.ac.uk/collection/R26/current/')),
('R27',
('SENSOR_MODEL',
'Terms listing models of sensors mounted on Argo floats. Note: avoid using the manufacturer name and sensor firmware version in new entries when possible. Argo netCDF variable SENSOR_MODEL is populated by R27 altLabel.',
'http://vocab.nerc.ac.uk/collection/R27/current/')),
('RD2',
('DM_QC_FLAG',
"Quality flag scale for delayed-mode measurements. Argo netCDF variables <PARAMETER>_ADJUSTED_QC in 'D' mode are populated by RD2 altLabel.",
'http://vocab.nerc.ac.uk/collection/RD2/current/')),
('RMC',
('MEASUREMENT_CODE_CATEGORY',
"Categories of trajectory measurement codes listed in NVS collection 'R15'",
'http://vocab.nerc.ac.uk/collection/RMC/current/')),
('RP2',
('PROF_QC_FLAG',
'Quality control flag scale for whole profiles. Argo netCDF variables PROFILE_<PARAMETER>_QC are populated by RP2 altLabel.',
'http://vocab.nerc.ac.uk/collection/RP2/current/')),
('RR2',
('RT_QC_FLAG',
"Quality flag scale for real-time measurements. Argo netCDF variables <PARAMETER>_QC in 'R' mode and <PARAMETER>_ADJUSTED_QC in 'A' mode are populated by RR2 altLabel.",
'http://vocab.nerc.ac.uk/collection/RR2/current/')),
('RTV',
('CYCLE_TIMING_VARIABLE',
"Timing variables representing stages of an Argo float profiling cycle, most of which are associated with a trajectory measurement code ID listed in NVS collection 'R15'. Argo netCDF cycle timing variable names JULD_<RTV>_STATUS are constructed by RTV altLabel.",
'http://vocab.nerc.ac.uk/collection/RTV/current/'))])
Deployment Plan#
It may be useful to be able to retrieve meta-data from Argo deployments. argopy can use the OceanOPS API for metadata access to retrieve these information. The returned deployment plan is a list of all Argo floats ever deployed, together with their deployment location, date, WMO, program, country, float model and current status.
To fetch the Argo deployment plan, argopy provides a dedicated utility class: OceanOPSDeployments
that can be used like this:
In [19]: from argopy import OceanOPSDeployments
In [20]: deployment = OceanOPSDeployments()
In [21]: df = deployment.to_dataframe()
In [22]: df
Out[22]:
date lat lon ... program country model
0 2023-07-25 00:00:00 72.30 -134.00 ... Argo CANADA CANADA ARVOR
1 2023-07-26 11:06:49 40.10 11.20 ... Argo ITALY ITALY ARVOR
2 2023-07-28 00:00:00 73.00 -150.00 ... Argo CANADA CANADA ARVOR
3 2023-07-30 00:00:00 43.42 7.89 ... Coriolis FRANCE ARVOR
4 2023-07-30 00:00:00 40.00 6.99 ... Coriolis FRANCE ARVOR
.. ... ... ... ... ... ... ...
427 2024-12-31 13:49:07 47.80 -3.30 ... Coriolis FRANCE ARVOR_D
428 2024-12-31 13:49:07 47.80 -3.30 ... Coriolis FRANCE ARVOR_D
429 2024-12-31 13:49:07 47.80 -3.30 ... Coriolis FRANCE ARVOR_D
430 2024-12-31 13:49:07 47.80 -3.30 ... Coriolis FRANCE ARVOR_D
431 2024-12-31 13:49:07 47.80 -3.30 ... Coriolis FRANCE ARVOR_D
[432 rows x 9 columns]
OceanOPSDeployments
can also take an index box definition as argument in order to restrict the deployment plan selection to a specific region or period:
deployment = OceanOPSDeployments([-90, 0, 0, 90])
# deployment = OceanOPSDeployments([-20, 0, 42, 51, '2020-01', '2021-01'])
# deployment = OceanOPSDeployments([-180, 180, -90, 90, '2020-01', None])
Note that if the starting date is not provided, it will be set automatically to the current date.
Last, OceanOPSDeployments
comes with a plotting method:
fig, ax = deployment.plot_status()
Note
The list of possible deployment status name/code is given by:
OceanOPSDeployments().status_code
Status |
Id |
Description |
---|---|---|
PROBABLE |
0 |
Starting status for some platforms, when there is only a few metadata available, like rough deployment location and date. The platform may be deployed |
CONFIRMED |
1 |
Automatically set when a ship is attached to the deployment information. The platform is ready to be deployed, deployment is planned |
REGISTERED |
2 |
Starting status for most of the networks, when deployment planning is not done. The deployment is certain, and a notification has been sent via the OceanOPS system |
OPERATIONAL |
6 |
Automatically set when the platform is emitting a pulse and observations are distributed within a certain time interval |
INACTIVE |
4 |
The platform is not emitting a pulse since a certain time |
CLOSED |
5 |
The platform is not emitting a pulse since a long time, it is considered as dead |
ADMT Documentation#
More than 20 pdf manuals have been produced by the Argo Data Management Team. Using the ArgoDocs
class, itโs easy to navigate this great database.
If you donโt know where to start, you can simply list all available documents:
In [23]: from argopy import ArgoDocs
In [24]: ArgoDocs().list
Out[24]:
category ... id
0 Argo data formats ... 29825
1 Quality control ... 33951
2 Quality control ... 46542
3 Quality control ... 40879
4 Quality control ... 35385
5 Quality control ... 84370
6 Quality control ... 62466
7 Cookbooks ... 41151
8 Cookbooks ... 29824
9 Cookbooks ... 78994
10 Cookbooks ... 39795
11 Cookbooks ... 39459
12 Cookbooks ... 39468
13 Cookbooks ... 47998
14 Cookbooks ... 54541
15 Cookbooks ... 46121
16 Cookbooks ... 51541
17 Cookbooks ... 57195
18 Cookbooks ... 46120
19 Cookbooks ... 52154
20 Cookbooks ... 55637
21 Cookbooks ... 46202
[22 rows x 4 columns]
Or search for a word in the title and/or abstract:
In [25]: results = ArgoDocs().search("oxygen")
In [26]: for docid in results:
....: print("\n", ArgoDocs(docid))
....:
---------------------------------------------------------------------------
RecursionError Traceback (most recent call last)
Cell In[26], line 2
1 for docid in results:
----> 2 print("\n", ArgoDocs(docid))
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4121, in ArgoDocs.__repr__(self)
4119 summary.append("DOI: %s" % doc['doi'])
4120 summary.append("url: https://dx.doi.org/%s" % doc['doi'])
-> 4121 summary.append("last pdf: %s" % self.pdf)
4122 if 'AF' in self.ris:
4123 summary.append("Authors: %s" % self.ris['AF'])
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4172, in ArgoDocs.pdf(self)
4170 """Link to the online pdf version of a document"""
4171 if self.docid is not None:
-> 4172 return self.ris['UR']
4173 else:
4174 raise ValueError("Select a document first !")
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4151, in ArgoDocs.ris(self)
4148 if self._ris is None:
4149 # Fetch RIS metadata for this document:
4150 import re
-> 4151 file = self._fs.fs.cat_file("https://dx.doi.org/%s" % self.js['doi'])
4152 x = re.search('<a target="_blank" href="(https?:\/\/([^"]*))"\s+([^>]*)rel="nofollow">TXT<\/a>',
4153 str(file))
4154 export_txt_url = x[1]
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/spec.py:757, in AbstractFileSystem.cat_file(self, path, start, end, **kwargs)
745 """Get the content of a file
746
747 Parameters
(...)
754 kwargs: passed to ``open()``.
755 """
756 # explicitly set buffering off?
--> 757 with self.open(path, "rb", **kwargs) as f:
758 if start is not None:
759 if start >= 0:
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/spec.py:1241, in AbstractFileSystem.open(self, path, mode, block_size, cache_options, compression, **kwargs)
1239 else:
1240 ac = kwargs.pop("autocommit", not self._intrans)
-> 1241 f = self._open(
1242 path,
1243 mode=mode,
1244 block_size=block_size,
1245 autocommit=ac,
1246 cache_options=cache_options,
1247 **kwargs,
1248 )
1249 if compression is not None:
1250 from fsspec.compression import compr
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
698 self.fs.get(path, fn)
699 self.save_cache()
--> 700 return self._open(path, mode)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
698 self.fs.get(path, fn)
699 self.save_cache()
--> 700 return self._open(path, mode)
[... skipping similar frames: CachingFileSystem.__getattribute__.<locals>.<lambda> at line 447 (1462 times), WholeFileCacheFileSystem._open at line 700 (1461 times)]
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
698 self.fs.get(path, fn)
699 self.save_cache()
--> 700 return self._open(path, mode)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:699, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
697 else:
698 self.fs.get(path, fn)
--> 699 self.save_cache()
700 return self._open(path, mode)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:192, in CachingFileSystem.save_cache(self)
190 c["blocks"] = list(c["blocks"])
191 self._mkcache()
--> 192 with atomic_write(fn) as f:
193 pickle.dump(cache, f)
194 self.cached_files[-1] = cached_files
File ~/.pyenv/versions/3.8.6/lib/python3.8/contextlib.py:113, in _GeneratorContextManager.__enter__(self)
111 del self.args, self.kwds, self.func
112 try:
--> 113 return next(self.gen)
114 except StopIteration:
115 raise RuntimeError("generator didn't yield") from None
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:856, in atomic_write(path, mode)
849 @contextlib.contextmanager
850 def atomic_write(path, mode="wb"):
851 """
852 A context manager that opens a temporary file next to `path` and, on exit,
853 replaces `path` with the temporary file, thereby updating `path`
854 atomically.
855 """
--> 856 fd, fn = tempfile.mkstemp(
857 dir=os.path.dirname(path), prefix=os.path.basename(path) + "-"
858 )
859 try:
860 with open(fd, mode) as fp:
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:331, in mkstemp(suffix, prefix, dir, text)
328 else:
329 flags = _bin_openflags
--> 331 return _mkstemp_inner(dir, prefix, suffix, flags, output_type)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:246, in _mkstemp_inner(dir, pre, suf, flags, output_type)
243 names = map(_os.fsencode, names)
245 for seq in range(TMP_MAX):
--> 246 name = next(names)
247 file = _os.path.join(dir, pre + name + suf)
248 _sys.audit("tempfile.mkstemp", file)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:147, in _RandomNameSequence.__next__(self)
145 c = self.characters
146 choose = self.rng.choice
--> 147 letters = [choose(c) for dummy in range(8)]
148 return ''.join(letters)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:147, in <listcomp>(.0)
145 c = self.characters
146 choose = self.rng.choice
--> 147 letters = [choose(c) for dummy in range(8)]
148 return ''.join(letters)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/random.py:288, in Random.choice(self, seq)
286 """Choose a random element from a non-empty sequence."""
287 try:
--> 288 i = self._randbelow(len(seq))
289 except ValueError:
290 raise IndexError('Cannot choose from an empty sequence') from None
RecursionError: maximum recursion depth exceeded while calling a Python object
Then using the Argo doi number of a document, you can easily retrieve it:
In [27]: ArgoDocs(35385)
Out[27]: ---------------------------------------------------------------------------
RecursionError Traceback (most recent call last)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/IPython/core/formatters.py:708, in PlainTextFormatter.__call__(self, obj)
701 stream = StringIO()
702 printer = pretty.RepresentationPrinter(stream, self.verbose,
703 self.max_width, self.newline,
704 max_seq_length=self.max_seq_length,
705 singleton_pprinters=self.singleton_printers,
706 type_pprinters=self.type_printers,
707 deferred_pprinters=self.deferred_printers)
--> 708 printer.pretty(obj)
709 printer.flush()
710 return stream.getvalue()
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/IPython/lib/pretty.py:410, in RepresentationPrinter.pretty(self, obj)
407 return meth(obj, self, cycle)
408 if cls is not object \
409 and callable(cls.__dict__.get('__repr__')):
--> 410 return _repr_pprint(obj, self, cycle)
412 return _default_pprint(obj, self, cycle)
413 finally:
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/IPython/lib/pretty.py:778, in _repr_pprint(obj, p, cycle)
776 """A pprint that just redirects to the normal repr function."""
777 # Find newlines and replace them with p.break_()
--> 778 output = repr(obj)
779 lines = output.splitlines()
780 with p.group():
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4121, in ArgoDocs.__repr__(self)
4119 summary.append("DOI: %s" % doc['doi'])
4120 summary.append("url: https://dx.doi.org/%s" % doc['doi'])
-> 4121 summary.append("last pdf: %s" % self.pdf)
4122 if 'AF' in self.ris:
4123 summary.append("Authors: %s" % self.ris['AF'])
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4172, in ArgoDocs.pdf(self)
4170 """Link to the online pdf version of a document"""
4171 if self.docid is not None:
-> 4172 return self.ris['UR']
4173 else:
4174 raise ValueError("Select a document first !")
File ~/checkouts/readthedocs.org/user_builds/argopy/checkouts/v0.1.14rc1/argopy/utilities.py:4151, in ArgoDocs.ris(self)
4148 if self._ris is None:
4149 # Fetch RIS metadata for this document:
4150 import re
-> 4151 file = self._fs.fs.cat_file("https://dx.doi.org/%s" % self.js['doi'])
4152 x = re.search('<a target="_blank" href="(https?:\/\/([^"]*))"\s+([^>]*)rel="nofollow">TXT<\/a>',
4153 str(file))
4154 export_txt_url = x[1]
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/spec.py:757, in AbstractFileSystem.cat_file(self, path, start, end, **kwargs)
745 """Get the content of a file
746
747 Parameters
(...)
754 kwargs: passed to ``open()``.
755 """
756 # explicitly set buffering off?
--> 757 with self.open(path, "rb", **kwargs) as f:
758 if start is not None:
759 if start >= 0:
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/spec.py:1241, in AbstractFileSystem.open(self, path, mode, block_size, cache_options, compression, **kwargs)
1239 else:
1240 ac = kwargs.pop("autocommit", not self._intrans)
-> 1241 f = self._open(
1242 path,
1243 mode=mode,
1244 block_size=block_size,
1245 autocommit=ac,
1246 cache_options=cache_options,
1247 **kwargs,
1248 )
1249 if compression is not None:
1250 from fsspec.compression import compr
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
698 self.fs.get(path, fn)
699 self.save_cache()
--> 700 return self._open(path, mode)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
698 self.fs.get(path, fn)
699 self.save_cache()
--> 700 return self._open(path, mode)
[... skipping similar frames: CachingFileSystem.__getattribute__.<locals>.<lambda> at line 447 (1457 times), WholeFileCacheFileSystem._open at line 700 (1456 times)]
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:700, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
698 self.fs.get(path, fn)
699 self.save_cache()
--> 700 return self._open(path, mode)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:699, in WholeFileCacheFileSystem._open(self, path, mode, **kwargs)
697 else:
698 self.fs.get(path, fn)
--> 699 self.save_cache()
700 return self._open(path, mode)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:447, in CachingFileSystem.__getattribute__.<locals>.<lambda>(*args, **kw)
411 def __getattribute__(self, item):
412 if item in [
413 "load_cache",
414 "_open",
(...)
445 # all the methods defined in this class. Note `open` here, since
446 # it calls `_open`, but is actually in superclass
--> 447 return lambda *args, **kw: getattr(type(self), item).__get__(self)(
448 *args, **kw
449 )
450 if item in ["__reduce_ex__"]:
451 raise AttributeError
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:192, in CachingFileSystem.save_cache(self)
190 c["blocks"] = list(c["blocks"])
191 self._mkcache()
--> 192 with atomic_write(fn) as f:
193 pickle.dump(cache, f)
194 self.cached_files[-1] = cached_files
File ~/.pyenv/versions/3.8.6/lib/python3.8/contextlib.py:113, in _GeneratorContextManager.__enter__(self)
111 del self.args, self.kwds, self.func
112 try:
--> 113 return next(self.gen)
114 except StopIteration:
115 raise RuntimeError("generator didn't yield") from None
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/site-packages/fsspec/implementations/cached.py:856, in atomic_write(path, mode)
849 @contextlib.contextmanager
850 def atomic_write(path, mode="wb"):
851 """
852 A context manager that opens a temporary file next to `path` and, on exit,
853 replaces `path` with the temporary file, thereby updating `path`
854 atomically.
855 """
--> 856 fd, fn = tempfile.mkstemp(
857 dir=os.path.dirname(path), prefix=os.path.basename(path) + "-"
858 )
859 try:
860 with open(fd, mode) as fp:
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:331, in mkstemp(suffix, prefix, dir, text)
328 else:
329 flags = _bin_openflags
--> 331 return _mkstemp_inner(dir, prefix, suffix, flags, output_type)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:246, in _mkstemp_inner(dir, pre, suf, flags, output_type)
243 names = map(_os.fsencode, names)
245 for seq in range(TMP_MAX):
--> 246 name = next(names)
247 file = _os.path.join(dir, pre + name + suf)
248 _sys.audit("tempfile.mkstemp", file)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:147, in _RandomNameSequence.__next__(self)
145 c = self.characters
146 choose = self.rng.choice
--> 147 letters = [choose(c) for dummy in range(8)]
148 return ''.join(letters)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/tempfile.py:147, in <listcomp>(.0)
145 c = self.characters
146 choose = self.rng.choice
--> 147 letters = [choose(c) for dummy in range(8)]
148 return ''.join(letters)
File ~/checkouts/readthedocs.org/user_builds/argopy/envs/v0.1.14rc1/lib/python3.8/random.py:288, in Random.choice(self, seq)
286 """Choose a random element from a non-empty sequence."""
287 try:
--> 288 i = self._randbelow(len(seq))
289 except ValueError:
290 raise IndexError('Cannot choose from an empty sequence') from None
RecursionError: maximum recursion depth exceeded while calling a Python object
and open it in your browser:
# ArgoDocs(35385).show()
# ArgoDocs(35385).open_pdf(page=12)