argopy.stores.ArgoIndex#

class ArgoIndex(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', convention: str | None = None, cache: bool = False, cachedir: str = '', timeout: int = 0)[source]#

Argo GDAC index store

If Pyarrow is available, this class will use pyarrow.Table as internal storage format; otherwise, a pandas.DataFrame will be used.

You can use the exact index file names or keywords:

core for the ar_index_global_prof.txt index file,
bgc-b for the argo_bio-profile_index.txt index file,
bgc-s for the argo_synthetic-profile_index.txt index file.

Examples

An index store is instantiated with a host (any access path, local, http or ftp) and an index file:

>>> idx = ArgoIndex()
>>> idx = ArgoIndex(host="https://data-argo.ifremer.fr")  # Default host
>>> idx = ArgoIndex(host="ftp://ftp.ifremer.fr/ifremer/argo", index_file="ar_index_global_prof.txt")  # Default index
>>> idx = ArgoIndex(index_file="bgc-s")  # Use keywords instead of exact file names
>>> idx = ArgoIndex(host="https://data-argo.ifremer.fr", index_file="bgc-b", cache=True)  # Use cache for performances
>>> idx = ArgoIndex(host=".", index_file="dummy_index.txt", convention="core")  # Load your own index

Full index methods and properties:

>>> idx.load()
>>> idx.load(nrows=12)  # Only load the first N rows of the index
>>> idx.to_dataframe(index=True)  # Convert index to user-friendly :class:`pandas.DataFrame`
>>> idx.to_dataframe(index=True, nrows=2)  # Only returns the first nrows of the index
>>> idx.N_RECORDS  # Shortcut for length of 1st dimension of the index array
>>> idx.index  # internal storage structure of the full index (:class:`pyarrow.Table` or :class:`pandas.DataFrame`)
>>> idx.shape  # shape of the full index array
>>> idx.uri_full_index  # List of absolute path to files from the full index table column 'file'

Search methods:

>>> idx.search_wmo(1901393)
>>> idx.search_cyc(1)
>>> idx.search_wmo_cyc(1901393, [1,12])
>>> idx.search_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_lat_lon([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_params(['C1PHASE_DOXY', 'DOWNWELLING_PAR'])  # Take a list of strings, only for BGC index !
>>> idx.search_parameter_data_mode({'BBP700': 'D', 'DOXY': ['A', 'D']})  # Take a dict.

Search result properties and methods:

>>> idx.N_MATCH  # Shortcut for length of 1st dimension of the search results array
>>> idx.search  # Internal table with search results
>>> idx.uri  # List of absolute path to files from the search results table column 'file'

>>> idx.run()  # Run the search and save results in cache if necessary
>>> idx.to_dataframe()  # Convert search results to user-friendly :class:`pandas.DataFrame`
>>> idx.to_dataframe(nrows=2)  # Only returns the first nrows of the search results
>>> idx.to_indexfile("search_index.txt")  # Export search results to Argo standard index file

Misc:

>>> idx.convention  # What is the expected index format (core vs BGC profile index)
>>> idx.cname
>>> idx.read_wmo
>>> idx.read_params
>>> idx.records_per_wmo

__init__(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', convention: str | None = None, cache: bool = False, cachedir: str = '', timeout: int = 0) → object#

Create an Argo index file store

Parameters:

host (str, default: https://data-argo.ifremer.fr) – Local or remote (ftp or http) path to a dac folder (GDAC structure compliant). This takes values like: ftp://ftp.ifremer.fr/ifremer/argo, ftp://usgodae.org/pub/outgoing/argo or a local absolute path.
index_file (str, default: ar_index_global_prof.txt) –
Name of the csv-like text file with the index.

Possible values are standard file name: ar_index_global_prof.txt, argo_bio-profile_index.txt or argo_synthetic-profile_index.txt.

You can also use the following shortcuts: core, bgc-b, bgc-s, respectively.
convention (str, default: None) –

Set the expected format convention of the index file. This is useful when trying to load index file with custom name. If set to None, we’ll try to infer the convention from the index_file value.
Possible values: ar_index_global_prof, argo_bio-profile_index, or argo_synthetic-profile_index.

You can also use the keyword: core, bgc-s, bgc-b.
cache (bool, default: False) – Use cache or not.
cachedir (str, default: OPTIONS['cachedir']) – Folder where to store cached files
timeout (int, default: OPTIONS['api_timeout']) – Time out in seconds to connect to a remote host (ftp or http).

Methods

`__init__`([host, index_file, convention, ...])	Create an Argo index file store
`cachepath`(path)	Return path to a cached file
`clear_cache`()	Clear cache registry and files associated with this store instance.
`load`([nrows, force])	Load an Argo-index file content
`read_params`([index])	Return list of unique PARAMETERs in index or search results
`read_wmo`([index])	Return list of unique WMOs in search results
`records_per_wmo`([index])	Return the number of records per unique WMOs in search results
`run`([nrows])	Filter index with search criteria
`search_cyc`(CYCs[, nrows])	Search index for cycle numbers
`search_lat_lon`(BOX[, nrows])	Search index for a rectangular latitude/longitude domain
`search_lat_lon_tim`(BOX[, nrows])	Search index for a rectangular latitude/longitude domain and time range
`search_parameter_data_mode`(PARAMs[, ...])	Search index for profiles with a parameter in a specific data mode
`search_params`(PARAMs[, logical, nrows])	Search index for one or a list of parameters
`search_tim`(BOX[, nrows])	Search index for a time range
`search_wmo`(WMOs[, nrows])	Search index for floats defined by their WMO
`search_wmo_cyc`(WMOs, CYCs[, nrows])	Search index for floats defined by their WMO and specific cycle numbers
`to_dataframe`([nrows, index, completed])	Return index or search results as `pandas.DataFrame`
`to_indexfile`(outputfile)	Save search results on file, following the Argo standard index formats

Attributes

`N_FILES`	Number of rows in search result or index if search not triggered
`N_MATCH`	Number of rows in search result
`N_RECORDS`	Number of rows in the full index
`backend`	Name of store backend
`cname`	Return the search constraint(s) as a pretty formatted string
`convention`	Convention of the index (standard csv file name)
`convention_supported`	List of supported conventions
`convention_title`	Long name for the index convention
`ext`	Storage file extension
`search_path`	Path to search result uri
`search_type`	Dictionary with search meta-data
`sha_df`	Returns a unique SHA for a cname/dataframe
`sha_h5`	Returns a unique SHA for a cname/hdf5
`sha_pq`	Returns a unique SHA for a cname/parquet
`shape`	Shape of the index array
`uri`	List of URI from search results
`uri_full_index`	List of URI from index

argopy.stores.ArgoIndex

Contents

argopy.stores.ArgoIndex#