argopy.ArgoIndex#

class ArgoIndex(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', cache: bool = False, cachedir: str = '', timeout: int = 0, convention: str | None = None)[source]#

Argo GDAC index store

If Pyarrow is available, this class will use pyarrow.Table as internal storage format; otherwise, a pandas.DataFrame will be used.

Examples

An index store is instantiated with the access path (host) and the index file:

>>> idx = ArgoIndex()
>>> idx = ArgoIndex(host="ftp://ftp.ifremer.fr/ifremer/argo")
>>> idx = ArgoIndex(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt")
>>> idx = ArgoIndex(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt", cache=True)
>>> idx = ArgoIndex(host=".", index_file="dummy_index.txt", convention="ar_index_global_prof")

Full index methods and properties:

>>> idx.load()
>>> idx.load(nrows=12)  # Only load the first N rows of the index
>>> idx.N_RECORDS  # Shortcut for length of 1st dimension of the index array
>>> idx.index  # internal storage structure of the full index (:class:`pyarrow.Table` or :class:`pandas.DataFrame`)
>>> idx.shape  # shape of the full index array
>>> idx.uri_full_index  # List of absolute path to files from the full index table column 'file'
>>> idx.to_dataframe(index=True)  # Convert index to user-friendly :class:`pandas.DataFrame`
>>> idx.to_dataframe(index=True, nrows=2)  # Only returns the first nrows of the index

Search methods and properties:

>>> idx.search_wmo(1901393)
>>> idx.search_cyc(1)
>>> idx.search_wmo_cyc(1901393, [1,12])
>>> idx.search_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_lat_lon([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_params(['C1PHASE_DOXY', 'DOWNWELLING_PAR'])  # Take a list of strings, only for BGC index !
>>> idx.N_MATCH  # Shortcut for length of 1st dimension of the search results array
>>> idx.search  # Internal table with search results
>>> idx.uri  # List of absolute path to files from the search results table column 'file'
>>> idx.run()  # Run the search and save results in cache if necessary
>>> idx.to_dataframe()  # Convert search results to user-friendly :class:`pandas.DataFrame`
>>> idx.to_dataframe(nrows=2)  # Only returns the first nrows of the search results
>>> idx.to_indexfile("search_index.txt")  # Export search results to Argo standard index file

Misc:

>>> idx.convention  # What is the expected index format (core vs BGC profile index)
>>> idx.cname
>>> idx.read_wmo
>>> idx.read_params
>>> idx.records_per_wmo

__init__(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', cache: bool = False, cachedir: str = '', timeout: int = 0, convention: str | None = None)#

Create an Argo index file store

Parameters:

host (str, default: https://data-argo.ifremer.fr) – Host is a local or remote ftp/http path to a dac folder (GDAC structure compliant). This takes values like: ftp://ftp.ifremer.fr/ifremer/argo, ftp://usgodae.org/pub/outgoing/argo or a local absolute path.
index_file (str, default: ar_index_global_prof.txt) – Name of the csv-like text file with the index
cache (bool, default: False) – Use cache or not.
cachedir (str, default: OPTIONS['cachedir']) – Folder where to store cached files
convention (str, default: ar_index_global_prof) – Set the expected format convention of the index file. This is useful when trying to load index file with custom name.

Methods

`__init__`([host, index_file, cache, ...])	Create an Argo index file store
`cachepath`(path)	Return path to a cached file
`clear_cache`()	Clear cache registry and files associated with this store instance.
`load`([nrows, force])	Load an Argo-index file content
`read_params`([index])	Return list of unique PARAMETERs in index or search results
`read_wmo`([index])	Return list of unique WMOs in search results
`records_per_wmo`([index])	Return the number of records per unique WMOs in search results
`run`([nrows])	Filter index with search criteria
`search_cyc`(CYCs[, nrows])	Search index for cycle numbers
`search_lat_lon`(BOX[, nrows])	Search index for a rectangular latitude/longitude domain
`search_lat_lon_tim`(BOX[, nrows])	Search index for a rectangular latitude/longitude domain and time range
`search_params`(PARAMs[, nrows])	Search index for a list of parameters
`search_tim`(BOX[, nrows])	Search index for a time range
`search_wmo`(WMOs[, nrows])	Search index for floats defined by their WMO
`search_wmo_cyc`(WMOs, CYCs[, nrows])	Search index for floats defined by their WMO and specific cycle numbers
`to_dataframe`([nrows, index])	Return index or search results as `pandas.DataFrame`
`to_indexfile`(outputfile)	Save search results on file, following the Argo standard index formats

Attributes

`N_FILES`	Number of rows in search result or index if search not triggered
`N_MATCH`	Number of rows in search result
`N_RECORDS`	Number of rows in the full index
`backend`	Name of store backend
`cname`	Return the search constraint(s) as a pretty formatted string
`convention`	Convention of the index (standard csv file name)
`convention_supported`	List of supported conventions
`convention_title`	Long name for the index convention
`ext`	Storage file extension
`search_path`	Path to search result uri
`search_type`	Dictionary with search meta-data
`sha_df`	Returns a unique SHA for a cname/dataframe
`sha_h5`	Returns a unique SHA for a cname/hdf5
`sha_pq`	Returns a unique SHA for a cname/parquet
`shape`	Shape of the index array
`uri`	List of URI from search results
`uri_full_index`	List of URI from index

argopy.ArgoIndex

Contents

argopy.ArgoIndex#