
class ArgoIndex(host: str = '', index_file: str = 'ar_index_global_prof.txt', convention: str | None = None, cache: bool = False, cachedir: str = '', timeout: int = 0)[source]#

Argo GDAC index store

If Pyarrow is available, this class will use pyarrow.Table as internal storage format; otherwise, a pandas.DataFrame will be used.

You can use the exact index file names or keywords:

  • core for the ar_index_global_prof.txt index file,

  • bgc-b for the argo_bio-profile_index.txt index file,

  • bgc-s for the argo_synthetic-profile_index.txt index file.


An index store is instantiated with a host (any access path, local, http or ftp) and an index file:

>>> idx = ArgoIndex()
>>> idx = ArgoIndex(host="")  # Default host
>>> idx = ArgoIndex(host="", index_file="ar_index_global_prof.txt")  # Default index
>>> idx = ArgoIndex(index_file="bgc-s")  # Use keywords instead of exact file names
>>> idx = ArgoIndex(host="", index_file="bgc-b", cache=True)  # Use cache for performances
>>> idx = ArgoIndex(host=".", index_file="dummy_index.txt", convention="core")  # Load your own index

Full index methods and properties:

>>> idx.load()
>>> idx.load(nrows=12)  # Only load the first N rows of the index
>>> idx.to_dataframe(index=True)  # Convert index to user-friendly :class:`pandas.DataFrame`
>>> idx.to_dataframe(index=True, nrows=2)  # Only returns the first nrows of the index
>>> idx.N_RECORDS  # Shortcut for length of 1st dimension of the index array
>>> idx.index  # internal storage structure of the full index (:class:`pyarrow.Table` or :class:`pandas.DataFrame`)
>>> idx.shape  # shape of the full index array
>>> idx.uri_full_index  # List of absolute path to files from the full index table column 'file'

Search methods:

>>> idx.search_wmo(1901393)
>>> idx.search_cyc(1)
>>> idx.search_wmo_cyc(1901393, [1,12])
>>> idx.search_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_lat_lon([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_params(['C1PHASE_DOXY', 'DOWNWELLING_PAR'])  # Take a list of strings, only for BGC index !
>>> idx.search_parameter_data_mode({'BBP700': 'D', 'DOXY': ['A', 'D']})  # Take a dict.

Search result properties and methods:

>>> idx.N_MATCH  # Shortcut for length of 1st dimension of the search results array
>>>  # Internal table with search results
>>> idx.uri  # List of absolute path to files from the search results table column 'file'
>>>  # Run the search and save results in cache if necessary
>>> idx.to_dataframe()  # Convert search results to user-friendly :class:`pandas.DataFrame`
>>> idx.to_dataframe(nrows=2)  # Only returns the first nrows of the search results
>>> idx.to_indexfile("search_index.txt")  # Export search results to Argo standard index file


>>> idx.convention  # What is the expected index format (core vs BGC profile index)
>>> idx.cname
>>> idx.read_wmo
>>> idx.read_params
>>> idx.records_per_wmo
__init__(host: str = '', index_file: str = 'ar_index_global_prof.txt', convention: str | None = None, cache: bool = False, cachedir: str = '', timeout: int = 0) object#

Create an Argo index file store

  • host (str, default: – Local or remote (ftp or http) path to a dac folder (GDAC structure compliant). This takes values like:, or a local absolute path.

  • index_file (str, default: ar_index_global_prof.txt) –

    Name of the csv-like text file with the index.

    Possible values are standard file name: ar_index_global_prof.txt, argo_bio-profile_index.txt or argo_synthetic-profile_index.txt.

    You can also use the following shortcuts: core, bgc-b, bgc-s, respectively.

  • convention (str, default: None) –

    Set the expected format convention of the index file. This is useful when trying to load index file with custom name. If set to None, we’ll try to infer the convention from the index_file value.

    Possible values: ar_index_global_prof, argo_bio-profile_index, or argo_synthetic-profile_index.

    You can also use the keyword: core, bgc-s, bgc-b.

  • cache (bool, default: False) – Use cache or not.

  • cachedir (str, default: OPTIONS['cachedir']) – Folder where to store cached files

  • timeout (int, default: OPTIONS['api_timeout']) – Time out in seconds to connect to a remote host (ftp or http).


__init__([host, index_file, convention, ...])

Create an Argo index file store


Return path to a cached file


Clear cache registry and files associated with this store instance.

load([nrows, force])

Load an Argo-index file content


Return list of unique PARAMETERs in index or search results


Return list of unique WMOs in search results


Return the number of records per unique WMOs in search results


Filter index with search criteria

search_cyc(CYCs[, nrows])

Search index for cycle numbers

search_lat_lon(BOX[, nrows])

Search index for a rectangular latitude/longitude domain

search_lat_lon_tim(BOX[, nrows])

Search index for a rectangular latitude/longitude domain and time range

search_parameter_data_mode(PARAMs[, ...])

Search index for profiles with a parameter in a specific data mode

search_params(PARAMs[, logical, nrows])

Search index for one or a list of parameters

search_tim(BOX[, nrows])

Search index for a time range

search_wmo(WMOs[, nrows])

Search index for floats defined by their WMO

search_wmo_cyc(WMOs, CYCs[, nrows])

Search index for floats defined by their WMO and specific cycle numbers

to_dataframe([nrows, index, completed])

Return index or search results as pandas.DataFrame


Save search results on file, following the Argo standard index formats



Number of rows in search result or index if search not triggered


Number of rows in search result


Number of rows in the full index


Name of store backend


Return the search constraint(s) as a pretty formatted string


Convention of the index (standard csv file name)


List of supported conventions


Long name for the index convention


Storage file extension


Path to search result uri


Dictionary with search meta-data


Returns a unique SHA for a cname/dataframe


Returns a unique SHA for a cname/hdf5


Returns a unique SHA for a cname/parquet


Shape of the index array


List of URI from search results


List of URI from index