argopy.stores.argo_index_pd.indexstore_pandas#

class indexstore_pandas(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', cache: bool = False, cachedir: str = '', timeout: int = 0)[source]#

Argo GDAC index store using pandas.DataFrame as internal storage format.

With this store, index and search results are saved as pickle files in cache

Examples

An index store is instantiated with the access path (host) and the index file:

>>> idx = indexstore()
>>> idx = indexstore(host="ftp://ftp.ifremer.fr/ifremer/argo")
>>> idx = indexstore(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt")
>>> idx = indexstore(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt", cache=True)

Index methods and properties:

>>> idx.load()
>>> idx.load(nrows=12)  # Only load the first N rows of the index
>>> idx.N_RECORDS  # Shortcut for length of 1st dimension of the index array
>>> idx.index  # internal storage structure of the full index (:class:`pyarrow.Table` or :class:`pandas.DataFrame`)
>>> idx.shape  # shape of the full index array
>>> idx.uri_full_index  # List of absolute path to files from the full index table column 'file'
>>> idx.to_dataframe(index=True)  # Convert index to user-friendly :class:`pandas.DataFrame`
>>> idx.to_dataframe(index=True, nrows=2)  # Only returns the first nrows of the index

Search methods and properties:

>>> idx.search_wmo(1901393)
>>> idx.search_cyc(1)
>>> idx.search_wmo_cyc(1901393, [1,12])
>>> idx.search_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_lat_lon([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01'])  # Take an index BOX definition
>>> idx.N_MATCH  # Shortcut for length of 1st dimension of the search results array
>>> idx.search  # Internal table with search results
>>> idx.uri  # List of absolute path to files from the search results table column 'file'
>>> idx.run()  # Run the search and save results in cache if necessary
>>> idx.to_dataframe()  # Convert search results to user-friendly :class:`pandas.DataFrame`
>>> idx.to_dataframe(nrows=2)  # Only returns the first nrows of the search results

Misc:

>>> idx.cname
>>> idx.read_wmo
>>> idx.records_per_wmo
__init__(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', cache: bool = False, cachedir: str = '', timeout: int = 0)#

Create an Argo index file store

Parameters
  • host (str, default: https://data-argo.ifremer.fr) – Host is a local or remote ftp/http path to a dac folder (GDAC structure compliant). This takes values like: ftp://ftp.ifremer.fr/ifremer/argo, ftp://usgodae.org/pub/outgoing/argo or a local absolute path.

  • index_file (str, default: ar_index_global_prof.txt) – Name of the csv-like text file with the index

  • cache (bool, default: False) – Use cache or not.

  • cachedir (str, default: OPTIONS['cachedir'])) – Folder where to store cached files

Methods

__init__([host, index_file, cache, ...])

Create an Argo index file store

cachepath(path)

Return path to a cached file

clear_cache()

Clear cache registry and files associated with this store instance.

load([nrows, force])

Load an Argo-index file content

read_wmo([index])

Return list of unique WMOs in search results

records_per_wmo([index])

Return the number of records per unique WMOs in search results

run([nrows])

Filter index with search criteria

search_cyc(CYCs[, nrows])

Search index for cycle numbers

search_lat_lon(BOX[, nrows])

Search index for a rectangular latitude/longitude domain

search_lat_lon_tim(BOX[, nrows])

Search index for a rectangular latitude/longitude domain and time range

search_tim(BOX[, nrows])

Search index for a time range

search_wmo(WMOs[, nrows])

Search index for floats defined by their WMO

search_wmo_cyc(WMOs, CYCs[, nrows])

Search index for floats defined by their WMO and specific cycle numbers

to_dataframe([nrows, index])

Return index or search results as pandas.DataFrame

Attributes

N_FILES

Number of rows in search result or index if search not triggered

N_MATCH

Number of rows in search result

N_RECORDS

Number of rows in the full index

backend

Name of store backend

cname

Return the search constraint(s) as a pretty formatted string

ext

Storage file extension

search_path

Path to search result uri

search_type

Dictionary with search meta-data

sha_df

Returns a unique SHA for a cname/dataframe

sha_h5

Returns a unique SHA for a cname/hdf5

sha_pq

Returns a unique SHA for a cname/parquet

shape

Shape of the index array

uri

List of URI from search results

uri_full_index

List of URI from index