argopy.ArgoIndex#
- class ArgoIndex(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', cache: bool = False, cachedir: str = '', timeout: int = 0, convention: str | None = None)[source]#
Argo GDAC index store
If Pyarrow is available, this class will use
pyarrow.Table
as internal storage format; otherwise, apandas.DataFrame
will be used.Examples
An index store is instantiated with the access path (host) and the index file:
>>> idx = ArgoIndex() >>> idx = ArgoIndex(host="ftp://ftp.ifremer.fr/ifremer/argo") >>> idx = ArgoIndex(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt") >>> idx = ArgoIndex(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt", cache=True) >>> idx = ArgoIndex(host=".", index_file="dummy_index.txt", convention="ar_index_global_prof")
Full index methods and properties:
>>> idx.load() >>> idx.load(nrows=12) # Only load the first N rows of the index >>> idx.N_RECORDS # Shortcut for length of 1st dimension of the index array >>> idx.index # internal storage structure of the full index (:class:`pyarrow.Table` or :class:`pandas.DataFrame`) >>> idx.shape # shape of the full index array >>> idx.uri_full_index # List of absolute path to files from the full index table column 'file' >>> idx.to_dataframe(index=True) # Convert index to user-friendly :class:`pandas.DataFrame` >>> idx.to_dataframe(index=True, nrows=2) # Only returns the first nrows of the index
Search methods and properties:
>>> idx.search_wmo(1901393) >>> idx.search_cyc(1) >>> idx.search_wmo_cyc(1901393, [1,12]) >>> idx.search_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_lat_lon([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_params(['C1PHASE_DOXY', 'DOWNWELLING_PAR']) # Take a list of strings, only for BGC index ! >>> idx.N_MATCH # Shortcut for length of 1st dimension of the search results array >>> idx.search # Internal table with search results >>> idx.uri # List of absolute path to files from the search results table column 'file' >>> idx.run() # Run the search and save results in cache if necessary >>> idx.to_dataframe() # Convert search results to user-friendly :class:`pandas.DataFrame` >>> idx.to_dataframe(nrows=2) # Only returns the first nrows of the search results >>> idx.to_indexfile("search_index.txt") # Export search results to Argo standard index file
Misc:
>>> idx.convention # What is the expected index format (core vs BGC profile index) >>> idx.cname >>> idx.read_wmo >>> idx.read_params >>> idx.records_per_wmo
- __init__(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', cache: bool = False, cachedir: str = '', timeout: int = 0, convention: str | None = None)#
Create an Argo index file store
- Parameters:
host (str, default:
https://data-argo.ifremer.fr
) – Host is a local or remote ftp/http path to a dac folder (GDAC structure compliant). This takes values like:ftp://ftp.ifremer.fr/ifremer/argo
,ftp://usgodae.org/pub/outgoing/argo
or a local absolute path.index_file (str, default:
ar_index_global_prof.txt
) – Name of the csv-like text file with the indexcache (bool, default: False) – Use cache or not.
cachedir (str, default: OPTIONS['cachedir']) – Folder where to store cached files
convention (str, default:
ar_index_global_prof
) – Set the expected format convention of the index file. This is useful when trying to load index file with custom name.
Methods
__init__
([host, index_file, cache, ...])Create an Argo index file store
cachepath
(path)Return path to a cached file
clear_cache
()Clear cache registry and files associated with this store instance.
load
([nrows, force])Load an Argo-index file content
read_params
([index])Return list of unique PARAMETERs in index or search results
read_wmo
([index])Return list of unique WMOs in search results
records_per_wmo
([index])Return the number of records per unique WMOs in search results
run
([nrows])Filter index with search criteria
search_cyc
(CYCs[, nrows])Search index for cycle numbers
search_lat_lon
(BOX[, nrows])Search index for a rectangular latitude/longitude domain
search_lat_lon_tim
(BOX[, nrows])Search index for a rectangular latitude/longitude domain and time range
search_params
(PARAMs[, nrows])Search index for a list of parameters
search_tim
(BOX[, nrows])Search index for a time range
search_wmo
(WMOs[, nrows])Search index for floats defined by their WMO
search_wmo_cyc
(WMOs, CYCs[, nrows])Search index for floats defined by their WMO and specific cycle numbers
to_dataframe
([nrows, index])Return index or search results as
pandas.DataFrame
to_indexfile
(outputfile)Save search results on file, following the Argo standard index formats
Attributes
N_FILES
Number of rows in search result or index if search not triggered
Number of rows in search result
Number of rows in the full index
backend
Name of store backend
cname
Return the search constraint(s) as a pretty formatted string
convention
Convention of the index (standard csv file name)
List of supported conventions
convention_title
Long name for the index convention
ext
Storage file extension
search_path
Path to search result uri
search_type
Dictionary with search meta-data
sha_df
Returns a unique SHA for a cname/dataframe
sha_h5
Returns a unique SHA for a cname/hdf5
sha_pq
Returns a unique SHA for a cname/parquet
shape
Shape of the index array
uri
List of URI from search results
uri_full_index
List of URI from index