argopy.stores.argo_index_pd.indexstore_pandas#
- class indexstore_pandas(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', cache: bool = False, cachedir: str = '', timeout: int = 0)[source]#
Argo GDAC index store using
pandas.DataFrame
as internal storage format.With this store, index and search results are saved as pickle files in cache
Examples
An index store is instantiated with the access path (host) and the index file:
>>> idx = indexstore() >>> idx = indexstore(host="ftp://ftp.ifremer.fr/ifremer/argo") >>> idx = indexstore(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt") >>> idx = indexstore(host="https://data-argo.ifremer.fr", index_file="ar_index_global_prof.txt", cache=True)
Index methods and properties:
>>> idx.load() >>> idx.load(nrows=12) # Only load the first N rows of the index >>> idx.N_RECORDS # Shortcut for length of 1st dimension of the index array >>> idx.index # internal storage structure of the full index (:class:`pyarrow.Table` or :class:`pandas.DataFrame`) >>> idx.shape # shape of the full index array >>> idx.uri_full_index # List of absolute path to files from the full index table column 'file' >>> idx.to_dataframe(index=True) # Convert index to user-friendly :class:`pandas.DataFrame` >>> idx.to_dataframe(index=True, nrows=2) # Only returns the first nrows of the index
Search methods and properties:
>>> idx.search_wmo(1901393) >>> idx.search_cyc(1) >>> idx.search_wmo_cyc(1901393, [1,12]) >>> idx.search_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_lat_lon([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.N_MATCH # Shortcut for length of 1st dimension of the search results array >>> idx.search # Internal table with search results >>> idx.uri # List of absolute path to files from the search results table column 'file' >>> idx.run() # Run the search and save results in cache if necessary >>> idx.to_dataframe() # Convert search results to user-friendly :class:`pandas.DataFrame` >>> idx.to_dataframe(nrows=2) # Only returns the first nrows of the search results
Misc:
>>> idx.cname >>> idx.read_wmo >>> idx.records_per_wmo
- __init__(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', cache: bool = False, cachedir: str = '', timeout: int = 0)#
Create an Argo index file store
- Parameters:
host (str, default:
https://data-argo.ifremer.fr
) – Host is a local or remote ftp/http path to a dac folder (GDAC structure compliant). This takes values like:ftp://ftp.ifremer.fr/ifremer/argo
,ftp://usgodae.org/pub/outgoing/argo
or a local absolute path.index_file (str, default:
ar_index_global_prof.txt
) – Name of the csv-like text file with the indexcache (bool, default: False) – Use cache or not.
cachedir (str, default: OPTIONS['cachedir'])) – Folder where to store cached files
Methods
__init__
([host, index_file, cache, ...])Create an Argo index file store
cachepath
(path)Return path to a cached file
clear_cache
()Clear cache registry and files associated with this store instance.
load
([nrows, force])Load an Argo-index file content
read_wmo
([index])Return list of unique WMOs in search results
records_per_wmo
([index])Return the number of records per unique WMOs in search results
run
([nrows])Filter index with search criteria
search_cyc
(CYCs[, nrows])Search index for cycle numbers
search_lat_lon
(BOX[, nrows])Search index for a rectangular latitude/longitude domain
search_lat_lon_tim
(BOX[, nrows])Search index for a rectangular latitude/longitude domain and time range
search_tim
(BOX[, nrows])Search index for a time range
search_wmo
(WMOs[, nrows])Search index for floats defined by their WMO
search_wmo_cyc
(WMOs, CYCs[, nrows])Search index for floats defined by their WMO and specific cycle numbers
to_dataframe
([nrows, index])Return index or search results as
pandas.DataFrame
Attributes
N_FILES
Number of rows in search result or index if search not triggered
N_MATCH
Number of rows in search result
N_RECORDS
Number of rows in the full index
backend
Name of store backend
cname
Return the search constraint(s) as a pretty formatted string
ext
Storage file extension
search_path
Path to search result uri
search_type
Dictionary with search meta-data
sha_df
Returns a unique SHA for a cname/dataframe
sha_h5
Returns a unique SHA for a cname/hdf5
sha_pq
Returns a unique SHA for a cname/parquet
shape
Shape of the index array
uri
List of URI from search results
uri_full_index
List of URI from index