argopy.stores.ArgoIndex#
- class ArgoIndex(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', convention: str | None = None, cache: bool = False, cachedir: str = '', timeout: int = 0)[source]#
Argo GDAC index store
If Pyarrow is available, this class will use
pyarrow.Table
as internal storage format; otherwise, apandas.DataFrame
will be used.You can use the exact index file names or keywords:
core
for thear_index_global_prof.txt
index file,bgc-b
for theargo_bio-profile_index.txt
index file,bgc-s
for theargo_synthetic-profile_index.txt
index file.
Examples
An index store is instantiated with a host (any access path, local, http or ftp) and an index file:
>>> idx = ArgoIndex() >>> idx = ArgoIndex(host="https://data-argo.ifremer.fr") # Default host >>> idx = ArgoIndex(host="ftp://ftp.ifremer.fr/ifremer/argo", index_file="ar_index_global_prof.txt") # Default index >>> idx = ArgoIndex(index_file="bgc-s") # Use keywords instead of exact file names >>> idx = ArgoIndex(host="https://data-argo.ifremer.fr", index_file="bgc-b", cache=True) # Use cache for performances >>> idx = ArgoIndex(host=".", index_file="dummy_index.txt", convention="core") # Load your own index
Full index methods and properties:
>>> idx.load() >>> idx.load(nrows=12) # Only load the first N rows of the index >>> idx.to_dataframe(index=True) # Convert index to user-friendly :class:`pandas.DataFrame` >>> idx.to_dataframe(index=True, nrows=2) # Only returns the first nrows of the index >>> idx.N_RECORDS # Shortcut for length of 1st dimension of the index array >>> idx.index # internal storage structure of the full index (:class:`pyarrow.Table` or :class:`pandas.DataFrame`) >>> idx.shape # shape of the full index array >>> idx.uri_full_index # List of absolute path to files from the full index table column 'file'
Search methods:
>>> idx.search_wmo(1901393) >>> idx.search_cyc(1) >>> idx.search_wmo_cyc(1901393, [1,12]) >>> idx.search_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_lat_lon([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_params(['C1PHASE_DOXY', 'DOWNWELLING_PAR']) # Take a list of strings, only for BGC index ! >>> idx.search_parameter_data_mode({'BBP700': 'D', 'DOXY': ['A', 'D']}) # Take a dict.
Search result properties and methods:
>>> idx.N_MATCH # Shortcut for length of 1st dimension of the search results array >>> idx.search # Internal table with search results >>> idx.uri # List of absolute path to files from the search results table column 'file'
>>> idx.run() # Run the search and save results in cache if necessary >>> idx.to_dataframe() # Convert search results to user-friendly :class:`pandas.DataFrame` >>> idx.to_dataframe(nrows=2) # Only returns the first nrows of the search results >>> idx.to_indexfile("search_index.txt") # Export search results to Argo standard index file
Misc:
>>> idx.convention # What is the expected index format (core vs BGC profile index) >>> idx.cname >>> idx.read_wmo >>> idx.read_params >>> idx.records_per_wmo
- __init__(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', convention: str | None = None, cache: bool = False, cachedir: str = '', timeout: int = 0) object #
Create an Argo index file store
- Parameters:
host (str, default:
https://data-argo.ifremer.fr
) – Local or remote (ftp or http) path to a dac folder (GDAC structure compliant). This takes values like:ftp://ftp.ifremer.fr/ifremer/argo
,ftp://usgodae.org/pub/outgoing/argo
or a local absolute path.index_file (str, default:
ar_index_global_prof.txt
) –Name of the csv-like text file with the index.
Possible values are standard file name:
ar_index_global_prof.txt
,argo_bio-profile_index.txt
orargo_synthetic-profile_index.txt
.You can also use the following shortcuts:
core
,bgc-b
,bgc-s
, respectively.convention (str, default: None) –
- Set the expected format convention of the index file. This is useful when trying to load index file with custom name. If set to
None
, we’ll try to infer the convention from theindex_file
value. Possible values:
ar_index_global_prof
,argo_bio-profile_index
, orargo_synthetic-profile_index
.
You can also use the keyword:
core
,bgc-s
,bgc-b
.- Set the expected format convention of the index file. This is useful when trying to load index file with custom name. If set to
cache (bool, default: False) – Use cache or not.
cachedir (str, default: OPTIONS['cachedir']) – Folder where to store cached files
timeout (int, default: OPTIONS['api_timeout']) – Time out in seconds to connect to a remote host (ftp or http).
Methods
__init__
([host, index_file, convention, ...])Create an Argo index file store
cachepath
(path)Return path to a cached file
clear_cache
()Clear cache registry and files associated with this store instance.
load
([nrows, force])Load an Argo-index file content
read_params
([index])Return list of unique PARAMETERs in index or search results
read_wmo
([index])Return list of unique WMOs in search results
records_per_wmo
([index])Return the number of records per unique WMOs in search results
run
([nrows])Filter index with search criteria
search_cyc
(CYCs[, nrows])Search index for cycle numbers
search_lat_lon
(BOX[, nrows])Search index for a rectangular latitude/longitude domain
search_lat_lon_tim
(BOX[, nrows])Search index for a rectangular latitude/longitude domain and time range
search_parameter_data_mode
(PARAMs[, ...])Search index for profiles with a parameter in a specific data mode
search_params
(PARAMs[, logical, nrows])Search index for one or a list of parameters
search_tim
(BOX[, nrows])Search index for a time range
search_wmo
(WMOs[, nrows])Search index for floats defined by their WMO
search_wmo_cyc
(WMOs, CYCs[, nrows])Search index for floats defined by their WMO and specific cycle numbers
to_dataframe
([nrows, index, completed])Return index or search results as
pandas.DataFrame
to_indexfile
(outputfile)Save search results on file, following the Argo standard index formats
Attributes
N_FILES
Number of rows in search result or index if search not triggered
N_MATCH
Number of rows in search result
N_RECORDS
Number of rows in the full index
backend
Name of store backend
cname
Return the search constraint(s) as a pretty formatted string
convention
Convention of the index (standard csv file name)
convention_supported
List of supported conventions
convention_title
Long name for the index convention
ext
Storage file extension
search_path
Path to search result uri
search_type
Dictionary with search meta-data
sha_df
Returns a unique SHA for a cname/dataframe
sha_h5
Returns a unique SHA for a cname/hdf5
sha_pq
Returns a unique SHA for a cname/parquet
shape
Shape of the index array
uri
List of URI from search results
uri_full_index
List of URI from index