argopy.ArgoIndex#
- class ArgoIndex(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', convention: str | None = None, cache: bool = False, cachedir: str = '', timeout: int = 0)[source]#
Argo GDAC index store
If Pyarrow is available, this class will use
pyarrow.Table
as internal storage format; otherwise, apandas.DataFrame
will be used.You can use the exact index file names or keywords:
core
for thear_index_global_prof.txt
index file,bgc-b
for theargo_bio-profile_index.txt
index file,bgc-s
for theargo_synthetic-profile_index.txt
index file.
Examples
An index store is instantiated with a host (any access path, local, http or ftp) and an index file:
>>> idx = ArgoIndex() >>> idx = ArgoIndex(host="https://data-argo.ifremer.fr") # Default host >>> idx = ArgoIndex(host="ftp://ftp.ifremer.fr/ifremer/argo", index_file="ar_index_global_prof.txt") # Default index >>> idx = ArgoIndex(index_file="bgc-s") # Use keywords instead of exact file names >>> idx = ArgoIndex(host="https://data-argo.ifremer.fr", index_file="bgc-b", cache=True) # Use cache for performances >>> idx = ArgoIndex(host=".", index_file="dummy_index.txt", convention="core") # Load your own index
Full index methods and properties:
>>> idx.load() >>> idx.load(nrows=12) # Only load the first N rows of the index >>> idx.to_dataframe(index=True) # Convert index to user-friendly :class:`pandas.DataFrame` >>> idx.to_dataframe(index=True, nrows=2) # Only returns the first nrows of the index >>> idx.N_RECORDS # Shortcut for length of 1st dimension of the index array >>> idx.index # internal storage structure of the full index (:class:`pyarrow.Table` or :class:`pandas.DataFrame`) >>> idx.shape # shape of the full index array >>> idx.uri_full_index # List of absolute path to files from the full index table column 'file'
Search methods:
>>> idx.search_wmo(1901393) >>> idx.search_cyc(1) >>> idx.search_wmo_cyc(1901393, [1,12]) >>> idx.search_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_lat_lon([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_lat_lon_tim([-60, -55, 40., 45., '2007-08-01', '2007-09-01']) # Take an index BOX definition >>> idx.search_params(['C1PHASE_DOXY', 'DOWNWELLING_PAR']) # Take a list of strings, only for BGC index ! >>> idx.search_parameter_data_mode({'BBP700': 'D', 'DOXY': ['A', 'D']}) # Take a dict.
Search result properties and methods:
>>> idx.N_MATCH # Shortcut for length of 1st dimension of the search results array >>> idx.search # Internal table with search results >>> idx.uri # List of absolute path to files from the search results table column 'file'
>>> idx.run() # Run the search and save results in cache if necessary >>> idx.to_dataframe() # Convert search results to user-friendly :class:`pandas.DataFrame` >>> idx.to_dataframe(nrows=2) # Only returns the first nrows of the search results >>> idx.to_indexfile("search_index.txt") # Export search results to Argo standard index file
Misc:
>>> idx.convention # What is the expected index format (core vs BGC profile index) >>> idx.cname >>> idx.read_wmo >>> idx.read_params >>> idx.records_per_wmo
- __init__(host: str = 'https://data-argo.ifremer.fr', index_file: str = 'ar_index_global_prof.txt', convention: str | None = None, cache: bool = False, cachedir: str = '', timeout: int = 0) object #
Create an Argo index file store
- Parameters:
host (str, default:
https://data-argo.ifremer.fr
) – Local or remote (ftp or http) path to a dac folder (GDAC structure compliant). This takes values like:ftp://ftp.ifremer.fr/ifremer/argo
,ftp://usgodae.org/pub/outgoing/argo
or a local absolute path.index_file (str, default:
ar_index_global_prof.txt
) –Name of the csv-like text file with the index.
Possible values are standard file name:
ar_index_global_prof.txt
,argo_bio-profile_index.txt
orargo_synthetic-profile_index.txt
.You can also use the following shortcuts:
core
,bgc-b
,bgc-s
, respectively.convention (str, default: None) –
- Set the expected format convention of the index file. This is useful when trying to load index file with custom name. If set to
None
, we’ll try to infer the convention from theindex_file
value. Possible values:
ar_index_global_prof
,argo_bio-profile_index
, orargo_synthetic-profile_index
.
You can also use the keyword:
core
,bgc-s
,bgc-b
.- Set the expected format convention of the index file. This is useful when trying to load index file with custom name. If set to
cache (bool, default: False) – Use cache or not.
cachedir (str, default: OPTIONS['cachedir']) – Folder where to store cached files
timeout (int, default: OPTIONS['api_timeout']) – Time out in seconds to connect to a remote host (ftp or http).
Methods
__init__
([host, index_file, convention, ...])Create an Argo index file store
cachepath
(path)Return path to a cached file
clear_cache
()Clear cache registry and files associated with this store instance.
load
([nrows, force])Load an Argo-index file content
read_params
([index])Return list of unique PARAMETERs in index or search results
read_wmo
([index])Return list of unique WMOs in search results
records_per_wmo
([index])Return the number of records per unique WMOs in search results
run
([nrows])Filter index with search criteria
search_cyc
(CYCs[, nrows])Search index for cycle numbers
search_lat_lon
(BOX[, nrows])Search index for a rectangular latitude/longitude domain
search_lat_lon_tim
(BOX[, nrows])Search index for a rectangular latitude/longitude domain and time range
search_parameter_data_mode
(PARAMs[, ...])Search index for profiles with a parameter in a specific data mode
search_params
(PARAMs[, logical, nrows])Search index for one or a list of parameters
search_tim
(BOX[, nrows])Search index for a time range
search_wmo
(WMOs[, nrows])Search index for floats defined by their WMO
search_wmo_cyc
(WMOs, CYCs[, nrows])Search index for floats defined by their WMO and specific cycle numbers
to_dataframe
([nrows, index, completed])Return index or search results as
pandas.DataFrame
to_indexfile
(outputfile)Save search results on file, following the Argo standard index formats
Attributes
N_FILES
Number of rows in search result or index if search not triggered
Number of rows in search result
Number of rows in the full index
backend
Name of store backend
cname
Return the search constraint(s) as a pretty formatted string
convention
Convention of the index (standard csv file name)
List of supported conventions
convention_title
Long name for the index convention
ext
Storage file extension
search_path
Path to search result uri
search_type
Dictionary with search meta-data
sha_df
Returns a unique SHA for a cname/dataframe
sha_h5
Returns a unique SHA for a cname/hdf5
sha_pq
Returns a unique SHA for a cname/parquet
shape
Shape of the index array
uri
List of URI from search results
uri_full_index
List of URI from index