argopy.stores.httpstore_erddap_auth.open_mfjson

argopy.stores.httpstore_erddap_auth.open_mfjson#

httpstore_erddap_auth.open_mfjson(urls, max_workers: int = 6, method: str = 'thread', progress: bool | str = False, preprocess=None, preprocess_opts={}, open_json_opts={}, url_follow=False, errors: str = 'ignore', *args, **kwargs)#

Download and process a collection of JSON documents from urls

This is a version of the httpstore.open_json method that is able to handle a list of urls sequentially or in parallel.

This method uses a concurrent.futures.ThreadPoolExecutor by default for parallelization. See method parameters below for more options.

Parameters:

urls (list(str))
max_workers (int) – Maximum number of threads or processes.
method (str, default: thread) –
Define the parallelization method:
- thread (default): based on concurrent.futures.ThreadPoolExecutor with a pool of at most max_workers threads
- process: based on concurrent.futures.ProcessPoolExecutor with a pool of at most max_workers processes
- distributed.client.Client: use a Dask client
- sequential/seq: open data sequentially in a simple loop, no parallelization applied
progress (bool, default: False) – Display a progress bar if possible
preprocess (collections.abc.Callable (optional)) – If provided, call this function on each dataset prior to concatenation
preprocess_opts (dict (optional)) – Options passed to the preprocess collections.abc.Callable, if any.
url_follow (bool, False) – Follow the URL to the preprocess method as url argument.
errors (str, default: ignore) –
Define how to handle errors raised during data URIs fetching:
- ignore (default): Do not stop processing, simply issue a debug message in logging console
- raise: Raise any error encountered
- silent: Do not stop processing and do not issue log message

Return type:

list()

Notes

For the distributed.client.Client and concurrent.futures.ProcessPoolExecutor to work appropriately, the pre-processing collections.abc.Callable must be serializable. This can be checked with:

>>> from distributed.protocol import serialize
>>> from distributed.protocol.serialize import ToPickle
>>> serialize(ToPickle(preprocess_function))

argopy.stores.httpstore_erddap_auth.open_mfjson

Contents

argopy.stores.httpstore_erddap_auth.open_mfjson#