GDAC snapshot DOI

GDAC snapshot DOI#

The Argo Data Management Team maintain an exemplary DOI system to support science reproducibility and FAIRness. On a monthly basis, the ADMT zip the entire Argo GDAC, archive it and assign it a specific DOI.

There is a major DOI 10.17882/42182 and each monthly snapshot has a minor DOI, with a hashtag.

argopy provides the ArgoDOI class to help you access, search and retrieve a DOI for Argo.

DOI discovery#

If you don’t know where to start, just load the major Argo DOI record, it will point toward the latest snapshots and list the most recent associated files:

In [1]: from argopy import ArgoDOI

In [2]: doi = ArgoDOI()

In [3]: doi
Out[3]: 
<argopy.DOI>
DOI: 10.17882/42182
Title: Argo float data and metadata from Global Data Assembly Centre (Argo GDAC)
Date: 2025-03-13
Network: core+BGC+deep
Link: https://dx.doi.org/10.17882/42182
File: 171 files in total
Files for core+BGC+deep:
     - #118037 Global GDAC Argo data files (2025-03-13 snapshot) https://www.seanoe.org/data/00311/42182/data/118037.tar.gz (68.4GiB, openAccess=True)
     - #117069 Global GDAC Argo data files (2025-02-09 snapshot) https://www.seanoe.org/data/00311/42182/data/117069.tar.gz (67.3GiB, openAccess=True)
     - #116315 Global GDAC Argo data files (2025-01-09 snapshot) https://www.seanoe.org/data/00311/42182/data/116315.tar.gz (66.5GiB, openAccess=True)
     - #115668 Global GDAC Argo data files (2024-12-09 snapshot) https://www.seanoe.org/data/00311/42182/data/115668.tar.gz (65.7GiB, openAccess=True)
     - #114627 Global GDAC Argo data files (2024-11-09 snapshot) https://www.seanoe.org/data/00311/42182/data/114627.tar.gz (65.0GiB, openAccess=True)
     - #113868 Global GDAC Argo data files (2024-10-09 snapshot) https://www.seanoe.org/data/00311/42182/data/113868.tar.gz (64.4GiB, openAccess=True)
     - #112844 Global GDAC Argo data files (2024-09-09 snapshot) https://www.seanoe.org/data/00311/42182/data/112844.tar.gz (63.7GiB, openAccess=True)
     - #112456 Global GDAC Argo data files (2024-08-09 snapshot) https://www.seanoe.org/data/00311/42182/data/112456.tar.gz (62.9GiB, openAccess=True)
     - #112389 Global GDAC Argo data files (2024-07-09 snapshot) https://www.seanoe.org/data/00311/42182/data/112389.tar.gz (62.3GiB, openAccess=True)
     - #110912 Global GDAC Argo data files (2024-06-09 snapshot) https://www.seanoe.org/data/00311/42182/data/110912.tar.gz (61.6GiB, openAccess=True)
Files for BGC only:
     - #118036 BGC Sprof data files (2025-03-13 snapshot) https://www.seanoe.org/data/00311/42182/data/118036.tar.gz (4.5GiB, openAccess=True)
     - #117068 BGC Sprof data files (2025-02-09 snapshot) https://www.seanoe.org/data/00311/42182/data/117068.tar.gz (4.4GiB, openAccess=True)
     - #116312 BGC Sprof data files (2025-01-09 snapshot) https://www.seanoe.org/data/00311/42182/data/116312.tar.gz (4.4GiB, openAccess=True)
     - #115667 BGC Sprof data files (2024-12-09 snapshot) https://www.seanoe.org/data/00311/42182/data/115667.tar.gz (4.3GiB, openAccess=True)
     - #114622 BGC Sprof data files (2024-11-09 snapshot) https://www.seanoe.org/data/00311/42182/data/114622.tar.gz (4.3GiB, openAccess=True)
     - #113867 BGC Sprof data files (2024-10-09 snapshot) https://www.seanoe.org/data/00311/42182/data/113867.tar.gz (4.2GiB, openAccess=True)
     - #112895 BGC Sprof data files (2024-09-09 snapshot) https://www.seanoe.org/data/00311/42182/data/112895.tar.gz (4.2GiB, openAccess=True)
     - #112455 BGC Sprof data files (2024-08-09 snapshot) https://www.seanoe.org/data/00311/42182/data/112455.tar.gz (4.1GiB, openAccess=True)
     - #110911 BGC Sprof data files (2024-06-09 snapshot) https://www.seanoe.org/data/00311/42182/data/110911.tar.gz (4.0GiB, openAccess=True)
     - #110195 BGC Sprof data files (2024-05-09 snapshot) https://www.seanoe.org/data/00311/42182/data/110195.tar.gz (4.0GiB, openAccess=True)

A typical use case will be for users to access the data on a specific date and then to conduct their analysis. At the time of writing a report or research publication, it is not trivial to get the most appropriate DOI for the dataset analysed. In this case, the ArgoDOI will get handy with its search method that will return the closest Argo DOI to a given date:

In [4]: doi.search('2020-02')
Out[4]: 
<argopy.DOI.record>
DOI: 10.17882/42182#70590
Title: Argo float data and metadata from Global Data Assembly Centre (Argo GDAC) - Snapshot of Argo GDAC of February 10st 2020
Date: 2020-02-10
Network: core+BGC+deep
Link: https://dx.doi.org/10.17882/42182#70590
File: https://www.seanoe.org/data/00311/42182/data/70590.tar.gz (34.4GiB, openAccess=True)

You can also specify the BGC network in order to select DOI with synthetic profiles only:

In [5]: doi.search('2020-02', network='BGC')
Out[5]: 
<argopy.DOI.record>
DOI: 10.17882/42182#95141
Title: Argo float data and metadata from Global Data Assembly Centre (Argo GDAC) - Snapshot of BGC Sprof data files of July 28st 2022
Date: 2022-07-28
Network: BGC
Link: https://dx.doi.org/10.17882/42182#95141
File: https://www.seanoe.org/data/00311/42182/data/95141.tgz (3.0GiB, openAccess=True)

DOI data#

Once you have identified a specific hashtag for you snapshot of interest, you can point directly to it:

In [6]: doi = ArgoDOI('109847')

In [7]: doi
Out[7]: 
<argopy.DOI>
DOI: 10.17882/42182#109847
Title: Argo float data and metadata from Global Data Assembly Centre (Argo GDAC) - Snapshot of BGC Sprof data files of April 09st 2024
Date: 2024-04-09
Network: BGC
Link: https://dx.doi.org/10.17882/42182#109847
File: https://www.seanoe.org/data/00311/42182/data/109847.tar.gz (3.9GiB, openAccess=True)

The later doi object holds attributes such as dx and file.

You can also trigger the tar.gz archive download with the ArgoDOI.download method.