
The NIST Public Data Repository allows users to download a single data file from a dataset by clicking on the download icon () to the right of file's name in the file listing; however, more often you will want to download many files from the dataset.
There are a few options to download files in bulk:
*recommended for datasets with more than 300 files
Download all files through the Data Cart
Download selected portions of a dataset with the Data Cart
In the PDR, you have a general Data Cart available for downloading in bulk a variety of files from multiple datasets. To use this feature, add files or folders of files from a dataset to your cart by clicking on "Add-to-Cart" icons () on the right side of the file listing. Alternatively, add all the files from a dataset to your cart by clicking the "Add-all-to-Cart" icon () above and to the left of the file listing.
After adding the files of interest, you can open up your view of the Cart by clicking either the "Cart" link in the most top-right corner of any dataset's home page or the "Data Cart" link in the navigation bar on the right side of the dataset's page. You will see a listing of the files and folders you have added to the cart. You can browse list, download individual files, or select the files you wish to download in bulk by clicking the selection boxes at the left. Click to the "Download Selected" button to prepare download. Like withe the "download-all" feature, a pop-up will show you a list of one or more Zip files containing the selected files; click the "Start Download" to start the actual download.
Downloading large datasets using the rclone tool
When PDR demand is high and the dataset you want to download contains a large number of files, the Data Cart will struggle to provide the data. When the number of files is larger than about 300, we recommend you try using rclone; it is a free, open-source tool for transfering many files to and from remote storage (such as a cloud drive) easily and reliably. The NIST PDR is fully compatible with rclone, making it a useful tool for downloading large datasets in bulk. It is available for all major computers, including Linux, Macs, and Windows, and can be installed manually or via common OS software package managers (e.g. apt, rpm, Brew, etc.).
After installing rclone, you can download all the files in a PDR dataset to your current directory by typing:
rclone copy :http: ./dataset-id/ --http-url http://data.nist.gov/od/ds/dataset-id/ -PIf the download process gets interrupted for any reason, you can rerun the same command, and it will resume the download where it left off.
Downloading large datasets using the Python script, pdrdownload.py
Download pdrdownload.py This script requires Python 3.8 or higher
Another way to download large datasets conveniently and reliably from the PDR is with our custom Python script, pdrdownload.py. For users that can run a python script, this script has several advantages:
To see a preview of what you will be downloading from your dataset, type:
python pdrdownload.py -I dataset-id This will construct and save locally a list of the files in the dataset with the identifier, dataset-id, and it will display the total number of files and the total number of bytes available as part of this dataset.
To start the download, type:
python pdrdownload.py -I dataset-id -DThe script has other useful features like downloading subsets, more verbose output, and others. To see the full list of options available, type:
python pdrdownload.py --helpProgrammatic access to NIST data products
The NIST Public Data Repository API interface allows users to create their own scripts for downloading files in bulk. In particular, one can download a JSON-encoded metadata description of a dataset which provides the URLs for downloadable files along with other useful information for tracking the data.