Public Data Resource

GIAB Benchmarking of HG002 Assemblies from HPRC Year 1 Bakeoff

Contact: Justin Zook..
Identifier: doi:10.18434/mds2-2578
Version: 1.0 First Released: 2022-06-08 Revised: 2022-06-08

Description

The Human Pangenome Reference Consortium (HPRC) tested which combination of current genome sequencing and automated assembly approaches yields the most complete, accurate, and cost-effective diploid genome assemblies with minimal manual curation. Assemblies were generated for GIAB HG002. Variant calls from twenty-nine assemblies were evaluated by NIST using dipcall v0.3 (https://github.com/lh3/dipcall) to produce variant calls when aligned to GRCh38. Benchmarking of small variant calls was then performed against GIAB benchmark v4.2.1 using hap.py v3.12 (https://github.com/Illumina/hap.py).
Research Topics: Bioscience: Genomics    
Subject Keywords: Human genomics; DNA sequencing; Reference materials; genome assembly; variant calling; benchmarking; HG002; HPRC    

Data Access

These data are public. Access rights statement:
The assemblies provided by Human Pangenome Reference Consortium (HPRC) for evaluation were generated by the HPRC, and its data use protocol is at https://humanpangenome.org/data-use-protocol/
Files

Loading file list...

About This Dataset

Version: 1.0 First Released: 2022-06-08 Revised: 2022-06-08
Cite this dataset
Justin Zook (2021), GIAB Benchmarking of HG002 Assemblies from HPRC Year 1 Bakeoff, National Institute of Standards and Technology, https://doi.org/10.18434/mds2-2578 (Accessed 2023-10-02)
Repository Metadata
Machine-readable descriptions of this dataset are available in the following formats:
NERDm
Access Metrics
Metrics data is not available for all datasets, including this one. This may be because the data is served via servers external to this repository.