Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 57 additions & 0 deletions datasets/telomere-to-telomere-korean-pangenome-project.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
Name: Telomere-to-Telomere Korean Pangenome Project
Description: >
A 1.55 PB open-access collection of telomere-to-telomere (T2T) Korean genomes,
including long-read sequencing (Oxford Nanopore ultra-long, PacBio HiFi),
chromosomal conformation capture data, phased diploid assemblies, pangenome
graph representations, and variant datasets. This resource enables accurate
construction of the first comprehensive Korean pangenome reference and
supports population genetics, graph-based alignment, structural variant
discovery, and precision medicine research for East Asian populations.

Documentation: https://github.com/KoreanPangenome/KoreanPangenome
Contact: [email protected]
ManagedBy: "[National Institute of Health, Republic of Korea](https://www.nih.go.kr/ko/main/main.do)"
UpdateFrequency: >
New data generated annually; expected growth of approximately 350 TB per year
between 2025 and 2029.

Tags:
- aws-pds
- genomics
- human genetics
- pangenome
- long-read
- sequencing
- precision-medicine
- health

License: >
Korean Pangenome Project Data Use Terms (open-access for scientific, educational,
and legitimate public health use under the Bioethics and Safety Act of the
Republic of Korea). Users must not attempt re-identification and must comply
with all relevant laws and ethical guidelines.

Citation: >
Please acknowledge the National Institute of Health, Republic of Korea and the
Telomere-to-Telomere Korean Pangenome Project (sub-program 6634-320) in any
publications or derivative works.

Resources:
- Description: >
Korean Pangenome Project data including long-read sequencing, assemblies,
and variant resources. Full resource description and S3 bucket details
will be updated upon public release.
ARN: ""
Region: ""
Type: S3 Bucket

DataAtWork:
Tutorials:
- Title: Get To Know A Dataset: Telomere-to-Telomere Korean Pangenome Project
URL: https://github.com/KoreanPangenome/open-data-examples/tree/main/korean-pangenome
NotebookURL: https://github.com/KoreanPangenome/open-data-examples/blob/main/korean-pangenome/get-to-know-a-dataset.ipynb
AuthorName: National Institute of Health, Republic of Korea
AuthorURL: https://www.nih.go.kr/ko/main/main.do

ADXCategories:
- Healthcare & Life Sciences