Dear @zhengxwen ,
hope you are keeping well.
Would you please recommend the best practice to address the +1 problem using GDS, e.g. adding new samples incrementally by small batches. For example, our starting batch can be a biobank scale dataset, e.g. UKB, and we are able to convert a pVCF to a GDS. When we want to add new samples, SeqMerge function will do, however it needs to generate a new file object, and the storage space will be doubled at least (temporarily), is there a way to avoid this by appending the new samples to the existing GDS directly? Please share your advice and guidance on this.
Best wishes,
Fengyuan
Dear @zhengxwen ,
hope you are keeping well.
Would you please recommend the best practice to address the +1 problem using GDS, e.g. adding new samples incrementally by small batches. For example, our starting batch can be a biobank scale dataset, e.g. UKB, and we are able to convert a pVCF to a GDS. When we want to add new samples, SeqMerge function will do, however it needs to generate a new file object, and the storage space will be doubled at least (temporarily), is there a way to avoid this by appending the new samples to the existing GDS directly? Please share your advice and guidance on this.
Best wishes,
Fengyuan