Clarified INFO END deprecation status #844

d-cameron · 2025-09-09T15:41:01Z

Addresses concerns raised in #784

github-actions · 2025-09-09T15:43:56Z

Changed PDFs as of 5132c8b: VCFv4.5 (diff).

jmarshall · 2025-09-09T19:30:58Z

VCFv4.5.tex

-##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
+##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of \begin{environment-name}
+	Samples
+\end{environment-name} With Data">


Unintended \begin{environment-name} … edit here?

jkbonfield · 2025-09-11T10:47:45Z

VCFv4.5.tex

+
+It is recommended that VCFv4.5 files include END unless that VCF contains any record that could be misinterpreted by the presence of END.
+That is, if there exists a sample or allele in which the END computed for that SVLEN or FORMAT LEN does not equal the maximum END, then no END should be present in any record that VCF.
+This approachs maintains backwards compatibility for unproblematic VCFs while attempting to minimise the probability of downstream data errors by making problematic records not valid for earlier versions of VCF (END was required for $<$*$>$ symbolic alleles).


"approachs" should be "approaches".

pd3 · 2025-09-11T10:58:06Z

VCFv4.5.tex

+Those same tools will incorrectly interpret the size of the smaller symbolic structural variants and $<$*$>$ symbolic alleles when END is present.
+
+It is recommended that VCFv4.5 files include END unless that VCF contains any record that could be misinterpreted by the presence of END.
+That is, if there exists a sample or allele in which the END computed for that SVLEN or FORMAT LEN does not equal the maximum END, then no END should be present in any record that VCF.


I find the current wording confusing. May I suggest rephrasing along the following lines:

Clarify that END is a derived field. If it is absent, it can be computed in such and such way.
(Therefore, not deprecated. Using the term deprecated raises unnecessary doubt: should newly written software still support END? The answer is yes, it must remain supported. So it’s better to avoid language that implies otherwise.)

Clarify the handling of inconsistencies. I do not fully understand what the other paragraphs are trying to convey. My interpretation is that they intend to describe what happens if END is computed incorrectly or conflicts with the primary information. Practically speaking, the responsibility lies with the producer to ensure consistency, and each program may choose how to handle discrepancies. If an analysis relies on the END tag, it will not recompute it from the primary fields (then we would not END in the first place). Conversely, if an analysis works directly from the primary fields, it is expected it will ignore END, since END is derived.

Clarify the comparison of END and LEN. If a comparison between END and LEN is important, the text should explain explicitly in what ways the two differ and in what ways they are equivalent. Although I am fairly familiar with the VCF format, the current paragraph did not make this distinction clear.

Clarified INFO END deprecation status

5132c8b

d-cameron mentioned this pull request Sep 9, 2025

INFO/END should not be deprecated #784

Open

jmarshall reviewed Sep 9, 2025

View reviewed changes

jmarshall added the vcf label Sep 9, 2025

jkbonfield reviewed Sep 11, 2025

View reviewed changes

pd3 reviewed Sep 11, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Clarified INFO END deprecation status #844

Clarified INFO END deprecation status #844

Uh oh!

d-cameron commented Sep 9, 2025

Uh oh!

github-actions bot commented Sep 9, 2025

Uh oh!

jmarshall Sep 9, 2025

Uh oh!

jkbonfield Sep 11, 2025

Uh oh!

pd3 Sep 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Clarified INFO END deprecation status #844

Are you sure you want to change the base?

Clarified INFO END deprecation status #844

Uh oh!

Conversation

d-cameron commented Sep 9, 2025

Uh oh!

github-actions bot commented Sep 9, 2025

Uh oh!

jmarshall Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

jkbonfield Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

pd3 Sep 11, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants