Skip to content
This repository has been archived by the owner on Jan 25, 2023. It is now read-only.

Is mateName missing something? #277

Open
teemukataja opened this issue Apr 1, 2019 · 4 comments
Open

Is mateName missing something? #277

teemukataja opened this issue Apr 1, 2019 · 4 comments

Comments

@teemukataja
Copy link
Contributor

#256 added a new property called mateName as a parameter to a variant query. Is this new feature incomplete? Should mateName be paired up with a coordinate to specify where in the mate chromosome the bonding happens?

Looking at https://samtools.github.io/hts-specs/VCFv4.3.pdf chapter 5.4.4 page 20 for reference.

How would one write a mateName query? We would probably need mateStart, mateStartMin and mateStartMax in addition to the newly created parameter.

Queries would then look something like this for example:
Using referenceName, start, mateName, mateStart
for
1 : 1000 - 2 : 2000
or
using variantType
as
1 : 1000 > BND

@mbaudis
Copy link
Member

mbaudis commented Apr 8, 2019

@teemukataja In the current proposal, mateName would be a specification for the end position. A BND with a specified mateName would correspond to a translocation if on different chromosome.

          description: |
            Second chromosome for fusion events. This can be
            * empty (no fusion or unknown partner)
            * identical to `referenceName` (e.g. one side of an inversion)
            * a different chromosome

IMO we don't need a separate mateStart; just specifying that the chromosomes should be ordered (for search):

"reference_name" : "8",
"start_min": 128400000,
"start_max" : 129400000,
"mate_name" : "22",
"end_min" : 23250000,
"end_max" : 23280000,

(comments also on #256 (comment)).

@blankdots
Copy link

blankdots commented Apr 16, 2019

@mbaudis Could you provide any example queries (e.g. POST or GET) and responses (JSON response) on how this functionality can be utilised? I could not find any in the issues or in the API specs.

I would like also to validate some assumptions:

  • if alternateBases or variantType can be used with mateName (seems like no);
  • if a query specifies variantType=BND is mateName required or not (seems like no).

@mbaudis
Copy link
Member

mbaudis commented Apr 17, 2019

@teemukataja SAee the example above, corresponding to an imprecise fusion event (e.g. a MYC-IGL translocation, variant Burkitt lymphoma). A precise query (which doesnt make much sense, since breakpoints are rarely recurring position-specific):

?referenceName=8&start=1289234404&mateName=22&end=23266044&variantType=BND

This would correspond to 2 lines in VCF, where the corresponding mate would be represented in the ALT and INFO fields:

#CHROM POS ID REF ALT QUAL FILTER INFO
8 1289234404 bnd_A C C]22:23266044] 6 PASS SVTYPE=BND;MATEID=bnd_B
22 23266044 bnd_B A [8:1289234404[A 6 PASS SVTYPE=BND;MATEID=bnd_A

The VCF contains additional information about the directionality of the fusion which we don't consider right now (not really important for query models but could be specified later on).

The following would be a typical variation of the query, in which we look for a fusion between canonical breakpoint regions using range matches (same genes):

?referenceName=8&startMin=128400000&startMax=129400000&mateName=22&endMin=23250000&endMax=23280000&variantType=BND

Current Beacon responses would be just standard. Since in example 2 multiple fusion events could be matched, we could deliver the different matched variants (in some TBD format) in the response (either through handover or in the response message - other discussion).

@mbaudis
Copy link
Member

mbaudis commented Apr 17, 2019

@teemukataja For BND variant queries w/o a mateName, all types of variants representing a structural sequence disruption could be queried. In our Beacon+ instance, we just match e.g. on the start and end positions of CNV events; obviously BND; possibly INS ...

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants