Future InChIKey should have the "standard/nonstandard" identifier moved into a different hyphen-deliminated part #191
Artoria2e5
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Current InChIKey puts the "standard/nonstandard" (N/S) letter in the second "word", after the hash for everything else and before the version. This results in very fragile resolving.
Take morphine as an example,
InChI=1S/C17H19NO3/c1-18-7-6-17-10-3-5-13(20)16(17)21-15-12(19)4-2-9(14(15)17)8-11(10)18/h2-5,10-11,13,16,19-20H,6-8H2,1H3/t10-,11+,13-,16-,17-/m0/s1,BQJCRHHNABKAKU-KBQPJGBKSA-N.Any "standard InChI" would meet all requirements for being a "nonstandard" (or rather, "not claiming to be standard") InChI. So the following is equivalent:
InChI=1/C17H19NO3/c1-18-7-6-17-10-3-5-13(20)16(17)21-15-12(19)4-2-9(14(15)17)8-11(10)18/h2-5,10-11,13,16,19-20H,6-8H2,1H3/t10-,11+,13-,16-,17-/m0/s1,BQJCRHHNABKAKU-KBQPJGBKNA-N. That's a trivally reversible one-letter change, yet the InChI resolution services from UniChem to NCI/CADD just cannot see through it. If you are even braver and use a search engine, then -- welp, the best that their word-splitting algorithm can do is give you a match forBQJCRHHNABKAKU. Good enough for sleepy things like morphine, terrible for invigorating 2-bornanols.So what's the lesson here? Well, resolvers should really just match the
BQJCRHHNABKAKU-KBQPJGBKpart. Given that the universe of possible chemical connectivities is much greater than the universe of possible stereochemistry-and-whatevers, the odds for getting a collision this way is even weaker than the odds for getting a collision for the first part.Of course this won't help search engines, which is why I hope a future version can move the hyphens around a bit. RInChI (the non-web versions) has the right idea about this. It puts the standardness and version identifier "SA" at the start, as an independent element and not down the middle.
Beta Was this translation helpful? Give feedback.
All reactions