@@ -16,6 +16,207 @@ chemistry in Wikidata.
1616
1717For up-to-date information about properties used for chemicals: < https://www.wikidata.org/wiki/Wikidata:WikiProject_Chemistry/Properties#Up-to-date_list_of_properties_about_chemical >
1818
19+ ## Single value
20+
21+ Many external identifiers are expected to only be found on a single Wikidata items. This is particularly
22+ the case when the external database uses the InChIKey has uniqueness criterion, like Wikidata.
23+
24+ All Wikidata properties have discussion pages that list constraint violations, and most of these
25+ have matching SPARQL queries. For example, the <a name =" tp4 " >ChEMBL ID</a > has this following SPARQL
26+ query to find ChEMBL identifiers used on more than one Wikidata item:
27+
28+ **SPARQL** [sparql/P592UniqueValue.rq](sparql/P592UniqueValue.code.html) ([run](https://query.wikidata.org/embed.html#%23%20Unique%20value%20constraint%20report%20for%20P592%3A%20report%20by%20value%0A%0ASELECT%0A%20%20%20%20%3Fvalue%20%28SAMPLE%28%3Fct%29%20AS%20%3Fct%29%0A%20%20%20%20%28GROUP_CONCAT%28DISTINCT%28STRAFTER%28STR%28%3Fitem%29%2C%20%22%2Fentity%2F%22%29%29%3B%20separator%3D%22%2C%20%22%29%20AS%20%3Fitems%29%0A%20%20%20%20%28GROUP_CONCAT%28DISTINCT%28%3FitemLabel%29%3B%20separator%3D%22%2C%20%22%29%20AS%20%3Flabels%29%0AWHERE%0A%7B%0A%20%20%09%7B%20%09SELECT%20%3Fvalue%20%28COUNT%28DISTINCT%20%3Fitem%29%20as%20%3Fct%29%0A%20%20%09%09WHERE%0A%20%20%09%09%7B%0A%20%20%09%09%09%3Fitem%20wdt%3AP592%20%3Fvalue%0A%09%09%7D%0A%20%20%20%20%09GROUP%20BY%20%3Fvalue%20HAVING%20%28%3Fct%3E1%29%0A%20%20%20%20%09ORDER%20BY%20DESC%28%3Fct%29%0A%20%20%20%20%09LIMIT%20100%0A%09%7D%0A%20%20%09%3Fitem%20wdt%3AP592%20%3Fvalue%20.%0A%09SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20%09bd%3AserviceParam%20wikibase%3Alanguage%20%22en%2C%20mul%22%20.%0A%20%20%20%20%09%3Fitem%20rdfs%3Alabel%20%3FitemLabel%20.%0A%20%20%09%7D%0A%7D%0AGROUP%20BY%20%3Fvalue%0AORDER%20BY%20DESC%28%3Fct%29%0A), [edit](https://query.wikidata.org/#%23%20Unique%20value%20constraint%20report%20for%20P592%3A%20report%20by%20value%0A%0ASELECT%0A%20%20%20%20%3Fvalue%20%28SAMPLE%28%3Fct%29%20AS%20%3Fct%29%0A%20%20%20%20%28GROUP_CONCAT%28DISTINCT%28STRAFTER%28STR%28%3Fitem%29%2C%20%22%2Fentity%2F%22%29%29%3B%20separator%3D%22%2C%20%22%29%20AS%20%3Fitems%29%0A%20%20%20%20%28GROUP_CONCAT%28DISTINCT%28%3FitemLabel%29%3B%20separator%3D%22%2C%20%22%29%20AS%20%3Flabels%29%0AWHERE%0A%7B%0A%20%20%09%7B%20%09SELECT%20%3Fvalue%20%28COUNT%28DISTINCT%20%3Fitem%29%20as%20%3Fct%29%0A%20%20%09%09WHERE%0A%20%20%09%09%7B%0A%20%20%09%09%09%3Fitem%20wdt%3AP592%20%3Fvalue%0A%09%09%7D%0A%20%20%20%20%09GROUP%20BY%20%3Fvalue%20HAVING%20%28%3Fct%3E1%29%0A%20%20%20%20%09ORDER%20BY%20DESC%28%3Fct%29%0A%20%20%20%20%09LIMIT%20100%0A%09%7D%0A%20%20%09%3Fitem%20wdt%3AP592%20%3Fvalue%20.%0A%09SERVICE%20wikibase%3Alabel%20%7B%0A%20%20%20%20%09bd%3AserviceParam%20wikibase%3Alanguage%20%22en%2C%20mul%22%20.%0A%20%20%20%20%09%3Fitem%20rdfs%3Alabel%20%3FitemLabel%20.%0A%20%20%09%7D%0A%7D%0AGROUP%20BY%20%3Fvalue%0AORDER%20BY%20DESC%28%3Fct%29%0A))
29+
30+ ``` sparql
31+ SELECT
32+ ?value (SAMPLE(?ct) AS ?ct)
33+ (GROUP_CONCAT(DISTINCT(STRAFTER(STR(?item), "/entity/")); separator=", ") AS ?items)
34+ (GROUP_CONCAT(DISTINCT(?itemLabel); separator=", ") AS ?labels)
35+ WHERE
36+ {
37+ { SELECT ?value (COUNT(DISTINCT ?item) as ?ct)
38+ WHERE
39+ {
40+ ?item wdt:P592 ?value
41+ }
42+ GROUP BY ?value HAVING (?ct>1)
43+ ORDER BY DESC(?ct)
44+ LIMIT 100
45+ }
46+ ?item wdt:P592 ?value .
47+ SERVICE wikibase:label {
48+ bd:serviceParam wikibase:language "en, mul" .
49+ ?item rdfs:label ?itemLabel .
50+ }
51+ }
52+ GROUP BY ?value
53+ ORDER BY DESC(?ct)
54+ ```
55+
56+ While this query shows a few false positives caused by tautomerism, it provides a useful
57+ list to regularly check:
58+
59+ <table >
60+ <tr >
61+ <td><b>value</b></td>
62+ <td><b>ct</b></td>
63+ <td><b>items</b></td>
64+ <td><b>labels</b></td>
65+ </tr >
66+ <tr >
67+ <td>CHEMBL521177</td>
68+ <td>2</td>
69+ <td>Q6469057, Q105287434</td>
70+ <td>lactucin, 4-epi-lactucin</td>
71+ </tr >
72+ <tr >
73+ <td>CHEMBL1206440</td>
74+ <td>2</td>
75+ <td>Q2130929, Q27124801</td>
76+ <td>cyclamic acid, cyclamate</td>
77+ </tr >
78+ <tr >
79+ <td>CHEMBL2303614</td>
80+ <td>2</td>
81+ <td>Q408014, Q74511001</td>
82+ <td>chondroitin sulfate, (2S,3S,4S,5R,6S)-6-[[(2R,3R,4S,5R,6S)-3-Acetamido-2,6-bis(hydroxymethyl)-5-(sulfomethyl)oxan-4-yl]methoxymethyl]-4,5-dihydroxy-3-(hydroxymethyl)oxane-2-carboxylic acid</td>
83+ </tr >
84+ <tr >
85+ <td>CHEMBL1433</td>
86+ <td>2</td>
87+ <td>Q422442, Q82982262</td>
88+ <td>doxycycline, doxycycline tautomer</td>
89+ </tr >
90+ <tr >
91+ <td>CHEMBL91</td>
92+ <td>2</td>
93+ <td>Q410534, Q75163056</td>
94+ <td>miconazole, rac-miconazole</td>
95+ </tr >
96+ <tr >
97+ <td>CHEMBL1201341</td>
98+ <td>2</td>
99+ <td>Q4352952, Q27077150</td>
100+ <td>echothiophate</td>
101+ </tr >
102+ <tr >
103+ <td>CHEMBL1201668</td>
104+ <td>2</td>
105+ <td>Q6997373, Q66360952</td>
106+ <td>nesiritide, brain natriuretic peptide</td>
107+ </tr >
108+ <tr >
109+ <td>CHEMBL2110884</td>
110+ <td>2</td>
111+ <td>Q27284343, Q76005793</td>
112+ <td>cetocycline, cetocycline tautomer</td>
113+ </tr >
114+ <tr ><td colspan =" 2 " ><a href =" sparql/P592UniqueValue.code.html " >sparql/P592UniqueValue.rq</a ></td ></tr >
115+ </table >
116+
117+ ## Uniqe value
118+
119+ Smilarly, we can use SPARQL to find Wikidata items with more than one ChEMBL identifier:
120+
121+ **SPARQL** [sparql/P592SingleValue.rq](sparql/P592SingleValue.code.html) ([run](https://query.wikidata.org/embed.html#SELECT%20DISTINCT%20%3FitemLabel%20%3FitemLabelURL%20%3Fcount%20%3Fsample1%20%3Fsample2%20%3Fexception%0AWITH%20%7B%0A%09SELECT%20%3Fformatter%20WHERE%20%7B%0A%09%09OPTIONAL%20%7B%20wd%3AP592%20wdt%3AP1630%20%3Fformatter%20%7D%0A%09%7D%20LIMIT%201%0A%7D%20AS%20%25formatter%0AWHERE%0A%7B%0A%09%7B%0A%09%09SELECT%20%3Fitem%20%28COUNT%28%3Fvalue%29%20AS%20%3Fcount%29%20%28MIN%28%3Fvalue%29%20AS%20%3Fsample1%29%20%28MAX%28%3Fvalue%29%20AS%20%3Fsample2%29%20%7B%0A%09%09%09%3Fitem%20p%3AP592%20%5B%20ps%3AP592%20%3Fval%3B%20wikibase%3Arank%20%3Frank%20%5D%20.%0A%09%09%09FILTER%28%20%3Frank%20%21%3D%20wikibase%3ADeprecatedRank%20%29%20.%0A%09%09%09INCLUDE%20%25formatter%20.%0A%09%09%09BIND%28%20IF%28%20BOUND%28%20%3Fformatter%20%29%2C%20URI%28%20REPLACE%28%20%3Fformatter%2C%20%27%5C%5C%241%27%2C%20%3Fval%20%29%20%29%2C%20%3Fval%20%29%20AS%20%3Fvalue%20%29%20.%0A%09%09%7D%20GROUP%20BY%20%3Fitem%20HAVING%20%28%20%3Fcount%20%3E%201%20%29%20LIMIT%20100%0A%09%7D%20.%0A%09OPTIONAL%20%7B%0A%09%09wd%3AP592%20p%3AP2302%20%5B%20ps%3AP2302%20wd%3AQ19474404%3B%20pq%3AP2303%20%3Fexc%20%5D%20.%0A%09%09FILTER%28%20%3Fexc%20%3D%20%3Fitem%20%29%20.%0A%09%7D%20.%0A%09BIND%28%20BOUND%28%20%3Fexc%20%29%20AS%20%3Fexception%20%29%20.%0A%20%20%20%20BIND%20%28%3Fitem%20AS%20%3FitemLabelURL%29%0A%09SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%2Cmul%22%20%7D%20.%0A%7D%0AORDER%20BY%20DESC%28%3Fcount%29%0A), [edit](https://query.wikidata.org/#SELECT%20DISTINCT%20%3FitemLabel%20%3FitemLabelURL%20%3Fcount%20%3Fsample1%20%3Fsample2%20%3Fexception%0AWITH%20%7B%0A%09SELECT%20%3Fformatter%20WHERE%20%7B%0A%09%09OPTIONAL%20%7B%20wd%3AP592%20wdt%3AP1630%20%3Fformatter%20%7D%0A%09%7D%20LIMIT%201%0A%7D%20AS%20%25formatter%0AWHERE%0A%7B%0A%09%7B%0A%09%09SELECT%20%3Fitem%20%28COUNT%28%3Fvalue%29%20AS%20%3Fcount%29%20%28MIN%28%3Fvalue%29%20AS%20%3Fsample1%29%20%28MAX%28%3Fvalue%29%20AS%20%3Fsample2%29%20%7B%0A%09%09%09%3Fitem%20p%3AP592%20%5B%20ps%3AP592%20%3Fval%3B%20wikibase%3Arank%20%3Frank%20%5D%20.%0A%09%09%09FILTER%28%20%3Frank%20%21%3D%20wikibase%3ADeprecatedRank%20%29%20.%0A%09%09%09INCLUDE%20%25formatter%20.%0A%09%09%09BIND%28%20IF%28%20BOUND%28%20%3Fformatter%20%29%2C%20URI%28%20REPLACE%28%20%3Fformatter%2C%20%27%5C%5C%241%27%2C%20%3Fval%20%29%20%29%2C%20%3Fval%20%29%20AS%20%3Fvalue%20%29%20.%0A%09%09%7D%20GROUP%20BY%20%3Fitem%20HAVING%20%28%20%3Fcount%20%3E%201%20%29%20LIMIT%20100%0A%09%7D%20.%0A%09OPTIONAL%20%7B%0A%09%09wd%3AP592%20p%3AP2302%20%5B%20ps%3AP2302%20wd%3AQ19474404%3B%20pq%3AP2303%20%3Fexc%20%5D%20.%0A%09%09FILTER%28%20%3Fexc%20%3D%20%3Fitem%20%29%20.%0A%09%7D%20.%0A%09BIND%28%20BOUND%28%20%3Fexc%20%29%20AS%20%3Fexception%20%29%20.%0A%20%20%20%20BIND%20%28%3Fitem%20AS%20%3FitemLabelURL%29%0A%09SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%2Cmul%22%20%7D%20.%0A%7D%0AORDER%20BY%20DESC%28%3Fcount%29%0A))
122+
123+ ``` sparql
124+ SELECT DISTINCT ?itemLabel ?itemLabelURL ?count ?sample1 ?sample2 ?exception
125+ WITH {
126+ SELECT ?formatter WHERE {
127+ OPTIONAL { wd:P592 wdt:P1630 ?formatter }
128+ } LIMIT 1
129+ } AS %formatter
130+ WHERE
131+ {
132+ {
133+ SELECT ?item (COUNT(?value) AS ?count) (MIN(?value) AS ?sample1) (MAX(?value) AS ?sample2) {
134+ ?item p:P592 [ ps:P592 ?val; wikibase:rank ?rank ] .
135+ FILTER( ?rank != wikibase:DeprecatedRank ) .
136+ INCLUDE %formatter .
137+ BIND( IF( BOUND( ?formatter ), URI( REPLACE( ?formatter, '\\$1', ?val ) ), ?val ) AS ?value ) .
138+ } GROUP BY ?item HAVING ( ?count > 1 ) LIMIT 100
139+ } .
140+ OPTIONAL {
141+ wd:P592 p:P2302 [ ps:P2302 wd:Q19474404; pq:P2303 ?exc ] .
142+ FILTER( ?exc = ?item ) .
143+ } .
144+ BIND( BOUND( ?exc ) AS ?exception ) .
145+ BIND (?item AS ?itemLabelURL)
146+ SERVICE wikibase:label { bd:serviceParam wikibase:language "en,mul" } .
147+ }
148+ ORDER BY DESC(?count)
149+ ```
150+
151+ This gives this list of Wikidata items with more than one ChEMBL ID:
152+
153+ <table >
154+ <tr >
155+ <td><b>itemLabelURL</b></td>
156+ <td><b>count</b></td>
157+ <td><b>sample1</b></td>
158+ <td><b>sample2</b></td>
159+ <td><b>exception</b></td>
160+ </tr >
161+ <tr >
162+ <td>http://www.wikidata.org/entity/Q425293</td>
163+ <td>2</td>
164+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL3039598/</td>
165+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL3306578/</td>
166+ <td>false</td>
167+ </tr >
168+ <tr >
169+ <td>http://www.wikidata.org/entity/Q4008670</td>
170+ <td>2</td>
171+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL2103975/</td>
172+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL264186/</td>
173+ <td>false</td>
174+ </tr >
175+ <tr >
176+ <td>http://www.wikidata.org/entity/Q417219</td>
177+ <td>2</td>
178+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL2079587/</td>
179+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL3182301/</td>
180+ <td>false</td>
181+ </tr >
182+ <tr >
183+ <td>http://www.wikidata.org/entity/Q75830</td>
184+ <td>2</td>
185+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL18041/</td>
186+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL28992/</td>
187+ <td>false</td>
188+ </tr >
189+ <tr >
190+ <td>http://www.wikidata.org/entity/Q422301</td>
191+ <td>2</td>
192+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL1201488/</td>
193+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL3039582/</td>
194+ <td>false</td>
195+ </tr >
196+ <tr >
197+ <td>http://www.wikidata.org/entity/Q420532</td>
198+ <td>2</td>
199+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL3039593/</td>
200+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL553025/</td>
201+ <td>false</td>
202+ </tr >
203+ <tr >
204+ <td>http://www.wikidata.org/entity/Q7784695</td>
205+ <td>2</td>
206+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL10247/</td>
207+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL298827/</td>
208+ <td>false</td>
209+ </tr >
210+ <tr >
211+ <td>http://www.wikidata.org/entity/Q417227</td>
212+ <td>2</td>
213+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL1286/</td>
214+ <td>https://www.ebi.ac.uk/chembl/compound_report_card/CHEMBL150361/</td>
215+ <td>false</td>
216+ </tr >
217+ <tr ><td colspan =" 2 " ><a href =" sparql/P592SingleValue.code.html " >sparql/P592SingleValue.rq</a ></td ></tr >
218+ </table >
219+
19220## References
20221
21222
0 commit comments