I finished setting this up.
The authority file is at project-documentation/DHARMA_glyphTaxonomy.xml.
The corresponding display is at https://dharmalekha.info/glyphs. The contents of the taxonomy was extracted from the data file I used so far. If you scroll down to the bottom of the Web page, you will see usage examples (and test data).
There is also a test file in test/DHARMA_INSTestGaiji.xml. The corresponding display is at https://dharmalekha.info/texts/INSTestGaiji. You can use it to see how the display behaves depending on the encoding of <g>.
Remaining problems are:
(1) I have not understood conventions regarding the contents of <g> and the mapping property from the taxonomy. As far as I'm concerned, what matters is that (1) the contents of <g> is the thing that should be displayed when the edition is rendered; (2) if the <g> is empty, the value of the corresponding mapping is displayed instead. This is how <g> is supposed to behave in TEI.
(2) I have set things up in such a way that both the old encoding and the new one remain valid, to ensure a smooth transition. But the use of <g type="pc"> creates a conflict: a symbol named pc is heavily used in Tamil inscriptions. I suggest we use <g type="punctuation"> to avoid the issue.
(3) I am very wary of applying global modifications to existing files, because (1) the potential for errors is massive; (2) people have many files that are not public yet and that the global update would ignore; (3) I can't preserve the formatting of the original file. This is why I have set things up in such a way that the old encoding and the new one can coexist. Backwards compatibility should be a primary goal at this stage.
(4) Something less important: the prefix "tax:" doesn't say anything about the type of data that is being referred to ("taxonomy" of what?), so "sym:" or "gly:", etc. would be a better fit.
I finished setting this up.
The authority file is at project-documentation/DHARMA_glyphTaxonomy.xml.
The corresponding display is at https://dharmalekha.info/glyphs. The contents of the taxonomy was extracted from the data file I used so far. If you scroll down to the bottom of the Web page, you will see usage examples (and test data).
There is also a test file in test/DHARMA_INSTestGaiji.xml. The corresponding display is at https://dharmalekha.info/texts/INSTestGaiji. You can use it to see how the display behaves depending on the encoding of
<g>.Remaining problems are:
(1) I have not understood conventions regarding the contents of
<g>and themappingproperty from the taxonomy. As far as I'm concerned, what matters is that (1) the contents of<g>is the thing that should be displayed when the edition is rendered; (2) if the<g>is empty, the value of the correspondingmappingis displayed instead. This is how<g>is supposed to behave in TEI.(2) I have set things up in such a way that both the old encoding and the new one remain valid, to ensure a smooth transition. But the use of
<g type="pc">creates a conflict: a symbol namedpcis heavily used in Tamil inscriptions. I suggest we use<g type="punctuation">to avoid the issue.(3) I am very wary of applying global modifications to existing files, because (1) the potential for errors is massive; (2) people have many files that are not public yet and that the global update would ignore; (3) I can't preserve the formatting of the original file. This is why I have set things up in such a way that the old encoding and the new one can coexist. Backwards compatibility should be a primary goal at this stage.
(4) Something less important: the prefix "tax:" doesn't say anything about the type of data that is being referred to ("taxonomy" of what?), so "sym:" or "gly:", etc. would be a better fit.