-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Here is a list of things that @Yuying-Jin and I have decided are best handled in post-processing of collation files with XSLT.
-
Solitary witness with only one token of content:
XPath://app[count(rdgGrp) = 1][rdgGrp[not(contains(@n, ','))]][count(descendant::rdg) < 3] -
Solitary witness holding meaningful content: If start of a new sentence, move down. Check if this witness in a preceding rdGrp ends wtih a period, and following other witnesses start with a capital letter(?)
- Or reconsider: move all of these down.
As of 2022-10-11 All solitary witness are now moved down.
- Or reconsider: move all of these down.
-
In the process of consolidating solitary witnesses, deal with this:
Let's try creating a conditional processing rule in the template rule on app with @mode="restructure":
IF the $norm param only contains [''] (string-length() = 4), do NOT create a new rdgGrp, and simply move
the $loner param into the existing structure.
Example of the problem: these do not need to be two separate rdGrp elements:
```
<app><rdgGrp n="['with', 'my', 'aunt', 'and', 'my']">
<rdg wit="f1818">with my aunt and my </rdg>
<rdg wit="f1823">with my aunt and my </rdg>
<rdg wit="fThomas">with my aunt and my </rdg>
<rdg wit="fMS">with my aunt & my </rdg>
</rdgGrp><rdgGrp n="['', 'with', 'my', 'aunt', 'and', 'my']"><rdg wit="fMS"><sga-add eID="c56-0104__main__d5e21929"/> with my aunt & my </rdg></rdgGrp></app>
```
2022-10-18 Likely solved with 8925989
-
Ampersand and other special characters generated by nodeToXML() output of longTokens, adds, dels (inlineVariationEvent):
&amp;or&quot;- This turned out to be a serious problem that might have distorted the collation and its handling of normalized
&toand.
Repaired in 0b02d1c
- This turned out to be a serious problem that might have distorted the collation and its handling of normalized
-
For inlineVariation events where we have constructed "long tokens": we now output these as a complete unit from start to end in only one
<rdgGrp>and<rdg>. This means that sometimes the other witnesses split around them awkwardly. We should smooth out the awkwardness with this algorithm:- If the contents of a "long token"
contains()a string from the very next<app>elementfollowing::app[1], then move the contents of that very next<app>up into the current app of the long token, and remove the very next<app>.
- If the contents of a "long token"
-
If an app shows one rdgGrp with all witnesses unified on a paragraph marker, but one witness (most likely f1831) actually has not been running in alignment for a long time, move the paragraph marker to the preceding or following app containing that witness. XPath:
//app[count(rdgGrp) = 1 ][count(.//rdg) = 5][preceding-sibling::app[1][count((.//rdg)) < 5]][matches(rdgGrp/@n, '^\[.<p' ) and matches(rdgGrp/@n, '>.\]$')]