Skip to content

Commit 0376f9a

Browse files
janiemiAleksi SahalaAleksi Sahalaasahala
authored
Configuration: Add Cuneiform corpora Achemenet and BALT (#14)
* test * add defs for oracc 2021 * debugging oracc * achement beta * achemenet fix * Achemenet (beta 2.0): Add YOS, Jursa, CT, Strassmaier (Committed by Jyrki Niemi on behalf of Aleksi Sahala.) * Achemenet: No space between property name and ":" * Achemenet: Convert definitions to Korp 9 * Achemenet: Move corpus defs after Oracc 2021 Achemenet: - Move corpus definitions after those of Oracc 2021, as they use the attribute lists for Oracc 2021. * Achemenet: Move to Cuneiform folder * Achemenet: Add beta status * Achemenet: Rename some subcorpora Achemenet: - Rename subcorpora: YOS -> YOS 7, Jursa -> Bel-remanni, CT -> CT 55, and similarly the corpus ids. * Achemenet: Comment out metadata links leading to Oracc info * Achemenet: Add suffix "(Achemenet)" to subcorpus titles * attrlist.oracc2021.url: Label oracc_url -> link_to_original app/modes/other_languages_mode.js: - attrlist.oracc2021.url: Change the label to "link_to_original", so that it is logical also in Achemenet. * corpora-en.json: Add translation for oracc_pos_subcatetory * Achemenet: Add corpus alias "achemenet" * Achemenet: Add correct URNs (commented-out) Achemenet: Add correct URNs for location, metadata and info page (resource group page), with the latter two commented out, waiting for the actual pages to be created. * Achemenet: Update corpus description, licence app/modes/other_languages_mode.js - Achemenet: - Add a longer corpus description. - Uncomment metadata URN. - Update licence info to use CC BY 4.0 and to link to corpus-specific licence URN. - Adjust IPR holder name. - Reformat "contents" to list one corpus per line. * Achemenet: Slightly update subcorpus titles, descriptions * Achemenet: Update structural attributes app/modes/other_languages_mode.js: - Achemenet: Update structural attributes: add those not in Oracc 2021 (text_cdilink, text_date, text_archive, text_id, sentence_translation, sentence_id) and remove those in Oracc 2021 but not in Achemenet (text_photo, text_copy, text_accessionno, text_excavation). app/translations/corpora-en.json, app/translations/corpora-fi.json: - Add translations for "cdli_link" and "archive". * Achemenet: Remove text_collection, sentence_translation app/modes/other_languages_mode.js: - Achemenet: Remove structural attributes text_collection (in Oracc 2021 but not in Achemenet) and sentence_translation (was removed from the data). * Achemenet: text_cdlilink: Hide in extended, statistics * Achemenet: Move link to original to structural attribute app/modes/other_languages_mode.js: - Achemenet: Replace positional attribute url with structural attribute text_url ("link to original"). * Achemenet: Add structural attribute text_empty * Achemenet: Description: Add line breaks, link to ANEE * Other languages: BALT: Add preliminary corpus configuration * BALT, Achemenet: Fix to remove attrs from BALT, not Achemenet app/modes/other_languages_mode.js: - Fix to remove structural attributes text_url and text_empty from BALT, not Achemenet, as was incorrectly done. * Achemenet: Extend (sub)corpus descriptions app/modes/other_languages_mode.js: - Achemenet: Modify as suggested by Tero Alstola: - Description: Add link to a publication with more information. - Subcorpus descriptions: Add information on the original publications. - Edit the title of achemenet_belremanni. - Order subcorpora alphabetically. * BALT: Extend (sub)corpus descriptions app/modes/other_languages_mode.js: - BALT: Modify as suggested by Tero Alstola: - Description: Add link to a publication with more information. - Add University of Helsinki as the IPR holder. - Subcorpus descriptions: Add information on the original publications. - Modify the titles of subcorpora to refer to the original publication. * Achemenet, BALT: Order attributes app/modes/other_languages_mode.js: - Achemenet and BALT: Use the same order for attributes (both positional and structural). * BALT: Rename corpus ids of two subcorpora --------- Co-authored-by: Aleksi Sahala <[email protected]> Co-authored-by: Aleksi Sahala <[email protected]> Co-authored-by: Aleksi Sahala <[email protected]>
1 parent 3f22d2a commit 0376f9a

File tree

3 files changed

+275
-1
lines changed

3 files changed

+275
-1
lines changed

app/modes/other_languages_mode.js

+266-1
Original file line numberDiff line numberDiff line change
@@ -535,6 +535,57 @@ settings.corpusAliases.oracc_2019_05
535535
= settings.corpusAliases["oracc-2019-05"];
536536

537537

538+
settings.corporafolders.cuneiform.achemenet = {
539+
title: "Achemenet",
540+
description: "Achemenet Babylonian texts – Kielipankki version 2020-12, Korp<br/><br/><a href=\"http://www.achemenet.com/\" target=\"_blank\">The Achemenet project</a> provides transliterations and translations of documents written in the Achaemenid Persian Empire (550–330 BCE).<br/>The Korp version of Achemenet contains the Babylonian cuneiform texts available in Achemenet in December 2020.<br/>The texts have been automatically lemmatized at the <a href=\"https://www.helsinki.fi/en/researchgroups/ancient-near-eastern-empires\">Centre of Excellence in Ancient Near Eastern Empires</a> (University of Helsinki), funded by the Research Council of Finland.<br/><br/>More information about the corpus is available at <a href=\"https://doi.org/10.5281/zenodo.14223709\" target=\"_blank\">https://doi.org/10.5281/zenodo.14223709</a>.",
541+
contents: [
542+
"achemenet_ct55",
543+
"achemenet_belremanni",
544+
"achemenet_murashu",
545+
"achemenet_strassmaier",
546+
"achemenet_yos7",
547+
],
548+
info: {
549+
metadata_urn: "urn:nbn:fi:lb-2023062101",
550+
urn: "urn:nbn:fi:lb-2023062102",
551+
licence: {
552+
name: "CC BY 4.0",
553+
description: "Creative Commons Attribution 4.0 International",
554+
urn: "urn:nbn:fi:lb-2025020601",
555+
},
556+
iprholder: {
557+
name: "Achemenet-CNRS",
558+
url: "http://www.achemenet.com/",
559+
},
560+
// TODO: Uncomment when https://www.kielipankki.fi/corpora/achemenet/
561+
// is available
562+
// infopage_urn: "urn:nbn:fi:lb-2023062103",
563+
status: "beta",
564+
}
565+
};
566+
567+
568+
settings.corporafolders.cuneiform.balt = {
569+
title: "BALT",
570+
description: "BALT: Babylonian Administrative and Legal Texts – Kielipankki version 2025-02, Korp<br/><br/>The corpus contains Babylonian cuneiform texts from the Neo-Babylonian, Persian, and Hellenistic periods (c. 626–93 BCE).<br/>More than half of the transliterated texts are legacy data of the late János Everling. The other texts have been transliterated and translated by Johannes Hackl, Bojana Janković, Michael Jursa, Yuval Levavi, Martina Schmidl and Caroline Waerzeggers.<br/>The texts have been automatically lemmatized at the <a href=\"https://www.helsinki.fi/en/researchgroups/ancient-near-eastern-empires\">Centre of Excellence in Ancient Near Eastern Empires</a> (University of Helsinki), funded by the Research Council of Finland.<br/><br/>More information about the corpus is available at <a href=\"https://doi.org/10.5281/zenodo.14186072\" target=\"_blank\">https://doi.org/10.5281/zenodo.14186072</a>.",
571+
// contents added later with funcs.addCorpusSettings
572+
info: {
573+
metadata_urn: "urn:nbn:fi:lb-2025022201",
574+
urn: "urn:nbn:fi:lb-2025022609",
575+
licence: {
576+
name: "CC BY 4.0",
577+
description: "Creative Commons Attribution 4.0 International",
578+
urn: "urn:nbn:fi:lb-2025022202",
579+
},
580+
iprholder: {
581+
name: "University of Helsinki",
582+
url: "https://www.helsinki.fi/",
583+
},
584+
status: "beta",
585+
},
586+
};
587+
588+
538589
/* Helsinki Corpus */
539590

540591
sattrlist.hc = {
@@ -1365,7 +1416,7 @@ attrlist.oracc2021 = {
13651416
extendedComponent: "structServiceSelect",
13661417
},
13671418
url: {
1368-
label: "oracc_url",
1419+
label: "link_to_original",
13691420
type: "url",
13701421
hideExtended: true,
13711422
hideStatistics: true,
@@ -1947,6 +1998,220 @@ settings.corpora.oracc_saao = {
19471998
};
19481999

19492000

2001+
// Attribute orders in the sidebar for Cuneiform corpora (some
2002+
// attributes are present only in some corpora)
2003+
2004+
// Positional attributes
2005+
let cuneiformAttrOrder = [
2006+
"lemma",
2007+
"normname",
2008+
"transcription",
2009+
"pos",
2010+
"oraccpos",
2011+
"msd",
2012+
"translation",
2013+
"sense",
2014+
"lang",
2015+
"autolemma",
2016+
"autopos",
2017+
"url",
2018+
];
2019+
2020+
// Structural attributes
2021+
let cuneiformStructAttrOrder = [
2022+
"text_cdlinumber",
2023+
"text_cdlilink",
2024+
"text_primarypub",
2025+
"text_url",
2026+
"text_collection",
2027+
"text_museumno",
2028+
"text_accessionno",
2029+
"text_period",
2030+
"text_date",
2031+
"text_datebce",
2032+
"text_provenience",
2033+
"text_archive",
2034+
"text_genre",
2035+
"text_subgenre",
2036+
"text_language",
2037+
"text_empty",
2038+
];
2039+
2040+
2041+
// ACHEMENET
2042+
2043+
// No additional positional attributes in Achemenet compared with Oracc 2021
2044+
attrlist.achemenet = $.extend(true, {}, attrlist.oracc2021);
2045+
// Remove Oracc 2021 positional attributes not in Achemenet
2046+
delete attrlist.achemenet.url;
2047+
2048+
funcs.setAttrOrder(attrlist.achemenet, cuneiformAttrOrder);
2049+
2050+
// Add structural attributes not in Oracc 2021
2051+
sattrlist.achemenet = $.extend(
2052+
true, {}, sattrlist.oracc2021,
2053+
{
2054+
text_cdlilink: {
2055+
label: "cdli_link",
2056+
type: "url",
2057+
urlOpts: {
2058+
// inLinkSection: true,
2059+
hideUrl: true,
2060+
newWindow: true,
2061+
},
2062+
hideExtended: true,
2063+
hideStatistics: true,
2064+
},
2065+
text_url: {
2066+
label: "link_to_original",
2067+
type: "url",
2068+
urlOpts: {
2069+
// inLinkSection: true,
2070+
hideUrl: true,
2071+
newWindow: true,
2072+
},
2073+
hideExtended: true,
2074+
hideStatistics: true,
2075+
},
2076+
text_date: sattrs.date,
2077+
text_archive: {
2078+
label: "archive",
2079+
extendedComponent: "structServiceSelect",
2080+
},
2081+
text_empty: funcs.makeBoolAttr("text_is_empty"),
2082+
text_id: {
2083+
label: "text_id",
2084+
displayType: "hidden",
2085+
},
2086+
sentence_id: sattrs.sentence_id_hidden,
2087+
}
2088+
);
2089+
// Remove Oracc 2021 structural attributes not in Achemenet
2090+
for (let attr of [
2091+
"text_photo",
2092+
"text_copy",
2093+
"text_accessionno",
2094+
"text_excavation",
2095+
"text_collection",
2096+
]) {
2097+
delete sattrlist.achemenet[attr];
2098+
}
2099+
2100+
funcs.setAttrOrder(sattrlist.achemenet, cuneiformStructAttrOrder);
2101+
2102+
2103+
settings.corpora.achemenet_murashu = {
2104+
id: "achemenet_murashu",
2105+
title: "Murašû archive (Achemenet)",
2106+
description: "Murašû archive (Achemenet)<br/><br/>The texts are originally published in BE 8/1 (Clay 1908), BE 9 (Hilprecht and Clay 1898), BE 10 (Clay 1904), CTMMA III (Spar and von Dassow 2000), Entrepreneurs and Empire (Stolper 1985), Istanbul Murašû Texts (Donbaz and Stolper 1997), Joannès 1987, PBS 2/1 (Clay 1912), Stolper 2001, Stolper 2015, TuM 2/3 (Krückmann 1933), and UCP 9/3 (Lutz 1928). For full bibliographical references, see <a href=\"https://doi.org/10.5281/zenodo.14223709\" target=\"_blank\">https://doi.org/10.5281/zenodo.14223709</a>.",
2107+
context: context.sp,
2108+
within: within.sp,
2109+
attributes: attrlist.achemenet,
2110+
structAttributes: sattrlist.achemenet
2111+
};
2112+
2113+
settings.corpora.achemenet_yos7 = {
2114+
id: "achemenet_yos7",
2115+
title: "YOS 7 (Achemenet)",
2116+
description: "YOS 7 (Achemenet)<br/><br/>The texts are originally published in Tremayne, Arch. 1925. Records from Erech: Time of Cyrus and Cambyses (538–521 B. C.). Yale Oriental Series, Babylonian Texts 7. New Haven: Yale University Press; London: Milford, Oxford University Press.",
2117+
context: context.sp,
2118+
within: within.sp,
2119+
attributes: attrlist.achemenet,
2120+
structAttributes: sattrlist.achemenet
2121+
};
2122+
2123+
settings.corpora.achemenet_belremanni = {
2124+
id: "achemenet_belremanni",
2125+
title: "Jursa, Bēl-rēmanni (Achemenet)",
2126+
description: "Jursa, Das Archiv des Bēl-rēmanni (Achemenet)<br/><br/>The texts are originally published in Jursa, Michael. 1999. Das Archiv des Bēl-rēmanni. Uitgaven van het Nederlands Historisch-Archaeologisch Instituut te Istanbul 86. Istanbul: Nederlands Historisch-Archaeologisch Instituut.",
2127+
context: context.sp,
2128+
within: within.sp,
2129+
attributes: attrlist.achemenet,
2130+
structAttributes: sattrlist.achemenet
2131+
};
2132+
2133+
settings.corpora.achemenet_ct55 = {
2134+
id: "achemenet_ct55",
2135+
title: "CT 55 (Achemenet)",
2136+
description: "CT 55 (Achemenet)<br/><br/>The texts are originally published in Pinches, T. G. 1982. Neo-Babylonian and Achaemenid Economic Texts. Cuneiform Texts from Babylonian Tablets in the British Museum 55. London: Trustees of the British Museum.",
2137+
context: context.sp,
2138+
within: within.sp,
2139+
attributes: attrlist.achemenet,
2140+
structAttributes: sattrlist.achemenet
2141+
};
2142+
2143+
settings.corpora.achemenet_strassmaier = {
2144+
id: "achemenet_strassmaier",
2145+
title: "Strassmaier (Achemenet)",
2146+
description: "Strassmaier (Cyr, Camb, Dar) (Achemenet)<br/><br/>The texts were originally published by J. N. Strassmaier in Inschriften von Nabonidus, König von Babylon (Leipzig, 1889); Inschriften von Nabuchodonosor, König von Babylon (Leipzig 1889); Inschriften von Cambyses, König von Babylon (Leipzig, 1890); Inschriften von Cyrus, König von Babylon (Leipzig, 1890); and Inschriften von Darius, König von Babylon (Leipzig, 1892–1897).",
2147+
context: context.sp,
2148+
within: within.sp,
2149+
attributes: attrlist.achemenet,
2150+
structAttributes: sattrlist.achemenet
2151+
};
2152+
2153+
funcs.addCorpusAliases("achemenet_.*", ["achemenet"]);
2154+
2155+
2156+
// BALT
2157+
2158+
// Same positional attributes as in Achemenet
2159+
attrlist.balt = attrlist.achemenet;
2160+
2161+
// Add structural attributes not in Achemenet
2162+
sattrlist.balt = $.extend(
2163+
true, {}, sattrlist.achemenet,
2164+
{
2165+
text_collection: sattrlist.oracc2021.text_collection,
2166+
text_accessionno: sattrlist.oracc2021.text_accessionno,
2167+
}
2168+
);
2169+
// Remove Achemenet structural attributes not in BALT
2170+
for (let attr of [
2171+
"text_url",
2172+
"text_empty",
2173+
]) {
2174+
delete sattrlist.balt[attr];
2175+
}
2176+
2177+
funcs.setAttrOrder(sattrlist.balt, cuneiformStructAttrOrder);
2178+
2179+
2180+
settings.templ.balt = {
2181+
id: "balt_{}",
2182+
title: "{} (BALT)",
2183+
description: "{}<br/>(BALT: Babylonian Administrative and Legal Texts – Kielipankki version 2025-02, Korp)<br/><br/>{}",
2184+
context: context.sp,
2185+
within: within.sp,
2186+
attributes: attrlist.balt,
2187+
structAttributes: sattrlist.balt,
2188+
};
2189+
2190+
funcs.addCorpusSettings(
2191+
settings.templ.balt,
2192+
[
2193+
["everling", "Everling",
2194+
["Everling (AnOr 8, CT 49, GCCI 1 & 2, Nbk, TuM 2/3, UCP 9/1 & 9/3, VS 3, YOS 17)",
2195+
"The texts have been transliterated by János Everling. The corpus includes texts published in AnOr 8 (Pohl 1933), CT 49 (Kennedy 1968), GCCI 1-2 (Dougherty 1923, 1933), Nbk (Strassmaier 1889), TuM 2/3 (Krückmann 1933), UCP 9/1 (Lutz 1927), UCP 9/3 (Lutz 1928), UCP 9/12 (Lutz 1931), VS 3 (Ungnad 1907), and YOS 17 (Weisberg and Dougherty 1980). For full bibliographical references, see <a href=\"https://doi.org/10.5281/zenodo.14186072\">https://doi.org/10.5281/zenodo.14186072</a>."]],
2196+
["hackl_briefdossier", "Hackl et al., Briefdossier des Šumu-ukīn",
2197+
["Hackl, Jankovic & Jursa, Briefdossier des Šumu-ukīn (KASKAL 8)",
2198+
"The texts are originally published in Hackl, Johannes, Bojana Janković, and Michael Jursa. 2011. “Das Briefdossier des Šumu-ukīn.” KASKAL 8: 177–221."]],
2199+
["hackl_privatbriefe", "Hackl et al., Spätbabylonische Privatbriefe",
2200+
["Hackl, Jursa & Schmidl, Spätbabylonische Privatbriefe (AOAT 414/1)",
2201+
"The texts are originally published in Hackl, Johannes, Michael Jursa, and Martina Schmidl. 2014. Spätbabylonische Privatbriefe. With contributions by Klaus Wagensonner. Alter Orient und Altes Testament 414/1. Münster: Ugarit-Verlag."]],
2202+
["levavi", "Levavi, Administrative Epistolography",
2203+
["Levavi, Administrative Epistolography (Dubsar 3)",
2204+
"The texts are originally published in Levavi, Yuval. 2018. Administrative Epistolography in the Formative Phase of the Neo-Babylonian Empire. Dubsar 3. Münster: Zaphon."]],
2205+
["waerzeggers", "Waerzeggers, Marduk-rēmanni",
2206+
["Waerzeggers, Marduk-rēmanni (OLA 233)",
2207+
"The texts are originally published in Waerzeggers, Caroline. 2014. Marduk-rēmanni: Local Networks and Imperial Politics in Achaemenid Babylonia. Orientalia Lovaniensia Analecta 233. Leuven: Peeters."]],
2208+
],
2209+
settings.corporafolders.cuneiform.balt
2210+
);
2211+
2212+
funcs.addCorpusAliases("balt_.*", ["balt"]);
2213+
2214+
19502215
settings.corpora.ethesis_ru = {
19512216
title: "E-thesis (русский)",
19522217
description: "The University of Helsinki’s Russian E-thesis, Korp Version<br/>Corpus of theses and dissertations (2005–2016)",

app/translations/corpora-en.json

+5
Original file line numberDiff line numberDiff line change
@@ -707,11 +707,16 @@
707707
"oracc_subgenre": "subgenre",
708708
"oracc_period": "period",
709709
"oracc_standardized": "standardized",
710+
"oracc_pos_subcategory": "part of speech subcategory",
710711
"oracc_textlang": "text languages",
711712
"oracc_lang": "language/dialect",
712713

713714
"text__geo_provenience": "find location",
714715

716+
"cdli_link": "CDLI link",
717+
"archive": "archive",
718+
"text_is_empty": "text is empty",
719+
715720
"main_section": "main section",
716721
"sections": "sections",
717722
"datetime_published": "publication date and time",

app/translations/corpora-fi.json

+4
Original file line numberDiff line numberDiff line change
@@ -892,6 +892,10 @@
892892

893893
"text__geo_provenience": "löytöpaikka",
894894

895+
"cdli_link": "CDLI-linkki",
896+
"archive": "arkisto",
897+
"text_is_empty": "teksti on tyhjä",
898+
895899
"main_section": "pääosasto",
896900
"sections": "osastot",
897901
"datetime_published": "julkaisuajankohta",

0 commit comments

Comments
 (0)