Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration: Add Cuneiform corpora Achemenet and BALT #14

Merged
merged 36 commits into from
Mar 14, 2025

Conversation

janiemi
Copy link

@janiemi janiemi commented Mar 10, 2025

Aleksi Sahala and others added 30 commits June 22, 2021 12:53
Merge Achemenet definitions from old (5.0.10-based) branch
corp-achemenet into config/achemenet based on the current config/master.
The configuration is yet to be converted to Korp 9.

Technically, corp-achemenet was first merged into config/achemenet with
strategy "ours" and the merge commit was then amended with the relevant
changes, to avoid major merge conflicts.

* origin/corp-achemenet:
  achemenet fix
  achement beta
(Committed by Jyrki Niemi on behalf of Aleksi Sahala.)
Achemenet:
- Move corpus definitions after those of Oracc 2021, as they use the
  attribute lists for Oracc 2021.
Achemenet:
- Rename subcorpora: YOS -> YOS 7, Jursa -> Bel-remanni, CT -> CT 55,
  and similarly the corpus ids.
app/modes/other_languages_mode.js:
- attrlist.oracc2021.url: Change the label to "link_to_original", so
  that it is logical also in Achemenet.
Achemenet: Add correct URNs for location, metadata and info
page (resource group page), with the latter two commented out, waiting
for the actual pages to be created.
Update config/achemenet with changes made to config/master.

* config/master: (74 commits)
  Syntax fix
  Remove defunct links to the korp.csc.fi domain
  Point links away from defunct Annlab
  Rename korpBackendServer to kielipankkiBaseAddress
  KP-9173 Actually set the value of korpBackendURL!
  KP-9173 Use the backend of this particular instance
  KP-9048 Update reittidemo video path
  Rename ignore_between_tokens_cqp -> ignoreBetweenTokensCQP
  Oracc 2021: Description: Add link to the list of texts
  ScotsCorr: Fix label for text_scripttype2
  ScotsCorr: Widen token box with word list and dict links
  ScotsCorr word list: Add English and Swedish translations
  ScotsCorr word list: Adjust list style
  ScotsCorr word list: Scroll words only, not heading
  ScotsCorr: Add word list modal styles
  custom/extended.js: scotscorrWord: Fix selecting words
  custom/extended.js: scotscorrWord: Define func s.inputChange
  custom/extended.js: scotscorrWord: var -> let/const
  custom/extended.js: scotscorrWord: Use template literals
  custom/extended.js: scotscorrWord: Fix subtree icons
  ...
app/modes/other_languages_mode.js
- Achemenet:
  - Add a longer corpus description.
  - Uncomment metadata URN.
  - Update licence info to use CC BY 4.0 and to link to corpus-specific
    licence URN.
  - Adjust IPR holder name.
  - Reformat "contents" to list one corpus per line.
app/modes/other_languages_mode.js:
- Achemenet: Update structural attributes: add those not in Oracc 2021
  (text_cdilink, text_date, text_archive, text_id, sentence_translation,
  sentence_id) and remove those in Oracc 2021 but not in Achemenet
  (text_photo, text_copy, text_accessionno, text_excavation).

app/translations/corpora-en.json,
app/translations/corpora-fi.json:
- Add translations for "cdli_link" and "archive".
app/modes/other_languages_mode.js:
- Achemenet: Remove structural attributes text_collection (in Oracc 2021
  but not in Achemenet) and sentence_translation (was removed from the
  data).
app/modes/other_languages_mode.js:
- Achemenet: Replace positional attribute url with structural attribute
  text_url ("link to original").
janiemi added 6 commits March 3, 2025 10:19
app/modes/other_languages_mode.js:
- Fix to remove structural attributes text_url and text_empty from BALT,
  not Achemenet, as was incorrectly done.
Update config/achemenet with changes made to config/master, to be used
in the configuration of Achemenet and/or BALT.

* config/master:
  modes/common.js: funcs.addCorpusSettings: Support multiple {}
  modes/common.js: funcs.addCorpusSettings: Modernize code
  modes/common.js: funcs.setAttrOrder: Test if attr exists
  modes/common.js: funcs.setAttrOrder: Modernize code
app/modes/other_languages_mode.js:
- Achemenet: Modify as suggested by Tero Alstola:
  - Description: Add link to a publication with more information.
  - Subcorpus descriptions: Add information on the original
    publications.
  - Edit the title of achemenet_belremanni.
  - Order subcorpora alphabetically.
app/modes/other_languages_mode.js:
- BALT: Modify as suggested by Tero Alstola:
  - Description: Add link to a publication with more information.
  - Add University of Helsinki as the IPR holder.
  - Subcorpus descriptions: Add information on the original
    publications.
  - Modify the titles of subcorpora to refer to the original
    publication.
app/modes/other_languages_mode.js:
- Achemenet and BALT: Use the same order for attributes (both positional
  and structural).
Copy link
Collaborator

@mmatthiesencsc mmatthiesencsc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Jyrki, sorry for being slow with this. I tried "ana" and noticed that the cldi links do not work. Not really our problem, but if the server is down, maybe we should not provide them?
Not much to add, approving.

https://www.kielipankki.fi/staging/korp/jn/achemenet/?mode=other_languages#?corpus=achemenet_ct55,achemenet_belremanni,achemenet_murashu,achemenet_strassmaier,achemenet_yos7,balt_everling,balt_hackl_briefdossier,balt_hackl_privatbriefe,balt_levavi,balt_waerzeggers&lang=en&cqp=%5B%5D&search=word%7Ca&page=0

@mmatthiesencsc
Copy link
Collaborator

CLDI started working again.

@mmatthiesencsc mmatthiesencsc merged commit 0376f9a into config/master Mar 14, 2025
1 check passed
@mmatthiesencsc mmatthiesencsc deleted the config/achemenet branch March 14, 2025 16:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants