Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
176 commits
Select commit Hold shift + click to select a range
10a9e58
Make logger use stderr (#5474)
mbollmann Jul 18, 2025
1479c56
Merge branch 'master' into python-dev
mbollmann Aug 9, 2025
ee78650
Add ORCID field to Person
mbollmann Jul 18, 2025
309ad65
Add new fields to NameSpec and Person, add check for verified IDs
mbollmann Aug 3, 2025
5b9ca84
Removed outdated special case when slugifying
mbollmann Aug 3, 2025
aa2411a
Switch from name_variants.yaml to people.yaml & new name resolution l…
mbollmann Aug 3, 2025
248009d
Transition test data & fix tests outside of personindex_test.py
mbollmann Aug 3, 2025
3471119
Remove tests for get_or_create_person, fix remaining ones
mbollmann Aug 3, 2025
7ee6ac8
Refactor get_or_create_person to resolve_namespec, refactor exceptions
mbollmann Aug 3, 2025
983bbc8
Refactor exceptions (again), add checks for ORCID on NameSpecification
mbollmann Aug 3, 2025
44ae702
Add ORCID validation (incl. checksum)
mbollmann Aug 3, 2025
f287f16
Add integration test for PersonIndex, currently expected to fail
mbollmann Aug 3, 2025
2c11c80
Bump Codecov action to v5
mbollmann Aug 8, 2025
a582685
Add by_orcid, rename name_to_ids to by_name
mbollmann Aug 8, 2025
e5511f4
Disallow person IDs starting with numbers
mbollmann Aug 8, 2025
5fe6470
Add tests for name resolution logic
mbollmann Aug 9, 2025
bfc8bbc
Increase test coverage, fix small bug (checked for wrong exception)
mbollmann Aug 9, 2025
4d7b39b
Add function & tests for ingestion logic
mbollmann Aug 9, 2025
59ef07b
Update CHANGELOG
mbollmann Aug 9, 2025
5f20fc2
Refactor Person.names to store if NameLink was EXPLICIT or INFERRED
mbollmann Aug 9, 2025
fb64a67
Add save functionality for people.yaml
mbollmann Aug 9, 2025
e65733f
Let changes to Person automatically update PersonIndex
mbollmann Aug 10, 2025
6f6bbe2
Add Person.make_explicit + more people.yaml saving tests
mbollmann Aug 10, 2025
e5330c8
Refactor PersonIndex tests & add check for duplicate ORCIDs
mbollmann Aug 10, 2025
cbecb5a
Move PersonIndex fields behind getters that auto-load data
mbollmann Aug 10, 2025
157e313
Add Person.update_id
mbollmann Aug 10, 2025
43575a4
Automatically call ingest_namespec() on create_ functions
mbollmann Aug 10, 2025
61cedf7
Add PersonIndex.create_person()
mbollmann Aug 10, 2025
7ac5e12
Update documentation (WIP)
mbollmann Aug 10, 2025
10a317f
Change slugs_to_verified_ids to contain sets (fixes bug with IDs bein…
mbollmann Aug 11, 2025
7a95834
Implement caching for PersonIndex
mbollmann Aug 13, 2025
e83b5f5
Revert "Implement caching for PersonIndex"
mbollmann Aug 13, 2025
3682487
Improve test coverage & make Person.set_canonical_name private
mbollmann Aug 18, 2025
0bf8332
Update documentation
mbollmann Aug 18, 2025
d4c7d10
Merge pull request #5472 from acl-org/python-author-refactor
mbollmann Aug 19, 2025
5feb4a7
Add MarkupText.from_ & add some string methods
mbollmann Aug 19, 2025
6249922
Use automatic converters for MarkupText attributes, change string met…
mbollmann Aug 19, 2025
66389b2
Update documentation for MarkupText
mbollmann Aug 19, 2025
89db51d
Add Anthology.save_all (partially functional)
mbollmann Aug 19, 2025
2a73ec5
Add hook that tracks modifications to collection objects
mbollmann Aug 19, 2025
dbe3727
Update documentation regarding save_all() functionality
mbollmann Aug 19, 2025
3c82443
Rename EventLinkingType -> EventLink, analogous to NameLink
mbollmann Aug 19, 2025
8681969
Update CHANGELOG
mbollmann Aug 19, 2025
aa4063d
Rename create_person -> create
mbollmann Aug 20, 2025
8455e16
Add VenueIndex.create
mbollmann Aug 20, 2025
8d75005
Fix backslash escape
mbollmann Aug 20, 2025
12d7d16
Fix a few documentation errors
mbollmann Aug 21, 2025
443ca0b
Add tests for VenueIndex/SIGIndex.save and a few pragmas
mbollmann Aug 21, 2025
22b8d6e
Add {Paper|Volume}.namespecs and Person.anthology_items()
mbollmann Aug 21, 2025
26047b7
Refactor update_id -> change_id, add merge_with_explicit
mbollmann Aug 21, 2025
8ff8694
Update CHANGELOG
mbollmann Aug 21, 2025
26a65a2
Fix tests
mbollmann Aug 21, 2025
0463601
Merge branch 'master' into python-dev
mbollmann Aug 21, 2025
fed153e
MarkupText: Make __contains__ check raw and XML text; add __len__
mbollmann Aug 21, 2025
e9eface
Add justfile in main Anthology folder
mbollmann Aug 21, 2025
3a5c31b
Merge branch 'master' into python-dev
mbollmann Sep 20, 2025
37b5b82
Merge branch 'master' into python-dev
mbollmann Sep 21, 2025
1176d76
Hotfix for empty MarkupText serialization
mbollmann Aug 21, 2025
d3620ff
Hotfix for empty MarkupText serialization test case
mbollmann Sep 24, 2025
0b35982
Update XML schema in library
mbollmann Sep 24, 2025
bcf36c6
Hotfix for empty, self-closing markup tags (re #6413)
mbollmann Nov 26, 2025
9711795
Merge branch 'master' into python-dev
mbollmann Nov 26, 2025
e7c2f7a
Fix logging (ensure all Rich output uses the same console)
mbollmann Aug 21, 2025
87e9a53
Update scipy to 1.15.3 (to see if it affects MacOS build failure)
mbollmann Nov 26, 2025
b96a117
Refactor "unverified/PID" to "PID/unverified"
mbollmann Nov 26, 2025
afe58d5
Add htaccess redirect to /unverified URLs
mbollmann Nov 26, 2025
0d17744
Make Hugo content template support /unverified person IDs
mbollmann Nov 26, 2025
d385081
Add script to transition to new author system
mbollmann Nov 28, 2025
1fa5e7f
Update transition script to go through ORCIDs recorded in the XML
mbollmann Nov 30, 2025
b86a31a
Merge branch 'python-dev' into python-api-improvements
mbollmann Nov 30, 2025
037372c
Backport unverified/ -> /unverified change
mbollmann Nov 30, 2025
9ed51ed
Change Poetry install in workflow to ensure same Python version as build
mbollmann Nov 27, 2025
3d1a16f
Fix missing keyword
mbollmann Nov 30, 2025
680bef0
Merge branch 'master' into python-api-improvements
mbollmann Nov 30, 2025
9913376
Merge pull request #5799 from acl-org/python-api-improvements
mbollmann Nov 30, 2025
a70ab4f
Merge branch 'python-dev' into master-new-author-system
mbollmann Nov 30, 2025
a6fc4b9
Fix bugs in transition script, check ORCIDs for validity
mbollmann Nov 30, 2025
a62dd41
Make ORCID validation error include the ORCID
mbollmann Nov 30, 2025
82cc5eb
Make sure ORCID suffixes are lower-cased
mbollmann Nov 30, 2025
5b60717
Make sure ORCID suffixes are lower-cased in person IDs
mbollmann Nov 30, 2025
ea5486a
Fix ORCID logic in transition script by recording ORCIDs for existing…
mbollmann Nov 30, 2025
47ae212
Merge branch 'master' into master-new-author-system
mbollmann Dec 14, 2025
a17fda7
Merge branch 'master' into python-dev
mbollmann Dec 14, 2025
a3de635
Add warnings to display invalid ORCIDs in transition script
mbollmann Dec 15, 2025
b4420ca
Enable people.yaml integration test
mbollmann Aug 21, 2025
5830b37
Fixes for new author system (#6809)
mjpost Dec 16, 2025
4638e6d
Transition data to new author format
mjpost Dec 16, 2025
e33b66c
Remove 􏰃
mjpost Dec 16, 2025
cf919ac
Revert "Transition data to new author format"
mbollmann Dec 16, 2025
91d14b1
Fix ORCID assignment bug in transition script logic
mbollmann Dec 16, 2025
d7256ce
Split Jiatong Li with two ORCID iDs
mjpost Dec 16, 2025
4483617
Merge branch 'master-new-author-system-data-transition' into master-n…
mjpost Dec 16, 2025
146b618
Resolve Li Zhang ORCID iD
mjpost Dec 16, 2025
4893884
Merge pull request #6822 from acl-org/master-new-author-system-more-f…
mjpost Dec 16, 2025
e1c9939
Export orcid in hugo data
mjpost Sep 19, 2025
2c68ced
Move ORCID icon
mjpost Sep 19, 2025
1308c3d
Remove ORCID itself
mjpost Sep 19, 2025
d650dee
Add question mark for unverified authors
mjpost Sep 19, 2025
a95873c
Add verification page stub
mjpost Sep 19, 2025
5307d68
black
mjpost Sep 19, 2025
ed41c0d
Ensmallen; use fontawesome for both
mjpost Sep 19, 2025
67f4f8f
Bump fontawesome version 5.7.2 -> 5.11.0
mjpost Sep 19, 2025
50bfb33
Icon sizing
mjpost Sep 19, 2025
07ce0b3
Update checksum
mjpost Sep 19, 2025
54ba165
Use fontawesome kit
mjpost Sep 19, 2025
ac57e3b
black
mjpost Sep 19, 2025
2dbb39b
Contra all docs everywhere, you need "fab"
mjpost Sep 19, 2025
2eb7946
Switch back to stylesheet
mjpost Sep 19, 2025
c37d7b1
fa-solid -> fas
mjpost Sep 19, 2025
675b6b7
Relativize link and lighten question mark
mjpost Sep 20, 2025
9083f2b
Set opacity for unverified people
mjpost Sep 20, 2025
c516654
reference syntax
mjpost Sep 20, 2025
b46429a
fontawesome hash
nschneid Sep 20, 2025
0ec620c
Transition to new author format
mjpost Dec 16, 2025
57e9cfb
Merge branch 'master-new-author-system' into master-new-author-system-ui
mbollmann Dec 16, 2025
52f6d40
Update bulk metadata form
mjpost Dec 17, 2025
a4d04a8
Fix people pages slug
mbollmann Dec 24, 2025
d41b8b5
Attempt to fix .htaccess rewrite
mbollmann Dec 24, 2025
e311b02
Merge branch 'master-new-author-system' into master-new-author-system-ui
mbollmann Dec 24, 2025
aa9c267
Revert "Transition to new author format"
mjpost Dec 26, 2025
555010b
Merge remote-tracking branch 'origin/master' into master-new-author-s…
mjpost Dec 26, 2025
bcc1a33
Refactor PersonIndex.build() to detect duplicate persons on same item
mbollmann Dec 26, 2025
fbd7cb7
Merge branch 'python-dev' into master-new-author-system
mbollmann Dec 26, 2025
7aa9d12
Merge remote-tracking branch 'origin/master' into master-new-author-s…
mjpost Dec 27, 2025
1a04d92
Transition to new author representation
mjpost Dec 27, 2025
3e21a91
Remove stray 'similar' line
mjpost Dec 27, 2025
d6026ae
Revert "Remove stray 'similar' line"
mjpost Dec 27, 2025
b789663
Revert "Transition to new author representation"
mjpost Dec 27, 2025
7dcbe62
Downgrade error to warning for duplicate unverified persons
mbollmann Dec 27, 2025
0a56f14
Merge branch 'python-dev' into master-new-author-system
mbollmann Dec 27, 2025
db804ee
Merge remote-tracking branch 'origin/master' into master-new-author-s…
mjpost Dec 28, 2025
31cb47f
Transition to new author representation
mjpost Dec 28, 2025
fe2cb90
First pass at documentation
mjpost Dec 29, 2025
4e5fb9b
Add note about name variants
mjpost Dec 29, 2025
589f149
Stub page for names
mjpost Dec 29, 2025
bdced7f
Update orcid.md
nschneid Dec 29, 2025
d0f9745
Merge pull request #6965 from acl-org/nschneid-orcid-info
mjpost Dec 30, 2025
d8ed9ee
black
mjpost Dec 30, 2025
85e6d15
Addressing weird hugo error
mjpost Dec 30, 2025
d6012da
names.yml: blank lines
nschneid Dec 30, 2025
c7f6db3
orcid.md: date
nschneid Dec 30, 2025
8b59e4c
Rename names.yml to names.md
nschneid Dec 30, 2025
bdaa993
Partial commit
mjpost Dec 31, 2025
98782b8
Add "Verifying your account" section"
mjpost Dec 31, 2025
636ea83
single.html: source line break led to space before comma
nschneid Jan 4, 2026
a73c095
orcid.md: no space in OpenReview
nschneid Jan 4, 2026
da1bb1b
verification.md: no space in OpenReview
nschneid Jan 4, 2026
22223c5
corrections.md: no space in OpenReview
nschneid Jan 4, 2026
b6012b6
Add three-way verification display
mjpost Jan 4, 2026
5d54bcd
with -> if
mjpost Jan 4, 2026
7fe19b9
Make icon a light green
mjpost Jan 4, 2026
955ebca
Fiddling with color, size, opacity
mjpost Jan 4, 2026
484114a
Refactor similar names mechanism (e.g. to correctly add /unverified IDs)
mbollmann Jan 6, 2026
4fc9af2
Merge branch 'python-dev' into master-new-author-system
mbollmann Jan 6, 2026
afcd7ca
Handle links to verified & unverified pages separately
mbollmann Jan 6, 2026
70f143f
Merge branch 'master-new-author-system' into master-new-author-system-ui
mbollmann Jan 6, 2026
8506353
Refactor verified check on author page template
mbollmann Jan 6, 2026
900e86e
Refactor and update icon styling
mbollmann Jan 6, 2026
3625997
Spacing: add some daylight between the name and the icon so it doesn'…
nschneid Jan 6, 2026
ce7741f
bootstrap class to set icon margin
nschneid Jan 6, 2026
d297b22
verification.md: a bit more detail on the icons and what verification…
nschneid Jan 7, 2026
10b99e0
header_navbar.html "GitHub" capitalization
nschneid Jan 7, 2026
5139fe4
verification.md: steps for verifying - we want an ORCID iD regardless…
nschneid Jan 7, 2026
c7709ef
orcid.md: OpenReview is one word
nschneid Jan 7, 2026
0fcdc4b
verification.md: fix icons
nschneid Jan 7, 2026
abde567
verification.md: rephrase part about unverified page for ambiguous names
nschneid Jan 7, 2026
893ca21
author page template: no checkmark icon for verification, just ORCID …
nschneid Jan 10, 2026
84063f9
No ? for verified accounts
mjpost Jan 10, 2026
a2c9fd9
Revert "No ? for verified accounts"
mjpost Jan 10, 2026
3d9bc64
Revert "author page template: no checkmark icon for verification, jus…
mjpost Jan 10, 2026
e3d7106
Make question mark green
mjpost Jan 10, 2026
20b83be
Restore paper verification notice in tooltip
mjpost Jan 10, 2026
cea36bd
ORCID iD isn't recorded
mjpost Jan 10, 2026
3f312cc
accidental </a>
nschneid Jan 10, 2026
d2a8772
for author's full display name, only include script variants if verified
nschneid Jan 10, 2026
728f349
Merge pull request #6824 from acl-org/master-new-author-system-ui
mjpost Jan 12, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
4 changes: 2 additions & 2 deletions .github/ISSUE_TEMPLATE/02-name-correction.yml
Original file line number Diff line number Diff line change
Expand Up @@ -37,8 +37,8 @@ body:
attributes:
label: Author ORCID
description: |
Provide a valid [ORCID](https://orcid.org) link here. This will be used to help match future papers to your Anthology ID.
placeholder: ex. https://orcid.org/my-orcid?orcid=0009-0003-8868-7504
Provide a valid [ORCID](https://orcid.org) link here, and [add your name variants to your ORCID profile](https://aclanthology.org/info/orcid_id). This will be used to match papers to your Anthology ID.
placeholder: ex. https://orcid.org/0009-0003-8868-7504
validations:
required: true
- type: textarea
Expand Down
4 changes: 2 additions & 2 deletions .github/ISSUE_TEMPLATE/99-bulk-metadata-correction.yml
Original file line number Diff line number Diff line change
Expand Up @@ -7,11 +7,11 @@ body:
- type: markdown
attributes:
value: >
**This form is not meant to be used manually.** Instead, it is activated by clicking the yellow "Fix data" button found on each paper page in the Anthology (e.g., https://aclanthology.org/K17-1003/). Clicking this button displays a UI tool for modifying the title, abstract, and author list. Submission of that form will automatically populate the title above and data block below.
**Please do not edit the JSON below.** This form is not meant to be used manually; instead, it is activated by clicking the yellow "Fix data" button found on each paper page in the Anthology (e.g., https://aclanthology.org/K17-1003/). Doing so displays a UI tool for modifying the title, abstract, and author list. Submission of that form will automatically populate the title above and data block below.
- type: markdown
attributes:
value: >
Corrections will be processed in bulk on a weekly basis after verification by Anthology staff.
Corrections will be manually reviewed by Anthology staff and processed in bulk on a roughly weekly basis.
- type: textarea
id: data
attributes:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/code-quality.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,6 @@ jobs:

# Coverage report
- name: Upload coverage reports to Codecov
uses: codecov/codecov-action@v4
uses: codecov/codecov-action@v5
with:
token: ${{ secrets.CODECOV_TOKEN }}
6 changes: 5 additions & 1 deletion bin/create_extra_bib.py
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@
import msgspec
from pathlib import Path
import re
from rich.console import Console
from rich.progress import track
import shutil
import subprocess
Expand All @@ -48,6 +49,7 @@

BIB2XML = None
XML2END = None
CONSOLE = Console(stderr=True)

# Max shard size in MiB
MAX_SHARD_MB = 49
Expand Down Expand Up @@ -89,6 +91,7 @@ def create_bibtex(builddir, clean=False) -> None:
reverse=True,
),
description="Create anthology.bib.gz... ",
console=CONSOLE,
):
with open(volume_file, "r") as f:
bibtex = f.read()
Expand Down Expand Up @@ -124,6 +127,7 @@ def create_bibtex(builddir, clean=False) -> None:
reverse=True,
),
description=" +abstracts.bib.gz... ",
console=CONSOLE,
):
with open(collection_file, "rb") as f:
data = msgspec.json.decode(f.read())
Expand Down Expand Up @@ -351,7 +355,7 @@ def batch_convert_to_mods_and_endf(bibtex, context):
)

log_level = log.DEBUG if args["--debug"] else log.INFO
tracker = setup_rich_logging(level=log_level)
tracker = setup_rich_logging(console=CONSOLE, level=log_level)

max_workers = int(args["--max-workers"]) if args["--max-workers"] else None
if (BIB2XML := shutil.which("bib2xml")) is None:
Expand Down
19 changes: 14 additions & 5 deletions bin/create_hugo_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
import msgspec
from omegaconf import OmegaConf
import os
from rich.console import Console
from rich.progress import (
Progress,
TextColumn,
Expand All @@ -52,6 +53,7 @@
from acl_anthology.collections.paper import PaperDeletionType
from acl_anthology.collections.volume import VolumeType
from acl_anthology.utils.logging import setup_rich_logging
from acl_anthology.utils.ids import is_verified_person_id
from acl_anthology.utils.text import (
interpret_pages,
month_str2num,
Expand All @@ -60,6 +62,7 @@


BIBLIMIT = None
CONSOLE = Console(stderr=True)
ENCODER = msgspec.json.Encoder()
SCRIPTDIR = os.path.dirname(os.path.realpath(__file__))

Expand Down Expand Up @@ -93,7 +96,7 @@ def make_progress():
TaskProgressColumn(show_speed=True),
TimeRemainingColumn(elapsed_when_finished=True),
]
return Progress(*columns)
return Progress(*columns, console=CONSOLE)


@cache
Expand Down Expand Up @@ -392,13 +395,19 @@ def export_people(anthology, builddir, dryrun):
)
if n.script is not None:
diff_script_variants.append(n.as_full())
if diff_script_variants:
if diff_script_variants and is_verified_person_id(person_id):
data["full"] = f"{data['full']} ({', '.join(diff_script_variants)})"
if person.comment is not None:
data["comment"] = person.comment
if person.orcid is not None:
data["orcid"] = person.orcid
similar = anthology.people.similar.subset(person_id)
if len(similar) > 1:
data["similar"] = list(similar - {person_id})
similar.remove(person_id)
if similar_verified := [id_ for id_ in similar if is_verified_person_id(id_)]:
data["similar_verified"] = sorted(list(similar_verified))
similar.difference_update(similar_verified)
if similar: # any remaining IDs are unverified
data["similar_unverified"] = sorted(list(similar))
people[person_id] = data
progress.update(task, advance=1)

Expand Down Expand Up @@ -567,7 +576,7 @@ def export_anthology(anthology, builddir, clean=False, dryrun=False):
)

log_level = log.DEBUG if args["--debug"] else log.INFO
tracker = setup_rich_logging(level=log_level)
tracker = setup_rich_logging(console=CONSOLE, level=log_level)

if limit := args["--bib-limit"]:
BIBLIMIT = int(limit)
Expand Down
Loading