-
Notifications
You must be signed in to change notification settings - Fork 132
Added i18n component and related scripts #1082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
…plits the text into manageable chunk
…xt handling and error reporting
…toring segment handling to use arrays for better management
…ling in Promise.all
…ity and consistency
… merging logic and remove unnecessary debug logs
…ager version in package.json
…lations inside SCHEME and SCHEMEINLINE tags
…ded try catch when parsing xml to json to report failed parsing possibly attributed to unsound xml structure.
- Created a new XML file for references (97references97.xml) containing a comprehensive list of references used in the SICP JS project. - Added a new XML file for the index preface (98indexpreface98.xml) to provide context and formatting for the index section. - Introduced a new XML file for the making section (99making99.xml) detailing the background, interactive features, and development history of the SICP JS project. - Updated subsection2.xml to close the previously open SUBSECTION tag and include a comment for clarity.
…LINE tags in XML parsing
breaking changes are made: xml repositories are divided into folders, currently consisting of en and cn folders to store translated content. The same applies to the json folder after running "yarn json". Frontend needs to be changed accordingly to fetch json files from the corresponding url |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not look at the contents of the zh_CN folder. I assume they are generated by AI? Or was there manual editing? If the former, please add a gitattributes file so that reviewers (and GitHub) knows it's machine generated
- name: Clone translated_xmls | ||
run: | | ||
git clone -b translated_xmls https://github.com/source-academy/sicp.git translated_xmls | ||
mv translated_xmls/* xml/ | ||
rm -r translated_xmls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we cloning from a specific branch instead of merging it to the default branch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain what these new workflows are for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This workflow is for triggering translations for all those English XMLs that are changed. The workflow is not set to be triggered on pushes yet because the AI translation is not very stable yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the translate-everything workflow is for translating all English XMLs
echo API_KEY=${{ secrets.OPENAI_KEY }} >> .env | ||
# echo API_KEY=${{ secrets.OPENAI_KEY2 }} >> .env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they are no longer used, the secret should be deleted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the start of the project, me and Yi Hao were both given an API key, so there are two secrets. Now one is used as a backup.
echo API_KEY=${{ secrets.OPENAI_KEY }} >> .env | ||
# echo API_KEY=${{ secrets.OPENAI_KEY2 }} >> .env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
All the AI translated XMLs will go into the translated_xmls branch, so I will delete the zh_CN folder. Should I add gitattributes file to the translated_xmls branch to mark they are generated by AI? |
How is the zh_CN folder different from the translated_xmls branch? Why not just merge the translations directly to master instead of using a separate branch? |
They are the same. The translated_xmls branch is separately created because in the workflow for translation (translate-changed, translate-everything) an action called deploy which pushes files to a branch is used to push AI generated translations to GitHub, so the branch is created for this. |
I see, noted, and what is the rationale of running it in a workflow, as opposed to running it locally? |
sicp.sourceacademy.org is a static site, so the translated XMLs need to be deployed too, and thus we are storing them on GitHub. This is not really about the translation workflows. The translator can be run locally and then the generated content can be manually pushed to the translated_xmls branch. |
@yihao03 I still don't understand why you removed the Just run the command locally and push it to master branch? In my view, if it's going to be deployed, then it should me in the master branch not some other branch. |
github_token: ${{ secrets.GITHUB_TOKEN }} | ||
publish_dir: ./translation_output | ||
force_orphan: false # leave the possiblity for direct modification on translated xmls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you do this, it will overwrite the deployment no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these files autogenerated? What is the ai_files
folder used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @RichDom2185, I have been out of the loop for a while. But I believe that ai_files are files for ai to refer to. While it hasn't been properly implemented/put to use, the plan is that it will have a folder structure that mirrors the xml/en directory where each file contains specific terms/instructions to translate the corresponding source file.
Based on our discussions with Prof @martin-henz , we agreed that users should not edit AI generated output directly to correct mistakes as the changes will be overwritten when we regenerate the output using AI, therefore it is better that should the translations need any amendment/improvement, it should be done by editing the prompt given to the AI model, which could be read from these files.
Hi, this was a design decision made by @coder114514 as he was in charge of the deployment workflow, while I mainly worked on the translation logic. However, I do agree with you that |
run yarn trans to start translation