diff --git a/docs/building-extensions/languages/_assets/home-dashboard-gaelic.png b/docs/building-extensions/languages/_assets/home-dashboard-gaelic.png new file mode 100644 index 00000000..db8e5422 Binary files /dev/null and b/docs/building-extensions/languages/_assets/home-dashboard-gaelic.png differ diff --git a/docs/building-extensions/languages/_assets/jed-check-gaelic.png b/docs/building-extensions/languages/_assets/jed-check-gaelic.png new file mode 100644 index 00000000..3ece7376 Binary files /dev/null and b/docs/building-extensions/languages/_assets/jed-check-gaelic.png differ diff --git a/docs/building-extensions/languages/index.md b/docs/building-extensions/languages/index.md new file mode 100644 index 00000000..4e4cbfb6 --- /dev/null +++ b/docs/building-extensions/languages/index.md @@ -0,0 +1,29 @@ +--- +sidebar_position: 3 +title: Languages +--- + +Languages +========= + +## Language Packs + +Joomla implements multilingual functionality using language packs, the default being British English. Each language pack consists of a number of .ini files for each **core** extension and each client (admin, api or site). There is information on the structure and use of language files in the [Multilingual](../../general-concepts/multilingual/) section of General Concepts in this manual. This section is more concerned with language packs as extensions. + +## Language Pack Documentation + +There are many parts to the Joomla language system, each needing its own specific documentation. For example: + +- Official Language Packs created by translation of the English originals require management code that is not part of the Joomla CMS. +- The actual translation of language packs using Crowdin requires Translator documentation. +- The use of multiple languages in a Joomla site using Associations or Overrides requires User documentation. +- The CMS language debugging feature requires Developer documentation. +- Third party components may offer several languages for which developers need some guidance on best practices. +- There may be circumstances where a Language Pack produced by a third party is not an ***Official*** language pack. An example is described for [Scottish Gaelic](../languages/language-extension-example/) in this section of the Manual. + +## References + +- [Making a Language Pack for Joomla](https://docs.joomla.org/J3.x:Making_a_Language_Pack_for_Joomla) + * This article is a little out of date and uses French as an example of how to build a language pack for Joomla 3. +- [Joomla's L10N-hearted](https://magazine.joomla.org/all-issues/august/joomla-s-l10n-hearted) + * This article describes how translation is accomplished using Crowdin. diff --git a/docs/building-extensions/languages/language-extension-example.md b/docs/building-extensions/languages/language-extension-example.md new file mode 100644 index 00000000..c4f2a4a0 --- /dev/null +++ b/docs/building-extensions/languages/language-extension-example.md @@ -0,0 +1,423 @@ +--- +sidebar_position: 5 +title: Joomla Language Extension Example +--- + +Joomla Language Extension Example +================================= + +## Introduction + +Official Joomla language extensions are normally installed via the System → Install → Languages route. However, there may be occasions when it is necessary to install a language extension via the Install → Extensions → Upload & Install route. This example is for Scottish Gaelic with all of the English to Gaelic translation obtained using openai.com at a cost of just under $5. It is an unofficial language extension because it really needs the translations verified by Gaelic speakers, perhaps unlikely as there are only 60,000 of them in total. Creation of the extension was inspired by the coincidence of an enquiry in the Forum and a personal visit to the ruins or Carnasserie Castle where the very first printed document in Scottish Gaelic was produced in 1567. + +## Repository File Structure + +The following structure includes a build.xml file, used to build the package using phing, and a .gitignore file, neither of which are present in the GitHub [repository](https://github.com/ceford/cefjdemos-pkg-gd-gb). The .ini files are translations of the original English .ini files. The method of translation is covered in a separate article. + +```sh +cefjdemos-pkg-gd-gb + gd-GB + admin_gd-GB + 454 *.ini files + install.xml + langmetadata.xml + localise.php + api_gd-GB + 2 *.ini files + install.xml + langmetadata.xml + site_gd-GB + 69 *.ini files + install.xml + langmetadata.xml + localise.php + admin_gd-GB.zip + api_gd-GB.zip + pkg_gd-GB.xml + script.php + site_gd-GB.zip + .gitignore + build.xml + LICENSE + pkg_gd-GB.zip + README.md +``` + +The pkg_gd-GB.zip file contains the three client zip files, the script.php file and the pkg_gd-GB.xml file but not the contents of each client folder as they are in the individual zips. + +## The pkg_gd-GB.xml File + +Note that **gd-GB** is the ISO code for Scottish Gaelic. Most of the fields in the pkg_gd-GB.xml file are self-explanatory. It is possible to create separate Site and Administrator language extensions. However, the Administrator install.xml and langmetadata.xml files are required for language administration and the localise.php file is required for use by some plugins. + +```xml + + + Scottish Gaelic Language Pack + gd-GB + 5.3.1.1 + 2025-06-18 + Clifford E Ford + cliff@ford.myzen.co.uk + https://github.com/ceford/cefjdemos-pkg-gd-gb + (C) 2025 Clifford E Ford. All rights reserved. + GNU General Public License version 2 or later; see LICENSE.txt + https://github.com/ceford/cefjdemos-pkg-gd-gb + Clifford E Ford + https://github.com/ceford/cefjdemos-pkg-gd-gb + + true + script.php + + site_gd-GB.zip + admin_gd-GB.zip + api_gd-GB.zip + + + https://github.com/ceford/cefjdemos-pkg-gd-gb/raw/main/pkg_gd-GB.zip + + +``` +The extension version is usually the same as the Joomla version for which it was created. An optional extra parameter may be used for updates, for example 5.3.1.1. When creating a third party extension take care not to copy any Official Joomla! elements. The JED Checker will flag some as invalid. + +## The script.php file + +This file is used to perform additional changes during extension install, update or uninstall. It is stored in the administrator/manifests/packages/gd-GB folder. + +```php +minimumJoomla = '5.0'; + $this->minimumPhp = '8.1.0'; + + $this->deleteFiles = [ + // Previous available version was for 2.5 - assume already removed + // Old files from Joomla 3 language packs - assume already removed + // Old files from Joomla 4 language packs - assume already removed + // Old files from Joomla 5 language packs (Only relevant for Joomla 6, should then be included in the deletion array with the 6.0-dev branch once created) + // '/administrator/language/gd-GB/plg_captcha_recaptcha_invisible.ini', + // '/administrator/language/gd-GB/plg_captcha_recaptcha_invisible.sys.ini', + ]; + } + + /** + * Function to perform changes during postflight + * + * @param string $type The action being performed + * @param ComponentAdapter $parent The class calling this method + * + * @return void + * + * @since 4.0.0v1 + */ + public function postflight($type, $parent) + { + $this->removeFiles(); + } +} +``` + +## Administrator + +The admin folder contains a large number of individual `.ini` files and three others: `install.xml`, `langmetadata.xml` and `localise.php`. + +### install.xml + +This file is used for installation and removal of the language extension. + +```xml + + + Scottish Gaelic + gd-GB + 5.3.1.1 + 2025-06-18 + Clifford E Ford + cliff@ford.myzen.co.uk + https://github.com/ceford/cefjdemos-pkg-gd-gb + (C) 2025 Clifford E Ford. All rights reserved. + GNU General Public License version 2 or later; see LICENSE.txt + + + + / + langmetadata.xml + install.xml + + +``` + +### langmetadata.xml + +This file is used for language management purposes. + +```xml + + + Scottish Gaelic + gd-GB + 5.3.1.1 + 2025-06-18 + Clifford E Ford + cliff@ford.myzen.co.uk + https://github.com/ceford/cefjdemos-pkg-gd-gb + (C) 2025 Clifford E Ford. All rights reserved. + GNU General Public License version 2 or later; see LICENSE.txt + + + Scottish Gaelic + Gàidhlig na h-Alba + gd-GB + 0 + gd_GB.utf8, gd_GB.UTF-8, gd_GB, gd, gla, gd-GB, scottish gaelic, gaelic, scots gaelic, scotland, uk, united kingdom + 1 + 0,6 + gregorian + + + +``` + +#### Notes + +- The `` tag should be in English. +- The `` tag should be in the extension language. +- The `` tag is used for sorting purposes. It should include: + - Standard POSIX-style locale codes (e.g., gd_GB.utf8) + - Alternate capitalizations or encodings (gd_GB.UTF-8, gd_GB) + - ISO language and country codes (gd, gla, gd-GB) + - Human-readable names and aliases (scottish gaelic, scots gaelic, gaelic, etc.) + - Country/region-related terms (scotland, uk if applicable) +- The `` tag is used to specify the first day of the week in that language. 0 is Sunday, 1 is Monday, etc. +- The `` tag is used to define the days considered to be weekend and often greyed. 0,6 is Saturday & Sunday, 1 would be Friday. +- The `` tag uses *gregorian* by default. Other calendars may be available for some languages. + +### localise.php + +This file is used to cope with language peculiarities. + +```php + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +``` + +The build process is then called from a VSCode tasks file. + +### .vscode/tasks.json + +```json +{ + // See https://go.microsoft.com/fwlink/?LinkId=733558 + // for the documentation about the tasks.json format + "version": "2.0.0", + "tasks": [ + { + "label": "Build pkg_gd-GB", + "type": "shell", + "command": "php ~/bin/phing-latest.phar", + "windows": { + "command": "php ~/bin/phing-latest.phar" + }, + "group": "build", + "presentation": { + "reveal": "always", + "panel": "shared" + } + } + ] +} +``` + +## JED Checker + +Although this extension is not destined for the Joomla Extensions Directory, the JED Checker is an invaluable development tool. This is what it reports: + +![JED Checker Screenshot](_assets/jed-check-gaelic.png) + +The XML Manifest errors appear to be a JED Checker bug that has been reported. There are no other problems. diff --git a/docs/building-extensions/languages/translate-openai.md b/docs/building-extensions/languages/translate-openai.md new file mode 100644 index 00000000..9673da4b --- /dev/null +++ b/docs/building-extensions/languages/translate-openai.md @@ -0,0 +1,322 @@ +--- +sidebar_position: 10 +title: Openai .ini File Translation Example +--- + +Openai .ini File Translation Example +==================================== + +## Introduction + +This article presents an example of a method to translate a language pack using openai.com. It seemed necessary to use this method to fulfil a requirement for a language pack in Scottish Gaelic without the availability of Gaelic speaking translators. + +## Preparation + +The default English language pack contains .ini files for three clients: admin (454 files), api (2 files) and site (69 files). Lists of the .ini files were made in three .txt files, as in this short example for the list of admin .ini files: + +``` +com_actionlogs.ini +com_actionlogs.sys.ini +com_admin.ini +com_admin.sys.ini +com_ajax.ini +... +plg_workflow_notification.sys.ini +plg_workflow_publishing.ini +plg_workflow_publishing.sys.ini +tpl_atum.ini +tpl_atum.sys.ini +``` + +The PHP script used for translation is shown in full below. It was intended initially for one-time use only so was not *polished* for public eyes. It is run from the command line: + +``` +php initrans.php +``` + +Two parameters are hard-coded: + +- A personal api key on line 13 or thereabout. +- The client list to process (admin, api or site) right at the end of the file. + +The script processes each English .ini file in turn, breaks it into short sections of up to 25 lines and sends each section to openai.com for translation. The translated sections are then reassembled and output to a new language .ini file. + +The script outputs the name of each file being processed. If something goes wrong there is a message to that effect. The longer lists of .ini files did run into completion problems, probably related to openai usage rates. In those cases the solution is to delete the translated files with problems and run the script again. It skips the .ini files that have already been translated. + +Lots of improvements are possible: + +- Although each line in an .ini file is split into key and value, they are not used. The whole line is submitted for translation with appropriate instructions. +- Some experimentation with the openai message parameters may improve performance. + +## The initrans.php file + +```php +base . $source . $line); + echo "Processing {$source}{$line}\n"; + $inilines = explode(PHP_EOL, $inifile); + $inicount = 0; + $batch = []; + foreach ($inilines as $iniline) { + $test = preg_match($pattern, $iniline, $matches); + + if (!empty($test)) { + // The key is in $matches[1] and the value in $matches[2] + $keys[$inicount] = $matches[1]; + + // Add the whole line to the batch + $batch[] = $matches[0]; + $inicount += 1; + // If the batch is a multiple of 25 send it for translation. + if ($inicount % 25 === 0) { + file_put_contents($sink . $line, $this->translateme($batch), FILE_APPEND); + $batch = []; + } + } else { + // Output any pending batch translations. + if (!empty($batch)) { + file_put_contents($sink . $line, $this->translateme($batch), FILE_APPEND); + } + + // Output the line unchanged + file_put_contents($sink . $line, "{$iniline}\n", FILE_APPEND); + + $batch = []; + } + } + + // Translate any lines still in the batch; + if (!empty($batch)) { + file_put_contents($sink . $line, $this->translateme($batch), FILE_APPEND); + } + $count += 1; + } + + echo "Total = {$count}\n\n"; + } + + /** + * Prepare a batch of lines for translation + * + * @param array $batch The array of lines so far. + */ + protected function translateme($batch) { + $text = implode("\n", $batch); + + // submit a batch of lines to openai.com for translation. + $translation = $this->getTranslation('Scottish Gaelic', $text); + + return "{$translation}\n"; + } + + /** + * Compose the message to be sent to openai.com + * + * @param string $language_name The name of the destination language in English + * @param string $paragraphBuffer The text to be translated. + * + * @return string The translated text or the original text with comments. + */ + protected function getTranslation($language_name, $paragraphBuffer) { + $instruction = "Please translate the following ini file text from English to {$language_name}"; + if ($language_name == 'German') { + $instruction .= ' Please use the word Beiträge rather than Artikel. '; + } + + $messages = [ + [ + "role" => "system", + "content" => "You are a translator who translates text from English to {$language_name}. " . + "Provide only the translated text, without any comments or explanations. " . + "The text is in ini file format with a key followed by the value to be translated in double quotes" . + "The translated value must be on one line." + ], + [ + 'role' => 'user', + 'content' => $instruction . ": \n" . + $paragraphBuffer, + ], + ]; + + $return = $this->chat($messages); + if (empty($return['choices'])) { + // Find out what is going on! + //var_dump($return, $messages); + echo "Untranslated text: " . substr($paragraphBuffer, 0, 64) . "\n"; + return "\n{$paragraphBuffer}\n\n"; + } else { + return $return['choices'][0]['message']['content']; + } + } + + /** + * Set the openai parameters and create a message: https://platform.openai.com/docs/api-reference/chat/create + * + * @param array $messages (each item must have "role" and "content" elements, this is the whole conversation) + * @param int $maxTokens maximum tokens for the response in ChatGPT (1000 is the limit for gpt-3.5-turbo) + * @param string $model valid options are "gpt-3.5-turbo", "gpt-4", and in the future probably "gpt-5" + * @param int $responseVariants how many response to come up with (normally we just want one) + * @param float $frequencyPenalty between -2.0 and 2.0, penalize new tokens based on their existing frequency in the answer + * @param int $presencePenalty between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the conversation so far, increasing the AI's chances to write on new topics. + * @param int $temperature default is 1, between 0 and 2, higher value makes the model more random in its discussion (going on tangents). + * @param string $user if you have distinct app users, you can send a user ID here, and OpenAI will look to prevent common abuses or attacks + */ + protected function chat( + $messages = [], + $maxTokens=2000, + $model='gpt-4o', + $responseVariants=1, + $frequencyPenalty=0, + $presencePenalty=0, + $temperature=1, + $user='') { + + //create message to post + $message = new stdClass(); + $message -> messages = $messages; + $message -> model = $model; + $message -> n = $responseVariants; + $message -> frequency_penalty = $frequencyPenalty; + $message -> presence_penalty = $presencePenalty; + $message -> temperature = $temperature; + + if($user) { + $message -> user = $user; + } + + $result = self::_sendMessage('/chat/completions', data: json_encode($message)); + + return $result; + } + + /** + * Send the request message to openai. + * + * @param string $endpoint Endpoint obtained from the openai url + * @param string $data The json encoded data to be sent. + * @param string $method Deafults to post. + * + * @return object The response to the request. + */ + private static function _sendMessage($endpoint, $data = '', $method = 'post') { + $apiEndpoint = self::$open_ai_url.$endpoint; + + $curl = curl_init(); + + if($method == 'post') { + $params = array( + CURLOPT_URL => $apiEndpoint, + CURLOPT_SSL_VERIFYHOST => false, + CURLOPT_SSL_VERIFYPEER => false, + CURLOPT_RETURNTRANSFER => true, + CURLOPT_MAXREDIRS => 10, + CURLOPT_TIMEOUT => 90, + CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1, + CURLOPT_CUSTOMREQUEST => "POST", + CURLOPT_NOBODY => false, + CURLOPT_HTTPHEADER => array( + "content-type: application/json", + "accept: application/json", + "authorization: Bearer ".self::$open_ai_key + ) + ); + curl_setopt_array($curl, $params); + curl_setopt($curl, CURLOPT_POSTFIELDS, $data); + } else if($method == 'get') { + $params = array( + CURLOPT_URL => $apiEndpoint . ($data!=''?('?'.$data):''), + CURLOPT_SSL_VERIFYHOST => false, + CURLOPT_SSL_VERIFYPEER => false, + CURLOPT_RETURNTRANSFER => true, + CURLOPT_MAXREDIRS => 10, + CURLOPT_TIMEOUT => 90, + CURLOPT_HTTP_VERSION => CURL_HTTP_VERSION_1_1, + CURLOPT_CUSTOMREQUEST => "GET", + CURLOPT_NOBODY => false, + CURLOPT_HTTPHEADER => array( + "content-type: application/json", + "accept: application/json", + "authorization: Bearer ".self::$open_ai_key + ) + ); + curl_setopt_array($curl, $params); + } + + $response = curl_exec($curl); + + curl_close($curl); + + $data = json_decode($response, true); + if(!is_array($data)) return array(); + + return $data; + } +} + +$client = new ChatGPTIniTranslate; +$client->go('api'); + +```