This week I accomplished an important milestone of the major rewrite that – apart from the daily work such as fixing bugs, coding small enhancements and reviewing patches – I'm working on since 9 months or so. In current master LibreOffice finally is able to transparently handle arbitrary (if valid) BCP 47 language tags and fully support the fo:script and *:rfc-language-tag attributes defined in ODF 1.2.
So what does this mean? It means that you'll be able to get your language in.
It means that already supported languages or writing scripts that so far used a kludge to squeeze them into ISO 639 language codes and ISO 3166 country codes only, are finally supported using the proper language tags registered with IANA. For example:
- ca-ES-valencia Catalan Valencian
- The Valencian variant of Catalan previously used the ca-XV kludge where XV is a reserved for private use ISO 3166 code, which meant it could be used for UI translation purposes but not for document content. This is now stored in ODF as style:rfc-language-tag='ca-ES-valencia' attributes.
- sr-Latn Serbian Latin
- Previously the deprecated sh kludge was used to differentiate between Serbian Latin and sr Serbian Cyrillic. Serbian Latin in Serbia sr-Latn-RS is now stored in ODF as fo:language='sr' fo:script='Latn' fo:country='RS' attributes.
It also means that a tag en-GB-oed can be and now is already supported, including the corresponding language list entry already being added to the list. This is English, Oxford English Dictionary spelling, which is mandatory for UN documents and as it seems also used for EU documents. LibreOffice will be the first free office suite to support spell-checkers with Oxford English Dictionary spelling along with en-GB and en-US spelling at the same time.
Transparently handle arbitrary tags means that when a document is read that contains language attribution not specifically known to LibreOffice (i.e. does not have an entry in the language list), when positioning the cursor on or selecting such text the language tag is shown in the status bar and in the language list of the character attribution so you will not see Unknown or, even worse, nothing or the system locale's language. If a dictionary was installed that handled such tag then it could be used for spell-checking. Transparently of course also means that the tag will be stored again to ODF when saving the document so the attribution is not lost.
The following screenshot shows an example of a document that uses the tag de-DE-1901 to designate German, German variant, traditional orthography:
I'm extremely glad to have this step ready just in time and of course I'll talk about it at the LibreOffice Conference 2013 at Milano, so to get all the details please join me and attend Getting you language in on Thursday, 26 September at 15:30 in Sala Alfa.
If you are interested in the technical details of BCP 47 language tags I recommend my bookmarks as a starting point.