[GNC-dev] How to manage documentation translations

Geert Janssens geert.gnucash at kobaltwit.be
Tue Sep 11 16:45:47 EDT 2018

Somewhere in the long thread about future documentation directions the issue 
of translating documentation was raised. Rightfully so, because this is 
currently a very challenging task.

The initial translation, while a huge job, is relatively straight forward: one 
takes the English docbook files and translates each paragraph and header, one 
by one. In the end one ends up with the same docbook files in your language.

Documentation updates on the other hand are more challenging. Whenever the 
English document is changed the only clue as to what has changed is in the git 
logs. Git, while very powerful, is not very translator-friendly. It's 
targeting a completely different use case.

For translations other tools exist. The best-known to us is gettext. This 
works by creating a message catalog of all translatable strings with tooling 
to help in translating them. Whenever an original string changes it is 
relatively easy for the translator to figure out which string has changed and 
make the necessary changes in the translation as well.

These two methods behave very differently in case of partially translated 

* With our old method only documentation that has been translated in available 
in that language. Untranslated parts are not available at all.
* With the gettext method the full documentation is always presented to the 
user. For parts that have been translated, this translation will be shown. For 
parts that have not been translated, the original English text will be shown.

Also in the old method documentation doesn't move unless a translator makes 
modifications. In the gettext method the translation may change whenever the 
English original changes. And even a simple change of punctuation would hide a 
translation from the end user (at least that's what happened in our current 
Italian translation, which is based on the gettext method).

I honestly don't know which end result is preferred by non-English speaking 
end users. Perhaps we should poll for this in our non-English mailing lists.

That's what we have now and I think we should be able to do better. Both for 
our translators as for our end users.

In the light of the upcoming major rework of the guide, option one will end in 
a lot of translator frustration. Translators will be required to interpret git 
logs and diffs to learn what has happened. As said, git is not a very good 
tool for non-developers to deal with.

So I'm inclined to look for improvements in the gettext method.

For the direct issues mentioned above:
1. losing translations on something as simple as a punctuation change.
We could avoid this by not running gettext extraction automatically. In a way 
that would make the gettext method a hybrid between the two methods. The 
workflow would become:
a. a translator runs gettext
b. the translator looks for new/changes text and updates the translation.
c. this will be used from now on
d. until the translator reruns gettext.
=> technical note, this really means we should copy/cache the original English 
documentation for each language we support. This copy/cache should only be 
updated on request of the translator.
The advantages of this approach are
* all of gettext tooling is available to support the translator
* the translated documentation the end user sees will always be what the 
translator intended and never change automatically behind the translator's 
* if the translation was not complete, there will be English parts in there 
* translator needs to be aware of the requirement to rerun gettext (or more 
precisely to update the copy/cache)

Another approach would be to tweak gettext behaviour to not hide slightly 
altered (fuzzy) translations.

2. The presence of English text in partially translated documentation
We could again wrap around gettext and filter out untranslated parts. That may 
give odd results though so I wouldn't recommend it.

Or we could leverage automated translations for example via google translate. 
We would have to add a note the translation may be more unreliable in that 
case. But perhaps poor translation is better than no translation at all ?

That's as far as I got for now. More input and other ideas are welcome.



More information about the gnucash-devel mailing list