HUGE compling speedup

Sun Aug 17 19:28:58 CDT 2003

Christian Stimming wrote:
> On Sonntag, 17. August 2003 18:40, Jon Lapham wrote:
> 
>>Could someone tell me again why we don't run "msgmerge" on the .po files
>>in CVSand save the merged files?
> 
> Because the .po files are the sole responsibility of the translators. *They* 
> should run "msgmerge" every once in a while. We shouldn't.

Well, I agree with you in theory, but, in practice we are getting 
hammered out here in the real world.  When I compile, I issue a "make" 
followed by a "make install".  This runs msgmerge twice on every file, 
for a total of almost 40 minutes of wall time in total (see below). 
This is not counting the po->gmo conversion, this is just msgmerge.

40 minutes!  That is absolutely crazy.

So, maybe the solution is to ask the maintainers of each po file to run 
msgmerge themselves and send the resultant merged po file in?

>>On my PIII800 a gazillion minutes of compiling GnuCash is spent running
>>msgmerge, every friggin' time I recompile.    

Just for the sake of putting some real numbers out there, these are the 
times to run "msgmerge -o BLAH.po POFILE.po gnucash.pot" ordered by 
longest run time.  Total time was 1092s, more than 18 minutes.  On a 
PIII 800 MHz computer with tons of RAM.

da.po           115 seconds
es_NI.po        101 seconds
sv.po           97 seconds
hu.po           94 seconds
ja.po           86 seconds
ru.po           67 seconds
es.po           51 seconds
pt.po           50 seconds
pl.po           45 seconds
nl.po           41 seconds
it.po           41 seconds
zh_TW.po        39 seconds
en_GB.po        39 seconds
zh_CN.po        38 seconds
cs.po           38 seconds
nb.po           36 seconds
sk.po           36 seconds
fr.po           28 seconds
uk.po           26 seconds
el.po           16 seconds
ta.po           5 seconds
pt_BR.po        2 seconds
de.po           1 seconds

Putting these times on the translation % done table:

Language     Done %    Trans  Untrans    Fuzzy  Missing Merge(s)
----------  -------  -------  -------  -------  ------- --------
de.po         100.0     2559        0        0        0      1
pt_BR.po      100.0     2559        0        0        0      2
en_GB.po       94.8     2425       69        0       65     39
nl.po          94.8     2425       68        0       66     41
it.po          94.4     2415       73        0       71     41
zh_TW.po       94.1     2407       68        0       84     39
cs.po          93.3     2388      106        0       65     38
ru.po          84.7     2167      116        0      276     67
fr.po          82.9     2122      437        0        0     28
pt.po          77.1     1974      113        0      472     50
es.po          77.0     1970      464        0      125     51
sk.po          73.2     1872      586        0      101     36
sv.po          52.4     1341      468        0      750     97
hu.po          52.2     1337      470        0      752     94
da.po          52.2     1335      470        0      754    115
el.po          50.4     1289      972        0      298     16
es_NI.po       49.4     1265      510        0      784    101
ja.po          46.5     1191      612        0      756     86
pl.po          22.4      574     1411        0      574     45
zh_CN.po       21.8      559     1432        0      568     38
uk.po          20.7      530     1363        0      666     26
nb.po          14.4      369     1552        0      638     36
ta.po          13.4      344     2151        0       64      5

So, as you can see, the biggest time users of msgmerge are not the most 
translated, or the least translated.  It is actually the "half" 
translated.  I guess probably due to the "fuzzy translation" guessing logic.

> That's this stupid gettext/intltool/whatever version. Even running "msgmerge" 
> on our own doesn't help, because the recompiling of the .po catalogs will 
> happen after every single change any c code file. 

Why wouldn't it help?  All the numbers shown in the above table would 
drop down to 1-2s per .po file for the merge.  Yes, we would still have 
to compile the .gmo, but that is a separate issue.

These are the compile times: (msgfmt -c -o BLAH.mo POFILE.po)

lapham at bilbo > ./time_compile.pl *.po
cs.po           0 seconds
da.po           1 seconds
de.po           0 seconds
el.po           1 seconds
en_GB.po        0 seconds
es_NI.po        0 seconds
es.po           1 seconds
fr.po           0 seconds
hu.po           1 seconds
it.po           0 seconds
ja.po           1 seconds
nb.po           0 seconds
nl.po           0 seconds
pl.po           1 seconds
pt_BR.po        0 seconds
pt.po           1 seconds
ru.po           0 seconds
sk.po           0 seconds
sv.po           1 seconds
ta.po           0 seconds
uk.po           0 seconds
zh_CN.po        1 seconds
zh_TW.po        0 seconds
overall         9 seconds

So, the actual compiling of the .mo file is very, very fast.

> We had some discussion 
> about this earlier. I ended up editing the po/Makefile.in.in file and 
> commenting out the line where the INTLTOOL_UPDATE command is called. I think 
> it was this one (commenting out is achieved by adding the 'echo' in front):
> 
> $(POFILES): $(srcdir)/$(DOMAIN).pot
> 	@lang=`echo $@ | sed -e 's,.*/,,' -e 's/\.po$$//'`; \
> 	test "$(srcdir)" = . && cdcmd="" || cdcmd="cd $(srcdir) && "; \
> 	echo "$${cdcmd}$(INTLTOOL_UPDATE) $${lang}.po"; 
> 
> This manual workaround has to be done every time when ./autogen.sh is called, 
> i.e., on the stable branch probably seldomly.

-- 
-**-*-*---*-*---*-*---*-----*-*-----*---*-*---*-----*-----*-*-----*---
  Jon Lapham  <lapham at extracta.com.br>          Rio de Janeiro, Brasil
  Work: Extracta Moléculas Naturais SA     http://www.extracta.com.br/
  Web: http://www.jandr.org/
***-*--*----*-------*------------*--------------------*---------------