Cyrillic in reports

Reinke Bonte reinke.bonte@web.de
Thu, 5 Dec 2002 23:32:46 +0900


I had a quick look through the sources. I'm not familiar with libxml,
but I think there is mistake in the gnucash functions which create the
DOM trees. They all pass the raw string over to libxml, but if I
understand the libxml documentation correctly, you should only pass
xmlCHAR* which are UTF-8 strings. My understanding is that you should
first use xmlCharEncInFunc before you add anything, and when you read a
file use the xmlCharEncOutFunc, in order to ensure smooth operation in
different locales.  

Our problem is probably here:
gnucash/src/backend/file/gnc-transaction-xml-v2.c line 179/180

But at many other places the same happens.

Please can someone who actually knows libxml can have a quick look to
check whether I am wrong.

Thank you


Reinke



On Thu, 5 Dec 2002 13:52:29 +0800
Sergei Dolmatov <sergei@dolmatov.dsb.ru> wrote:

> On Thu, Dec 05, 2002 at 02:24:44PM +0900, Reinke Bonte wrote:
> > I am not sure whether this is an GtkHTML bug. I have the same
> > problem with Japanese characters in reports. There might be
> > something wrong with my locale setting, but in my case gnucash
> > writes those strange entities into the account (.xac) file. 
> > 
> > Sergei, can you read through your account file whether you have the
> > same problem there? In my case (eucJP encoding), preliminary tests
> > suggest, that the entities in the account file are actual unicode
> > encodings, however, they are not unicode encodings of the Japanese
> > characters, but unicode encodings of the ISO-8859 representation of
> > the eucJP encoding of the Japanese characters.
> 
> Yes, in .xac I have these entities too. And they aren't UNICODE for
> cyrillic too.
> 

-- 
"Fun ist ein Stahlbad. Die Vergnuegungsindustrie verordnet
es unablaessig. Lachen in ihr wird zum Instrument des 
Betrugs am Glueck."