One reason why gnucash will not start because of corupt XML file

Derek Atkins warlord at MIT.EDU
Wed Aug 28 10:10:25 EDT 2013


Hi Keith,

Keith <tunerchip at gmail.com> writes:

> No sure if this is the correct place to report this however it may
> save someone a lot of time
> I download CVS files from paypal and import them into gnucash
> This works a treat however after doing several files and closing down
> gnucash
> When I came to start up gnucah it would not start and reported parse
> error in XML file loading at startup
> When I closed it down it was working fine showing all the imports OK
>
> Anyway basically what was happining , I had sold some stuff to china
> and the downloaded CVS files from paypal contained some unreadable
> chinese characters, these unreadable characters were then writen into
> the XML file when I imported them , then when the program was closed
> then restarted it hit a parse error with the XML files ,the none
> readable characters must be control characters in the XML.
> The solution untill a developer puts a filter into the import is to
> put the Imported CVS into Notepad++ and replace none readable
> charecters with the Notpad++ TextFX plugin
> this swaps all none readable characters except the CR LF

Ahh, that's definitely an interesting problem.  Yes, GnuCash (and the
XML parser) expect the data to be UTF8.  The GnuCash UI enforces this,
because you can only enter UTF8 into the UI.  However, clearly the
importer(s) do not properly sanitize the input data, meaning you could
import non-UTF8 data which will get saved out and then, as you see, wont
reload.

Can you please file a bug report in Bz about this?  Specify which
importer(s) you used.  The fix will be to make sure the data is UTF8 on
import (possibly by asking for asking for an imported encoding/charset,
or just dropping non-UTF8 characters).

Thanks!

-derek

-- 
       Derek Atkins, SB '93 MIT EE, SM '95 MIT Media Laboratory
       Member, MIT Student Information Processing Board  (SIPB)
       URL: http://web.mit.edu/warlord/    PP-ASEL-IA     N1NWH
       warlord at MIT.EDU                        PGP key available


More information about the gnucash-user mailing list