problem with qif-parse.scm in trunk

Charles Day cedayiv at gmail.com
Wed Sep 24 10:06:47 EDT 2008


On Tue, Sep 23, 2008 at 7:07 PM, David Reiser <dbreiser at earthlink.net>wrote:

>
> On Sep 23, 2008, at 9:50 AM, Charles Day wrote:
>
> On Mon, Sep 22, 2008 at 8:38 PM, David Reiser <dbreiser at earthlink.net>wrote:
>
>>
>> On Sep 22, 2008, at 12:42 PM, Charles Day wrote:
>>
>> On Mon, Sep 22, 2008 at 9:41 AM, Charles Day <cedayiv at gmail.com> wrote:
>>
>>> On Mon, Sep 22, 2008 at 8:13 AM, Derek Atkins <warlord at mit.edu> wrote:
>>>
>>>> Quoting David Reiser <dbreiser at earthlink.net>:
>>>>
>>>> >
>>>> > On Sep 22, 2008, at 10:17 AM, Derek Atkins wrote:
>>>> >
>>>> >> David Reiser <dbreiser at earthlink.net> writes:
>>>> >>
>>>> >>> As it exists currently, qif-parse.scm does not work, even with the
>>>> >>> escaped version of a3. However, if I change \xa3 to \\xa3, gnucash
>>>> >>> will run. That looks like a escaping/quoting inconsistency among
>>>> >>> systems. Is that any easier to solve than the base encoding problem?
>>>> >>
>>>> >> If you change it to \\xa3 then does it properly deal with the £ in
>>>> >> the QIF?
>>>> >>
>>>> >>>
>>>> >> -derek
>>>> >
>>>> >
>>>> > I don't know. Is there a sample qif file I can test? What will I be
>>>> > looking for?
>>>>
>>>> See http://bugzilla.gnome.org/show_bug.cgi?id=141003
>>>>
>>>
>>> Doubling up the backslashes should break the fix, as then the backslash
>>> loses its special regex expression meaning.
>>>
>>
>> Sorry, wrong attachment on the previous message. I've now attached the
>> correct one.  -Charles
>>
>>
>>>
>>> Please see the simple QIF file attached. It contains the British Pound
>>> symbol in ISO 8859-1 (0xA3). This is what the QIF importer needs to be able
>>> to handle. Here is the output of 'od':
>>> $ od -c 141003a.qif
>>> 0000000   !   A   c   c   o   u   n   t  \n   N   M   y       C   r   e
>>> 0000020   d   i   t       C   a   r   d  \n   T   C   C   a   r   d  \n
>>> 0000040   ^  \n   !   T   y   p   e   :   C   C   a   r   d  \n   D   2
>>> 0000060   2   /   0   9   /   2   0   0   8  \n   P   T   e   s   t
>>> 0000100   p   a   y   e   e  \n   T 243   3   8   .   4   6  \n   ^  \n
>>> 0000120
>>>
>>> -Charles
>>>
>> Correct. Doubling the backslash breaks the fix.With LANG=en_US.UTF-8, my
>> system even complains "some characters have been discarded" during the
>> import when I make the .scm file UTF-8 while the qif is latin-1. (Though I
>> haven't changed the other two files in the changeset...).
>>
>> I need that LANG setting because that's the only way to get gtkprint to
>> use US-letter paper parameters while printing checks. (Unless someone wants
>> to add a gnumeric-style default page setup to gnucash.)
>>
>
> If you change that LANG setting, does it affect the QIF import behavior?
> Regarding page setup, have you tried out Mike Alexander's patch for bug
> 531871?
>
>
> Hmm. Turns out it was LC_MESSAGES that I was setting to force gtkprint to
> use US Letter as the default paper. Mike's patch does indeed solve that
> problem. I'd vote to move that patch to trunk sooner rather than later.
>
> Turning off the setting of LC_MESSAGES and running:
> LANG=C /opt/gnucash-svn/bin/gnucashdoes allow gnucash to start. But I
> still get the error message in the qif dialog "Some characters have been
> discarded" when importing 141003a.qif
>

The "Some characters have been discarded" message means that the default
attempt to convert the string to UTF-8 according to locale has failed and
the pound symbol was just deleted instead. If you continue then the import
should succeed.

I have just tested with that test QIF file myself, and the pound symbols are
getting converted to UTF-8 and actually failing the regex match. So it seems
that there is a bug here and I am completely bewildered as to how I could
have missed it, as I tested the code with a large file chock full of pound
symbols. Or somehow fooled myself with a different file (must have). Anyway
I will dive in and report back here.


> Since I'm not setting $LANG anywhere in .profile or .bashrc, I'd guess some
> setting in the Mac realm is doing it for me. Defeating that kind of setting
> is not something I think I should do in generic packaging, however.
>
> Dave
> --
> David Reiser
> dbreiser at earthlink.net
>

-Charles


More information about the gnucash-devel mailing list