Gnucash 2.6.x History with Chinese Full Path

John Ralls jralls at ceridwen.us
Wed Nov 5 00:39:57 EST 2014


> On Nov 4, 2014, at 7:04 PM, chihchieh.sun at innolux.com wrote:
> 
> 
> 
> On Nov 4, 2014, at 7:17 AM, Chenxiong Qi <qcxhome at gmail.com> wrote:
> 
>> On Tue, Nov 4, 2014 at 10:43 PM, John Ralls <jralls at ceridwen.us> wrote:
>>> 
>>>> On Nov 4, 2014, at 1:15 AM, chihchieh.sun at innolux.com wrote:
>>>> 
>>>> Sorry, I made a mistake. Not coverting ANSI to UTF-8.
>>>> 
>>>> I use the notepad++ and create a new file encoding in ANSI.
>>>> 
>>>> Then, I encoded the file in UTF-8 as below
>>>> 
>>>> Original : C:\Documents and Settings\chihchieh.sun\桌面\gnucash\資產負
>>>>> 表.gnucash
>>>> ANSI :     C:\Documents and Settings\chihchieh.sun\獢��竰\gnucash\鞈��峯鞎
> 甎鍦
>>>> 銵?gnucash
>>>> UTF-8 :    C:\Documents and Settings\chihchieh.sun\桌面\gnucash\資產負
> 債衿
>>>> gnucash.
>>> 
>>> This is bug https://bugzilla.gnome.org/show_bug.cgi?id=737089.
>>> 
>>> Regards,
>>> John Ralls
>>> 
>> 
>> I installed 2.6.4-2 in Windows 7 Professional Edition, Simplified
>> Chinese. Sorry, I don't have a copy of traditional edition. Create,
>> save as and open files, that either has name containing traditional
>> Chinese characters, or in a directory whose name contains traditional
>> Chinese characters, or both, the problem does not happen.
>> 
>> btw, in Report, the bug above happens.
> 
>> Interesting. Sun Chihchieh, what version of Windows are you using?
> 
>> Regards,
>> John Ralls
> 
> OS:
> microsoft windows xp
> professional
> version 2002
> sp3
> 
> Hi,
> 
> I try these pathes as below, and the first three of pathes are OK.
> 
> OK
> C:\Documents and Settings\chihchieh.sun\桌面\gnucash\測試賬本.gnucash
> C:\Documents and Settings\chihchieh.sun\桌面\gnucash\測試賬本1234.gnucash
> C:\Documents and Settings\chihchieh.sun\桌面\gnucash\資產負債.gnucash
> 
> NG
> C:\Documents and Settings\chihchieh.sun\桌面\gnucash\資產負債表.gnucash
> 
> I found that the word "表" would effect the function.
> 
> Howerver, the display of pathes in the history of GSettings is still
> strange as shown in picture.
> 
> Is it Normal?
> 
> C:\Documents and Settings\chihchieh.sun\桌面\gnucash\測試賬本.gnucash
> => C:\Documents and Settings\chihchieh.sun\獢\gnucash\皜祈岫鞈祆.gnucash
> 
> C:\Documents and Settings\chihchieh.sun\桌面\gnucash\測試賬本1234.gnucash
> => C:\Documents and Settings\chihchieh.sun\獢\gnucash\皜祈岫鞈祆
> 1234.gnucash
> 
> C:\Documents and Settings\chihchieh.sun\桌面\gnucash\資產負債.gnucash
> => C:\Documents and Settings\chihchieh.sun\獢\gnucash\鞈鞎.gnucash
> 
> C:\Documents and Settings\chihchieh.sun\桌面\gnucash\資產負債表.gnucash
> => C:\Documents and Settings\chihchieh.sun\獢\gnucash\鞈鞎銵?gnucash
> 

I can't think of any reason that U+8868 would change the way the filename is handled. Does it make any difference if you change its position in the string?

Look at the number of words: 桌面 becomes 獢, 測試賬本 becomes 皜祈岫鞈祆, and 資產負債表 becomes 鞈鞎銵: 2->3, 4->6,
and 5->7. The representation is UTF-8, which represents each 2-byte UTF16 codepoint as a 3-byte UTF8 representation, but Windows doesn't do UTF8 unless explicitly told to, so it's interpreting the e.g. 12 bytes of the middle string as 6 2-byte characters. What doesn't make sense to me is what encoding it's using to interpret them. If it was reading it as UTF16 the words lining up with the beginning of a UTF8 triple would evaluate to the reserved private block of Unicode, and wouldn't render. That's clearly not happening. It isn't reading them as CP936 or CP950 either.

However, it doesn't matter as long as the values aren't changed by writing them to the registry and reading them back, because gsettings will understand them as UTF8.

Regards,
John Ralls




More information about the gnucash-devel mailing list