save corrupted gnucash file
Zhang Weiwu
zhangweiwu at realss.com
Thu Sep 23 22:00:25 EDT 2010
Hello. the story had to be told in timeline. Skip the indented content
for fast reading, but if you decide to reply, please do read the
indented details. Thanks in advance!
2010-08-18: I Found Invoices With Duplicated IDs, And "Fixed" It By
Manually Edit GnuCash XML File.
Detail:
After using gnucash for 2 years, in gnucash all invoice number from
000000 to 000044 are taken; and I found two problems:
1. Invoices with duplicated ID exists. There are two invoice
000012s, two 000020s and two 000028s. In all cases, one is
posted and paid, the other not.
2. Customers with duplicated ID has been found. One customer have
two records for it, both 000011. One has many invoice
associated with it, the other not.
It's not clear how automatically generated IDs can have duplicate
case, but they must be manually created instead of automatically
duplicated by GnuCash, because the duplicated ones always have
content that is different than the other with same ID, and the
difference are the types that can only be created by human (e.g.
"Service" and "Services").
I fixed it in this way:
1. Empty all entries on the duplicated not-posted invoices.
2. Back up current gnucash file.
3. With XML document processing experience in the past years, I
directly worked on the gnucash file, removing duplicated
not-posted invoices and duplicated not-having-any-invoices
customers from the XML file by observing XML structure and
removing the entries. I might also removed a few references of
the entries.
4. Start gnucash, checked these duplicated invoices and customers
are really gone. But I did not check if I lost other invoices too.
2010-09-21: More Than A Month Later I Found The Previous XML Fix Might
Be Wrong, But Is A Month Late Too Late To Repair?
I checked the backup I made on 2010-08-18, found invoice 000039 to
000044 in that backup is gone in my current working file. It is hard
to explain /why/ they are gone, but I might have an answer for
/when/. A check of .xac files shows they had been gone as early as
the earliest sax file on 2010-18-23, but the period between
2010-08-18 to 2010-18-23 is unclear as they are removed by gnucash
(gnucash wipe historical records more than a month old), thus it is
not clear if they are accidentally deleted by user or is a result of
my modification to XML source. One thing is clear, if some user
operation has been done to the gnucash file between 2010-08-18 to
2010-18-23, it must be so in-significant that we can afford to lose,
because during that period all our customers are in holiday.
During the last month, one Bill was created and paid; one Invoice
was created, posted not paid. Their IDs are 000039 and 000040,
overlapping the invoices lost.
Story over. Now following are trials to solve this problem.
I think of
1. Recover lost invoices, a.k.a. moving the lost invoices from the
backup of 2010-08-18 to the current working file.
2. Replay changes, a.k.a. take the backup of 2010-08-18, replay all
changes of the log file to it.
I believe recovering invoices is a very difficult task, because of
multiple complicated XML internal references to these paid invoices.
Replay the changes sounds much easier, even counting in the effort I
had to solve the duplicated invoices/customers problem again.
I tried to replay all changes and failed.
Detail:
To replay all changes, first I need to collect them. I did with this:
$ head -n 2 rss_gnucash_20100923210659.log > /tmp/merged.log; for i in rss_gnucash_2010082714*.log rss_gnucash_201009[01]*.log ; do tail -n +2 $i | sed -e 's/000039/000045/' -e 's/000040/000046/'>> /tmp/merged.log ; done;
Note the 'sed' statement is to shift the invoice/bill numbers
created in last month. I have double checked by diff that 'sed' did
not do any stupid replacement in this case.
And I load GnuCash with the backup of 2010-08-18, ask it to import
the merged.log, save. Now:
1. In the saved gnucash XML file, the lost invoices are still
there, the two bill/invoices newly created in the last month
is also in the XML file.
2. if I load GnuCash with this file, search for all invoices, the
two newly created bill/invoices do not appear in the list. So
they are somehow hidden. I checked the transaction of the bill
and found 3 transactions associated with the bill.
So, the replay is not successful.
What do you suggest me to do from here? Try different way to replay
logs? Manually re-enter everything in the last month by reading the log
files? Or moving lost invoices from the backup?
I will go for manual re-entry if no good way out exist. It might take 2
working days to re-enter everything last month.
Thanks in advance!
More information about the gnucash-user
mailing list