File signatures??

matt at considine.net matt at considine.net
Thu Jun 29 08:52:03 EDT 2017


Hi Maf.

Thanks for the header string from a custom report - that is promising 
search string.

I was in fact referring to the Chart of Accounts and agree that if I can 
find the data file I should be fine.

The problem is that the recovery operation (using Testdisk/Photorec) 
results in files and file fragments that may or may not be correctly 
identified by file extensions.  And in the vast majority of cases the 
filename is of the form 'f765987234' plus an extension.  If the filetype 
couldn't be determined from a list of 400 or so file signatures that 
Photorec has then it gets a '.txt' extension.

And in all cases the file folder structure is lost, replaced instead by 
folders that have 500 or so recovered files and file fragments.

PDFs, JPGs, etc seem to be largely recovered, with the identification 
and sorting challenge still ahead.  Problematic are files that Photorec 
doesn't recognize ahead of time or which have no real "signature".  And 
things like an encrypted database of passwords will probably only be 
found by dumb luck.

So if I can identify the data file, you're right, I should be fine.  
It's finding it that is the problem, assuming it's not corrupted.  That 
has led me to focus on a "plan b" which would be a process for find 
either backups of transactions, plus something describing the Chart of 
Accounts that I could ideally import, plus any customized reports or 
styling.  Which is why I'm hunting for strings that could be considered 
"signatures."

In going through the forum, I've identified
   'trans_guid', "acc_guid', 'split_guid', 'PluginPageAccountTree', 
'<act:id', '<trn:id', '<split:id', 'gnc:account version', 
'gnc:transaction version', '<trn:split>'
as likely-unique "markers" for gnucash-related files, showing up for 
example in some .xml.gz files that I have yet to explore.

So I'm hopeful that I can identify a minimum number of files that I can 
manually import or stitch together to get back up and running.

When I do so, I'm happy to detail the process, assuming it's not 
considered too far off topic.

Thanks for your help and everyone's patience.
Matt

On 2017-06-29 07:13, Maf. King wrote:
> Hi Matt.
> 
> What do you mean by category structure?  If you're talking about the 
> Chart of
> Accounts (ie the "tree"), that is stored in the data file.  Recover it 
> (or one
> of the auto-backup copied in the same directory) and I'd expect all to 
> be
> well.
> 
> Custom reports are in ~/.gnucash/saved-reports-2.4 - this head extract 
> from
> mine suggests that a sequence of semicolons might be a good search 
> string?  or
> "gnc:report" etc.
> 
> <SNIP>
> maf at janus:~/.gnucash> head saved-reports-2.4
> ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;
> ;; Options for saved report "VAT Box 1 - Last Qtr", based on template
> "2fe3b9833af044abb929a88d5a59620f"
> (let ()
>  (define (options-gen)
>   (let ((options (gnc:report-template-new-options/report-guid
> "2fe3b9833af044abb929a88d5a59620f" "Transaction Report")))
> 
> ; Section: General
> 
> (let ((option (gnc:lookup-option options
> 
> <SNAP>
> 
> Maf.
> 
> 
> On Thursday, 29 June 2017 11:38:28 BST MattC wrote:
>> Thank you.  That is exactly what I'm looking for.  Fwiw, it appears 
>> that
>> searching for "trans_guild" identifies log files/fragments.  So I'm 
>> hoping
>> that piecing those together migjt be a successful "plan b".   I think 
>> that
>> leaves the category structure to try to figure out, as well as - 
>> though
>> less importantly - customized reports.
>> 
>> I'll report back how it goes, but if anyone has suggestions for other 
>> files
>> to search for, I'd appreciate the input.
>> 
>> Matt
>> 
>> 
>> Sent from my Verizon Wireless 4G LTE smartphone
>> 
>> <div>-------- Original message --------</div><div>From: "Maf. King"
>> <maf at chilwell.net> </div><div>Date:06/29/2017  4:31 AM  (GMT-05:00)
>> </div><div>To: gnucash-user at gnucash.org </div><div>Cc: 
>> matt at considine.net
>> </div><div>Subject: Re: File signatures?? </div><div>
>> </div>On Wednesday, 28 June 2017 14:40:09 BST matt at considine.net 
>> wrote:
>> > Hi,
>> >
>> > I think this is the right venue to ask this question.  If not, I can
>> > hopefully get a pointer to where else to turn.
>> >
>> > I need to figure out what - if any - file signatures could be used to
>> > identify gnucash data files.  The need arises from a harddisk crash and
>> > recovery effort, the result of which was a *lot* of files and file
>> > fragments recovered, but at the expense of the harddisk's directory
>> > structure and filenames (for the most part).  The harddisk in question
>> > has terabytes of data on it, so going through the disk manually is not
>> > practical.
>> >
>> > On this disk were the data files for a non-profit which had a somewhat
>> > customized account tree structure.  What I am trying to figure out is if
>> > there are any unique headers to a minimum number of files that could be
>> > used to recreate the transactions and other data in gnucash?  If there
>> > are keywords or byte strings I can use, then I can use disk search tools
>> > for look for the files and fragments that are relevant and try to stitch
>> > things back together.
>> >
>> > FWIW, I believe the account data was stored as XML rather than in a
>> > database.  And the version of gnucash I was using was whatever version
>> > was stable at the beginning of this calendar year.
>> >
>> > Thanks in advance for any help or pointers.
>> >
>> > Matt
>> >
>> > PS I already understand the wisdom of having some backup elsewhere, so I
>> > can forgo that pointer.  The problem in this case was that this unit was
>> > also the backup.
>> 
>> Hi Matt
>> 
>> from my system:
>> 
>> ~> head myfile.gnucash
>> <?xml version="1.0" encoding="utf-8" ?>
>> <gnc-v2
>>      xmlns:gnc="http://www.gnucash.org/XML/gnc"
>>      xmlns:act="http://www.gnucash.org/XML/act"
>>      xmlns:book="http://www.gnucash.org/XML/book"
>>      xmlns:cd="http://www.gnucash.org/XML/cd"
>>      xmlns:cmdty="http://www.gnucash.org/XML/cmdty"
>>      xmlns:price="http://www.gnucash.org/XML/price"
>>      xmlns:slot="http://www.gnucash.org/XML/slot"
>>      xmlns:split="http://www.gnucash.org/XML/split"
>> 
>> Now, this is an uncompressed gc file from v 2.6.16, but dating back 
>> years; a
>> file created this year has exactly the same first few lines though.
>> 
>> If your file was saved with compression turned on, then your task is
>> probably harder - look for gz compressed files.
>> 
>> HTH,
>> Maf.


More information about the gnucash-user mailing list