[GNC] Fwd: Smaller backup files: Time division, Since or Incremental?

Stephen M. Butler Stephen.M.Butler51 at gmail.com
Tue Dec 22 19:08:13 EST 2020


I am addressing the backup issue from the point of view of a former 
Oracle Database Administrator and retired IT Manager.

1.  Multiple generations of backup for a file on the same media do you 
no good.  You need a three to four generation of independent backups.  
That way, if you lose or destroy the most recent media, you can work 
back in time to recover your data from an earlier point in time.

2.  Backup files without the roll-forward logs do you little good.  You 
might recover the grandfather file but without the intervening 
transaction logs you can't get forward to the parent (much less the 
son/daughter).

3.  Depending on more than four generations of backup files indicates a 
poorly designed backup strategy.  Have you tried rolling forward a 
database (set of files) with 30 generations of logs?  Yes, I know most 
software is designed for a full backup once a month with 30 days of 
increments -- but even there the file you really need is on the most 
recent incremental and there are ways to get to it quickly.
         Note:  Yes, I had to roll forward a database from a month old 
backup and I did have all 30 days of logs.  Not fun! Stressful even!  
Not a quick act.  I sat the sys-admin down and we had a new strategy in 
place the next day.

4.  Once a particular file is on three or four independent backup media 
(think generations), ask yourself why you still need it on your main 
hard drive -- especially if it is only a backup file.

If you are planning for that one-in-a-million years event, forget it.  
The rest of your business won't survive it and you won't need that data 
file anyway!

Yes, disk is cheap and having multiple sets of backups on independent 
disks (used to be tape) is great -- but a daily/weekly/monthly cleanup 
script to remove outdated material is also very cheap and very effective 
at keeping storage needs at a lower level.




On 12/22/20 1:15 PM, Geoff wrote:
> Hi David
>
> You said that:
> "One thing that complicates my gnucash life is the size and number of 
> backup files."
>
> Depending on your risk appetite, you could simplify your backup regime 
> by only backing up the master file itself:
>
> -rw-rw-r-- 1 dgp dgp 3433048 Dec 22 08:23 
> GnuCash/ubuntu-DGPickett.gnucash
>
> Or, that file plus the transaction log files (which are much much 
> smaller as they only contain the transactions you have entered in a 
> session).
>
> Regards
>
> Geoff
> =====
>
> On 23/12/2020 12:51 am, David G. Pickett wrote:
>> I am sure my daily stock quotes do not help (also very historical), 
>> but this just reflects our quiet retired life as a middle class 
>> couple, not in the 1%:
>>
>> dgp at dgp-p6803w:~
>> $ new GnuCash/*gnucash
>> -rw-rw-r-- 1 dgp dgp 3430699 Dec 18 14:52 
>> GnuCash/ubuntu-DGPickett.gnucash.20201218145806.gnucash
>> -rw-rw-r-- 1 dgp dgp 3430685 Dec 18 14:58 
>> GnuCash/ubuntu-DGPickett.gnucash.20201218150354.gnucash
>> -rw-rw-r-- 1 dgp dgp 3430760 Dec 18 15:03 
>> GnuCash/ubuntu-DGPickett.gnucash.20201219000126.gnucash
>> -rw-rw-r-- 1 dgp dgp 3431599 Dec 19 00:01 
>> GnuCash/ubuntu-DGPickett.gnucash.20201220000125.gnucash
>> -rw-rw-r-- 1 dgp dgp 3431594 Dec 20 00:01 
>> GnuCash/ubuntu-DGPickett.gnucash.20201221000126.gnucash
>> -rw-rw-r-- 1 dgp dgp 3431594 Dec 21 00:01 
>> GnuCash/ubuntu-DGPickett.gnucash.20201221121236.gnucash
>> -rw-rw-r-- 1 dgp dgp 3431870 Dec 21 12:12 
>> GnuCash/ubuntu-DGPickett.gnucash.20201221122809.gnucash
>> -rw-rw-r-- 1 dgp dgp 3432037 Dec 21 12:28 
>> GnuCash/ubuntu-DGPickett.gnucash.20201222000126.gnucash
>> -rw-rw-r-- 1 dgp dgp 3432967 Dec 22 00:01 
>> GnuCash/ubuntu-DGPickett.gnucash.20201222082339.gnucash
>> -rw-rw-r-- 1 dgp dgp 3433048 Dec 22 08:23 
>> GnuCash/ubuntu-DGPickett.gnucash
>> dgp at dgp-p6803w:~
>> $ file GnuCash/*gnucash
>> GnuCash/ubuntu-DGPickett.gnucash:                        gzip 
>> compressed data, from Unix, original size modulo 2^32 54774628
>> GnuCash/ubuntu-DGPickett.gnucash.20201001000150.gnucash: gzip 
>> compressed data, from Unix, original size modulo 2^32 53227358
>> GnuCash/ubuntu-DGPickett.gnucash.20201001103626.gnucash: gzip 
>> compressed data, from Unix, original size modulo 2^32 53242986
>> GnuCash/ubuntu-DGPickett.gnucash.20201001105516.gnucash: gzip 
>> compressed data, from Unix, original size modulo 2^32 53247987
>> GnuCash/ubuntu-DGPickett.gnucash.20201001110638.gnucash: gzip 
>> compressed data, from Unix, original size modulo 2^32 53249627
>>
>>
>> -----Original Message-----
>> From: Geoff <cleanoutmyshed at gmail.com>
>> To: David G. Pickett <dgpickett at aol.com>; gnucash-user at gnucash.org 
>> <gnucash-user at gnucash.org>
>> Sent: Mon, Dec 21, 2020 5:51 pm
>> Subject: Re: [GNC] Fwd: Smaller backup files: Time division, Since or 
>> Incremental?
>>
>> Not to my knowledge.
>>
>> Out of interest, how big is your XML file currently, and is it
>> compressed or uncompressed?
>>
>> Edit / Preferences / General /  Compress files.
>>
>> Thanks
>>
>> Geoff
>> =====
>>
>> On 22/12/2020 3:28 am, David G. Pickett via gnucash-user wrote:
>>  > Are there options to not replicate the full file?
>>  >
>>  >
>>  >    - Loading the data from multiple files would not take any 
>> appreciable additional time.
>>  >    - Loading the recent data first might save times, as really old 
>> data is off screen in most accounts.
>>  >    - The algorithm could even take into account keeping deeper 
>> time images of low churn accounts so the first page could be 
>> populated and the rest installed in the background.
>>  >
>>  >
>>  > -----Original Message-----
>>  > From: Geoff <cleanoutmyshed at gmail.com 
>> <mailto:cleanoutmyshed at gmail.com>>
>>  > To: David G. Pickett <DavidGPickett at comcast.net 
>> <mailto:DavidGPickett at comcast.net>>; gnucash-user at gnucash.org 
>> <mailto:gnucash-user at gnucash.org>
>>  > Sent: Mon, Dec 21, 2020 4:30 am
>>  > Subject: Re: [GNC] Fwd: Smaller backup files: Time division, Since 
>> or Incremental?
>>  >
>>  > Have you considered only backing up the log files then? They are your
>>  > incrementals...
>>  >
>>  > 
>> https://www.gnucash.org/docs/v4/C/gnucash-guide/basics-backup1.html 
>> <https://www.gnucash.org/docs/v4/C/gnucash-guide/basics-backup1.html>
>>  >
>>  > Geoff
>>  > =====
>>  >
>>  > On 21/12/2020 1:28 pm, David G. Pickett wrote:
>>  >>
>>  >>
>>  >>
>>  >> -------- Forwarded Message --------
>>  >> Subject:     Smaller backup files: Time division, Since or 
>> Incremental?
>>  >> Date:     Sat, 19 Dec 2020 14:40:54 -0500
>>  >> From:     David G. Pickett <DavidGPickett at comcast.net 
>> <mailto:DavidGPickett at comcast.net>>
>>  >> To: gnucash-devel at gnucash.org <mailto:gnucash-devel at gnucash.org>
>>  >>
>>  >>
>>  >>
>>  >> One thing that complicates my gnucash life is the size and number of
>>  >> backup files.
>>  >>
>>  >>     * It'd be nice, since most of the data is very historical, if 
>> it the
>>  >>       data was divided by time into multiple files, more finely 
>> in rear
>>  >>       time.  Even if old files occasionally get updated by new 
>> work, they
>>  >>       would mostly be static.
>>  >>     * Another traditional way to keep backup sizes down is the 
>> Since-the
>>  >>       last-full and the Incremental since the last incremental.
>>  >>     * Since files are in xml, if they are line divided by 
>> transaction or
>>  >>       entry, text tools like good old sccs can discern differences.
>>  >>
>>  >> _______________________________________________
>>  >> gnucash-user mailing list
>>  >> gnucash-user at gnucash.org <mailto:gnucash-user at gnucash.org>
>>  >> To update your subscription preferences or to unsubscribe:
>>  >> https://lists.gnucash.org/mailman/listinfo/gnucash-user 
>> <https://lists.gnucash.org/mailman/listinfo/gnucash-user>
>>  >> If you are using Nabble or Gmane, please see
>>  >> https://wiki.gnucash.org/wiki/Mailing_Lists 
>> <https://wiki.gnucash.org/wiki/Mailing_Lists>for more information.
>>  >> -----
>>  >> Please remember to CC this list on all your replies.
>>  >> You can do this by using Reply-To-List or Reply-All.
>>
>>  > _______________________________________________
>>  > gnucash-user mailing list
>>  > gnucash-user at gnucash.org <mailto:gnucash-user at gnucash.org>
>>  > To update your subscription preferences or to unsubscribe:
>>  > https://lists.gnucash.org/mailman/listinfo/gnucash-user 
>> <https://lists.gnucash.org/mailman/listinfo/gnucash-user>
>>  > If you are using Nabble or Gmane, please see 
>> https://wiki.gnucash.org/wiki/Mailing_Lists 
>> <https://wiki.gnucash.org/wiki/Mailing_Lists>for more information.
>>  > -----
>>  > Please remember to CC this list on all your replies.
>>  > You can do this by using Reply-To-List or Reply-All.
>>  >
> _______________________________________________
> gnucash-user mailing list
> gnucash-user at gnucash.org
> To update your subscription preferences or to unsubscribe:
> https://lists.gnucash.org/mailman/listinfo/gnucash-user
> If you are using Nabble or Gmane, please see 
> https://wiki.gnucash.org/wiki/Mailing_Lists for more information.
> -----
> Please remember to CC this list on all your replies.
> You can do this by using Reply-To-List or Reply-All.


-- 
Stephen M Butler, PMP, PSM
Stephen.M.Butler51 at gmail.com
kg7je at arrl.net
253-350-0166
-------------------------------------------
GnuPG Fingerprint:  8A25 9726 D439 758D D846 E5D4 282A 5477 0385 81D8



More information about the gnucash-user mailing list