Search and Replace
Lincoln A Baxter
lab at lincolnbaxter.com
Mon Jan 4 20:56:13 EST 2016
On Sun, 2016-01-03 at 20:16 -0500, R. Victor Klassen wrote:
> I would like to elevate the question to something more along the line
> of a feature request.
>
> The scenario is as follows:
>
> During the high season, we have 5-6 invoices per week, largely
> containing the same items, but different quantities, and sometimes
> different prices.
> Occasionally auto-fill doesn’t happen - for what reason, I don’t
> know. It is usually in such an instance that an item may have the
> wrong account associated with it.
> But this means that accounts can from time to time get wrong.
>
> And then for the next - oh, 4-6 weeks, this is applied to every
> invoice with that item, thanks to autofill. I notice when one of the
> income accounts is surprisingly low or high (usually low, as I’m less
> likely to be suspicious if it is high).
>
> Here is where a search and replace would be wonderful. Find all
> occurrences of XXX in the description field of all invoices between
> date D1M1YYY1 and date D2M2YYY2 and change the account to AAA if it
> is not already. Possibly with a confirm requested on each
> occurrence.
OK, so this looks exactly like a problem I just solved last week! (with
a perl script).
In my case, I had used the ability to delete accounts, and move all transactions to another account (higher in the hierarchy) to remove accounts that never had more than 2 or 3 transactions per year. But, several months later, I decided I had gone too far, and wanted to back it out at least part way... (because it turned out I had included in the consolidation payroll splits that happened twice a month and it created too much noise for how I want to account for the other transaction.
So I needed to bulk move of all of the transactions referencing an account in the transaction's split, to a different account, using a regular expression match on the transaction description.
The script I wrote is attached along with the text documentation. While it does not look at transaction date ranges, I think that this would be useful to add. I'm planning to do some other transaction date processing (to divide up my GC file with YEARS if transactions in it, into files with older (archived) data and a current file with the most recent year at least, so adding date range searching is something I'd consider for this script, if others think they would use it, I'll see if I could do that sooner than later. I think capability would be useful, and it would more narrowly target the moves to make. So it would add a degree of safety.
Lincoln
-------------- next part --------------
A non-text attachment was scrubbed...
Name: gc_move_splits.pl
Type: application/x-perl
Size: 15614 bytes
Desc: not available
URL: <http://lists.gnucash.org/pipermail/gnucash-user/attachments/20160104/3c207e98/attachment.pl>
-------------- next part --------------
NAME
gc_move_splits.pl
COPYRIGHT
Copyright (C) 2016 Lincoln A Baxter
This program is free software: you can redistribute it and/or modify it
under the terms of the GNU General Public License as published by the
Free Software Foundation, either version 3 of the License, or (at your
option) any later version.
This program is distributed in the hope that it will be useful, but
WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General
Public License for more details.
Please see the GNU General Public License at
<http://www.gnu.org/licenses/>
ABSTRACT
Move transaction splits from one account in the GC file to another.
Useful if you have consolidated accounts, and then discovered you want
to separate out some of the transactions into different accounts.
SYNOPSIS
gc_move_splits.pl [options] CG-file.xml [modified-GC-file.xml]
options:
--help print this synopsis help text
--man print the full man page (this file's POD)
--verbose increased verbosity (mainly for debugging)
--fromAcctSplit=AccountPath path of account the split should be move FROM
--toAccountSplit=AccountPath path of account the split should be move TO
--dumpAccount=file|- create a dump of account paths and guids
--description=regex Only move splits in transactions matching
description= my be repeated to specify more than one
--pathSeparator=char GnuCash account path separator (default=:)
The second file argument is is optional. With no destination output
file, gc_move_splits.pl runs in analytical/trial mode, and reports all
actions taken and produces all specified analysis outputs.
The source input file is never modified.
If your gnucash data file is compressed you must uncompress it first (on
unix based OSes) as follows:
cat gnucach_data_file gunzip > uncompressed_gnucash_file.xml
Or you can just uncheck the "compressed file" option in GnuCash and
save.
DISCUSSION
The script reads an uncompressed "version 2" GnuCash XML datafile and
writes a new file with the modified splits.
Splits to be moved are found by
1. Getting the Guid for the account from which the split is to be removed
2. Getting the Guid for the account into which the split should be moved
Then traversing all transactions in the file looking for
1. transactions with descriptions matching the input regex
2. finding the split with the guid to be removed
3. replacing the guid to be removed with the guid of the destination account
The script reads the GnuCash xml file *as XML* not as text. The script
uses XML::LibXML CPAN perl module to read, traverse modify and output a
new XML file. It is not dependant on the formating of the gnucash XML
unlike most perl scripts this author has seen on the GnuCash users email
list.
Because this script does not treat the GnuCash file as text, it is not
subject to the breakage that would occur if the formatting of the XML
data were to change.
Instead, this script reads the GnuCash data a DOM structure, and then
manipulates that structure.
The script does not modify the input GnuCash XML datafile. The user must
specify an outout data filename. The results of the script's operations
are written to this file, which should then be opened and checked in
GnuCash. Before replacing the original file.
To print a usage synopsis: gc_move_splits.pl --help
To print the synopsis, plus option descriptions: gc_move_splits.pl
--help --verbose
To print the entire man page: gc_move_splits.pl --man
ENVIRONMENT
Because gc_move_splits.pl reads the input XML file *as XML* using the
perl CPAN XML::LibXML module, using an XPath expression to find the
bayes matching data slots in each account, the script requires that your
perl environment have the CPAN XML::LibXML module installed.
The command
perl -c -MXML::LibXML </dev/null
will report an error if the module is not installed in your environment.
Of course the script will report this also, because if it is not
present, the script will not compile.
Unix/Linux environments
Most Linux distributions make this available via the their standard
package managers. On Debian based distributions this can be installed
with the following command:
sudo apt-get install libxml-perl
On Unix/Linux environments this script should be made executable with
the chmod command
chmod +x gc_move_splits.pl
Windows environments
XML::LibXML is also available in Active Perl, and in the cygwin
environments.
This author has not tested this script on windows, but knows of no
reason why it will not work, once the required environment is installed.
On windows, the easiest way to run the script would be by using perl
from a cmd prompt:
perl gc_move_splits.pl
Macintosh environments
This author is not familiar with the OSX environment. But knows it is
based on BSD Unix. The script should run from a terminal prompt, once
the requisit, LIBXml is installed in the perl environemnt. Patches to
these instructions are welcome.
OPTIONS
All options may be abbreviated as long as the option is distinct from
all other options.
--description='regex'
This regular expression is used for identifying transaction splits to be
moved or retargeted to another account.
This switch may be specified multiple times to specify multiple Regexes:
--descri='^Check' --descr='United Way'
The first will find "Check" at the beginning of a transaction
description. The second will find "United Way" anywhere in the
transaction description.
--verbose
Print very verbose output to STDOUT. Used for debugging. Don't bother.
Besides, this is also not very well implemented in this script at this
time.
--help (or -h)
Print help text.
--path(Separator)=:
Specifies the character used as the path separator in your GC datafile.
The default value is a colon (:)
--man
Print the full man page documentation
--fromAcctSplit=
Full path of account we what to move the split FROM.
--toAcctSplit=
Full path of account we what to move the split TO.
--description
Specifies the regex that will be looked for in the transaction
description. May be repeated.
--dumpAcct=file|-
Dump the full paths for all the accounts in the GC file to either the
file specified or to STDOUT, if "-" is used.
AUTHOR
Lincoln A. Baxter email: my intials (all three) (at) lincolnbaxter (dot)
com
More information about the gnucash-user
mailing list