[GNC] A Milk Run

David Cousens davidcousens at bigpond.com
Fri Apr 24 20:59:15 EDT 2020


flywire,
I have copied this to the gnucash-user mailing list (see 5 below)

1.  To get the documentation looking good and available in multiple formats you end up having to use a fairly
sophisticated product and consequently there is a learning curve. I didn't find Docbook all that hard to come to grips
with and i hadn't used it before starting to work on the documentation a couple of years ago. I have by no means fully
explored it's capabilities.  Everyone will want to do it in the system they are already most familiar with but in a
situation like GnuCash with a largely part-time  volunteer team, I think it is better to settle on a system ( best
choice at the time) and stick with it as long as possible until the requirements and/or tecnology changes force you to
reassess .   I also have a small book publishing enterprise where I use Scribus to format the books (poetry0 for
printing and simultaeous epub pdf creation and it works well for that operation. GnuCash is a different requirement
where there is a need for interaction between the code development and documentation communities.
2.  Both the Help manual and the Tutorial and Concepts Guide have grown over time, and somewhat haphazardly at times.
About 18 months ago within the group of general contributors there was a push to make the help manual more of a
technical reference. e.g. more of a gui interface description and to orient the guide more towards illustrating how to
use Gnucash to achieve particular accounting objectives  (Personal Finances) Business Features. The importing section
was originally in the guide as  it was possibly less core to the use of GnuCash than it has now become with the online
and electronic access and it was developed as an adjunct to manual entry of transactions.  Again the volunteer team is
fairly small so progress is not rapid. Keeping up with the changes in the code can be a major challenge. I also do some
programming and have an accounting background so I can poke around in the code to work out what is actually happening
which helps a bit with documenting.
3. Certainly happy to have a good look at it and incorporate it into the guide. 
4. The transaction matching has two components. The first is the avoidance of duplicate transactions. The second is the
assignment second account for the transaction which is not normally specified explicitly in a bank statement
record.  The bayesian approach in GnuCash works well on this second  problem. However you need to understand how it
works to optimize it. One of the reasons I started working on the documentation of the importing was that the
documentation was pretty poor and I didn't understand what was happening myself. You cannot import your file with a
thousand transactions in one hit and expect GnuCash to correctly assign accounts. 

The account assignment is done by tokenizing information in the description date and amount fields of the transaction
and constructing a table of the frequency of ocurrence of the tokens a particular account that has been assigned as the
second account.  When a transaction is imported its tokenized information is comapred with the frequenct table and a
score of the matches of the tokens with each possible account assignment is calculated and the one with the account
highest score is  selected and preented as the assigned account.  You can manually override that automatic assignment in
the matcher window.  When all of the transactions displayed have had the correct accounts assigned to them and you click
the OK button on the matcher window, the token data is updated into the frequency table in the data file at that point
only. If you have never imported data, that table is totally empty.   Note: The frequency table contains no information
derived from transactions which may have already been recorded manually in GnuCash without using the import matcher.  
The best strategy is to initially import data in small batches at first making sure that you manually assign the correct
accountsin each case before hitting OK to actually import the data. It is only after OK is clicked that the frequency
table in the data file is updated. If you import data with incorrect account assignments or leave transactions which are
assigned to the Imbalance accounts in the import,  you are training the system to assign the wrong accounts.  You should
notice that after a successful few imports that GnuCash's guesses at the account should improve and most accounts will
generally be assigned to the accounts you want.  At this point you can start increasing the size of the batches of data
you import. Splitting a csv file up is fairly easy in a text editor.

If you have not been completing the imports a s described or have been correcting the account assignments in GnuCash
after importing your data file is going to contain frequency table information which will misdirect the account
assignment.  Tools->Import->Map Editor allows editing of the stored tokens. Any associations with Imbalance accounts
should be deleted. This is a relatively new feature and is on my list of future documentation projects . Use with
caution. I improved the matching performance considerably by editing out data for files which were being assigned
incorrectly fairly frequently.   The matcher is never going to work perfectly unless the imported data explicitly
specifies the second account for the transaction.  In this case Gnucash also constructs a map of accounts specified in a
Transfer account field  to specific accounts in the GnuCash internal account heirarchy.
5 Are you on the GnuCash mailing lists https://lists.gnucash.org/mailman/listinfo. The gnucash-user list and the various
language lists are generally the most useful for new users.  The gnucash-devel list is mainly for the
developers/documenters . Click on the link for a list and the next page will be a signup/registration page.. there is
also information about using the lists in the Gnucash wiki pages https://wiki.gnucash.org/wiki/Mailing_Lists.   The
mailing lists can also be accessed through various web interfaces described in the wiki page which require a separate
registration. Some of the users are old style purists who regard web pages as some sort of arcane magic.  Some of the
developers also use IRC for discussion while working on the code and do sometimes field user questions there although
most is via the mailing lists as that has a searchable archive.
Cheers David
On Sat, 2020-04-25 at 08:57 +1000, flywire wrote:
> I wanted to run a few things past you.
> What are your thoughts on Bug 722016 - We should change the Documentation file format ?Re Help Manual and Concepts
> Guide. You know about documentation systems. Electronic systems make some things a lot harder and the tables of
> commands, eg 
> 
> Table 4.1. Account Tree - File Menu - Access to file, just kills that document for me. That content seems more like
> reference material.If you can handle Importing data example #132 use case I'd be pleased for someone to pick it up.
> I've documented my thoughts fairly well.I have a bigger issue. I have a few thousand transactions in a bank statement
> csv. How do I get through the transaction matching process? How to communicate. Support is: Bug Lists, IRC (Does
> anyone use IRC anymore?). There is no forum for discussion or even 
> 
> https://discordapp.com / 
> 
> https://money.stackexchange.com and Issues aren't enabled in 
> 
> https://github.com/Gnucash/gnucash-docs. There's a generation issue here, your grandkids have a different view of how
> communication works.
> Stay safe
> 
> 
> 
> 
> 
> 
> 
> 
> 
-- 
Dr David R Cousens
B.Sc, M.Prof. Acc., Ph.D., G.C.Ed


More information about the gnucash-user mailing list