[GNC-dev] Normalizing live data, a suggestion for discussion

David Carlson david.carlson.417 at gmail.com
Sat Feb 2 10:40:34 EST 2019


On Sat, Feb 2, 2019, 9:25 AM Geert Janssens <geert.gnucash at kobaltwit.be
wrote:

> Op zaterdag 2 februari 2019 10:19:02 CET schreef Wm via gnucash-devel:
> > On 02/02/2019 00:16, David Cousens wrote:
> > > As well as the account names you might also want to munge data in the
> > > description/memo fields. This can contain identifying information for
> > > customers/vendors.
> >
> > How about we just zap the stuff in description/memo fields by default?
> > They're not mathematically significant and rarely cause double entry
> > problems unless someone introduces unusual UI stuff in which case they
> > should be able to provide an example.
> >
> > > Also possible any data relating to the owner of the file
> > > which is stored in the file/database.
> >
> > Does your file/database have an obvious owner?  Mine doesn't apart from
> > the name of the file which is the first and obvious thing to change
> > before you send it off for someone else to look at.
> >
> > If you mean bits of text in reports they wouldn't be included in an
> > SQLite file.
> >
> > If you mean bits of text in outbound documents I think we've already
> > zapped them.
> >
> > Have I missed your point?
> >
>
> Yes, if you use business features, you may have entered business
> identifying
> data in File->Properties. It think that's what David is referring to.
> Similarly there may be customer and vendor data (names addresses) in the
> book
> that should equally be obfuscated. Just random data is fine.
>
> Continuing on that vein, if you have bills and invoices, aside from
> randomizing the transaction's split amounts and values you'll also have to
> do
> the same for invoice entries. And to make the book useful for detecting
> business data bugs this should happen in such a way that invoice tax and
> discount amounts remain consistent after multiplying with random numbers
> *and*
> that the invoice totals continue to match the business transactions
> amounts in
> AR/AP accounts.
>
> And to make that one level more complicated, after that the payment
> transactions *also* have to continue to match the new randomized invoice
> amount (if the invoice was paid in full).
>
> It doesn't end there, payments can be split over multiple invoices, so
> again
> when one randomizes invoice amounts care must be taken to adjust the
> payments
> in proportion to the invoice amount change or fully paid invoices suddenly
> can
> become partially paid or overpaid.
>
> While this is probably all possible I believe the resulting script will be
> so
> complex that it will become a source of bugs in itself which would divert
> developer time to debugging and maintaining this script rather than
> working on
> the effectively reported bug for which a sample data file was asked in the
> first place...
>
> Up until a book with only transactions, no business data at all it sounded
> like a useful tool.
>
> Oh and we haven't mentioned SXs and budgets yet...
>
> As for Colin's question: on Windows and MacOS sqlite is supported out of
> the
> box. On linux it may require the additional installation of a libdbi
> driver.
> Most distros I know have packages for this driver but they may not be
> installed by default.
>
> Geert
>
>
> _______________________________________________
> gnucash-devel mailing list
> gnucash-devel at gnucash.org
> https://lists.gnucash.org/mailman/listinfo/gnucash-devel


Wouldn't it be simpler to create a library of template files designed to
exercise various features that a user could find one to illustrate his
concern?

Thiswould bypass the need to figure out how to sanitize every possible user
file.

If the user wants, he could still build his own example file as some users
do now.

David Carlson

>
>


More information about the gnucash-devel mailing list