[GNC-dev] Normalizing live data

Hendrik Boom hendrik at topoi.pooq.com
Sat Feb 2 12:44:52 EST 2019


On Sat, Feb 02, 2019 at 04:30:30PM +0100, Geert Janssens wrote:
> Op zaterdag 2 februari 2019 14:31:43 CET schreef Hendrik Boom:
> > > On 2/1/19 5:36 AM, Wm via gnucash-devel wrote:
> > > > [2] as long as the transaction stream balances the actual numbers
> > > > don't matter (their will be occasions where the numbers are important
> > > > but these tend to be number extremes related to commodities rather
> > > > than anyone using gnc to do a Mr Putin vs Mr Trump sports bet).? In
> > > > most cases multiplying any matching numbers by the same semi-random
> > > > should produce a good file for examination so long as it is done
> > > > consistently [4]
> > 
> > If the numbers in the file are integers times some account or
> > currency-dependent unit, then just clculationg the greatest common
> > divisor of all the obfuscated numbers will give a good guess as to the
> > semirandom multiplier.
> 
> Do you think that still is possible if a different random number was used for 
> each transaction ? (That's how I understood Wm's suggestion)
> 
> Each transaction will have it's own random number. So for transaction A all 
> splits may have been multiplied with 450, for Transaction B all numbers may 
> have been multiplied by 500. 

That might work.  That way eash transaction balances, but the account 
balances will be nonsense.

Still, by finding the gcd you can still produce a lower bound on the 
transaction values.  And if you, say, split off sales tax into a separate 
split your lower bound will oftern be the actual value.

And it's likely that one could also identify income and expense accounts as 
such by the pattern of debits vs credits.

-- hendrik

> 
> Regards,
> 
> Geert
> 
> 


More information about the gnucash-devel mailing list