Regular expressions in GnuCash

Mike or Penny Novack mpnovack at mtdata.com
Sun Sep 25 09:48:09 EDT 2016


<< and let me start by apologizing about being in too much of a hurry -- 
though the issue was more failure of analysis than failure to test>>

The PROBLEM was to determine something based on the EXPRESSION of 
numbers. Suggesting that there was something wrong with them BEING just 
the expression of numbers rather than numbers (a field of numeric type) 
is missing the point. Because they ARE just the expression of numbers 
instead of numbers is part of the given problem is why we have been 
discussing string processing tools like regex. And I have to disagree 
that we would be helped in general by asking for a change to numeric. It 
might be a report we were trying to analyze, and that certainly would be 
in "display characters" instead of numeric fields.

As a string processing problem, I was trying to show that there were two 
parts, most easily divided up. First to find/identify just what needed 
to be compared/range checked and then the compare/range check itself 
(and my mistake was not seeing that a SIMPLE collating sequence check 
was enough once determined to be four digits << but that latter being a 
consequence of the collating sequence of almost all display character 
sets (the representation of the digits is contiguous and in the same 
order as their numeric value --- offhand I do not know of any display 
set that does not have this property) >>

I do NOT think that gnucash users should be having to do string 
processing. Even most professional programmers don't have a lot of 
experience with that.  Which is why I was calling the discussion a 
digression. But the 'nix shells plus standard library of utilities DO 
constitute a powerful string processing language << which is probably 
why specific string processing languages like SNOBOL went extinct>> I 
used to have a "standard task" of the string problem type which whenever 
I was learning a new language or shell I would try, having first chosen 
that problem* while learning CLIST and then done for every 
shell/language after. The solution using bash + standard utilities an 
order or two of magnitude shorter than than anything previous. As few 
keystrokes as lines in some of the previous solutions!

If people DO want to use regex's (and similar tools) they should 
practice with some simpler, but still interesting problems of the string 
type. For example, search a text for an occurrence of a "number" equal 
in value to a given number (but you don't know in advance the 
REPRESENTATION of the number in the text, does it have lead zeros? a 
currency symbol in front? commas in it? etc.

Michael D Novack


* For those curious, the "problem" is a "program" that will prompt for 
text input, then report whether that text is a palindrome according to 
the usual rules for TEXT palindromes, then offer the option of quitting 
or entering another text. This is significantly harder than for 
MATHEMATICAL palindromes << the usual rules for text palindromes ignores 
differences in punctuation, case, spaces, etc. --- thus "Madam, I'm 
Adam." is considered a palindrome by text palindrome rules but not by 
mathematical rules since it isn't exactly the same back to front) >>

One of the things I lost in the 2006 house fire so I can't show you, but 
my bash script solving this was five lines (maybe 150 - 200 keystrokes). 
For comparison, the solution in c over 100 lines. The tools of the 'nix 
standard library include some powerful string processing ones.




More information about the gnucash-user mailing list