Regular expressions in GnuCash
Mike or Penny Novack
mpnovack at mtdata.com
Sun Sep 25 09:48:09 EDT 2016
<< and let me start by apologizing about being in too much of a hurry --
though the issue was more failure of analysis than failure to test>>
The PROBLEM was to determine something based on the EXPRESSION of
numbers. Suggesting that there was something wrong with them BEING just
the expression of numbers rather than numbers (a field of numeric type)
is missing the point. Because they ARE just the expression of numbers
instead of numbers is part of the given problem is why we have been
discussing string processing tools like regex. And I have to disagree
that we would be helped in general by asking for a change to numeric. It
might be a report we were trying to analyze, and that certainly would be
in "display characters" instead of numeric fields.
As a string processing problem, I was trying to show that there were two
parts, most easily divided up. First to find/identify just what needed
to be compared/range checked and then the compare/range check itself
(and my mistake was not seeing that a SIMPLE collating sequence check
was enough once determined to be four digits << but that latter being a
consequence of the collating sequence of almost all display character
sets (the representation of the digits is contiguous and in the same
order as their numeric value --- offhand I do not know of any display
set that does not have this property) >>
I do NOT think that gnucash users should be having to do string
processing. Even most professional programmers don't have a lot of
experience with that. Which is why I was calling the discussion a
digression. But the 'nix shells plus standard library of utilities DO
constitute a powerful string processing language << which is probably
why specific string processing languages like SNOBOL went extinct>> I
used to have a "standard task" of the string problem type which whenever
I was learning a new language or shell I would try, having first chosen
that problem* while learning CLIST and then done for every
shell/language after. The solution using bash + standard utilities an
order or two of magnitude shorter than than anything previous. As few
keystrokes as lines in some of the previous solutions!
If people DO want to use regex's (and similar tools) they should
practice with some simpler, but still interesting problems of the string
type. For example, search a text for an occurrence of a "number" equal
in value to a given number (but you don't know in advance the
REPRESENTATION of the number in the text, does it have lead zeros? a
currency symbol in front? commas in it? etc.
Michael D Novack
* For those curious, the "problem" is a "program" that will prompt for
text input, then report whether that text is a palindrome according to
the usual rules for TEXT palindromes, then offer the option of quitting
or entering another text. This is significantly harder than for
MATHEMATICAL palindromes << the usual rules for text palindromes ignores
differences in punctuation, case, spaces, etc. --- thus "Madam, I'm
Adam." is considered a palindrome by text palindrome rules but not by
mathematical rules since it isn't exactly the same back to front) >>
One of the things I lost in the 2006 house fire so I can't show you, but
my bash script solving this was five lines (maybe 150 - 200 keystrokes).
For comparison, the solution in c over 100 lines. The tools of the 'nix
standard library include some powerful string processing ones.
More information about the gnucash-user
mailing list