Logic ideas - 5 levels.
Neil Williams
linux at codehelp.co.uk
Sun Sep 18 08:55:53 EDT 2005
On Saturday 17 September 2005 11:47 pm, Josh Sled wrote:
(oops, this is a really long one!)
:-)
> Data-type constraints are
> generally above-and-beyond those provided by programming languages, such
> as "positive integer", "bounded integer", "length-limited string",
> "patterned string", &c.
(This will be easier once the gnc-backend-file is replaced - some of these
bounds can be set in the XML if schemas are used properly. QSF does this to
ensure that all GUID strings are true hexadecimals and integer fields do not
contain string characters. That provides an entry-point validation at the XML
level.)
The data-type constraints of individual parameters distinct from any other
parameter should always be handled solely within the object - the (static)
param_getfcn and param_setfcn handlers must enforce their own logic to form
the lowest level of validation. If a parameter accepts a string but cannot
deal with a string that contains certain characters, the specific object
handler code for that parameter needs to do that check.
e.g. If Account->Name - for whatever reason - was not able to cope with
&%£$*() characters, the xaccAccountSetName code should refuse to set such
characters.
That's the lowest level and it's the one we have already - albeit it isn't
fully utilised always.
These two levels are the simplest to implement in all UI's because one can be
set by the backend (using run-time schema validation) and the other by the
object code itself - which has to be shared between all UI's anyway.
So these are the levels that I see:
1. Data Source constraints: Includes XML schema / DTD validation of incoming
data e.g. no GUID read from the XML should fail to verify as hexadecimal - it
may fail as a GUID (out of range etc.), but not as hexadecimal. Other
backends can provide different constraints according to their strengths -
maybe we should seek that each backend implements a minimum standard of data
constraint.
2. Discrete parameter logic: Each object parameter must enforce those rules
that are necessary for itself. This includes only those rules that can be
enforced with reference ONLY to the current parameter in the current entity.
This is present but needs to be encouraged, widened and supported more
cleanly.
3. Entity logic: Each entity needs a new mechanism to ensure it's own validity
by validating it's parameters *with reference to* it's other parameters.
These rules are the lowest level of what I am proposing as the 'new' logic.
This is where an entity can verify that it has X parameter OR Y parameter and
fail if the two are incompatible. This is present in some entities (typically
those that support clones) but not in a form that can be accessed in a
standard manner. i.e. there's no typedef or designated callback / foreach
support.
4. Collection logic: Some objects may need to verify their own place within
the QofCollection - typically these would be hierarchical objects like
Account. This would implement "all accounts must have a unique name".
5. Book logic: Rules that determine how objects of different types validate
their own data and references to other objects. This is where "all splits
must reference a known account" would be implemented.
Clearly, the result of the unique account name rule and the splits-reference
rule need to handled differently - one is a syntax rule, one is an assertion.
However, the collection and book logic can support both kinds of rules. If
there is an assertion that fits into the collection logic, that's fine. If
there's a syntax rule for the book, that's good too.
I'm thinking of a cascade implementation - rules in the higher levels are only
executed if lower level rules report success.
Also, each user operation decides which levels of logic need to be checked. In
a dialog, losing the focus on one input box can utilise different logic to a
clicking a radio button in the same dialog. Enabling "OK" would normally
require most if not all levels to be checked, but results should be cached so
that each rule is only executed once for each relevant data change.
e.g. the handler for the radio button would query the logic library for the
parameter relating to the button, passing the value for low-level checks.
Optionally, it could also request a check on another parameter or another
rule for the current object.
Then a tag in the instance would indicate rules remaining to be checked and
the dialog control function would enable "OK" if this returned zero.
The object declares which rules are essential and which are optional
(assertions and syntax respectively). The UI determines how and when those
rules are checked and in which sequence. The UI can also upgrade rules -
handle any syntax/optional rule as if it was essential if appropriate - it
would not be able to downgrade a rule deemed essential by the object.
> (There is also is more than one level of integrity, here. For instance,
> gnucash would function perfectly well if no Account had a name or
> description...
That, in my plan, would be 3: Entity logic. Dictated by the QofObject
definition, it consists of parameter handlers and a new function that
inspects the entity as a unit, relating parameters to each other, within the
limits of a single instance.
> the types, guids and parent-guids are all that are
> strictly required. But in any practical user interface, every account
> should have a name that is non-null and unique. I guess this primarily
> extends to user-facing identifiers like names, but I think it's worth
> distinguishing assertion-level constraints like "all splits must
> reference an account" and practical constraints like "all accounts much
> have a unique name".)
That can be done according to *how* the entity complains about an invalid
value. In the entity logic, an assertion failure could choose to free the
entire entity (i.e. refuse to set). An "interface" failure (such as the
account name uniqueness) would be a simple complaint along the lines of "try
again" but leaving other parameters unchanged.
> The "high" logic, I believe, has two parts: the data-types and functions
> which define the semantics of the application, and the user interface
> which defines the "syntax" of an application, if you will. These things
> are often very closely related, which is harder to generically abstract.
Can we implement those parts as different *methods* rather than different
models? i.e. The same logic check can validate the semantics and the syntax
if that is appropriate for that particular check. The handler reports back a
more severe failure (including freeing the entity) for the semantics and a
moderate failure (i.e. try again) on a syntax failure.
> The best that people have seem to come up with so far is the
> Model-View-Controller architecture.
>
> The mortgage loan druid is a decent example: I took care to seperate the
> GUI/druid controller from the loan-parameters/options and processing
> model... and even still the model-only code tacitly assumes a druid-like
> interface. Some other GUI could -- if the appropriate piece was to move
> from the druid into the engine -- use that same model to re-present
> similar functionality.
I'll have to look at that.
> This has pretty low priority on my todo list right now; I'm more than
> happy to review proposals, designs and code, though.
Same here. Besides, this kind of thing suits a slower, more considered,
development with lots of proposals and designs tossed around.
> I think the validation is hard enough and somewhat valuable, but the
> *real* value is saving any work that would be re-implementation of the
> application logic.
Agreed. The lower levels (1 + 2 above), are small increments from where we are
now. Level 3 is implemented in a patchy way for certain elements of certain
objects. Levels 4 and 5 are implemented in two ways: the gnc-backend-file
refuses to save/load data that doesn't fit the implicit assumptions of that
backend and the UI refuses to display data that doesn't fit it's own versions
of those assumptions.
I believe we can save a lot of work by putting those assumptions in one place
so that each backend, each UI and all other components can use the same
rules.
Partial books would implement partial rules - there's no point in rejecting a
QSF book that fails collection or book logic (4+5) because that is expressly
the point of a partial book. However, QSF should never fail data constraint,
parameter or entity logic (1,2+3). Higher level logic would be implemented
during or immediately after the book merge when the QSF is loaded back into
the main data set - just as I currently implement some Account hierarchy
handling code in the merge druid that only exists in the GnuCash codebase.
In CashUtil, this could be implemented by changing to the QSF backend if the
higher-level rules fail their checks. This would automatically require those
checks to succeed when the data is later merged back into the main data set.
> For example, I can easily see a set of rules declared that enable
> runtime input validation for SXes, thus refactoring some code out of the
> current SX editor and as such re-usable by a CLI, but I think there's a
> fair amount of code that is not readily factored out, and I'm not sure
> how to deal with. I can imagine that most of the non-UI logic can get
> factored out into a better Model, and the CLI View/Controller could call
> it as well as the GUI View/Controller, but I think there's a large
> amount of reimplementation just in the view/controler side, too. :(
I'd hope that the model and the view / controller would all be single units
that take their parameters from the QofObject - I'd rather not have a
specific model for SX and a specific model for Account. Instead, a single
model that can load the rules for SX or Account as required.
I'd like as many of the rules as possible to be declared by the object so that
new objects are easy to plugin. So the sched-xaction object would define a
set of rules that express what is currently implicit in the SX editor.
> (I use SXes as my examples here due to familiarity, but I don't think
> there's anything in particular that biases them as an example.)
:-)
> Of course, some of the above costs assume feature parity between the GUI
> and CLI versions of gnucash... what's your goal for CashUtil? Is it a
> CLI interface to all of gnucash, or a basic access to a subset of
> gnucash?
I do want CashUtil to be a full CLI interface for all GnuCash data that can be
represented in a CLI.
The elements that I feel are outside CashUtil are:
1. Reports.
2. Budgets.
1. Reports cannot be handled within the CLI (it's a waste of code IMHO), they
are far better handled by using the CLI to parse SQL to select the data
required for the report and some other (scripting) tool to format the QSF
output into a highly customisable report. e.g. By generating the data
*behind* the existing reports as XML from external SQL statements (in a .sql
file), CashUtil can provide every user with every report they could ever want
in any format they want and printable in whatever format they can prepare. No
more concerns about why X report won't print Y data or on Z media. True,
these reports would not then be within the scope of the GnuCash GUI but it's
a small price to pay for the freedom to create truly customised reports - and
if changes are made to the QSF as a result of the report, the data can always
be merged back in.
The difficulties with doing this with the current gnc-backend-file are that
the XML is too specific to gnucash, it can be impossible to isolate certain
instances (due to implicit AccountGroup logic etc.) and it's difficult to
parse. By having a QSF file that only contains the data specific to the
intended report, users can use PHP, Perl, Python, whatever, to format that
data in whatever way they can because the XML structure is always the same.
It also mitigates the need to set particular financial year end dates - users
can prepare their own QSF from and to any particular date, for any selection
of data.
e.g. I've just done my tax return and the current GnuCash reports are simply
not adequate. Using the QSF method, I would export the data between 6th April
2004 and 5th April 2005, use SQL to include details for some transfers that
are currently missed, summarise certain details that are too verbose and
expand others that are too opaque. Then use perl (in my case) to parse the
QSF XML and produce a full calculation of my tax return figures. Perl could
even calculate things like my capital allowance and business:private usage
percentages as well as estimate my payments on account. The final step is to
wrap the commands in a bash file and the entire process is automated! (Plus I
don't have to print out any GnuCash reports or load up OOoCalc!)
Additionally, QSF will remain available for data import/export/mining no
matter which backend is in use for the main data.
2. Budgets: I'm not sure how to proceed with these, the budgets in G2 are not
currently available to QOF and until that changes, it's hard to see how the
budgets can be queried in CashUtil to produce the kind of data that is
available for external reports. Equally, I haven't yet had time to look at
how budgets could be represented to QOF.
Elements that could be supported but are not yet:
1. Finance::Quote could be implemented by CashUtil but the code to handle this
has not even been considered yet.
2. Probably a few others that have slipped off my radar.
If we want cashutil around the time of G2, some features will have to be
tagged as TODO and implemented later.
--
Neil Williams
=============
http://www.data-freedom.org/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.gnucash.org/pipermail/gnucash-devel/attachments/20050918/4dbda7fb/attachment.bin
More information about the gnucash-devel
mailing list