Dereconciling account using XSLT

Neil Williams linux at codehelp.co.uk
Sun Apr 10 06:03:19 EDT 2005


On Sunday 10 April 2005 7:05 am, Ben Pracht wrote:
> Note to developers:

Don't forget that the current XML backend is to be replaced. It's not going to 
be changed now.

> It would have helped to have had:
> *  A DTD for this file, locally available *without* getting the
> developers source copy, and without downloading off the web.

Where did you find one in the end?

> *  Namespace prefixes defined somewhere in the data file.

There's no point doing that now, better to ditch the entire format and only 
retain the ability to read old files. Therefore, changes to the code that 
implements the current XML format are all but pointless - the whole point is 
to ensure complete compatibility with any file previously written by GnuCash 
- that includes any flaws in the XML structure of those files.
:-)

> *  For that matter, cutting down on the namespaces to something like
> say, one, would be really nice.

That's what has been done with the QSF XML definition - for exactly these 
reasons. Incidentally, QSF doesn't use a DTD, it uses a Schema (actually 2, 
one for data one for maps). Those are to be installed along with the 
application when the G2 port is released. It'll be installed in the logical 
place: /usr/share/xml/

> *  Not used XML ever, but instead used an updatable database with
> documentation  on the tables, names, etc.

The "database" idea has been explored before. GnuCash doesn't have a database, 
but it is going to use Sqllite as a backend in future. I'm working on a 
simple translation to and from SQL for QSF data too. As for the documentation 
on table names and fields, see the QOF object definitions in the source. The 
end of Account.c, Transaction.c, gncInvoice.c and others.

I dare say those could be documented explicitly in the own section of the 
doxygen output, I'll consider that before G2 is released.

> Ideally, the names in the DB 
> would be easy enough to understand that documentation wouldn't be
> needed.

How does this grab you?

<?xml version="1.0"?>
<qof-qsf xmlns="http://qof.sourceforge.net/">
  <book count="1">
    <book-guid>8c4b034ad399fbdbf5ca72781c5fd44c</book-guid>
     <object type="gncCustomer" count="33">
      <string type="id">000005</string>
      <string type="notes"/>
      <string type="name">Pharmacy Plus</string>
      <guid type="guid">1dec596d093153495a232d7f18874b5a</guid>
      <boolean type="active">true</boolean>
      <boolean type="tax table override">false</boolean>
      <numeric type="amount of discount">0/1</numeric>
      <numeric type="amount of credit">0/1</numeric>
    </object>

Remember that this is a data-centric format, not an application-centric 
format. What matters is that the data describes itself fully, including 
exactly how the data itself should be handled, by expressing the data type. 
Then any application can read the file and know how to process the data 
without breaking the types.

e.g. The gncCustomer ID is a string and must be handled as a string - handling 
it as an integer would lose the leading zeroes. That may be a bad example 
(GnuCash can probably handle that) but you see the idea.

The same tags will be used by, e.g. pilot-link, for it's data - all that 
changes are the identifiers, selection of data types and the contents.

e.g. the differences between a gncCustomer and a pilotToDo are:
1. Different object type names.
2. a customer has 3 strings, 1 guid, 2 booleans and 2 numerics,
	a ToDo has 3 strings, 1 guid, 1 date and 3 integer values.

That's why the tag is string and the type is "notes", not 
<notes type="string"> - another application won't know anything about the 
notes tag, it will know how to deal with a string tag. So the format 
restricts the tag names to the kinds of data that can be handled, allowing 
all applications to understand all data without having to understand an 
infinite variety of tag names.

> My XML file is 6.5mb, but I can compress it down to something 
> substantially smaller than 1.44mb.

GnuCash can currently compress the data file but reducing the file size 
without losing necessary data is not going to help.

What WILL help is year closing. Once that is implemented, users would never 
need such large files - only the current year and possibly one previous year 
would be necessary. How much of that file relates to transactions PRIOR to 
2003?


-- 

Neil Williams
=============
http://www.dcglug.org.uk/
http://www.nosoftwarepatents.com/
http://www.linux.codehelp.co.uk/

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.gnucash.org/pipermail/gnucash-user/attachments/20050410/f0b3f824/attachment.bin


More information about the gnucash-user mailing list