Creating Gnucash invoices with XML
linux at codehelp.co.uk
Mon Apr 5 18:05:46 EDT 2004
-----BEGIN PGP SIGNED MESSAGE-----
On Monday 05 April 2004 8:45, Derek Atkins wrote:
> Neil Williams <linux at codehelp.co.uk> writes:
> > My first problem is the repeat within an invoice:
> > The first section needs to deal with the common stuff, customer, start
> > date, job ID. That's then fixed for the rest of that invoice. However,
> > the subsequent data needs to repeat - more than one item per invoice.
> > This would be easier under XML but in CSV, it breaks the standard unless
> > duplicated.
> Yes, this is an issue.. How does IIF deal with it?
IIF? Intuit Interchange Format? Not used it. Sorry.
> > Why was XML deprecated? It would solve these problems.
> Because XML is a HORRIBLE data format. It's a great _INTERCHANGE_
> format. So we want to fix that and relegate XML to what it's good at,
> exchanging data. For storing data we should use a real database.
Totally agree on not using XML as a database. It was never suitable. I use
MySQL for most projects.
I think XML is a superb format for data exchange. I use XML as a data exchange
format in other projects, although not yet in C. I'm also more used to
strictly pre-defined formats for exchange, rather than writing an entire
parser. This will work to our advantage, see later.
> If you wanted to write an importer that took a snippet of a GnuCash
> XML file and imported it, that would work too. Indeed, importing an
Kind of the reverse - take an XML file from other applications and import it.
Pre-defined XML formats that are easy to create and convert. XML easily
accommodates mixing contents too, so that one XML import can include invoices
and payments, accounts and style-sheet settings. Everything that was stored
by Gnucash in XML as a data storage format is already available for data
export, import, merge and exchange. That makes importing data from a PDA etc.
> XML file would help in multiple ways. For example, imagine being able
> to re-run the Hierarchy Druid in order to add new sets-of-accounts to
> your data file! An XML importer would make this much easier.
If I'm correct in how Gnucash used XML for data storage, then you've already
done all that work. In order for Gnucash to save and reopen XML storage
files, XML definitions for every saved component must exist, even if not
explicitly. Not only that, but mechanisms already exist to convert the
incoming XML data (from the data file) into live Gnucash data (for display
and manipulation in the GUI).
XML can sort this out itself and, TBH, it is a job best left to XML to
accomplish. I've tried others and failed. CSV is probably the most awkward
for repeated use. The work with XML is front-loaded - design enough data
formats in the beginning and enforce these as the required formats for later.
Most of this work is already done because of the previous XML methods. The
previous (understandably deprecated) storage XML definitions can simply be
recast into XML export/import definitions. That leaves the data storage for
any mechanism you'd like.
Have I got this right?
Currently you have Gnucash this way:
Start app -> open previous data file -> read XML -> populate data structures
in RAM -> display GUI -> manipulate data in GList etc -> write to XML on save
- -> exit.
What I propose would be:
Start app -> open previous data file/source -> read a format to be decided,
perhaps SQL -> populate data structures using new mapping -> display GUI ->
manipulate data in GList as before -> write/flush to new format on save or
just on each operation -> exit.
Now, when I want to export data, I re-use the old function calls to save a
portion of the data file into the old XML format. When I import an XML file,
I re-use the old file-open calls to import the data. All that's needed is a
wrapper to cope with partial data and crashes with existing data.
Crucially, we'd need to retain all the existing XML <-> GList mappings but
instead of loading them every time, they would be pressed into service upon
an export or import only. This provides the importer with a ready conduit to
all existing Gnucash data structures - meaning that absolutely anything
already in Gnucash can be imported and exported.
There must be some level of XML parsing already being performed within Gnucash
file operations. File->Open and File->Save etc.
This would simply be downgraded to import-export.
> Similarly, it would be useful for combining multiple data files.
Yes, precisely because the definitions for every component of every data file
must already have been defined for the save and open routines to work. Even
if the definitions are not explicit, it's not hard to create a DTD from
> The downside is the challenge in mapping the GUIDs of an imported data
> to an existing data. How do you know if an account is the same? Or
> an invoice? or a customer? It's a huge can of worms to build an XML
> importer (which is why it hasn't been done, yet ;)
Not necessarily. In the help file that talked about XSLT, there were a whole
list of XSLT definitions for components. XML has the advantage over CSV that
these formats can be validated and are reliable. Therefore, an XML file that
claims to represent an invoice (from the choice of DTD) but actually contains
payment data can be rejected in a nice, informative, operation.
Duplications would be handled in exactly the same way as now - by having the
unique ID stored / retrieved from the XML, missing ID -> new record.
For each category of import, an XML structure needs to be defined and a DTD
created, hosted on gnucash.org. Every XML file that refers to that DTD can be
verified against it and data only accepted if the validation passes. That
should reduce the amount of data typing work required. Actually, to prevent
problems with not having access to gnucash.org, the DTD's would have to be in
the package. (Thinking on the fly here, you know!)
In a real sense, most of the definition work has already been done. That's why
I was keen on XSLT/XML in my initial query. Converting a binary storage
program into an XML export program is a major undertaking, but gnucash was
already using XML for data storage, so (unless I'm in for a shock), the data
typing and conversion must already be in code?
> The choice is yours, tho.
If you agree (and if my assumptions about Gnucash file operations above are
correct) I'd recommend dumping CSV as a data import mechanism and using XML
instead. No need for XSLT, by defining the formats, third-party applications
can write native Gnucash XML documents ready for import and expect valid XML
export documents in the same format. (native as in 'old version native'.)
> What needs to be done:
> * column mapping
done in XML
> * field parsing (we already have a bunch of generic parsers)
already implemented in Open/Save and in need of customising to accept only
> * user verification
OK, maybe once the XML is parsed, a dialog box showing (some of ) the content?
Changing the column mapping ala CSV isn't possible with XML, that would
indicate a corrupted import file and requires separate corrective measures.
> * transaction matching
I'll need help with that. The existing procedures are presumably not
anticipating a merge with existing data but are set to be read into an
otherwise empty memory allocation.
Is it acceptable to have a very simple rule?
Is there a unique ID specified?
If yes, update the data behind that UID.
If no, insert as new data.
> * data-checking
To a large extent, covered within XML in terms of the wrong data in the wrong
field. Still some work to be done to check data types though - XML parsed
character data covers at least 4 different C data types! How does Gnucash
currently deal with a corrupt XML data file?
> * data-insertion
Hoping to re-use existing conduits.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
-----END PGP SIGNATURE-----
More information about the gnucash-devel