QSF XML file backend for gnucash-gnome2-dev branch

Neil Williams linux at codehelp.co.uk
Tue Jan 25 18:29:34 EST 2005


On Tuesday 25 January 2005 10:35 pm, Josh Sled wrote:
> I wanted to echo Derek's comments: nice job!  :)

Thanks.

> On Tue, 2005-01-25 at 11:38, Neil Williams wrote:
> > I've got code in the patch just sent to gnucash-patches to correctly
> > distinguish a QSF XML file from v1, v2 or the older binary GnuCash text
> > formats and it loads using the QofBackendProvider mechanism rather than
> > as a GnuCash module.
>
> A question came up on the IRC channel regarding this process... Can you
> elaborate on how it's done?  

The QOF Doxygen output (extended to show all the source) should help:
http://code.neil.williamsleesmill.me.uk/doxygen/qsf-backend_8c-source.html

Look at lines 373 onwards.

Then look at:
http://code.neil.williamsleesmill.me.uk/doxygen/qofsession_8c-source.html
Lines 368 onwards.

> Is the namespace [or lack thereof, in the 
> older XML case] of the file being opened used to branch on which backend
> to use, or some other process?

The existing methods are used to identify the existing data file types. If 
those all fail, then I parse the document using one of the QSF schema and 
proceed from there. That happens in gnc-backend-file.c, lines 390 onwards:

static QofBookFileType
gnc_file_be_determine_file_type(const char *path)
{
        if(gnc_is_xml_data_file_v2(path)) {
        return GNC_BOOK_XML2_FILE;
    } else if(gnc_is_xml_data_file(path)) {
        return GNC_BOOK_XML1_FILE;
    } else if(is_gzipped_file(path)) {
        return GNC_BOOK_XML2_FILE;
    } else if(is_our_qsf_object(path)) {
        return QSF_GNC_OBJECT;  /**< QSF object file using only GnuCash QOF 
objects */
        } else if(is_qsf_object(path)) {
                return QSF_OBJECT;      /**< QSF object file that needs a QSF 
map */
        } else if(is_qsf_map(path)) {
                return QSF_MAP;                 /**< QSF map file */
        } else {
        return GNC_BOOK_BIN_FILE;
    }
}


> > The patch includes the QSF schema for object and map files, it includes
> > code to install the schema in a qsf sub-directory of $prefix/share/ and
> > it also includes a test QSF map that will be the next phase of
> > development to map pilot-link QOF objects into GnuCash objects.
>
> One problem I wanted to call out was the use of the URIs
> "urn:qof-qsf-map" and "urn:qof-qsf-container" as the XML namespaces.

I know. I'd appreciate any tips you have on registering a namespace, once a 
suitable static URI is available - it's not yet.

> Not only are they not registered URN namespace, but best practices are
> to use a namespace-name which is generally resolveable [i.e., an
> http-scheme URI], and which resolves into a useful document [i.e., is a
> URL].

During development, I didn't have a static location, so I used URN. 

> I'd recommend something date-stamped under http://qof.sourceforge.net/
> ... maybe:
>
>     xmlns:qsfMap="http://qof.sourceforge.net/ns/qof/2005/01/qsf/map#"
>    
> xmlns:qsfContainer="http://qof.sourceforge.net/ns/qof/2005/01/qsf/container
>#"
>

I agree - this is a QOF backend and it will be used by pilot-link and 
hopefully by other applications too - without GnuCash being installed. So it 
would need to be somewhere on qof.sourceforge.net and only Linas has access - 
I can't contact him, he hasn't been receiving email for a while.

It still needs to be installed with QOF for offline purposes, as expected.

> ... each of which is probably simply an HTTP redirect [303] back to the
> documentation.

Which also isn't on sourceforge, it's on my own server. I wouldn't want that 
to be the static URI, sourceforge is more likely to be sticking around!

> > The backend can cope with partial QofBooks and is designed to handle
> > routines that create a second QofSession, populate that session with data
> > for export - any QOF compatible objects - and save the session to export
> > as QSF XML. There is absolutely no requirement for AccountGroup or any
> > other specific object to be present, the backend will write out just
> > invoices or just accounts, just customers or any mix.
>
> A better way to say this might be: "it is the caller's responsibility to
> ensure to the consistency, validity and coherence of the sub-set of data
> being serialized".

Quite - QSF will never write an invalid XML file, nor accept one for input. I 
was just waffling, as usual.

(That's why the documentation on http://code.neil.williamsleesmill.me.uk/ is 
getting so large!)

> > QSF increases the libxml2 requirement to 2.6.0 for the gnome2 branch to
> > support the schema validation which identifies QSF files relative to
> > existing files.
>
> Is this the only reason for 2.6.0?

No - but I'd have to spend a while investigating exactly which calls are 
supported prior to 2.5.2.

> Can we make schematic validation optional?

No. It's fundamental to how QSF accepts and parses the data. It's not just 
schema validation that needs 2.5.2, there are some xml constructs that came 
in at the same time.

Besides, I thought you were in favour of valid XML?

I'd hate to write a DTD for this, the objects may be possible but the maps are 
quite something else.

> Can we get rid of it entirely?

Nope.

> > QSF uses UTC time throughout and uses this time format string:
> > #define QSF_XSD_TIME   "%Y-%m-%dT%H:%M:%SZ"
> > The datestring must be timezone independent and include all specified
> > fields.
>
> To clarify:
>
> * MUST the timezone be in UTC

It must be UTC (so that the timezone does not matter) AND it must use the Z - 
currently.

> , or MAY it be in some non-UTC timezone, 

No. Although the code will handle it, the XML will not validate. IIRC, the 
-0500 format won't validate, it would need to be -05:00. Try it with xmllint 
and the files from:
http://code.neil.williamsleesmill.me.uk/qsf.html#AEN607

> as per [1]? 
> * If it MUST be, do you intend to fully support xsd:dateTime, or no?

I do fully support xsd:dateTime - the long form with a Time value. However, 
producing xsd:dateTime without using Z is a hassle.

> [1] http://www.w3.org/TR/xmlschema-2/#dateTime

It's a nice idea but when you get up close with xmllint, you can't format the 
timezone to validate without using regexps. It's easier to use UTC and Z.

> > as well as QOF_TYPE_DEBCRED as a QOF_TYPE_BOOLEAN currently. The
> > parameter will be converted back during the creation of the QofObject
> > concerned.
>
> Hmm.  I'm assuming debcred is actually an enumerated value; is the
> intent to support that, or just to use boolean?

DEBCRED is handled as a boolean elsewhere . . . .

As long as it is an enum with only two possible values, that's fine.

Derek?

-- 

Neil Williams
=============
http://www.dcglug.org.uk/
http://www.nosoftwarepatents.com/
http://sourceforge.net/projects/isbnsearch/
http://www.neil.williamsleesmill.me.uk/
http://www.biglumber.com/x/web?qs=0x8801094A28BCB3E3

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.gnucash.org/pipermail/gnucash-devel/attachments/20050125/42da674e/attachment.bin


More information about the gnucash-devel mailing list