IRC discussion on i18n, xml/utf8, and 1.8->2.0 data migration issues

Chris Shoemaker c.shoemaker at cox.net
Fri Feb 3 14:02:29 EST 2006


On Fri, Feb 03, 2006 at 12:27:59PM -0500, Derek Atkins wrote:
> Quoting Chris Shoemaker <c.shoemaker at cox.net>:
> 
> >>You can read them, but then what?  You would need to /remember/ them
> >>somehow so you could put them back into the data file.. . Otherwise
> >>you just corrupt the data going back.
> >
> >There's no way I'd expect the old version to _preserve_ new data-types
> >added by newer versions.  But IMO, _ignoring_ new data-types and
> >remaining usable is better than just bombing.
> 
> I completely disagree.  As soon as you save off the data file you've
> now lost data.  I do agree that it shouldn't "bomb", but I do not believe
> that it should read the data.  IMNSHO It should say "you need a newer
> version of GnuCash to read this file" and just refuse to open it.

I think it should warn: "This file contains some data that was
introduced by a newer version of GnuCash, but which may safely be
ignored by this version of GnuCash, without affecting the integrity of
the remaining data.  However, that ignored data WILL BE ERASED from
any file that you save with this version of GnuCash?  Do you want to
continue?"  [ Note: I call this the "first warning" later on. ] 

Yes, this means distinguishing between changes that really are
backward-compatible, like "this is a new feature", from those which
aren't, like "this new field means the transaction is actually VOID."

> >Sure, if you make non-backward compatible changes to the file format,
> >you need to reflect that in the format version.
> 
> ANY unrecognized option is a non-backwards compatible change to the file
> format..  

This is currently true, but if we purposely design a change that is
semantically backward-compatible, like budgets are, why should we
_guarantee_ that the change is syntactically backward-INcompatible.

> And losing that data is even worse.  

The user should be allowed to make an informed choice between losing
certain incompatible data and accessing their data with their older
software.  I make this decision every time I save an OpenOffice
document, and I _appreciate_ the freedom to do so.

> Imagine if 1.6 did this.
> Now a user of 1.8 who has all their business and SX data in the file
> loads it into 1.6 which prompty ignores the business and SX data but
> doesn't tell the user.  User now saves file and LOSES all their business
> and SX data.  User now complains to us.

I agree there's a problem with this scenario, but I think the problem
is abuse of expectations, NOT losing the business and SX data.  This
cuts both ways: User upgrades GnuCash, tries it out for some time,
checks out the the new features, decides that the one _dropped_
feature is a must-have, tries to go back to their old program only to
find out their only option is to lose all their newly entered
transactions.  Again there's data-loss, but the data-loss was not the
real problem.  The real problem was the the new version didn't import
his data with a warning that said, "WARNING: if you use any feature in
this version that isn't in an older version, your ENTIRE data file
will be UNREADABLE by that old version."

The way I see it, we _need_ one warning or the other, just to be fair
to the users.  I don't know which one we want for 1.8->2.0, but I know
that, in general, I want the _option_ of using the "first warning".

> >Of course.  That's WAY too hard.  But that's not what I was
> >suggesting.  Currently, it's _impossible_ to change the format
> >_at_all_ without breaking old versions.  I'm just suggesting we should
> >at least make it _possible_ to make backward compatible changes.
> 
> That's not completely true.  This is what the KVP frames are for.
> If you put new data into the KVP frame then previous versions can
> read it just fine..  But that makes it much harder on, say, DB schemas.

Ok, that's true.  But I don't think we need to constrain
backward-compatible format changes to that syntax.

> >Budgets is a perfect example.  There's _no_ good reason why a
> >pre-budgets version of gnucash shouldn't be able to open a data-file
> >from a version of gnucash that supports budgets.  Of course it can't
> >preserve budget data, but it should at least work with the data it
> >knows about.
> 
> I disagree.  See above.  If you DONT have budget data, or any 
> new-in-new-version
> features in your data file, then yes, it should be backwards compatible.
> But as soon as you get a new-in-new-version feature the file should NOT
> be readable by previous versions due to the data loss issue.
> 
> Solve the data loss issue and I'll change my opinion, but not before.

IMO, "solving" the data-loss issue means giving the user a choice:
"Yes, you may use this file with your old program, OR you may keep you
budget in this file, but not both.  Which do you want?"

<snip>

> >Well, the 1.8->2.0 is a bit of a separate issue, since the fix is
> >completely different.  I'm still not quite clear on whether the
> >encoding issue is solvable.  But personally, I'd feel a bit
> >irresponsible if we broke backward-compatibility for no other reason
> >than introducing budgets.  It just feels... rude.
> 
> We broke it for Business and SX in 1.8.  It was broken from 1.4->1.6
> for other reasons that I don't recall...  But it's broken ONLY IF
> YOU USE THE NEW FEATURE..  Just running the program doesn't break
> compatibility..  (xml encoding issues asside).
> 
> Please stop thinking like a developer.  Think like a user, a dumb
> user, a dumb user who knows NOTHING about computers or programming
> or the issues of when data loss can occur.  They care more about data
> integrity than compatibility.  

I really think I am thinking like a user.  But I don't think they're
so dumb as to be unable to choose between retaining the data related
to new features and using their old, familiar program to access their
file.  Apparently, lots of other document-format designers agree,
since this is a pretty standard option.

> Go ahead, ask on -user.  Ask this question
> and see what response you get:
> 
>   Would you rather be able to always read a data file created in a new
>   version of gnucash using an older version of gnucash, where saving that
>   data file will lose data when you re-open it in the new version, 

um, this doesn't represent my opinion.  Perhaps a better question would be:

  Do you have any desire to open files with a version of GnuCash older
than the version that saved the file?  If so, would you prefer that, 

  a) You must explicitly save your file from the new version as a
backward-compatible datafile, expunging any data related to new
features, OR

  b) You are warned by the old version of GnuCash that data related to
new features will be lost when you save with the old version, OR

  c) Sorry, if you've used any new feature, you just can't go back.

Neither a) nor b) are that hard to do (disregarding potential encoding
issue).  And, I actually believe that some users of old versions would
avoid upgrading if the only option is c).  

To be quite honest, the _only_ reason _I_ would upgrade is that I know
I can hand-edit the xml to retain backward compatibility.  It's
_enough_ of a perceived risk to be keeping other people's books in
non-commercial accounting software that I don't need any _additional_
risk due to a one-way, no-turning-back upgrade path.

-chris


More information about the gnucash-devel mailing list