[GNC-dev] Git branches
john
jralls at ceridwen.us
Wed Nov 16 00:08:15 EST 2022
I didn't follow completely all of dymitruk's essay either, but it seems clear to me that he's working in a much larger team than we are. His suggestion for handling merge conflicts was a shared git rerere cache; I understand the principle but I'm not completely clear about the implementation. I had the impression that release branches were like feature branches: Used once by the release team and discarded, and that his team uses Atlassian's Jira to keep track of what branches are merged into each release. He didn't go into a lot of detail about how to do it, and absent us adopting Jira I don't think that would matter much. Anyway I didn't mean to suggest that we should take up that whole rather complicated process; I just thought it a useful outline of the single-branch strategy.
Not only do we not do semantic versioning, I don't think we even know how. https://en.wikipedia.org/wiki/Software_versioning#Semantic_versioning has a brief description that says that you bump the major number if you remove or change some existing API, the minor number if you add API, and the patch number (which we got rid of with 3.0) for all other changes. That doesn't make any sense at all for GnuCash the application, it's for libraries. For GnuCash right now it would really just for bindings and maybe only the Guile bindings.
Glib has some nice macros, based somewhat on Apple's Availability Macros, in https://gitlab.gnome.org/GNOME/glib/-/blob/main/glib/gversionmacros.h.in that can be used in functions declarations to emit or suppress deprecation warnings based on a target version of any Glib-based library. We *could* use those directly or we could pinch them and adapt them to however we want to manage deprecations. Glib and several other GNOME libraries also have deprecated directories, e.g. https://gitlab.gnome.org/GNOME/glib/-/tree/main/glib/deprecated that they use when they deprecate whole classes.
The SQL backend has that per-table version check and built in update queries to update a database when the version changes, but since it doesn't write out a new DB on every save it has a much greater need for that feature than does the XML backend. There's also GNUCASH_RESAVE_VERSION that's available to indicate that the database needs to be purged and rewritten from memory. Fortunately we haven't needed to change it. The XML backend does have versions on each top-level element. They're all 2.0.0 suggesting that either we haven't been very good about updating them when we change something or that the schema is a lot more stable than we think it is. From a design standpoint I'm not sure that versioning every entry is all that useful considering that everything is written out fresh with every save.
Regards,
John Ralls
> On Nov 15, 2022, at 9:25 AM, Geert Janssens <geert.gnucash at kobaltwit.be> wrote:
>
> Op maandag 14 november 2022 19:59:24 CET schreef john:
> > I guess we could do that as long as we continue the no-backports policy, but
> > it's something you argued against when we started using git-flow a few
> > years ago.
> >
>
> I don't have a clear memory of what I argued against way back then. It doesn't matter much. In reality we have continued to avoid backporting anyway, which is just fine for the small team that we are.
>
> > But what about the opposite approach, having only one permanent branch and
> > no major releases? Instead of 5.0 next spring we'll release 2023.1 and the
> > spring after that 2024.1, with .2 in June, .3 in September, and .4 in
> > December every year? Major changes, like c++options, get merged when ready;
> > we might do a beta release (e.g. 2023.2beta) a month before a release with
> > a major change to get better user testing. We'd have to work out policies
> > for API and schema changes because it would blow up the file upgrade path
> > for users who've skipped some releases. There's a very dense exposition on
> > this pattern at http://dymitruk.com/blog/2012/02/05/branch-per-feature/.
>
> It's actually a branch and release pattern I had been considering but was hesitant to bring up as perhaps to radical. Since you now bring it up for consideration, let's evaluate it after all.
>
> 1. I like the idea of only a single release branch and all development happening on feature or bugfix branches that get merged into this release branch when ready.
>
> 2. I also like the idea of dropping distinction between a stable and development series. It would bring improvements to users much faster in general - it will be released when ready, not queued for the next major release (which could be only in 2 years worst case).
> It's a bit what fast moving projects such as webbrowsers currently do.
>
> 3. Year based release numbering is also very clear. And always gives a reasonable indication of how old a given version of gnucash is.
>
>
> On the flip side
> 1. This does do away with semantic versioning completely. But that's the whole point of having only one release branch. Each release can be a mix of bugfixes and new features.
> 2. I imagine this only works well if newly added code (features or bugfixes alike) is well tested, implying having tests written for it. And that the existing code base is well tested as well. While slowly improving, the gnucash code is still not very well covered.
>
> I also read through the dymitruk article you linked to. There are a few other elements that are not fully clear to me yet:
>
> * he talks about an integration branch. Is that a branch that people continue to merge their new work in, and that just serves
> a. to discover and resolve merge conflicts early on and
> b. to run an integration test suite on
> Will this branch ever be cleaned or just merge upon merge be added to it ? I have no clear example of how such a branch is used really.
>
> * there's a separate release branch. Which can be reset from time to time if bad features are to be skipped for the most recent release. Resetting a branch seems to conflict with distributed repositories in my mind. But perhaps this is not a problem if it's commonly known this a a resettable branch. And no devs except for the release manager should really check out this branch and then even only while preparing for releases ? It's a bit vague to me.
>
> * handling merge conflicts and sharing the resolutions seem to be an important part of the solution. Otherwise these conflicts continue to trip up different devs. There was a suggestion as to how to do this, but nothing concrete. Something to figure out as well.
>
>
>
> As for the API and schema changes, that would indeed require some reconsideration.
>
> I have a few first thoughts, but nothing well structured:
>
> * For API the important change to keep in mind is deprecation. New API won't be an issue. Do we support function signature changes or should a new function be defined in that case ?
> Current policy is that we deprecate in a stable series and remove in a future major release. As our current schedule is a two-year cycle for major releases, we could make the policy "a deprecated feature/function will be kept around for 2 years, after which it will be definitely removed". Other durations can be chosen as well, as long as it's clear. So consumers of the api could at most jump two years ahead from the version they currently use with a guarantee their own code continues to work. At that point they should do the work of updating their code to cope with deprecated api.
>
> * Alternatively we could maintain a list of deprecated symbols and write a small tool around it that consumers of our API can run to test if their own code still depends on these deprecated symbols. Or not really remove the deprecated symbols but move them to a separate source file that prints out error messages when this removed symbol is still used. These messages could indicate in which version of gnucash the symbol was removed to help users go back to a version that still works with their code. I have not thought this through very deeply to imagine whether this is even feasible.
>
> * It could also be a mixture of both, keep a removal list for a reasonable amount of time to guide users to compatible older releases, but then finally drop the removed symbol after all. I think much of this could be mostly automated, similarly to how I wrote xml files to keep track of deprecated GSettings schemas.
>
> * As for schema changes again I see two possibilities.
>
> 1. The first is to extend slightly on how we do things now, namely if the user's data contains bits that require adjustment, just do them the first time this data is loaded or effectively used. This has slowly become a spaghetti throughout the engine and from time to time we drop bits of this to keep the code manageable (and hence older data can't be upgraded any more). If we start to effectively use the version component of our schemas (both xml and dbi backends have this), we could also require a minimum version of a schema for a certain version of gnucash. Each time we drop a bit of conversion code, we can bump this minimum required schema version (and conveniently guide the user to the last version of gnucash that wrote this version of the schema). So each version of gnucash would have a minimum and a maximum schema version it supports and these versions are updated as we add or remove schema related code.
>
> 2. The other approach is to implement a separate bit of code that's solely responsible for tracking schema changes (akin to the idea of having a scrubbing function specifically for this reason). Whenever a data file is to be opened, this code can check the schema version and do all the data transformations necessary to bring it to the current schema. The advantage of this would be that we unclutter the rest of the code. The schema changes could also be in separate subfunctions, eg one to do the changes from schema version 1 to 2, a second to do the changes from schema version 2 to 3 and so on. This list could become very long over time, yet stay very clear as each step is nicely isolated and easily readable. For that matter these steps could be in appropriate data transformation languages (xslt or an sql data modelling language) rather than code or even a mixture of both depending on what we need. It would allow us to convert very old data files (well, starting from where we implement this at least) over time without having bits of data conversion code all over the engine code.
> With such a piece of code in place that only grows over time with schema changes we can easily support converting older data files to the most recent schema over very long periods of time.
>
> All of this also comes with extra work of course which takes developer time away from other interesting improvements. So it's worth evaluating of this alternative versioning scheme brings enough benefits to be worth the effort.
>
> Geert
More information about the gnucash-devel
mailing list