[GNC-dev] Save as Postgres
jralls at ceridwen.us
Fri Sep 4 14:34:55 EDT 2020
> On Sep 4, 2020, at 9:43 AM, Greg Ingram <ingram at symsys.com> wrote:
> I'm a long-time user and recently started to lurk on the -devel list. This problem may belong on -user but it seems like a problem for developers rather than my fellow users.
> I have a set of book in a SQLite3 file and I'm trying to save it in PostgreSQL. It's still currently running. Here's a couple of lines from top:
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 42022 ingram 20 0 1600828 796256 93648 R 100.0 9.9 125:28.52 gnucash
> It keeps one CPU close to pegged. I can't tell what it's doing. The Save As dialog is still visible with the Save As button looking like it's been pressed. I don't see any relevant activity in system logs or GnuCash logs. As far as I can tell, it's never connected to the PostgreSQL server. It hasn't created any tables. I created the database after I aborted a previous try where the database didn't exist yet.
> I'm using GnuCash 3.8, Build ID: 3.8b+(2019-12-29), as distributed with Kubuntu/Focal. I've also used a 3.8 from a flatpak on a machine with an older version of Kubuntu.
> It's a pretty big data set: 112M SQLIte3 file with roughly 10K accounts, 50K transactions, and 160K splits.
> Is there hope it'll start writing to the database?
> I searched for information related to what I'm seeing and most of what I found seemed to be about GnuCash 2.8 and earlier. There was some discussion about revamping GnuCash to take better advantage of SQL and that, at that time, it was still reading the entire database into memory.
> Are things different now? Is there a performance gain to be had with a SQL back end? I switched from XML to SQLite3 because it seemed like the program was bogging down. And that's why I'm looking to try PostgreSQL now.
> I've run into what may be a similar problem where I can no longer import transactions. Or I wasn't willing to wait long enough. Something I read back then suggested that some part of matching transactions to accounts involved a sort of exponential growth in the work it was doing. That's probably not clear but whatever it was led me to conclude that I had too much data for the program to handle. I work around the issue by importing into an almost empty set of accounts and the cut and paste into my official books.
There never was a GnuCash 2.8. Late in the development cycle for what would have been 2.8 we had to change the GUI from Gtk2 to Gtk3 and decided to release the result as GnuCash 3. That has a SQL backend rewritten in C++ but the underlying design is the same and didn't change the way GnuCash interacts with the database. You'll probably find references to a Postgres backend from GnuCash 2.2. That was replaced by the SQL/DBI backend in 2.4 so you should disregard anything you find about that, but anything about 2.4 or later is still useful.
The first thing to do is to make sure that you can connect to your Postgres server and create a database. I suggest you do that with File>New and selecting the Postgres backend so that it's not conflated with the high overhead of copying a large database. Once you're sure that works *then* migrate your SQLite3 database.
The SQL backend is written to create the database for you, it won't work to create an empty database and use it. Consequently the user you connect with must have CREATE* privs on the server.
There is only one performance gain with the SQL backend regardless of SQL engine. Of the three SQLite3 has the least overhead so unless the MySQL/Maria or PG server is running on hardware optimized for it and connected by a very fast network you'll get the best performance with SQLite3. That aside, the SQL backends load the whole database into memory and all GnuCash work is done on those in-memory objects. The only difference is that when you change something the SQL backends commit it immediately to the database; the XML backend waits and writes everything back out either when you tell it to save or periodically if you have autosave turned on. Since GnuCash is mostly single-threaded and has no object locking XML saves block the UI so the one performance gain SQL affords is that you won't have to wait for autosaves to complete every n minutes.
The matching algorithm for imports is pretty slow. Jean Laroche made some nice improvements for 4.2 that we'll release on the 27th. Dunno if it will be enough, but switching backends isn't going to help that at all.
More information about the gnucash-devel