[GNC] [MAINT] Unplanned hardware outage for code.gnucash.org

Derek Atkins derek at ihtfp.com
Tue Jan 18 07:42:02 EST 2022


Good morning,

At around 11:30pm US/EST last night, the hardware hosting code.gnucash.org
crashed.  I noticed the outage this morning at around 6:30 and
power-cycled the hardware around 7am.  The system rebooted and the VMs
(including code) were back in operation by around 7:20am.

Looking at the logs host logs, it looks like the host system was still
alive but started having "sanlock" issues around 00:33 this morning, then
a watchdog error at 00:34, and an ATA error immediately after.  At this
point I started getting VDSM errors, another ATA error 10 seconds later,
and then VDSM execution errors (most likely all due to the ATA issues). 
About 30 seconds letter the VDSM service exited into "failed state", and
the log abruptly ends shortly thereafter at 00:34:55.

According to the reboot log, ATA4 is a 2TB SSD.  It is unclear if the
issue is the drive itself or the ATA driver.

At this point, the hardware is up and running, but clearly there is "an
issue".  I'm getting on a plane in 3h30m so hopefully the system will
remain stable until I return Thursday.

-derek

-- 
       Derek Atkins                 617-623-3745
       derek at ihtfp.com             www.ihtfp.com
       Computer and Internet Security Consultant



More information about the gnucash-user mailing list