[Development] The CI is down atm
Tony Sarajärvi
tony.sarajarvi at qt.io
Sat Mar 17 12:14:09 CET 2018
(top posting, thank you outlook)
The NFS is the backbone of how our CI works. I've been drawing a describing picture of how our infra is set up so that everyone can understand the bits and pieces of it, but I've seen so super busy lately that it hasn't progressed lately. Began working on it during the slower period at Christmas.
Basically we have 20 hosts that need access to the same data. For this we have a Dell Compellent storage system in the background and OpenNebula having the qcow2 images stored there. Now these qcow2 images are shared over NFS and we have 20 hosts reading that data. If they constantly read that data over the local network, the NFS server would not cope with it. So that's why all hosts have NFS caches so that the data they need is mostly available every time it's needed.
It would be a dramatic change in the infra setup if we distributed all these qcow2 images to the hosts beforehand, and we don't have the storage space on the hosts to store the images either ☹
-Tony
-----Original Message-----
From: Allan Jensen
Sent: lauantai 17. maaliskuuta 2018 11.25
To: Tony Sarajärvi <tony.sarajarvi at qt.io>
Cc: development at qt-project.org
Subject: Re: The CI is down atm
On Samstag, 17. März 2018 07:42:30 CET Tony Sarajärvi wrote:
> Ok I brought it up again. Hosts seem to be dying left and right. The
> current MAAS provided Ubuntu 16.04 LTS seem to have at least 3
> different symptoms of how it crashes.
>
>
>
> kernel BUG at /build/linux-Fk60NP/linux-4.10.0/include/linux/swapops.h:129!
>
> kernel BUG at /build/linux-Fk60NP/linux-4.10.0/fs/fscache/operation.c:494!
>
> or
>
> kernel BUG at /build/linux-Fk60NP/linux-4.10.0/fs/fscache/operation.c:68!
>
>
>
> + possible memory corruptions on multiple host causing crashes. These
> + are
> being checked by memtest, but it hasn’t found anything wrong.
>
Googling around a bit, these errors seems to be pretty common and happen across many different distros and for years. The general conclusion seems to be that fscache is just not very stable.
Can't you somehow avoid NFS instead?
'Allan
More information about the Development
mailing list