[Development] The CI is down atm

Jędrzej Nowacki jedrzej.nowacki at qt.io
Mon Mar 19 08:58:06 CET 2018


Oh, thanks for sharing it. For the first time we have something that really 
points to fscache, or at least has it in the logs. I suggest to replace NFS + 
fscache, with a distributed files system with _good_ local caching (could be 
cephfs?). The current setup requires only that used part of images is cached 
locally  and that all images are permanently stored in one location. As a 
bonus we would migrate from a stat setup to a mesh, that would be good 
too.That can not be hard to achieve ;-)

Cheers,  
 Jędrek


Dnia sobota, 17 marca 2018 07:42:30 CET Tony Sarajärvi pisze:
> Ok I brought it up again. Hosts seem to be dying left and right. The current
> MAAS provided Ubuntu 16.04 LTS seem to have at least 3 different symptoms
> of how it crashes. 
> kernel BUG at /build/linux-Fk60NP/linux-4.10.0/include/linux/swapops.h:129!
> kernel BUG at /build/linux-Fk60NP/linux-4.10.0/fs/fscache/operation.c:494!
> or
> kernel BUG at /build/linux-Fk60NP/linux-4.10.0/fs/fscache/operation.c:68!
>  
> + possible memory corruptions on multiple host causing crashes. These are
> being checked by memtest, but it hasn’t found anything wrong. 
> So, coin is building again, but more crashes are due. I should swap to 17.04
> or 17.10 even though they have their own share of problems. They _might_
> have gotten fixed during this period when we went back to 16.04. And as
> 16.04 went from bad to worse recently, the 17.xx are surely more stable
> right now. 
> Fingers crossed!
>  
> -Tony
>  
> Oh, and capacity is heavily reduced before the crashed servers are deployed
> back. And a long queue is in already naturally. ☹ 
> We aren’t monitoring 24/7, so please inform at qt.ci at qt.io  if you suspect
> the CI has crashed. Thank you! 
> From:  Tony Sarajärvi
> Sent:  lauantai 17. maaliskuuta 2018 8.03
> To:  development at qt-project.org
> Subject:  The CI is down atm
>  
> Hi
>  
> The CI is down atm and I can’t even log in to the server right now. I’m
> trying to fix this somehow right away. 
> -Tony





More information about the Development mailing list