[Development] CI stabilization status
Sami Nurmenniemi
sami.nurmenniemi at qt.io
Wed May 2 11:28:32 CEST 2018
Hi all,
Additional effort was put in stabilizing the CI in the beginning of the February. Highlights of the improvements so far are:
* Top flaky test cases have been fixed or blacklisted
* This was initially the number one reason for randomly failing integrations
* Mostly done for qtbase
* JIRA tickets created for the blacklisted tests
* Coin improvements
* Secondary Coin instance running for testing qt5 builds agains Coin master branch
* Garbage collection of old build artifacts for reducing disk usage on Coin host
* The last cancelled qt5 builds are protected
* Agent heartbeat added, builds no longer fail if VM or KVM host crashes
* Refactoring asynchronous code from Python threads to asyncio
*
Reduced Gerrit connection attempts
*
Improved robustness of CI for CI tests
*
Fixed deadlocks causing integrations to hang
* Infra
* More KVM host machines deployed (previously 20, now 33 hosts)
* Network bandwidth on KVM host machines increased from 1 Gbit/s to 10 Gbit/s
* Provisioning
* Changes specific for one OS no longer require provisioning all of them
* OS package repositories are mirrored to reduce external network usage
* Automatic updates on guest VMs disabled to reduce variance of external components
* CI system monitoring page added to Grafana
* Simple list of red/green hearts (you have to be logged in to see these)
* Detects problems before they start affecting the CI stability
Problems we are still facing:
* KVM hosts crashing (kernel bug with nfs cache)
* Configure step hanging during build, mostly happens on macOS
* Random flaky test failures on other components than qtbase
* Some leftovers to iron out from the Coin refactoring
Best Regards,
Sami
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.qt-project.org/pipermail/development/attachments/20180502/29bbd84d/attachment.html>
More information about the Development
mailing list