[Development] Continous Integration (CI) meetings

Tue Oct 15 15:05:22 CEST 2013

Hi all,

I know many of you are interested in knowing how the CI is doing. We have been 
looking for a while how to improve our communication in this area and of 
course how to get the whole process running smoother for everyone.

>From now on we'll have weekly meetings, similar to the release team. Each 
Tuesday 13:00 CEST ( 
http://www.timeanddate.com/worldclock/fixedtime.html?msg=Qt+QA+meeting&iso=20131022T13&p1=187&ah=1 
)

Since we just had a first attempt at getting this going, here is a little 
summary and the IRC log below.

* We have openSuse machines integrated in the CI (already announced on this 
list), OpenSSL was missing but will be on them from now on.
* Android: tests not yet running, needs some help from Android team 
(androiddeployqt missing). Basic infrastructure in place.
* Some issues with V4 on ARM are still being fixed, should be done today 
though.
* QQuick2 test flakyness:
  * lots of timing dependent tests
  * maybe use cpu time (bogomips) for Q_TRY_VERIFY and friends
  * maybe check the qml engine for running animations (would need new api)
* network test server: we'd like to update it but nobody is actively working 
on it right now (the server based on Ubuntu 12.4 is almost working, some tests 
still fail).
* long term we would like to have more reliability by running defined 
snapshots of VMs for the tests, currently the test machines simply keep on 
running.

Cheers,
Frederik

<tosaraja> ok, without further delaying this for no reason, let's begin this 
:) So hello everybody
<lars> fregl: here as well
<olhirvon> tosaraja is going to lead the discussion
<fregl> great :) hi lars
<ablasche> hi
<pejarven> o/
<tosaraja> I didn't prepare much of an agenda, mainly just thought of 
something we could discuss about. To starters I could tell you about the 
current activities in the CI, what we are doing and what problems we might 
have. And then if you have any questions for us or want to discuss something, 
please do tell whenever you feel like it
<fregl> I guess these meetings are still new, so we'll find the best structure 
over time, but for now I'd say we can quickly go though current issues
<fregl> one current think that is interesting to me: sifalt you found we don't 
have openssl everywhere?
<tosaraja> SuSE's got their OpenSSL development library just 20 minutes ago 
https://codereview.qt-project.org/#change,68203
<fregl> tosaraja: manually or using puppet?
<tosaraja> fregl: using puppet...so you can add 15 minutes to that ;)
<sifalt> fregl: yes, Opensuse and ince70embedded env
<tosaraja> the embedded is still unsolved
<fregl> ok, that I less bad than I thought :)
<fregl> does it block tests on wince or do we simply not run them?
<tosaraja> Which test is run for openssl? How did you notice it was missing 
from suse?
<fregl> nierob_: ^
<sahumada> fregl: we dont run tests for wince
<fregl> I guess enginio fails without openssl, otherwise we probably just skip 
the tests when running make check
<fregl> sahumada: ok, then I think this is not an urgent issue
<nierob_> by default if Qt is compiled without openssl the tests are not 
executed
<nierob_> fregl: they are ifdef'ed
<fregl> fkleint: since we have lars here, maybe we can talk about the quick2 
tests? you have tried to stabilize them and did not get too far, right?
<fregl> tosaraja: what do you have on the agenda? we just talked on Friday so 
I don't have much else right now. any general status update?
<lars> fregl: I'm working on that right now (in a way). I'm going through our 
tests and check them against GC corruptions (found a few already).
<lars> that will hopefully help, but we'll only see over time
<fkleint> fregl: Erik mainly tried (see mail)
<fkleint> fregl: he found that he had to insert arbitrary sleeps
<fkleint> but that is not satisfactory
<tosaraja> fregl: not much really. After we get the current discussions out of 
the way, i was going to tell you about the blockers in other areas
<fregl> so one point Eric writes in the mail is that make check does not skip 
insignificant tests - any comments on that? tosaraja sahumada?
<fkleint> lars: We are facing the problem that  the stuff shows quite some 
non-deterministic behaviour due to the multithreadedness and different render 
loops
<sahumada> fregl: dont know such an email :)
<fkleint> lars: Talked to Squish  folks at DevDays and they are facing the 
same issue
<fkleint> lars: You basically have to wait for animations to finish, etc & 
frames synced before proceeding with thr test
<fregl> sahumada: forwarded
<fkleint> lars: that is quite hard on machines under load
<fkleint> basically increase sleep  until it passes ;-)
<fregl> I think in a way we keep on hitting the same old issue: timers make 
tests really non-deterministic and machines run sometimes under load
<fkleint> I wonder if we could have an API to check whether animations are 
stopped and synced
<tosaraja> fregl: sahumada: I haven't looked at the script running tests, but 
i would imagine that make check doesn't read the files having the 
insignificant flags, but our perl scripts do. We would have to transfer the 
logic from perl -> make to enable that. I might be totally wrong here, but i 
suspect this is how it works
<fkleint> QTRY_VERIFY(QuickEngine.idle() ) or similar
<lars> fkleint: fregl: yes, that's one big issue. best option is probably to 
talk to gunnar about it.
<fkleint> hm,ok
<lars> fkleint: we fixed some issues by speeding up animations and using 
proper waitForWindowShown etc...
<lars> fkleint: did them when gunnar was here 2 weeks ago
<lars> I think the listview test is a lot better now for example.
<fkleint> yep, but CI machines can be under load & really slow
<fregl> another idea that janarve had was to make tests work with cpu time 
instead of wall time or such
<fkleint> but that might differ across platforms
<janarve> yes
<janarve> actually per-process time
<fregl> maybe someone can tell if that would be a sensible way of going about 
it. otherwise we can make it an action point to check with gunnar about 
animations
<lars> fregl: the problem is that we're using QTRY_COMPARE quite a few places, 
and that'll time out. so we'd need a different way that tells us that 
animations are done.
<lars> but we really shouldn't even get close to 5 secs with any of our 
animations. slows down tests a lot as well
<nierob_> I thought that animations are using vsync which is the real time. So 
using process time will not help
<janarve> lars: I agree. I think measuring process-time would solve that
<nierob_> but it would help in the  network tests
<janarve> nierob_: no, vsync is just for drawing. It still uses a timer to 
measure how far the animation should go
<nierob_> ok
<janarve> Btw, do we have evidence that QTRY_COMPARE is really a big problem?
<fkleint> there is an "overload" where you can specify the timeout?
<janarve> yes, but its still not bullet proof
<fkleint> or another macro,  QTRY_COMPARE_TIMEOUT or so
<nierob_> fkleint: yes, but the problem remains
<fkleint> ok, so, it should be possible to specify process time..there
<fkleint> hm
<janarve> If we measured process-time, we could have 
QTRY_COMPARE_BOGOMIPS(how_many_bogomips_until_timeout)
<fkleint> Anyone up for maintaining QTestlib  btw?
<janarve> the problem is that that parameter would be *very* large and not 
very good to guess
<fkleint> then we need to add another set of macros..
<fregl> I don't think anyone looks at qtestlib at the moment
<fkleint> that macro-riddled design is a bit suboptimal..
<nierob_> janarve: QtestLib could query the cpu before running the tests and 
get bogomips stats
<nierob_> janarve: then it could estimate waiting time
<janarve> fkleint: well, all comparing in testlib are based on macros
<janarve> nierob_: so what number would you give to the macro?
<fregl> ok, janarve, fkleint, nierob_ do you think we can start a task force 
to try implementing this? figuring out sane numbers can be done when we have a 
proof of concept I would say
<nierob_> janarve: it would be nice to accept time, which could be recomputed 
to bogomips
<fregl> we don't need to solve it right here and now
<janarve> I am interested
<janarve> fregl: ^
<ossi|tt> i don't think using cpu time is any good. it doesn't buy anything if 
some events get lost, etc.
<nierob_> me too
<fregl> ossi|tt: how do events get lost?
<fkleint> but coming back to Quick2, would some API lilke QmlEngine::idle() 
help?
<fkleint> (also thinking Squish)
<fregl> nierob_: janarve: great, you have an AP. I guess it might be a 
combination of both times.
<ossi|tt> fregl: typically they are not sent ;)
<fregl> ossi|tt: that sounds like a bug that is in need of fixing then
<janarve> I'll discuss with nierob_
<ossi|tt> fregl: yes. we are talking about autotest ;)
<fregl> fkleint: the problem is animations running indefinitely
<fkleint> yes, except those basically
<fregl> ossi|tt: yes, so if events get lost, we better not fix the test but 
the code that loses them
<ossi|tt> fregl: the point is that relying on cpu time will simply make some 
failing tests wait forever
<fkleint> foreach (QAnimation *a) if (a->isRunning() && !isInDefinitely) 
return false
<ossi|tt> fregl: just don't rely on anything that relates to the program doing 
something particular
<fregl> ok, so we need to talk to gunnar about that, is he on irc atm?
<fkleint> Hm. can';t see him
<ossi|tt> fregl: wall time is the best you can get. make a long enough 
timeout, and make sure that the good case will *not* need much time.
<sletta> fregl: here
<fkleint> ossi|tt: Enter the multihtreaded world of Quick2 ;-) this is not 
widgets
<fregl> sletta: fkleint had some good question relating to animations in quick
<ossi|tt> fkleint: i'm not sure what this has to do with anything
<fregl> sletta: basically we were discussing flaky tests - some depend on 
animations finishing
<tosaraja> Can systems where CPU speed and BUS speeds are adjustable according 
to load mess up things like calculating CPU time?
<fregl> sletta: is there a way to check if there are running animation (apart 
from those runing for ever)
<fkleint> sletta: backlog at http://paste.kde.org/p7fe90954
<ossi|tt> tosaraja: the kernel is supposed to consider that when giving back 
cpu time. but anyway, as i said, i don't think cpu time is a good idea.
<tosaraja> ossi|tt: right
<ossi|tt> tosaraja: it may be a good idea to use resource-limited cgroups to 
contain amok-running tests, though.
<fregl> ossi|tt: I think we can let nierob_ and janarve try to prototype 
something and if that fails it fails.
<tosaraja> ossi|tt: are you volunteering to investigate this? ;)
<fkleint> Quick2 rather tend to not run amok ,but jut fail..
<ossi|tt> tosaraja: not really. but fundamentally it isn't hard ^^
<ossi|tt> fregl: it has already failed, because logic says that it's not going 
to be reliable by design
<fregl> ossi|tt: I guess that's linux only though. and I'd like to move on to 
the qquick2 problems
<ossi|tt> fregl: yes, it is
<fregl> ok, lets take the cpu time after this meeting.
<sletta> fregl: we don't have an API to query for running animations or to 
separate between indefinitely running animation and animations which have a 
definite unknown finish time
<nierob_> sletta: could we have it?
<sletta> how would that help_
<fkleint> QTRY_VERIFY(QMlEngine::idle())
<fkleint> then proceed
<fkleint> with testing
<tosaraja> ok, if you pick this up after the meeting or have another thread 
here on the side (think we can manage that) , I'd have another for you: 
compiling V4 to ARM. It seems like Blackberry is stumbling upon the same 
problem as we now have
<sletta> that still excludes metainvokes, timers and other async behavior 
which is heavily used throughout Qt. I'm not sure that will fix anything
<fkleint> OK, so, everyone can sleep over it and maybe develop some ideas
<fregl> lars: v4 and arm is being fixed if I understand correctly?
<tosaraja> it was already discussed on the release mailing list shortly. Do we 
have anyone doing the implementation to ARM? (currently it's only for THUMB)
<fregl> tosaraja: actually I think Simon is working on that right now
<sletta> Why does QTRY_VERIFY time out in the first place... If it ends up 
hanging, the test will be killed anyway
<tosaraja> fregl: great! :)
<lars> fregl: yes, tronical has a fix he's now cleaning up
<lars> fregl: arm is actually mostly working, android was the issue :)
<fregl> tosaraja: so that one should be there in a day or two
<lars> fregl: erikv is also working on some arm related issues
<tosaraja> fregl: I'll start a build now and then...then :)
<lars> tosaraja: fregl: we hopefully have it all fixed tonight...
<tosaraja> Then we have Android testing as an issue
<fregl> tosaraja: so let's see if it works tomorrow, the fix needs to go 
though integration first I guess ;)
<tosaraja> sifalt was trying with the latest changes, but was missing 
something. Do we have Simo here?
<lars> fregl: we've had little issues with CI on declarative lately, so I'm 
positive
<sifalt> tosaraja: I cherrypicked eskil's chnages from gerrit, but it looked 
like i was sitll missing something
<olhirvon> sifalt: ^^
<fregl> sifalt: maybe best talk to eskil directly
<sifalt> it was nagging about missing androiddeployqt
<tosaraja> getting those androids running tests shouldn't be that hard. We 
already have a line of 10 tablets waiting to be tested on. When we get the 
first machine set up and it manages to test correctly, we can just clone it 
and we will have 10 machines connected to 10 tablets running the tests
<tosaraja> meaning, we can have several submodules verified on android.
<tosaraja> ios on simulator is something I have on my todo list, and will 
start working on that probably already this week
<fregl> tosaraja: are the machines physical?
<tosaraja> fregl: no, 10 virtual machines connected to 10 tablets. I will 
hard-code to environment variables the different IP addresses of the tablets
<fregl> right, so the VMs can access usb, sounds good.
<tosaraja> fregl: until we come up with how to easily create a pool of devices 
and maintain a state machine on which is occupied and which is free, i thought 
this would be the easy way out
<fregl> tosaraja: sounds sensible to me
<tosaraja> fregl: the connection is done over IP, not USB
<fregl> ok
<tosaraja> fregl: the tablet's ADB server is set to listen to IP
<fregl> well, seems like you have that under control
<tosaraja> then... the new network test server
<tosaraja> it hasn't gotten any progress lately.
<nierob_> btw. digia used to sell a solution for multiple mobile testing / 
code execution so maybe we have such thing ready
<tosaraja> afaik, peter-h fixed tests to work on the new server, but as far as 
the remainder goes, I don't think we have gotten anywhere
<tosaraja> we still have a few tests failing preventing us from upgrading
<tosaraja> With all these other things on my table, I haven't had time to 
anything about this. And I recon that neither does anyone else ;)
<tosaraja> But to more promising news... having CI create VMs on demand... and 
always using a clean clone is progressing.
<tosaraja> Qt has finally gotten its own vSphere to build on. We have 
enterprise version of jenkins installed as a trial
<tosaraja> Now I can start playing around with it and see how it connects and 
how it works. We should also have the APIs for vSphere available now if we 
wanted to create the modules for Jenkins ourselves.
<tosaraja> Do we have anyeone here that has done that previously? Do new 
plugins for Jenkins i mean
<tosaraja> If we did that work ourselves, we could still continue using the 
free version of Jenkins
<fregl> that means we can basically start snapshotting CI machines and rolling 
back to old stages. Maybe we should consider moving in that direction and 
depending less on puppet.
<tosaraja> Yes, we could scrap puppet after that
<fregl> I think as a first step that snapshotting sounds more important than 
also starting VMs on demand
<tosaraja> and each time we did a modification to the template, we could run a 
whole Qt build and run the tests on it to see that we don't break anything
<fregl> yes
<fregl> we seem to have the capacity to run all those machines now, so that 
should not change
<tosaraja> fregl: snapshotting from the current situation is a bad idea, since 
the machines are already out of sync and not clones. We would need to re-
create every machine first, then create a base snapshot and continue from 
there
<fregl> if the puppet (or manual) update is run on only one machine at a time, 
as describe above, we can then start using that to test everything and if it 
works indeed use it as new template for the other VMs
<tosaraja> fregl: and having the on demand would do that for us
<fregl> yes, of course start the snapshots from a clean slate
<fregl> tosaraja: but how do you teach jenkins that it can just create 
machines? or do we pretend to jenkins that we still have a limited number of 
machines running somehow?
<tosaraja> fregl: the first step jenkins would do is create a new node for 
itself and as that node starts it would create a VM that connected to itself
<tosaraja> fregl: at least that's how i pictured myself doing it
<fregl> tosaraja: I don't understand. how does it create a node? is a node not 
a running machine already?
<tosaraja> fregl: I don't really know how Jenkins manages to create a new node 
on the fly, and how it would start working on it as it would be offline until 
something connected to it.... it might be that some master node would have to 
take care of the initial setup
<tosaraja> fregl: ^ still to test and figure out ;)
<fregl> tosaraja: so I think this is rather hard, that is why I would start 
with a pool of machines first. These can then be reverted back to a snapshot 
after the test run.
<sifalt> tosaraja: fregl: I don't think it is a problem. We are already having 
a situation where the master thinks that there is more nodes than there 
actually is
<tosaraja> fregl: how would we manage to put puppet into all that? If we had a 
snapshot and we need to run puppet on it. we would need to update the snapshot
<fregl> sifalt: ok, if that can be done, that's really great.
<tosaraja> ok...but it's past 3 o clock here and the time is up for this 
meeting
<fregl> tosaraja: I would imagine: run the snapshot, let puppet run, take new 
snapshot
<fregl> ok, let's just meet again next Tuesday.
<fregl> can someone write a short summary? we can attach the backlog.
<tosaraja> obviously nothing much changes regading the usage of this 
channel... whenever you need us or eachother, we will be here. I'll just be 
heading home now. Thanks and bye :)
<fregl> tosaraja: yes. and it's good to have a bit of focused time for this :)
<tosaraja> indeed
<olhirvon> This seems to be useful :) Thanks everyone for active 
participation.

-- 
Best regards,
Frederik Gladhorn
Senior Software Engineer - Digia, Qt
Visit us on: http://qt.digia.com