Hi,
I have lava-master and lava-slave v2018.1 installed, and a qemu device added. Test job can be scheduler. Then I followed https://validation.linaro.org/static/docs/v2/pipeline-server.html#using-zmq-... to enable ZMQ authentication.
Certificates were generated correctly, public certificates were copied to master and slave respectively. With the following configs: lava-master ``` MASTER_SOCKET="--master-socket tcp://*:5556"
LOGLEVEL="DEBUG"
ENCRYPT="--encrypt" MASTER_CERT="--master-cert /etc/lava-dispatcher/certificates.d/master.key_secret" SLAVES_CERTS="--slaves-certs /etc/lava-dispatcher/certificates.d/" ```
lava-slave ``` MASTER_URL="tcp://192.168.11.214:5556" LOGGER_URL="tcp://192.168.11.214:5555"
HOSTNAME="--hostname lava-slave1"
LOGLEVEL="DEBUG"
ENCRYPT="--encrypt" MASTER_CERT="--master-cert /etc/lava-dispatcher/certificates.d/master.key" SLAVE_CERT="--slave-cert /etc/lava-dispatcher/certificates.d/slave1.key_secret" ```
After lava-master and lava-slave restarted, I see the following logs. Seems the connect was established, but lava-logs went offline. lava-master ``` 2018-01-30 11:05:50,260 DEBUG lava-slave1 => PING(20) 2018-01-30 11:05:52,086 DEBUG lava-master => PING(20) 2018-01-30 11:06:08,728 DEBUG lava-logs => PING(20) 2018-01-30 11:06:10,261 INFO scheduling health checks: 2018-01-30 11:06:10,270 DEBUG -> disabled on: lxc, qemu 2018-01-30 11:06:10,271 INFO scheduling jobs: 2018-01-30 11:06:10,272 DEBUG - lxc 2018-01-30 11:06:10,292 DEBUG - qemu 2018-01-30 11:06:10,332 DEBUG lava-slave1 => PING(20) 2018-01-30 11:06:12,115 DEBUG lava-master => PING(20) 2018-01-30 11:06:20,252 INFO [POLL] Received a signal, leaving 2018-01-30 11:06:20,254 INFO [CLOSE] Closing the controler socket and dropping messages 2018-01-30 11:06:21,203 INFO [INIT] Dropping privileges 2018-01-30 11:06:21,204 DEBUG Switching to (lavaserver(114), lavaserver(119)) 2018-01-30 11:06:21,204 INFO [INIT] Marking all workers as offline 2018-01-30 11:06:21,209 INFO [INIT] Starting encryption 2018-01-30 11:06:21,211 DEBUG [INIT] Opening master certificate: /etc/lava-dispatcher/certificates.d/master.key_secret 2018-01-30 11:06:21,238 DEBUG [INIT] Using slaves certificates from: /etc/lava-dispatcher/certificates.d/ 2018-01-30 11:06:21,245 INFO [INIT] LAVA master has started. 2018-01-30 11:06:21,246 INFO [INIT] Using protocol version 2 2018-01-30 11:06:41,247 WARNING lava-logs is offline: can't schedule jobs 2018-01-30 11:07:01,255 WARNING lava-logs is offline: can't schedule jobs 2018-01-30 11:07:04,433 INFO lava-slave1 => HELLO 2018-01-30 11:07:04,433 WARNING New dispatcher <lava-slave1> 2018-01-30 11:07:09,450 DEBUG lava-slave1 => PING(20) 2018-01-30 11:07:21,260 WARNING lava-logs is offline: can't schedule jobs 2018-01-30 11:07:29,477 DEBUG lava-slave1 => PING(20) 2018-01-30 11:07:41,265 WARNING lava-logs is offline: can't schedule jobs ```
lava-slave ``` 2018-01-30 11:06:10,283 DEBUG PING => master (last message 20s ago) 2018-01-30 11:06:10,335 DEBUG master => PONG(20) 2018-01-30 11:06:30,356 DEBUG PING => master (last message 20s ago) 2018-01-30 11:07:04,379 INFO [INIT] LAVA slave has started. 2018-01-30 11:07:04,380 INFO [INIT] Using protocol version 2 2018-01-30 11:07:04,390 INFO [INIT] Starting encryption 2018-01-30 11:07:04,390 DEBUG Opening slave certificate: /etc/lava-dispatcher/certificates.d/slave1.key_secret 2018-01-30 11:07:04,413 DEBUG Opening master certificate: /etc/lava-dispatcher/certificates.d/master.key 2018-01-30 11:07:04,414 INFO [INIT] Connecting to master as <lava-slave1> 2018-01-30 11:07:04,415 INFO [INIT] Greeting the master => 'HELLO' 2018-01-30 11:07:04,440 INFO [INIT] Connection with master established 2018-01-30 11:07:04,442 INFO Master is ONLINE 2018-01-30 11:07:04,443 INFO Waiting for instructions 2018-01-30 11:07:09,450 DEBUG PING => master (last message 5s ago) 2018-01-30 11:07:09,455 DEBUG master => PONG(20) ```
From django admin console, I see lava-slave1 still is online, but
both lava-master and lava-logs workers went offline, and it stopped scheduling test job. Have you guys ever see/hit this issue? Any advice and suggestions would be appreciated.
Thanks, Chase