Overview

Request 673429 accepted

- Added Restart and RestartSec to restart
spark master and spark worker (bsc#1091479)

Loading...

Ashwin Agate's avatar

We have seen that spark master shut sdown when it has trouble talking to zookeeper, may be due to network related issues or if zookeeper is not functional. 15 seconds is just to provide some time for these problems to be resolved otherwise spark might be restarted but if there are still problems with zookeeper it might die again. This just prevents continuous restarts.


Ashwin Agate's avatar

Also added Restart to spark-worker service file. Since if spark-master is down, there is a small possibility of the worker also getting into a weird state or dying. We havent seen evidence of this happening yet but I think adding the change will make it more bullet proof.


Thomas Bechtold's avatar

Can you describe *why *it is useful to wait 15 seconds before the service is restarted? That sounds wrong.


Ashwin Agate's avatar

We have seen that spark master shut sdown when it has trouble talking to zookeeper, may be due to network related issues or if zookeeper is not functional. 15 seconds is just to provide some time for these problems to be resolved otherwise spark might be restarted but if there are still problems with zookeeper it might die again. This just prevents continuous restarts.

Request History
Ashwin Agate's avatar

aagate created request

- Added Restart and RestartSec to restart
spark master and spark worker (bsc#1091479)


Joseph Davis's avatar

joadavis accepted request

Definitely a needed improvement.

openSUSE Build Service is sponsored by