This request supersedes:
request 672910
(Show diff)
Overview
Loading...
Can you describe *why *it is useful to wait 15 seconds before the service is restarted? That sounds wrong.
author
We have seen that spark master shut sdown when it has trouble talking to zookeeper, may be due to network related issues or if zookeeper is not functional. 15 seconds is just to provide some time for these problems to be resolved otherwise spark might be restarted but if there are still problems with zookeeper it might die again. This just prevents continuous restarts.
Login required, please
login
in order to comment
We have seen that spark master shut sdown when it has trouble talking to zookeeper, may be due to network related issues or if zookeeper is not functional. 15 seconds is just to provide some time for these problems to be resolved otherwise spark might be restarted but if there are still problems with zookeeper it might die again. This just prevents continuous restarts.
Also added Restart to spark-worker service file. Since if spark-master is down, there is a small possibility of the worker also getting into a weird state or dying. We havent seen evidence of this happening yet but I think adding the change will make it more bullet proof.