Running gce --quit-soon <name> does not delete the VM after a firework has completed in some cases. I would expect a worker to check for the metadata status after each firework completes but it looks like rapidfire can launch many fireworks before returning for the metadata check here:
|
rocket_launcher.rapidfire( |
|
self.launchpad, self.fireworker, strm_lvl=self.strm_lvl, |
|
max_loops=1, sleep_time=self.sleep_secs) |
|
|
|
# Idle to the max. |
|
idled = self.sleep_secs # rapidfire() just slept once |
|
while not self.launchpad.run_exists(self.fireworker): # none ready to run |
|
future_work = self.launchpad.future_run_exists(self.fireworker) # any ready or waiting? |
|
if idled >= (self.idle_for_waiters if future_work else self.idle_for_rockets): |
|
return 'idle' |
|
|
|
req = gcp.instance_attribute('quit') |
|
if req == 'soon' or req == 'when-idle': |
|
return '"quit={}" request'.format(req) |
|
|
|
FW_CONSOLE_LOGGER.debug( |
|
'Sleeping for %s secs waiting for launchable rockets', |
|
self.sleep_secs) |
|
time.sleep(self.sleep_secs) |
|
idled += self.sleep_secs |
|
|
|
req = gcp.instance_attribute('quit') |
|
if req == 'soon': |
|
return '"quit={}" request'.format(req) |
I think the arg nlaunches=1 should be passed to rapidfire to exit after launching only one firework so we can check for the quit metadata. I think rapidfire will launch as many rockets that are waiting as it can since it looks like it skips the loop check if more fireworks are ready.
https://github.com/materialsproject/fireworks/blob/6cb2a66d35239611ec2a1ccb807be38976198a0b/fireworks/core/rocket_launcher.py#L107-L126
Is the expectation to check for the metadata after each firework or to let rapidfire launch as many as it wants before checking?
Running
gce --quit-soon <name>does not delete the VM after a firework has completed in some cases. I would expect a worker to check for the metadata status after each firework completes but it looks likerapidfirecan launch many fireworks before returning for the metadata check here:borealis/borealis/fireworker.py
Lines 185 to 208 in d24b972
I think the arg
nlaunches=1should be passed torapidfireto exit after launching only one firework so we can check for the quit metadata. I thinkrapidfirewill launch as many rockets that are waiting as it can since it looks like it skips the loop check if more fireworks are ready.https://github.com/materialsproject/fireworks/blob/6cb2a66d35239611ec2a1ccb807be38976198a0b/fireworks/core/rocket_launcher.py#L107-L126
Is the expectation to check for the metadata after each firework or to let
rapidfirelaunch as many as it wants before checking?