Services
app.services.base
Service
Bases: UUIDAuditBase
close_tunnel(session: AsyncSession) -> None
async
get_job(provider: str = None) -> Job
Fetch the Slurm job backing the service.
open_tunnel(session: AsyncSession) -> None
async
Create an ssh tunnel to connect to the service. Assumes attached to session.
After creation of the tunnel, the remote port is updated and recorded in the database.
refresh(session: AsyncSession, config: State)
async
Update the service status. Assumes running in an attached state.
Determines the service status by pinging the service and then checking the Slurm job state if the ping in unsuccessful. Updates the service database and returns the status.
The status returned depends on the starting status because services in a "STARTING" status cannot transitionto an "UNHEALTHY" status. The status life-cycle is as follows:
Slurm job submitted -> SUBMITTED
Slurm job switches to pending -> PENDING
Slurm job switches to running -> STARTING
API ping successful -> HEALTHY
API ping unsuccessful -> STARTING
API ping unsuccessful and time limit exceeded -> TIMEOUT
Slurm job switches to failed -> FAILED
Slurm job switches to failed -> FAILED
A service that successfully starts will be in a HEALTHY status. The status remains HEALTHY as long as subsequent updates ping successfully. Unsuccessful pings will transition the service status to FAILED if the Slurm job has failed; TIMEOUT if the Slurm job times out; and UNHEALTHY otherwise.
An UNHEALTHY service becomes HEALTHY if the update pings successfully. Otherwise, the service status changes to FAILED if the Slurm job has failed or TIMEOUT if the Slurm job times out.
Services that enter a terminal status (FAILED, TIMEOUT or STOPPED) cannot be re-started.
start(session: AsyncSession, config: State, container_options: dict, job_options: dict)
async
Start the service with provided Slurm job and container options. Assumes running in attached state.
Submits a Slurm job request, creates a new database entry and waits for the service to start.
Parameters:
-
container_options
(dict
) –a dict containing container options (see ContainerConfig).
-
job_options
(dict
) –a dict containing job options (see JobConfig).
Returns:
-
–
None.
stop(session: AsyncSession, config: State, delay: int = 0, timeout: bool = False, failed: bool = False)
async
Stop the service after delay
seconds. Assumes running in attached state.
The default terminal state is STOPPED, which indicates that the service
was stopped normally. Use the failed
or timeout
flags to indicate
that the service stopped due to a Slurm job failure or timeout, resp.
This process updates the database after stopping the service.
Parameters:
-
delay
(int
, default:0
) –The number of seconds to wait before stopping the service.
-
timeout
(bool
, default:False
) –A flag indicating the service timed out.
-
failed
(bool
, default:False
) –A flag indicating the service Slurm job failed.