Skip to content

API Reference

blackfish.client

Blackfish client for the programmatic interface.

Blackfish

Programmatic interface for managing Blackfish ML inference services.

This client provides both synchronous and asynchronous APIs for creating, managing, and monitoring ML inference services. All async methods are prefixed with 'async_' (e.g., async_launch_service, async_list_services).

Examples:

Synchronous usage:

>>> bf = Blackfish()
>>> service = bf.launch_service(
...     name="my-llm",
...     image="text_generation",
...     model="meta-llama/Llama-3.3-70B-Instruct",
...     profile_name="default"
... )
>>> print(service.status)

Asynchronous usage:

>>> async def main():
...     bf = Blackfish()
...     service = await bf.async_launch_service(
...         name="my-llm",
...         image="text_generation",
...         model="meta-llama/Llama-3.3-70B-Instruct",
...         profile_name="default"
...     )
...     print(service.status)
>>> asyncio.run(main())

home_dir: str property

Get the Blackfish home directory.

__aenter__() -> Self async

Async context manager entry.

__aexit__(exc_type: type[BaseException] | None, exc_val: BaseException | None, exc_tb: Any) -> None async

Async context manager exit.

__enter__() -> Self

Sync context manager entry.

__exit__(exc_type: type[BaseException] | None, exc_val: BaseException | None, exc_tb: Any) -> None

Sync context manager exit.

__init__(home_dir: Optional[str] = None, host: Optional[str] = None, port: Optional[int] = None, debug: Optional[bool] = None, auth_token: Optional[str] = None, config: Optional[BlackfishConfig] = None)

Initialize the Blackfish client.

You can either pass a complete BlackfishConfig object, or pass individual configuration parameters. Individual parameters will override values from a provided config object.

Parameters:

  • home_dir (Optional[str], default: None ) –

    Path to Blackfish home directory (default: ~/.blackfish)

  • host (Optional[str], default: None ) –

    API host (default: localhost)

  • port (Optional[int], default: None ) –

    API port (default: 8000)

  • debug (Optional[bool], default: None ) –

    Debug mode (default: True)

  • auth_token (Optional[str], default: None ) –

    Authentication token (optional)

  • config (Optional[BlackfishConfig], default: None ) –

    Optional BlackfishConfig instance for advanced configuration. Individual parameters will override config values if provided.

Examples:

Simple usage:

>>> bf = Blackfish(home_dir="~/.blackfish", debug=True)

Advanced usage with full config:

>>> config = BlackfishConfig(home_dir="~/.blackfish", port=9000)
>>> bf = Blackfish(config=config)

Mixed usage (config + overrides):

>>> config = BlackfishConfig(...)
>>> bf = Blackfish(config=config, port=9000)  # Override just the port

async_delete_service(service_id: str) -> bool async

Delete a service from the database (async).

Note: This only deletes the database record. The service should be stopped first using stop_service().

Parameters:

  • service_id (str) –

    UUID of the service

Returns:

  • bool

    True if deleted, False if not found

async_get_service(service_id: str) -> Optional[ManagedService] async

Get a service by ID (async).

Parameters:

  • service_id (str) –

    UUID of the service

Returns:

  • Optional[ManagedService]

    ManagedService instance or None if not found

async_launch_service(name: str, image: str, model: str, profile_name: str = 'default', container_config: Optional[dict[str, Any]] = None, job_config: Optional[dict[str, Any]] = None, mount: Optional[str] = None, grace_period: int = 180, auto_cleanup: bool = True, **kwargs: dict[str, Any]) -> ManagedService async

Create and start a new service (async).

Parameters:

  • name (str) –

    Service name

  • image (str) –

    Service image type (e.g., "text_generation", "speech_recognition")

  • model (str) –

    Model repository ID (e.g., "meta-llama/Llama-3.3-70B-Instruct")

  • profile_name (str, default: 'default' ) –

    Name of the profile to use

  • container_config (Optional[dict[str, Any]], default: None ) –

    Container configuration options. If 'model_dir' and 'revision' are not provided, they will be automatically determined by searching for the model in the profile's cache directories and selecting the latest revision.

  • job_config (Optional[dict[str, Any]], default: None ) –

    Job configuration options (Slurm settings, etc.)

  • mount (Optional[str], default: None ) –

    Optional directory to mount

  • grace_period (int, default: 180 ) –

    Time in seconds to wait before marking unhealthy

  • auto_cleanup (bool, default: True ) –

    If True, automatically stop and delete this service when the Python script exits (default: True)

  • **kwargs (dict[str, Any], default: {} ) –

    Additional service-specific parameters

Returns:

  • ManagedService ( ManagedService ) –

    The created service instance wrapped for easy access

Raises:

  • ValueError

    If the profile is not found, the model is not available, or model files cannot be located.

async_list_services(image: Optional[str] = None, model: Optional[str] = None, status: Optional[ServiceStatus] = None, name: Optional[str] = None, profile: Optional[str] = None) -> list[ManagedService] async

List services with optional filters (async).

Parameters:

  • image (Optional[str], default: None ) –

    Filter by image type

  • model (Optional[str], default: None ) –

    Filter by model

  • status (Optional[ServiceStatus], default: None ) –

    Filter by status

  • name (Optional[str], default: None ) –

    Filter by name

  • profile (Optional[str], default: None ) –

    Filter by profile

Returns:

async_stop_service(service_id: str, timeout: bool = False, failed: bool = False) -> Optional[ManagedService] async

Stop a service (async).

Parameters:

  • service_id (str) –

    UUID of the service

  • timeout (bool, default: False ) –

    Mark as timed out

  • failed (bool, default: False ) –

    Mark as failed

Returns:

  • Optional[ManagedService]

    Updated managed service instance or None if not found

async_wait_for_service(service_id: str, target_status: ServiceStatus = ServiceStatus.HEALTHY, timeout: float = 300, poll_interval: float = 5) -> Optional[ManagedService] async

Wait for a service to reach a target status (async).

Parameters:

  • service_id (str) –

    UUID of the service

  • target_status (ServiceStatus, default: HEALTHY ) –

    Status to wait for (default: HEALTHY)

  • timeout (float, default: 300 ) –

    Maximum time to wait in seconds (default: 300)

  • poll_interval (float, default: 5 ) –

    Time between status checks in seconds (default: 5)

Returns:

  • Optional[ManagedService]

    ManagedService instance if target status reached, None if timeout or service failed

Examples:

>>> service = bf.launch_service(...)
>>> service = await bf.async_wait_for_service(str(service.id))
>>> if service and service.status == ServiceStatus.HEALTHY:
...     print(f"Service ready on port {service.port}")

close() -> None

Close the database connection (sync wrapper).

delete_service(service_id: str) -> bool async

Delete a service from the database (sync wrapper).

See async_delete_service for details.

get_service(service_id: str) -> Optional[ManagedService] async

Get a service by ID (sync wrapper).

See async_get_service for details.

launch_service(name: str, image: str, model: str, profile_name: str = 'default', container_config: Optional[dict[str, Any]] = None, job_config: Optional[dict[str, Any]] = None, mount: Optional[str] = None, grace_period: int = 180, auto_cleanup: bool = True, **kwargs: dict[str, Any]) -> ManagedService async

Create and start a new service (sync wrapper).

See async_launch_service for details.

list_services(image: Optional[str] = None, model: Optional[str] = None, status: Optional[ServiceStatus] = None, name: Optional[str] = None, profile: Optional[str] = None) -> list[ManagedService] async

List services with optional filters (sync wrapper).

See async_list_services for details.

stop_service(service_id: str, timeout: bool = False, failed: bool = False) -> Optional[ManagedService] async

Stop a service (sync wrapper).

See async_stop_service for details.

wait_for_service(service_id: str, target_status: ServiceStatus = ServiceStatus.HEALTHY, timeout: float = 300, poll_interval: float = 5) -> Optional[ManagedService] async

Wait for a service to reach a target status (sync wrapper).

See async_wait_for_service for details.

blackfish.service

ManagedService wrapper class for the Blackfish programmatic interface.

ManagedService

Wrapper around Service that provides convenient access to service methods.

This class wraps a Service object and provides easy-to-use methods that don't require passing session and state objects. All operations are delegated to the parent Blackfish client.

__getattr__(name: str) -> Any

Delegate attribute access to the underlying service.

__init__(service: Service, client: Blackfish)

Initialize a managed service.

Parameters:

  • service (Service) –

    The underlying Service object

  • client (Blackfish) –

    The Blackfish client managing this service

__repr__() -> str

Return string representation.

async_close_tunnel() -> Self async

Close the SSH tunnel for this service (async).

This is useful when a service didn't properly release its port. Finds and kills SSH processes associated with the service's port.

Returns:

  • Self

    Self for method chaining

async_delete() -> bool async

Delete the service from the database (async).

Returns:

  • bool

    True if deleted successfully

async_refresh() -> Self async

Refresh the service status (async).

Returns:

  • Self

    Self for method chaining

async_stop(timeout: bool = False, failed: bool = False) -> Self async

Stop the service (async).

Parameters:

  • timeout (bool, default: False ) –

    Mark as timed out

  • failed (bool, default: False ) –

    Mark as failed

Returns:

  • Self

    Self for method chaining

async_wait(timeout: float = 300, poll_interval: float = 10) -> Self async

Wait for the service to be healthy.

Parameters:

  • timeout (float, default: 300 ) –

    Maximum time to wait in seconds (default: 300)

  • poll_interval (float, default: 10 ) –

    Time between status checks in seconds (default: 10)

Returns:

  • Self

    Self (for method chaining), or None if service not found

Examples:

>>> service = await bf.async_launch_service(...)
>>> service = await service.async_wait()
>>> if service and service.status == ServiceStatus.HEALTHY:
...     print(f"Service ready on port {service.port}")

close_tunnel() -> Self

Close the SSH tunnel for this service (sync).

This is useful when a service didn't properly release its port. Finds and kills SSH processes associated with the service's port.

Returns:

  • Self

    Self for method chaining

delete() -> bool

Delete the service from the database (sync).

Returns:

  • bool

    True if deleted successfully

refresh() -> Self

Refresh the service status (sync).

Returns:

  • Self

    Self for method chaining

stop(timeout: bool = False, failed: bool = False) -> Self

Stop the service (sync).

Parameters:

  • timeout (bool, default: False ) –

    Mark as timed out

  • failed (bool, default: False ) –

    Mark as failed

Returns:

  • Self

    Self for method chaining

wait(timeout: float = 300, poll_interval: float = 10) -> Self

Wait for the service to be healthy (sync).

Parameters:

  • timeout (float, default: 300 ) –

    Maximum time to wait in seconds (default: 300)

  • poll_interval (float, default: 10 ) –

    Time between status checks in seconds (default: 10)

Returns:

  • Self

    Self (for method chaining), or None if service not found

blackfish.utils

Utility functions for the Blackfish programmatic interface.

set_logging_level(level: str = 'WARNING') -> None

Set the global Blackfish logging level.

This function controls the logging level for all Blackfish operations. By default, the programmatic interface uses WARNING level to reduce verbose output. Use this function to change the logging level globally.

Parameters:

  • level (str, default: 'WARNING' ) –

    Logging level string. Must be one of: "DEBUG", "INFO", "WARNING", "ERROR", or "CRITICAL". Case-insensitive.

Examples:

>>> import blackfish
>>> blackfish.set_logging_level("INFO")  # Show info logs
>>> blackfish.set_logging_level("DEBUG")  # Show all logs including debug
>>> blackfish.set_logging_level("WARNING")  # Only warnings and errors (default)