ComputeSession Functions

class ai.backend.client.func.session.ComputeSession(name, owner_access_key=None)

Provides various interactions with compute sessions in Backend.AI.

The term ‘kernel’ is now deprecated and we prefer ‘compute sessions’. However, for historical reasons and to avoid confusion with client sessions, we keep the backward compatibility with the naming of this API function class.

For multi-container sessions, all methods take effects to the master container only, except destroy() and restart() methods. So it is the user’s responsibility to distribute uploaded files to multiple containers using explicit copies or virtual folders which are commonly mounted to all containers belonging to the same compute session.

classmethod await paginated_list(status=None, access_key=None, *, fields=(FieldSpec(field_ref='id', humanized_name='Session ID', field_name='id', alt_name='session_id', formatter=<ai.backend.client.output.formatters.OutputFormatter object>, subfields={}), FieldSpec(field_ref='image', humanized_name='Image', field_name='image', alt_name='image', formatter=<ai.backend.client.output.formatters.OutputFormatter object>, subfields={}), FieldSpec(field_ref='type', humanized_name='Type', field_name='type', alt_name='type', formatter=<ai.backend.client.output.formatters.OutputFormatter object>, subfields={}), FieldSpec(field_ref='status', humanized_name='Status', field_name='status', alt_name='status', formatter=<ai.backend.client.output.formatters.OutputFormatter object>, subfields={}), FieldSpec(field_ref='status_info', humanized_name='Status Info', field_name='status_info', alt_name='status_info', formatter=<ai.backend.client.output.formatters.OutputFormatter object>, subfields={}), FieldSpec(field_ref='status_changed', humanized_name='Last Updated', field_name='status_changed', alt_name='status_changed', formatter=<ai.backend.client.output.formatters.OutputFormatter object>, subfields={}), FieldSpec(field_ref='result', humanized_name='Result', field_name='result', alt_name='result', formatter=<ai.backend.client.output.formatters.OutputFormatter object>, subfields={}), FieldSpec(field_ref='abusing_reports', humanized_name='Abusing Reports', field_name='abusing_reports', alt_name='abusing_reports', formatter=<ai.backend.client.output.formatters.OutputFormatter object>, subfields={})), page_offset=0, page_size=20, filter=None, order=None)

Fetches the list of sessions.

Parameters:

status (str) –
Fetches sessions in a specific status (PENDING, SCHEDULED, PULLING, PREPARING,

RUNNING, RESTARTING, RUNNING_DEGRADED, TERMINATING, TERMINATED, ERROR, CANCELLED)
fields (Sequence[FieldSpec]) – Additional per-session query fields to fetch.

Return type:

PaginatedResult[dict]

classmethod await get_or_create(image, *, name=None, type_='interactive', starts_at=None, enqueue_only=False, max_wait=0, no_reuse=False, dependencies=None, callback_url=None, mounts=None, mount_map=None, mount_options=None, envs=None, startup_command=None, resources=None, resource_opts=None, cluster_size=1, cluster_mode=ClusterMode.SINGLE_NODE, domain_name=None, group_name=None, bootstrap_script=None, tag=None, architecture='x86_64', scaling_group=None, owner_access_key=None, preopen_ports=None, assign_agent=None)

Get-or-creates a compute session. If name is None, it creates a new compute session as long as the server has enough resources and your API key has remaining quota. If name is a valid string and there is an existing compute session with the same token and the same image, then it returns the ComputeSession instance representing the existing session.

Parameters:

image (str) – The image name and tag for the compute session. Example: python:3.6-ubuntu. Check out the full list of available images in your server using (TODO: new API).
name (str) –
A client-side (user-defined) identifier to distinguish the session among currently running sessions. It may be used to seamlessly reuse the session already created.

Changed in version 19.12.0: Renamed from clientSessionToken.
type –
Either "interactive" (default) or "batch".

Added in version 19.09.0.
enqueue_only (bool) –
Just enqueue the session creation request and return immediately, without waiting for its startup. (default: false to preserve the legacy behavior)

Added in version 19.09.0.
max_wait (int) –
The time to wait for session startup. If the cluster resource is being fully utilized, this waiting time can be arbitrarily long due to job queueing. If the timeout reaches, the returned status field becomes "TIMEOUT". Still in this case, the session may start in the future.

Added in version 19.09.0.
no_reuse (bool) –
Raises an explicit error if a session with the same image and the same name already exists instead of returning the information of it.

Added in version 19.09.0.
mounts (List[str]) – The list of vfolder names that belongs to the current API access key.
mount_map (Mapping[str, str]) – Mapping which contains custom path to mount vfolder. Key and value of this map should be vfolder name and custom path. Default mounts or relative paths are under /home/work. If you want different paths, names should be absolute paths. The target mount path of vFolders should not overlap with the linux system folders. vFolders which has a dot(.) prefix in its name are not affected.
mount_options (Optional[Mapping[str, Mapping[str, str]]]) – Mapping which contains extra options for vfolder.
envs (Mapping[str, str]) – The environment variables which always bypasses the jail policy.
resources (Mapping[str, str | int]) – The resource specification. (TODO: details)
cluster_size (int) –
The number of containers in this compute session. Must be at least 1.

Added in version 19.09.0.

Changed in version 20.09.0.
cluster_mode (ClusterMode) –
Set the clustering mode whether to use distributed nodes or a single node to spawn multiple containers for the new session.

Added in version 20.09.0.
tag (str) – An optional string to annotate extra information.
owner – An optional access key that owns the created session. (Only available to administrators)

Return type:

ComputeSession

Returns:

The ComputeSession instance.

classmethod await create_from_template(template_id, *, name=Undefined.TOKEN, type_=Undefined.TOKEN, starts_at=None, enqueue_only=Undefined.TOKEN, max_wait=Undefined.TOKEN, dependencies=None, callback_url=Undefined.TOKEN, no_reuse=Undefined.TOKEN, image=Undefined.TOKEN, mounts=Undefined.TOKEN, mount_map=Undefined.TOKEN, envs=Undefined.TOKEN, startup_command=Undefined.TOKEN, resources=Undefined.TOKEN, resource_opts=Undefined.TOKEN, cluster_size=Undefined.TOKEN, cluster_mode=Undefined.TOKEN, domain_name=Undefined.TOKEN, group_name=Undefined.TOKEN, bootstrap_script=Undefined.TOKEN, tag=Undefined.TOKEN, scaling_group=Undefined.TOKEN, owner_access_key=Undefined.TOKEN)

Get-or-creates a compute session from template. All other parameters provided will be overwritten to template, including vfolder mounts (not appended!). If name is None, it creates a new compute session as long as the server has enough resources and your API key has remaining quota. If name is a valid string and there is an existing compute session with the same token and the same image, then it returns the ComputeSession instance representing the existing session.

Parameters:

template_id (str) – Task template to apply to compute session.
image (str | Undefined) – The image name and tag for the compute session. Example: python:3.6-ubuntu. Check out the full list of available images in your server using (TODO: new API).
name (str | Undefined) –
A client-side (user-defined) identifier to distinguish the session among currently running sessions. It may be used to seamlessly reuse the session already created.

Changed in version 19.12.0: Renamed from clientSessionToken.
type –
Either "interactive" (default) or "batch".

Added in version 19.09.0.
enqueue_only (bool | Undefined) –
Just enqueue the session creation request and return immediately, without waiting for its startup. (default: false to preserve the legacy behavior)

Added in version 19.09.0.
max_wait (int | Undefined) –
The time to wait for session startup. If the cluster resource is being fully utilized, this waiting time can be arbitrarily long due to job queueing. If the timeout reaches, the returned status field becomes "TIMEOUT". Still in this case, the session may start in the future.

Added in version 19.09.0.
no_reuse (bool | Undefined) –
Raises an explicit error if a session with the same image and the same name already exists instead of returning the information of it.

Added in version 19.09.0.
mounts (Union[List[str], Undefined]) – The list of vfolder names that belongs to the current API access key.
mount_map (Union[Mapping[str, str], Undefined]) – Mapping which contains custom path to mount vfolder. Key and value of this map should be vfolder name and custom path. Default mounts or relative paths are under /home/work. If you want different paths, names should be absolute paths. The target mount path of vFolders should not overlap with the linux system folders. vFolders which has a dot(.) prefix in its name are not affected.
envs (Union[Mapping[str, str], Undefined]) – The environment variables which always bypasses the jail policy.
resources (Union[Mapping[str, str | int], Undefined]) – The resource specification. (TODO: details)
cluster_size (int | Undefined) –
The number of containers in this compute session. Must be at least 1.

Added in version 19.09.0.

Changed in version 20.09.0.
cluster_mode (ClusterMode | Undefined) –
Set the clustering mode whether to use distributed nodes or a single node to spawn multiple containers for the new session.

Added in version 20.09.0.
tag (str | Undefined) – An optional string to annotate extra information.
owner – An optional access key that owns the created session. (Only available to administrators)

Return type:

ComputeSession

Returns:

The ComputeSession instance.

await destroy(*, forced=False, recursive=False): Destroys the compute session. Since the server literally kills the container(s), all ongoing executions are forcibly interrupted.

await restart(): Restarts the compute session. The server force-destroys the current running container(s), but keeps their temporary scratch directories intact.

await rename(new_id): Renames Session ID of running compute session.

await commit(): Commit a running session to a tar file in the agent host.

await export_to_image(new_image_name): Commits running session to new image and then uploads to designated container registry. Requires Backend.AI server set up for per-user image commit feature (24.03).

await interrupt(): Tries to interrupt the current ongoing code execution. This may fail without any explicit errors depending on the code being executed.

await complete(code, opts=None)

Gets the auto-completion candidates from the given code string, as if a user has pressed the tab key just after the code in IDEs.

Depending on the language of the compute session, this feature may not be supported. Unsupported sessions returns an empty list.

Parameters:

code (str) – An (incomplete) code text.
opts (dict) – Additional information about the current cursor position, such as row, col, line and the remainder text.

Return type:

Iterable[str]

Returns:

An ordered list of strings.

await get_info(): Retrieves a brief information about the compute session.

await get_logs(): Retrieves the console log of the compute session container.

await get_dependency_graph(): Retrieves the root node of dependency graph of the compute session.

await get_status_history(): Retrieves the status transition history of the compute session.

await execute(run_id=None, code=None, mode='query', opts=None)

Executes a code snippet directly in the compute session or sends a set of build/clean/execute commands to the compute session.

For more details about using this API, please refer the official API documentation.

Parameters:

run_id (str) – A unique identifier for a particular run loop. In the first call, it may be None so that the server auto-assigns one. Subsequent calls must use the returned runId value to request continuation or to send user inputs.
code (str) – A code snippet as string. In the continuation requests, it must be an empty string. When sending user inputs, this is where the user input string is stored.
mode (str) – A constant string which is one of "query", "batch", "continue", and "user-input".
opts (dict) – A dict for specifying additional options. Mainly used in the batch mode to specify build/clean/execution commands. See the API object reference for details.

Returns:

An execution result object

await upload(files, basedir=None, show_progress=False)

Uploads the given list of files to the compute session. You may refer them in the batch-mode execution or from the code executed in the server afterwards.

Parameters:

files (Sequence[str | Path]) –
The list of file paths in the client-side. If the paths include directories, the location of them in the compute session is calculated from the relative path to basedir and all intermediate parent directories are automatically created if not exists.

For example, if a file path is /home/user/test/data.txt (or test/data.txt) where basedir is /home/user (or the current working directory is /home/user), the uploaded file is located at /home/work/test/data.txt in the compute session container.
basedir (Union[str, Path, None]) – The directory prefix where the files reside. The default value is the current working directory.
show_progress (bool) – Displays a progress bar during uploads.

await download(files, dest='.', show_progress=False)

Downloads the given list of files from the compute session.

Parameters:

files (Sequence[str | Path]) – The list of file paths in the compute session. If they are relative paths, the path is calculated from /home/work in the compute session container.
dest (str | Path) – The destination directory in the client-side.
show_progress (bool) – Displays a progress bar during downloads.

await list_files(path='.')

Gets the list of files in the given path inside the compute session container.

Parameters:: path (str | Path) – The directory path in the compute session.

await get_abusing_report(): Retrieves abusing reports of session’s sibling kernels.

await start_service(app, *, port=Undefined.TOKEN, envs=Undefined.TOKEN, arguments=Undefined.TOKEN, login_session_token=Undefined.TOKEN)

Starts application from Backend.AI session and returns access credentials to access AppProxy endpoint.

Return type:: Mapping[str, Any]

listen_events(scope='*')

Opens the stream of the kernel lifecycle events. Only the master kernel of each session is monitored.

Return type:: SSEContextManager
Returns:: a StreamEvents object.

stream_events(scope='*')

Opens the stream of the kernel lifecycle events. Only the master kernel of each session is monitored.

Return type:: SSEContextManager
Returns:: a StreamEvents object.

stream_pty()

Opens a pseudo-terminal of the kernel (if supported) streamed via websockets.

Return type:: WebSocketContextManager
Returns:: a StreamPty object.

stream_execute(code='', *, mode='query', opts=None)

Executes a code snippet in the streaming mode. Since the returned websocket represents a run loop, there is no need to specify run_id explicitly.

Return type:: WebSocketContextManager

class ai.backend.client.func.session.StreamPty(session, underlying_response, **kwargs): A derivative class of WebSocketResponse which provides additional functions to control the terminal.