Cluster Networking
Single-node cluster session
If a session is created with multiple containers with a single-node option, all containers are created in a single agent. The containers share a private bridge network in addition to the default network, so that they could interact with each other privately. There are no firewall restrictions in this private bridge network.
Multi-node cluster session
For even larger-scale computation, you may create a multi-node cluster session that spans across multiple agents. In this case, the manager auto-configures a private overlay network, so that the containers could interact with each other. There are no firewall restrictions in this private overlay network.
Detection of clustered setups
There is a concept called cluster role.
The current version of Backend.AI creates homogeneous cluster sessions by replicating the same resource configuration and the same container image,
but we have plans to add heterogeneous cluster sessions that have different resource and image configurations for each cluster role.
For instance, a Hadoop cluster may have two types of containers: name nodes and data nodes, where they could be mapped to main
and sub
cluster roles.
All interactive apps are executed only in the main1
container which is always present in both cluster and non-cluster sessions.
It is the user application’s responsibility to connect with and utilize other containers in a cluster session.
To ease the process, Backend.AI injects the following environment variables into the containers and sets up a random-generated SSH keypairs between the containers so that each container ssh into others without additional prompts.:
Environment Variable |
Meaning |
Examples |
---|---|---|
|
The number of containers in this cluster session. |
|
|
A comma-separated list of container hostnames in this cluster session. |
|
|
A comma-separated key:value pairs of cluster roles and the replica counts for each role. |
|
|
The container hostname of the current container. |
|
|
The one-based index of the current container from the containers sharing the same cluster role. |
|
|
The name of the current container’s cluster role. |
|
|
The zero-based global index of the current container within the entire cluster session. |
|