Install Agent

We assume that your system is configured with a sudoable admin user named devops. Your Backend.AI manager should be already set up and running.

Guide variables

⚠️ Prepare the values of the following variables before working with this page and replace their occurrences with the values when you follow the guide.

Name Meaning
{NS} The etcd namespace (just create a unique string like domain names)
{ETCDADDR} The etcd cluster address ({ETCDHOST}:{ETCDPORT}, localhost:2379 for development setup)
{ENDPOINT} The DNS hostname of the API server (depending on your environment, this may be either a publicly registered domain or a local private domain)

Optional variables

Name Meaning
{SSLCERT} The path to your SSL certificate (bundled with CA chain certificates)
{SSLPKEY} The path to your SSL private key
{S3AKEY} The access key for AWS S3 or compatible services [1]
{S3SKEY} The secret key for AWS S3 or compatible services
{DDAPIKEY} The Datadog API key
{DDAPPKEY} The Datadog application key
{SENTRYURL} The private Sentry report URL
[1]AWS S3 is used to store the output files generated by the user code in kernels’ /home/work/.output directory. If not specified, Backend.AI will just skip uploading generated files.

Install dependencies for daemonization

Ubuntu

$ sudo apt-get -y update
$ sudo apt-get -y dist-upgrade
$ sudo apt-get install -y ca-certificates git-core supervisor

Here are some optional but useful packages:

$ sudo apt-get install -y vim tmux htop

CentOS / RHEL

(TODO)

Prepare CUDA (if available)

Check out the [[Install CUDA]] guide.

Prepare Python 3.6+

Check out [[Install Python via pyenv]] for instructions. Create a virtualenv named venv-agent.

(Only in Linux) To enable detailed resource statistics, give the Python executable to have CAP_SYS_ADMIN, CAP_SYS_PTRACE, and CAP_DAC_OVERRIDE capabilities.

$ sudo setcap cap_sys_ptrace,cap_sys_admin,cap_dac_override+eip "$(readlink -f $(pyenv which python))"

Install Backend.AI Agent as Package

$ pyenv shell venv-agent
$ pip install -U setuptools pip
$ pip install -U backend.ai-agent

Monitoring and Logging

Check out the [[Install Monitoring and Logging Tools]] guide.

Configure supervisord

$ sudo vi /etc/supervisor/conf.d/apps.conf
[program:backendai-agent]
user = devops
stopsignal = TERM
stopasgroup = true
command = /home/devops/run-agent.sh
$ vi /home/devops/init-venv.sh
#!/bin/bash
export PYENV_ROOT="$HOME/.pyenv"
export PATH="$PYENV_ROOT/bin:$PATH"
eval "$(pyenv init -)"
eval "$(pyenv virtualenv-init -)"
pyenv shell venv-agent
$ sudo mkdir -p /var/cache/scratches
$ sudo chown devops:devops /var/cache/scratches
$ vi /home/devops/run-agent.sh
source /home/devops/init-venv.sh
umask 0002
export AWS_ACCESS_KEY_ID="{S3AKEY}"
export AWS_SECRET_ACCESS_KEY="{S3SEKEY}"
export DATADOG_API_KEY={DDAPIKEY}
export DATADOG_APP_KEY={DDAPPKEY}
export RAVEN_URI="{SENTRYURL}"
exec python -m ai.backend.agent.server \
            --etcd-addr {ETCDADDR} \
            --namespace {NS} \
            --scratch-root=/var/cache/scratches

Prepare Kernel Images

You need to pull the kernel container images first to actually spawn compute sessions. The name and tag pairs of images must be also specified in backend.ai-manager/sample-configs/image-metadata.yml file imported into etcd.

Here are the pull commands for a few commonly used Python-based images:

$ docker pull lablup/kernel-python:3.6-debian
$ docker pull lablup/kernel-python-tensorflow:1.8-py36
$ docker pull lablup/kernel-python-tensorflow:1.8-py36-gpu

For the full list of publicly available kernels, check out the kernels repository.

Finally, Run!

$ sudo supervisorctl reread
$ sudo supervisorctl start backendai-agent