Daily Development Workflows
About Pants
Since 22.09, we have migrated to Pants as our primary build system and dependency manager for the mono-repository of Python components.
Pants is a graph-based async-parallel task executor written in Rust and Python. It is tailored to building programs with explicit and auto-inferred dependency checks and aggressive caching.
Key concepts
The command pattern:
$ pants [GLOBAL_OPTS] GOAL [GOAL_OPTS] [TARGET ...]
Goal: an action to execute
You may think this as the root node of the task graph executed by Pants.
Target: objectives for the action, usually expressed as
path/to/dir:name
The targets are declared/defined by
path/to/dir/BUILD
files.
The global configuration is at
pants.toml
.Recommended reading: https://www.pantsbuild.org/docs/concepts
Inspecting build configurations
Display all targets
$ pants list ::
This list includes the full enumeration of individual targets auto-generated by collective targets (e.g.,
python_sources()
generates multiplepython_source()
targets by globbing thesources
pattern)
Display all dependencies of a specific target (i.e., all targets required to build this target)
$ pants dependencies --transitive src/ai/backend/common:src
Display all dependees of a specific target (i.e., all targets affected when this target is changed)
$ pants dependees --transitive src/ai/backend/common:src
Note
Pants statically analyzes the source files to enumerate all its imports
and determine the dependencies automatically. In most cases this works well,
but sometimes you may need to manually declare explicit dependencies in
BUILD
files.
Running lint and check
Run lint/check for all targets:
$ pants lint ::
$ pants check ::
To run lint/check for a specific target or a set of targets:
$ pants lint src/ai/backend/common:: tests/common::
$ pants check src/ai/backend/manager::
Currently running mypy with pants is slow because mypy cannot utilize its own cache as pants invokes mypy per file due to its own dependency management scheme.
(e.g., Checking all sources takes more than 1 minutes!)
This performance issue is being tracked by pantsbuild/pants#10864. For now, try using a
smaller target of files that you work on and use an option to select the
targets only changed (--changed-since
).
Running formatters
If you encounter failure from ruff
, you may run the following to automatically fix the import ordering issues.
$ pants fix ::
If you encounter failure from black
, you may run the following to automatically fix the code style issues.
$ pants fmt ::
Running unit tests
Here are various methods to run tests:
$ pants test ::
$ pants test tests/manager/test_scheduler.py::
$ pants test tests/manager/test_scheduler.py:: -- -k test_scheduler_configs
$ pants test tests/common:: # Run common/**/test_*.py
$ pants test tests/common:tests # Run common/test_*.py
$ pants test tests/common/redis:: # Run common/redis/**/test_*.py
$ pants test tests/common/redis:tests # Run common/redis/test_*.py
You may also try --changed-since
option like lint
and check
.
To specify extra environment variables for tests, use the --test-extra-env-vars
option:
$ pants test \
> --test-extra-env-vars=MYVARIABLE=MYVALUE \
> tests/common:tests
Running integration tests
$ ./backend.ai test run-cli user,admin
Building wheel packages
To build a specific package:
$ pants \
> --tag="wheel" \
> package \
> src/ai/backend/common:dist
$ ls -l dist/*.whl
If the package content varies by the target platform, use:
$ pants \
> --tag="wheel" \
> --tag="+platform-specific" \
> --platform-specific-resources-target=linux_arm64 \
> package \
> src/ai/backend/runner:dist
$ ls -l dist/*.whl
Using IDEs and editors
Pants has an export
goal to auto-generate a virtualenv that contains all
external dependencies installed in a single place.
This is very useful when you use IDEs and editors.
To (re-)generate the virtualenv(s), run:
$ pants export --resolve=RESOLVE_NAME # you may add multiple --resolve options
You may display the available resolve names by (the command works with Python 3.11 or later):
$ python -c 'import tomllib,pathlib;print("\n".join(tomllib.loads(pathlib.Path("pants.toml").read_text())["python"]["resolves"].keys()))'
Similarly, you can export all virtualenvs at once:
$ python -c 'import tomllib,pathlib;print("\n".join(tomllib.loads(pathlib.Path("pants.toml").read_text())["python"]["resolves"].keys()))' | sed 's/^/--resolve=/' | xargs ./pants export
Then configure your IDEs/editors to use
dist/export/python/virtualenvs/python-default/PYTHON_VERSION/bin/python
as the
interpreter for your code, where PYTHON_VERSION
is the interpreter version
specified in pants.toml
.
As of Pants 2.16, you must export the virtualenvs by the individual lockfiles
using the --resolve
option, as all tools are unified to use the same custom resolve subsystem of Pants and the ::
target no longer works properly, like:
$ pants export --resolve=python-default --resolve=mypy
To make LSP (language server protocol) services like PyLance to detect our source packages correctly,
you should also configure PYTHONPATH
to include the repository root’s src
directory and
plugins/*/
directories if you have added Backend.AI plugin checkouts.
For linters and formatters, configure the tool executable paths to indicate
dist/export/python/virtualenvs/RESOLVE_NAME/PYTHON_VERSION/bin/EXECUTABLE
.
For example, ruff’s executable path is
dist/export/python/virtualenvs/ruff/3.11.4/bin/ruff
.
Currently we have the following Python tools to configure in this way:
ruff
: Provides a fast linting (combining pylint, flake8, and isort) and formatting (auto-fix for some linting rules and isort)mypy
: Validates the type annotations and performs a static analysisblack
: Validates and reformats all Python codes by reconstructing it from AST, just likegofmt
.Tip
For a long list of arguments or list/tuple items, you could explicitly add a trailing comma to force Black to insert line-breaks after every item even when the line length does not exceed the limit (100 characters).
Tip
You may disable auto-formatting on a specific region of code using
# fmt: off
and# fmt: on
comments, though this is strongly discouraged except when manual formatting gives better readability, such as numpy matrix declarations.pytest
: The unit test runner framework.coverage-py
: Generates reports about which source lines were visited during execution of a pytest session.towncrier
: Generates the changelog from news fragments in thechanges
directory when making a new release.
VSCode
Install the following extensions:
Python (
ms-python.python
)Pylance (
ms-python.vscode-pylance
) (optional but recommended)Black (
ms-python.black-formatter
)Mypy (
ms-python.mypy-type-checker
)Ruff (
charliermarsh.ruff
)
Set the workspace settings for the Python extension for code navigation and auto-completion:
Setting ID |
Recommended value |
---|---|
|
true |
|
|
|
|
|
|
Set the following keys in the workspace settings to configure Python tools:
Setting ID |
Example value |
---|---|
|
|
|
|
|
|
|
|
Note
Changed in July 2023
After applying the VSCode Python Tool migration,
we no longer recommend to configure python.linting.*Path
and python.formatting.*Path
keys.
Vim/NeoVim
There are a large variety of plugins and usually heavy Vimmers should know what to do.
We recommend using ALE or CoC plugins to have automatic lint highlights, auto-formatting on save, and auto-completion support with code navigation via LSP backends.
Warning
Note that it is recommended to enable only one linter/formatter at a time (either ALE or CoC) with proper configurations, to avoid duplicate suggestions and error reports.
When using ALE, it is recommended to have a directory-local vimrc as follows.
First, add set exrc
in your user-level vimrc.
Then put the followings in .vimrc
(or .nvimrc
for NeoVim) in the build root directory:
let s:cwd = getcwd()
let g:ale_python_black_executable = s:cwd . '/dist/export/python/virtualenvs/black/3.11.4/bin/black' " requires absolute path
let g:ale_python_mypy_executable = s:cwd . '/dist/export/python/virtualenvs/mypy/3.11.4/bin/mypy'
let g:ale_python_ruff_executable = s:cwd . '/dist/export/python/virtualenvs/ruff/3.11.4/bin/ruff'
let g:ale_linters = { "python": ['ruff', 'black', 'mypy'] }
let g:ale_fixers = {'python': ['ruff', 'black']}
let g:ale_fix_on_save = 1
When using CoC, run :CocInstall coc-pyright @yaegassy/coc-ruff
and :CocLocalConfig
after opening a file
in the local working copy to initialize PyRight functionalities.
In the local configuration file (.vim/coc-settings.json
), you may put the linter/formatter configurations
just like VSCode (see the official reference):
To activate Ruff (a Python linter and fixer), run :CocCommand ruff.builtin.installServer
.
{
"coc.preferences.formatOnType": true,
"coc.preferences.formatOnSaveFiletypes": ["python"],
"coc.preferences.willSaveHandlerTimeout": 5000,
"ruff.enabled": true,
"ruff.autoFixOnSave": false, # Use code actions to fix individual errors
"ruff.useDetectRuffCommand": false,
"ruff.builtin.pythonPath": "dist/export/python/virtualenvs/ruff/3.11.4/bin/python",
"ruff.serverPath": "dist/export/python/virtualenvs/ruff/3.11.4/bin/ruff-lsp",
"python.pythonPath": "dist/export/python/virtualenvs/python-default/3.11.4/bin/python",
"python.formatting.provider": "black",
"python.formatting.blackPath": "dist/export/python/virtualenvs/black/3.11.4/bin/black",
"python.linting.mypyEnabled": true,
"python.linting.mypyPath": "dist/export/python/virtualenvs/mypy/3.11.4/bin/mypy",
}
Switching between branches
When each branch has different external package requirements, you should run pants export
before running codes after git switch
-ing between such branches.
Sometimes, you may experience bogus “glob” warning from pants because it sees a stale cache.
In that case, run pgrep pantsd | xargs kill
and it will be fine.
Running entrypoints
To run a Python program within the unified virtualenv, use the ./py
helper
script. It automatically passes additional arguments transparently to the
Python executable of the unified virtualenv.
./backend.ai
is an alias of ./py -m ai.backend.cli
.
Examples:
$ ./py -m ai.backend.storage.server
$ ./backend.ai mgr start-server
$ ./backend.ai ps
Working with plugins
To develop Backend.AI plugins together, the repository offers a special location
./plugins
where you can clone plugin repositories and a shortcut script
scripts/install-plugin.sh
that does this for you.
$ scripts/install-plugin.sh lablup/backend.ai-accelerator-cuda-mock
This is equivalent to:
$ git clone \
> https://github.com/lablup/backend.ai-accelerator-cuda-mock \
> plugins/backend.ai-accelerator-cuda-mock
These plugins are auto-detected by scanning setup.cfg
of plugin subdirectories
by the ai.backend.plugin.entrypoint
module, even without explicit editable installations.
Writing test cases
Mostly it is just same as before: use the standard pytest practices. Though, there are a few key differences:
Tests are executed in parallel in the unit of test modules.
Therefore, session-level fixtures may be executed multiple times during a single run of
pants test
.
Warning
If you interrupt (Ctrl+C, SIGINT) a run of pants test
, it will
immediately kill all pytest processes without fixture cleanup. This may
accumulate unused Docker containers in your system, so it is a good practice
to run docker ps -a
periodically and clean up dangling containers.
To interactively run tests, see Debugging test cases (or interactively running test cases).
Here are considerations for writing Pants-friendly tests:
Ensure that it runs in an isolated/mocked environment and minimize external dependency.
If required, use the environment variable
BACKEND_TEST_EXEC_SLOT
(an integer value) to uniquely define TCP port numbers and other resource identifiers to allow parallel execution. Refer the Pants docs.Use
ai.backend.testutils.bootstrap
to populate a single-node Redis/etcd/Postgres container as fixtures of your test cases. Import the fixture and use it like a plain pytest fixture.These fixtures create those containers with OS-assigned public port numbers and give you a tuple of container ID and a
ai.backend.common.types.HostPortPair
for use in test codes. In manager and agent tests, you could just referlocal_config
to get a pre-populated local configurations with those port numbers.In this case, you may encounter
flake8
complaining about unused imports and redefinition. Use# noqa: F401
and# noqa: F811
respectively for now.
Warning
About using /tmp in tests
If your Docker service is installed using Snap (e.g., Ubuntu 20.04 or
later), it cannot access the system /tmp
directory because Snap applies a
private “virtualized” tmp directory to the Docker service.
You should use other locations under the user’s home directory (or
preferably .tmp
in the working copy directory) to avoid mount failures
for the developers/users in such platforms.
It is okay to use the system /tmp
directory if they are not mounted inside
any containers.
Writing documentation
Create a new pyenv virtualenv based on Python 3.10.
$ pyenv virtualenv 3.10.9 venv-bai-docs
Activate the virtualenv and run:
$ pyenv activate venv-bai-docs $ pip install -U pip setuptools wheel $ pip install -U -r docs/requirements.txt
You can build the docs as follows:
$ cd docs $ pyenv activate venv-bai-docs $ make html
To locally serve the docs:
$ cd docs $ python -m http.server --directory=_build/html
(TODO: Use Pants’ own Sphinx support when pantsbuild/pants#15512 is released.)
Advanced Topics
Adding new external dependencies
Add the package version requirements to the unified requirements file (
./requirements.txt
).Update the
module_mapping
field in the root build configuration (./BUILD
) if the package name and its import name differs.Update the
type_stubs_module_mapping
field in the root build configuration if the package provides a type stubs package separately.Run:
$ pants generate-lockfiles $ pants export
Merging lockfile conflicts
When you work on a branch that adds a new external dependency and the main branch has also
another external dependency addition, merging the main branch into your branch is likely to
make a merge conflict on python.lock
file.
In this case, you can just do the followings since we can just regenerate the lockfile
after merging requirements.txt
and BUILD
files.
$ git merge main
... it says a conflict on python.lock ...
$ git checkout --theirs python.lock
$ pants generate-lockfiles --resolve=python-default
$ git add python.lock
$ git commit
Resetting Pants
If Pants behaves strangely, you could simply reset all its runtime-generated files by:
$ pgrep pantsd | xargs -r kill
$ rm -r /tmp/*-pants/ .pants.d .pids ~/.cache/pants
After this, re-running any Pants command will automatically reinitialize itself and all cached data as necessary.
Note that you may find out the concrete path inside /tmp
from .pants.rc
’s
local_execution_root_dir
option set by install-dev.sh
.
Warning
If you have run pants
or the installation script with sudo
, some of the above directories
may be owned by root and running pants
as the user privilege would not work.
In such cases, remove the directories with sudo
and retry.
Resolving missing directories error when running Pants
ValueError: Failed to create temporary directory for immutable inputs: No such file or directory (os error 2) at path "/tmp/bai-dev-PN4fpRLB2u2xL.j6-pants/immutable_inputsvIpaoN"
If you encounter errors like above when running daily Pants commands like lint
,
you may manually create the directory one step higher.
For the above example, run:
mkdir -p /tmp/bai-dev-PN4fpRLB2u2xL.j6-pants/
If this workaround does not work, backup your current working files and
reinstall by running scripts/delete-dev.sh
and scripts/install-dev.sh
serially.
Changing or updating the Python runtime for Pants
When you run scripts/install-dev.sh
, it automatically creates .pants.bootstrap
to explicitly set a specific pyenv Python version to run Pants.
If you have removed/upgraded this specific Python version from pyenv, you also need to
update .pants.bootstrap
accordingly.
Debugging test cases (or interactively running test cases)
When your tests hang, you can try adding the --debug
flag to the pants test
command:
$ pants test --debug ...
so that Pants runs the designated test targets serially and interactively.
This means that you can directly observe the console output and Ctrl+C to
gracefully shutdown the tests with fixture cleanup. You can also apply
additional pytest options such as --fulltrace
, -s
, etc. by passing them
after target arguments and --
when executing pants test
command.
Installing a subset of mono-repo packages in the editable mode for other projects
Sometimes, you need to editable-install a subset of packages into other project’s directories. For instance you could mount the client SDK and its internal dependencies for a Docker container for development.
In this case, we recommend to do it as follows:
Run the following command to build a wheel from the current mono-repo source:
$ pants --tag=wheel package src/ai/backend/client:dist
This will generate
dist/backend.ai_client-{VERSION}-py3-none-any.whl
.Run
pip install -U {MONOREPO_PATH}/dist/{WHEEL_FILE}
in the target environment.This will populate the package metadata and install its external dependencies. The target environment may be one of a separate virtualenv or a container being built. For container builds, you need to first
COPY
the wheel file and install it.Check the internal dependency directories to link by running the following command:
$ pants dependencies --transitive src/ai/backend/client:src \ > | grep src/ai/backend | grep -v ':version' | cut -d/ -f4 | uniq cli client plugin
Link these directories in the target environment.
For example, if it is a Docker container, you could add
-v {MONOREPO_PATH}/src/ai/backend/{COMPONENT}:/usr/local/lib/python3.10/site-packages/ai/backend/{COMPONENT}
to thedocker create
ordocker run
commands for all the component directories found in the previous step.If it is a local checkout with a pyenv-based virtualenv, you could replace
$(pyenv prefix)/lib/python3.10/site-packages/ai/backend/{COMPONENT}
directories with symbolic links to the mono-repo’s component source directories.
Boosting the performance of Pants commands
Since Pants uses temporary directories for aggressive caching, you could make
the .tmp
directory under the working copy root a tmpfs partition:
$ sudo mount -t tmpfs -o size=4G tmpfs .tmp
To make this persistent across reboots, add the following line to
/etc/fstab
:tmpfs /path/to/dir/.tmp tmpfs defaults,size=4G 0 0
The size should be more than 3GB. (Running
pants test ::
consumes about 2GB.)To change the size at runtime, you could simply remount it with a new size option:
$ sudo mount -t tmpfs -o remount,size=8G tmpfs .tmp
Making a new release
Update
./VERSION
file to set a new version number. (Remove the ending new line, e.g., usingset noeol
in Vim. This is also configured in./editorconfig
)Run
LOCKSET=tools/towncrier ./py -m towncrier
to auto-generate the changelog.You may append
--draft
to see a preview of the changelog update without actually modifying the filesystem.(WIP: lablup/backend.ai#427).
Make a new git commit with the commit message: “release: <version>”.
Make an annotated tag to the commit with the message: “Release v<version>” or “Pre-release v<version>” depending on the release version.
Push the commit and tag. The GitHub Actions workflow will build the packages and publish them to PyPI.
Backporting to legacy per-pkg repositories
Use
git diff
andgit apply
instead ofgit cherry-pick
.To perform a three-way merge for conflicts, add
-3
option to thegit apply
command.You may need to rewrite some codes as the package structure differs. (The new mono repository has more fine-grained first party packages divided from the
backend.ai-common
package.)
When referring the PR/issue numbers in the commit for per-pkg repositories, update them like
lablup/backend.ai#NNN
instead of#NNN
.