Buck: Skylark

Skylark

A bit of history

Historically, Buck relied on Python for describing build files and macros. This allowed Buck users to implement many missing features without having to modify Buck's core. While it worked fine for local builds and small repositories, when used at scale, the ability to access host environment and perform arbitrary actions without Buck's knowledge led to non-deterministic, hard to debug issues and slow parsing.

To address some of these issues, Buck introduced features like allow_unsafe_import, but we were ultimately unable to provide proper sandboxing for deterministic parsing and a new solution had to be put in place.

Present day

In order to tackle Python DSL parser limitations, Buck added polyglot language support and provided a built-in parser, Skylark, as an alternative to the Python DSL parser.

Enabling Skylark parser

In order to enable Skylark support for your project, please add

[parser]
  polyglot_parsing_enabled = true
  default_build_file_syntax = SKYLARK
to your .buckconfig file. This is recommended for new projects and will become the default in the future. If most of your build files or macros rely on Python DSL features and you're not ready to invest into migrating to Skylark, you can replace
default_build_file_syntax = SKYLARK
with
default_build_file_syntax = PYTHON_DSL
to use Python DSL parser by default. Since Skylark will soon be the default, it's highly recommended to start the migration. To make it easier, Buck gives you control over which parser to use for parsing individual build files. Adding
# BUILD FILE SYNTAX: SKYLARK
as the very first line of the build file will result in Buck using Skylark parser for parsing it. Similarly,
# BUILD FILE SYNTAX: PYTHON_DSL
will result in Python DSL parser being used.

It's best to enable the Skylark parser globally and add

# BUILD FILE SYNTAX: PYTHON_DSL
to all build files that rely on Python DSL features.

Note that all of the options above require enabled polyglot parsing:

[parser]
  polyglot_parsing_enabled = true

Migrating from Python to Skylark

The Skylark language was specifically created to address all of the above and many other issues, which is why Buck team has decided to replace Python DSL with Skylark as the language for build file and extension files. Unfortunately, migration cannot be fully automated, so below some of the ways to resolve common tasks when migrating to Skylark are described.

include_defs

The include_defs function is not supported in Skylark because it pollutes the execution environment by default and makes automated refactoring much harder. To replace a usage of
include_defs("//tools/my_macro.bzl")
you should:
  • find all symbols defined in my_macro file that are actually used by this file. Say, for example, it needs foo and bar.
  • replace include_defs invocation with an equivalent load()function invocation explicitly importing the needed symbols:
    load("//tools:my_macro.bzl", "foo", "bar")
Note that load() function uses the build target pattern syntax as if there is an
export_file(name="my_macro.bzl")
defined in a tools package build file. This means that instead of //package/extension.bzl syntax expected by include_defs(), a similar load() invocation would expect //package:extension.bzl.

Environment variables

Environment variables are implicit and frequently result in non-reproducible builds because of environment variable differences across machines. They must be replaced with corresponding configuration variables. For example, instead of

my_var = py_sdk.os.env.get('MY_VAR', 'foo')
you should use
my_var = read_config('my_project', 'my_var', 'foo')
in your build file or extension file. When calling buck, instead of passing
env MY_VAR='some_value' buck ...
you should pass a configuration flag
buck ... -c my_project.my_var=foo

Note that while using Python DSL parser it's possible to invoke read_config() function during extension file evaluation directly or through a set of other function invocations, this behavior is not supported with Skylark parser in order to track configuration option usage more precisely. Because of this, a top-level read_config() function invocation like

foo = read_config(...)
would either have to be performed in build file directly or, preferably, moved into a well-named function within an extension file. In case configuration options are used to instantiate expensive objects which should be created only once, consider replacing code like
FOO = expensive1() if read_config(...) else expensive2()
with something like
_EXPENSIVE1 = expensive1()
_EXPENSIVE2 = expensive2()

def foo():
  return _EXPENSIVE1 if read_config(...) else _EXPENSIVE2
While it can result in instantiation of an unnecessary expensive object, it may still be more efficient than instantiating one of the expensive objects during each foo invocation. Having said that, please start simple and optimize only if performance overhead becomes noticeable.

instanceof

The instanceof operator is not available in Skylark because it does not support inheritance, but some of its usages can be replaced with the type operator. For example,

foo instanceof str
can be replaced with
type(foo) == type('')

get_base_path

The get_base_path() function is replaced with the equivalent, but more appropriately named package_name(). Note, that while in build files it's invoked as package_name(), it's invoked as native.package_name()in extension files, as are the rest of built-in functions provided by Buck. It's fairly easy to write an alias if there is a strong desire to use the old name instead.

del

Usage of del arr[1] and del dictionary['key'] is not supported. Use arr.pop(1) and dictionary.pop('key') instead respectively.

class

Classes are not supported. You can replace them with structs and functions. In addition to language simplification, structs are more memory efficient. For example, a class like

class Foo:
def __init__(self, foo, bar=None):
...
def some_method(self, param):
...
...
foo = Foo('foo', bar='yo')
res = foo.some_method(some_param)
can be replaced with something like
def some_function(foo_instance, param):
...
foo = struct(foo='foo', bar='yo')
res = some_function(foo, some_param)
You can also track state in variables defined in the same extension file, but you cannot expose any mutators, since all variables are frozen once extension file is evaluated. This is intentional and prevents race conditions, since build files as well as extension files must support efficient parallel evaluation. You can also use providers in order to create named struct factories.
def some_function(foo_instance, param):
...
Foo = provider()
foo = Foo(foo='foo', bar='yo')
res = some_function(foo, some_param)

import re

Regular expressions are not supported in Skylark due to unbounded runtime and resource usage, but their usage is often unnecessary and can be replaced with simple string manipulations. Patterns like re"//libraries/my_lib/.*" can be replaced with a startswith("//libraries/my_lib/"). Similarly, the endswith() method can be used to replace a pattern that starts with .* and "some_text" in foo can replace re".*some_text.*".

raise Exception

Raising and catching exceptions is not supported. Use the fail function instead. For example, instead of raise Exception("foo") or raise Exception("attribute_name: foo") you can use fail("foo") or fail("foo", "attribute_name") respectively to stop build/exception file evaluation and report an error. Since usage of fail triggers non-recoverable errors and halts parsing, it cannot be used for control flow.

while loop

While loops are not supported due to unbounded runtime. Instead, use a for loop with a bounded range. Usage of while True: ... should be replaced with a for _ in range(REASONABLE_LIMIT): followed by an extra check after the loop to make sure that loop has terminated before all iterations were exhausted.

python module

Python modules cannot be imported in Skylark. Many safe Python functions like os.path.basepath or os.path.join can be replaced with paths.basename and paths.join from Skylib. In order to use it, clone it into some directory, configure it as a cell by adding

[repositories]
  bazel_skylib = path/to/skylib_checkout
to a .buckconfig file, and load corresponding function
load("@bazel_skylib//lib:paths.bzl", "paths")
. An example from Skylib website:
load("@bazel_skylib//:lib.bzl", "paths", "shell")

p = paths.basename("foo.bar")
s = shell.quote(p)

Skylint

Consider using Skylint lint tool which can catch and suggest fixes for some of the common issues. Unfortunately, since it was not designed to handle arbitrary Python files, it can crash. Some of the common reasons for it to crash are:
  • usage of nested functions. Nested functions should be moved to the top level.
  • usage of not foo in instead of foo not in. Use foo not in instead - it's recommended by flake8 anyways.
You can bisect affected area of code by commenting out parts of the file and rerunning Skylint.

Testing your changes

Automated testing

Comming soon...

Manual testing

The easiest way to check if your changes affected build rules is by checking if target rule keys have changed. You can capture rule keys before making your change using
buck targets --show-rulekey
followed by the command below after applying your changes
buck targets --show-rulekey
Now that before and after rule keys are captured
diff before after
should be empty unless your changes affected semantics of some macros or build definitions. In order to get more insight into what exactly has you, changed can use
buck audit rules path/to/BUCK
command on individual build file files to see how macros are expanded by Buck.