Guix package structure: build-system overview and build arguments

So far we've looked at how Guix package's handle source code and other inputs such as libraries. In this post we're going to focus on the build itself, which is defined through a build-system. Rather than having to write code (or a script) for each package specifying how to build the code, in Guix we specify the build-system which standardises the process. Depending on the code we're dealing with we'll use a different build-system, and we can also make changes to how the build takes place. We'll also take a look at Guile Scheme's ability to pass code around within our package definitions, sometimes we want that evaluated and sometimes we don't - which means we have to learn how to quote and unquote code.

This post is part 8 in a series covering Guix Packaging - if you'd like to learn the other aspects of Guix packaging check it out!

The build process consists of the build system, this provides standardised build tools and build phases. We can impact how the build takes place by altering the arguments that are sent to the build utilities, or by changing the build phases. The build phases do things like unpack the source, run the compiler, move the built code into the Store and so forth.

We specify the build-system that's used as one of the fields in the package definition which:

"represents the build procedure of the package, as well as implicit dependencies of that build procedure"

There are build-systems for many of the major programming languages and package ecosystems - 47 of them in total! There are build-systems for the major programming languages like Python, Perl, Rust and Go; others for dealing with particular libraries like QT, Glib/GTK and Fonts; there are also some for particular build tools like Maven, Meson and Cmake; finally there are more specialist requirements like Elm, Emacs, Rakudo and Zig. Many of the build systems have an equivalent guix import command so that a source package can easily be converted to a Guix package.

Many of the build-systems use the GNU build-system as their foundation, so we'll use this for our example. The GNU build-system is defined in guix/build-system/gnu.scm, with a supporting module in guix/build/gnu-build-system.scm. We already know that a build-system includes a standard set of build tools, for the GNU build system fundamentally it does "./configure && make && make install".

Here's an edited version of the sdl-pango package (in gnu/packages/sdl.scm) to show it in action:

 1 (define-public sdl-pango
 2   (package
 3     (name "sdl-pango")
 4     (version "0.1.2")
 5     (source
 6      (origin
 7        (method url-fetch)
 8        (uri (string-append
 9              "mirror://sourceforge/sdlpango/SDL_Pango/" version "/"
10              "SDL_Pango-" version  ".tar.gz"))
11        (sha256
12         (base32 "197baw1dsg0p4pljs5k0fshbyki00r4l49m1drlpqw6ggawx6xbz"))
13        (patches (search-patches "sdl-pango-api_additions.patch"))))
14     (build-system gnu-build-system)
15     (arguments
16      `(#:configure-flags (list "--disable-static")
17        #:phases
18        (modify-phases %standard-phases
19          (add-after 'unpack 'autogen
20            ;; Force reconfiguration because the included libtool
21            ;; generates linking errors.
22            (lambda _ (invoke "autoreconf" "-vif"))))))
23     (native-inputs
24      (list autoconf automake libtool pkg-config))
25     (inputs
26      `(("fontconfig" ,fontconfig)
27        ("freetype" ,freetype)
28        ("glib" ,glib)
29        ("harfbuzz" ,harfbuzz)
30        ("pango" ,pango)
31        ("sdl" ,sdl)))
32     (home-page "https://sdlpango.sourceforge.net")
33     (synopsis "Pango SDL binding")
34     (description "This library is a wrapper around the Pango library.
35 It allows you to use TrueType fonts to render internationalized and
36 tagged text in SDL applications.")
37     (license license:lgpl2.1)))

Line 14: the package build-system is the gnu-build-system.

Line 15-22: provides a set of arguments that go to the build-system - this is everything inside `( ... ). Notice that this is a list, and in this case it's a quasi-quoted list. The backtick is a quasiquote and we'll learn about that later in this post.

Line 16: a keyword argument #:configure-flags with a list attached to it, this has a string in it. Essentially, this ensures that the configure command runs with the string "--disable-static" as an argument.

Line 17: a build-system runs through a number of phases as it downloads the source code, unpacks it, compiles and finally installs it into the store. Providing the #:phases keyword argument specifies that one or more custom phases will be defined and run.

Line 18-22: the %standard-phases are being altered by the modify-phases function to add a new phase after 'unpack called 'autogen. On line 22 a new anonymous function is created (lambda _) which receives no arguments, it run a single command of autoreconf with some arguments.

With the broad brushes in our minds lets look at the arguments we can provide to the build process.

Arguments

During a particular build phase a build utility is run (e.g the compiler) and we can provide an argument to it. There are 27 arguments that can be provided, but some of them are pretty esoteric for the average packaging scenario - so we're just going to focus on the following ones:

Argument Build Phase Notes
#:configure-flags configure Flags to send to the configure command
#:make-flags build Flags to send to compiling with GNU Make
#:parallel-build? build Whether the build runs in parallel
#:tests? check Whether tests run or not.
#:test-target check Name of test target GNU Make will run
#:parallel-tests? check Whether tests run in parallel
#:license-file-regexp install Regular expression to find the license

Configure-flags argument

#:configure-flags: this keyword argument effects the configure phase where the configure command is run. Generally, configure is used to change options, these are often covered in the projects README file. Here are some examples:

(build-system gnu-build-system)
(arguments
  '(#:configure-flags
    (list "--enable-audio=alsa,flac,jack,ao,vorbis,speex")))

This is from the timidity++ package in audio.scm. The arguments to the build are provided as a quoted list '( ... ). Each option is a keyword argument #:<whatever> with some value. In this case we have #:configure-flags, which is provided as a list of strings (list "--enable-audio=alsa ..."). The result is that when the configure command runs it will do so like this: ./configure --enable-audio=alsa,flac,jack,ao,vorbis,speex".

The second example is taken from the mcpp package in cpp.scm which uses a Gexp:

(build-system gnu-build-system)
(arguments
  (list #:configure-flags #~(list "--enable-mcpplib")))

It's exactly the same process, just a slightly different way of creating the list that's fed to the configure command. In this case we have a plain arguments list (arguments ...), which contains an expression to build a list (list ... ). It's constructing a keyword list #:configure-flags, where each one has a value. To avoid needing to quote the list - either at this level or earlier - it uses a Gexp #~ which delays evaluation until the build code is running. This all leads to the final configure command being: ./configure --enable-mcpplib.

As we said earlier many of the other build systems use the same build phases, often they inherit them from GNU build-system and then customise. Consequently, we can often use the same arguments when using different build systems. The wildmidi package uses the cmake-build-system but it still has a configure-flags argument:

(build-system cmake-build-system)
(arguments
  '(#:tests? #f
    #:configure-flags (list "-DWANT_ALSA=ON")))

It uses a quoted list for it's arguments '( ... ), then each of the options is a keyword argument (#:<whatever>). In this case we have #:tests? #f to cover that there are no tests. And #:configure-flags, which is provided by a list of strings (list "-DWANT_ALSA=ON"). The result is that when the configure command runs it will do so with that option: ./configure -DWANT_ALSA=ON

Make-flags argument

#:make-flags: changes how the build phase runs, where GNU Make's make command is executed. The default is to run plain make and we can provide additional options:

(build-system gnu-build-system)
(arguments
  (list #:make-flags #~'("GNUTERM=dumb")))

This is a shortened version of the gnuplot package in maths.scm. We have the arguments field, and a list is created for the keyword arguments. The Gexp is used with a quoted list to set the GNUTERM argument. In the end the make command will be: make GNUTERM=dumb.

The package hdf5 also in maths.scm shows the quasiquoted list style argument:

(build-system gnu-build-system)
(arguments
  `(#:configure-flags '("--enable-cxx")
    #:make-flags (list "CFLAGS=-fPIC"
                       "CXXFLAGS=-fPIC")))

I've slightly shortened it for clarity, but as we can see there's a quasiquoted arguments list, arguments `( ... ). This has two keyword arguments which both have lists associated with them. The #:configure-flags uses a quoted list to provide the string, whereas #:make-flags constructs a list using the list function. The final make command that's run is: make CFLAGS=-fPIC CXXFLAGS=-fPIC.

Probably the most common use for setting make-flags is that the source code is expecting to find "cc" rather than using gcc: to get around this we can send some options to the make command. Here's an example from the pounce package in messaging.scm:

(build-system gnu-build-system)
(arguments
  `(#:make-flags
    (list
      (string-append "CC=" ,(cc-for-target))
      (string-append "PREFIX=" %output))))

We have a quasiquote list for the arguments, and then a list is constructed for #:make-flags. For the first one the Guile function string-append is called - as you might expect it joins two strings together. The ,(cc-for-target) part is an unquote - what it's doing is executing the cc-for-target function (see guix/utils.scm) which returns a string with the particular gcc to use and it inserts that string so we get the string rather than the function. The result is that we get "CC=x86_64-gcc". The second one does the same thing, adding two strings together but it uses the special variable %output which is is a string (see guix/gexp.scm). This gives us a final command that is something like: make CC=x86_64-gcc PREFIX=/gnu/store/...

Parallel-build? argument

#:parallel-build? is a simple boolean flag that impacts whether Make runs with parallel jobs or not. It's essentially setting the command make -j 2. We only really need to change this if a build is failing - for example the espeak-ng package in speech.scm:

(build-system gnu-build-system)
(arguments
  `(#:configure-flags '("--disable static")
    ;; Building in parallel triggers a race condition in 1.49.2.
    #:parallel-build? #f))

Tests? argument

#:tests? is another boolean keyword argument that changes whether the check build phase runs. Practically, this is altering whether make check is run or not:

(build-system gnu-build-system)
(arguments
  `(#:tests? #f  ;no check target
    #:make-flags
      (list
        (string-append "CC=" ,(cc-for-target))
        (string-append "PREFIX=" %output)))

This is a shortened example from the slscroll package in suckless.scm. The arguments are a quasiquoted list arguments `( ... ), and then two keyword argument parameters. Notice that #:make-flags is a list, whereas #:tests? is a boolean - it can be #f or #false.

The convention is to put #:tests? as the first argument, and to always provide a comment if they are switched off explaining why. As Guix only uses the tests provided by the upstream source it's strongly discouraged to turn off tests. If there are tests that don't run correctly it's better to alter the build phases to switch off those that are incorrect, rather than switching all tests off.

Test-target argument

#:test-target argument alters the check phase to provide a string to the Make command. Rather than running make check it will run make <whatever> - this is used in situations where the test target in the Makefile is something different.

(build-system gnu-build-system)
(arguments
  (list
    #:tests? #f  ;xmllint attempts to download DTD
    #:test-target "test"))

The solfege package in music.scm uses this flag to set the test target to test, when the check build phase runs the command will be: make test.

The #:tests? and #:test-target argument's are used in lots of the other build-system's. This example is from the python-textdistance package in python-xyz.scm which uses the Python build-system:

(build-system python-build-system)
(arguments
  `(#:test-target "pytest")

parallel-tests? argument

#:parallel-tests? is the last of the test focused arguments. The same as for #:parallel-build it defaults to being positive, and sets the GNU makes -j flag when the check build phase is running. Switch this off if there are tests that depend on previous tests.

(build-system gnu-build-system)
(arguments '(#:parallel-tests? #f)

This is the gperf package from gperf.scm. The arguments field is provided with a quoted list, with just the one keyword argument that's set to false.

license-file-regexp

#:license-file-regexp determines what regular expression will be used to find a license file which is copied into the outputs. The default regexp can be found in the %license-file-regexp variable from build/gnu-build-system.scm.

This is very rarely used, one example is the pngsuite package in image.scm:

(build-system gnu-build-system)
(arguments
  '(#:tests? #f  ; there is no test target
    #:license-file-regexp "PngSuite.LICENSE"))

Scheme's Quote and QuasiQuote

Having dealt with the useful arguments that we can provide to the build-system, lets take a moment to look at quotes and quasiquotes in Guile Scheme.

To experiment with this the easiest way is to start a Guix REPL. This loads up Guile Scheme into a REPL and also loads the Guix modules. From the command line do:

$ guix repl

This starts the REPL and gives us a prompt that looks like this:

scheme@(guix-user)>

We can enter any Guile code we want and when we hit return it's evaluated and the result returned - the famous Read-Eval-Print Loop. For example, we can do simple addition like this:

scheme@(guix-user)> (+ 1 1)
$1 = 2

To quit we do ,quit - putting a comma in front tells the REPL that it's a command for itself:

scheme@(guix-user)> ,quit

Alright, lets explore quote. As we know the list is the foundational building block that Scheme uses.

As we saw Guile evaluates a list immediately as it did above when we added the two numbers together. But, what if we want to provide a list as an argument to a function: we don't want Guile to evaluate the list, we want it to pass it into the function. In this example, we have the display function which prints something - for example to print a string we do:

scheme@(guix-user)> (display "Hello World!\n")
Hello World!

When we try it using a list:

scheme@(guix-user)> (display ("Hello" "World!"))
Wrong type to apply: "Hello"

The problem is that Guile thinks we're providing a function called "Hello" which it's supposed to do something with - rather than a list that it should display. That's because a list can either be an expression or a data store:

;; list as an expression - function in first position and then some data to apply the function to
scheme@(guix-user)> (+ 1 1)

;; list as a data store
scheme@(guix-user)> (1 3 4)

Most of the time we want Guile to evaluate the list, whether it's an expression or a data store doesn't matter. But, when we're passing around a list as a data store we want to tell Guile not to evaluate the list. We can specifically tell it we want to provide a list to the display function by quoting which is a single ' - a quote mark - placed in from of the list:

scheme@(guix-user)> (display '("Hello" "World!"))
(Hello World!)

It's telling Guile to treat it as a literal list and pass it to the function display as an argument. The alternative syntax for telling Guile that this is a literal list is the word quote:

scheme@(guix-user)> (display (quote ("Hello" "World!")))
(Hello World!)

There are lots of situations when defining a Guix packages where we want to provide a literal list (not a function) in the package definition so we use a quote:

;; a quoted list
(arguments '(#:tests? #f)

If we didn't quote the list, then Guile would think #:tests? was a function and try to execute it.

The next question is what if we want a list where some parts are literal and some parts are the results from being evaluated. For this we need a slightly different quote which is called a quasiquote. The quasiquote is the backtick, or the word quasiquote:

scheme@(guix-user)> `(1 2 3)
(1 2 3)

scheme@(guix-user)> (quasiquote (1 2 3))
(1 2 3)

The difference with a quasiquoted list is that we can unquote part of it, where normal Guile code will then run. For example, what if we wanted to provide the same list, but the last number has to be calculated:

scheme@(guix-user)> `(1 2 3 ,(+ 2 2))
(1 2 3 4)

The unquote is indicated with a comma or the word unquote, the expression is evaluated and the result inserted into the list. In this case, Guile sees the comma before the expression (+ 2 2) so evaluates it and then puts the result into the quoted list. Here's another example, using strings:

=> (define name "Bob")
=> `("John", "Jane" ,name)
("John" "Jane" "Bob")

As we can see the same effect takes place where the unquote temporarily switches Guile back into 'evaluate' mode so it evaluates the variable name and puts it into the list.

We see unquote used in Guix where we want to insert some value from a function or variable that Guix has internally. Here's an example we looked at earlier:

;; insert the value (a string) by calling the cc-for-target function
;; add the two strings together
(string-append "CC=" ,(cc-for-target))

The last syntax to know is unquote splicing which inserts the elements of a list. The syntax is ,@ or unquote-splicing. Here's a simple example:

=> (define ages (list 34 89 16))

=> `(45 ,@ages 12)
(45 34 89 16 12)

As we can see half way through the quasiquoted list we have an unquote-splice which inserts the results of the ages list. The difference from a plain unquote is that an unquote splice only works with a list, and it inserts the elements of the list (not the list itself).

An example of using unquote-splicing from the julia package in julia.scm - I've shortened it up to make it easier to understand:

(arguments
  `(#:make-flags
      (list
        "VERBOSE=1"
        ,@(if (target-x86-64?)
            `("USE_BLAS64=1"
              "LIBBLAS=-lopenblas64_")))

We have the normal arguments list and it's a quasiquoted list using the backtick. The #:make-flags keyword argument is associated with a list, where the first item is a string ("VERBOSE=1"). For the next items in the list there's an unquote-splice - where it checks if the current target is x86-64 and if it is it inserts the strings ("USE_BLAS64=1" etc) as items. The final result is a make-flags string.

Remember that unquoting and unquote-splicing only work in quasiquoted lists. The alternative method in Guix is to use a Gexp, which we'll cover another time.

Build system summary

We've made solid headway understanding the basics of Guix's build-system concept, and the various arguments that can be provided to impact how the build executes. As we're moving into more complicated situations we need to understand how to pass code around, understanding how to quote and unquote code is key: personally I find some of the complex chains around unquote-splicing into a quasiquoted list quite difficult to parse - definitely have to use a notepad and paper!

Next time we'll look at build-phases which is the other way we can impact how a build-system runs a build - that's also going to necessitate learning about creating our own functions in Guile.


Posted in Tech Wednesday 24 April 2024
Tagged with tech ubuntu guix