A gccrs workflow
I recently had a discussion with Guillaume Gomez regarding contributing to GCC, and the process of submitting patches to various mailing lists.
As someone extremely involved in the development of gccrs
, and also extremely unfamiliar
with patch-based workflows up until a few months ago, Guillaume
pointed out that it might be interesting for me to lay out some tips and tricks for working
on GCC. This led to the heap of characters you’re about to read.
Building GCC
Building GCC is quite an involved process, with a lot of gotchas and documentation-reading required. I strongly recommend reading through the GCC wiki’s section on building before attempting this.
One of the most important note is the one about not building GCC directly within the root of the project. This will cause you pain. I’ll put together what my usual workflow is for building a clean gccrs
from a fresh git clone
, and we can have a more detailed look at some of the steps.
I have annotated the most important lines with a number, to which I’ll refer in the next sections.
$ pwd
/home/arthur/Git/gccrs
$ mkdir build
$ cd build
$ ../configure # ... # 1
$ cd ..
$ make -C build -j14 # 2
Configuring
My entire ./configure
invocation looks like this:
../configure CC='ccache clang' CXX='ccache clang++' CXXFLAGS='-O0 -g -fno-pie' CFLAGS='-O0 -g -fno-pie' LDFLAGS='-no-pie -fuse-ld=mold' --enable-multilib --enable-languages=rust --disable-bootstrap
It is the product of many, many hours building and re-building gccrs
, in the hopes that my distribution’s security measures will allow me to compile it and in the search of the shortest possible incremental build times.
As it stands, I was able to achieve the following timings on my machine (a 2022 Lenovo Laptop with an AMD Ryzen 7 PRO 5850U and 32GB of RAM). I have included the timing for compiling gccrs
without changing the compiler or linker for reference, as this is what I was using when getting started on the project.
I have timed an incremental build by simply running touch
on gcc/rust/checks/errors/rust-unsafe-checker.cc
or on gcc/rust/ast/rust-ast.h
, a very often included header.
Clean build | Incremental (.cc touch) | Incremental (.h touch) | |
---|---|---|---|
basic ./configure line | 7m00s | 13s | 74s |
cool ./configure line | 5m30s | 2s | 3s |
The ./configure
invocation can be broken down into a few parts.
CC='ccache clang' CXX='ccache clang++'
In order to achieve fast iteration on the project and short build times, I use the clang
compiler rather than gcc
to build the project. clang
is much, much faster than gcc
,
and I don’t have any particular opinion on C++ compilers, so it does just fine. As it stands,
clang
also has a lot more warnings enabled by default - so while our gcc
-based build
of gccrs
is warning free, the clang
-based one isn’t (or maybe it is, if you’re from the
future).
- The various
-fno-pie
and-no-pie
flags
These flags are simply to allow me to build the project on ArchLinux. There is also an
--enable-static
option to ./configure
, which I believe is supposed to achieve the same
outcome, but I could not get it to work. I haven’t looked into it more, and if it ain’t broke
don’t fix it. But this could probably be cleaned up and improved.
LDFLAGS='-no-pie -fuse-ld=mold'
You can see once again a flag related to Position Independant Executables (pie
), for the same reason stated above. The more interesting bit is the use of mold
to link GCC. I won’t benchmark the various linkers I have tried, but it does make a nice noticeable difference, as well as adds color to linker errors :D
--enable-languages=rust
and--enable-multilib
This flag only enables the compilation of the Rust frontend compiler (gccrs
and rust1
/crab1
) as well as the default C frontend. This way, you won’t end up compiling all default frontends for GCC, which include the C++ and Fortran compilers. Enabling multilib is useful for running tests in 32-bit mode.
--disable-bootstrap
For developing GCC, you do not need to make a full bootstrap build of the compiler each time. However, it is enabled by default, as most people who build GCC are simply interested in using the compiler, not modifying it. Make sure to disable bootstrapping unless you want to check you did not introduce any new unaccepted warnings or error. But for most intents in purposes, you do not need to do bootstrap builds to work on GCC.
However, you need to build the compiler often! To help check I did not break anything during various stages of my contributions, I will often keep multiple build directories with different ./configure
lines. My current list is as follows:
build
with the above-mentioned./configure
line
This is my most-used build directory. I will run make -C build -j14
every chance I get.
Having such a fast build time is very useful for checking other contributors’ pull-requests as
well, or making sure I don’t give someone a bogus C++ indication. As I mentioned above, this
compiles gccrs
using clang
, which notices a lot of warnings we haven’t fixed yet, so I’ll
usually ignore them. While this sounds extremely bad, we do have a CI workflow for checking new
warnings, so it’s almost impossible to merge new code in the project which would introduce warnings.
build-gcc
, which builds usingccache gcc
instead ofccache clang
Here, nothing changes apart from the compilers (gcc
instead of clang
). I still use ccache
and mold
to speed things up as much as possible.
../configure CC='ccache gcc' CXX='ccache g++' CXXFLAGS='-O0 -g -fno-pie' CFLAGS='-O0 -g -fno-pie' LDFLAGS='-no-pie -fuse-ld=mold' --enable-multilib --enable-languages=rust --disable-bootstrap
Building with gcc
from time to time allows me to make sure I haven’t committed too many crimes, and is the prefered building method for most of our contributors.
build-bootstrap
, which performs a full bootstrap build of the compiler
I will disable ccache
and mold
for bootstrapping the compiler as it is going to be a lengthy process anyway. For bootstrap builds, our favorite Frenchman Marc Poulhiès recommends against doing incremental builds, as it has caused him issues in the past. So make sure to remove the build directory before compilation.
../configure --enable-languages=rust
Using ccache
and clang
does not help a lot with bootstrapping, as we will be building g++
from scratch and using it to compile gccrs
. This is a very, very long process, and I don’t run bootstrap builds often, as they take around one hour to complete on my machine.
Compiling the project
Finally, compiling! It’s easier for me to compile gccrs
from the root of the project,
as I’ll often do multiple commands at once and am not very good at noticing when I change
directories. Since my machine has 16 threads, I use 14 of them when compiling anything, which
leaves me enough for other programs running on my computer without running out of cores or memory.
I often see first-time contributors running make
in single-threaded mode (a default make
invocation with no argument). Since GCC contains a lot of files, using multiple cores helps
a lot. However, using -j
without any number of jobs specified will allow make
to create as
many jobs as it wants. And since GCC contains a lot of files, this will cause your computer to suffer.
Editor support
Since GCC is hard, and programming overall is hard, and C++ in particular is impossible
to writehard, I find it extremely helpful to rely on IDE-like features when working on
gccrs
. I use helix as my text editor, which ships with support for the Language Server Protocol.
However, since this is C++ and we can’t have nice things, you first need to create a compilation database to allow your linter/analyzer/lsp-thingamajig to do its thing (in my case, clangd
).
To do this, you can usually choose between compiledb
and bear
.
I typically use bear
for no particular reason. bear
works by wrapping your
make
invocation and performing dark-magic to extract the various flags and options you give your compiler.
However, because it’s written in C++ and we can’t have nice things, you will have
to run bear
using only a single job in your make invocation. No worries, you only have to this once. Whenever I need to do this (very rarely, probably only when I get a new computer or accidentally deletes the compilation database), I usually start the command and go to lunch.
My bear command usually looks like this:
$ bear -- make -C build -j1
This will use the same options as specified in your ./configure
line and will allow you to get diagnostics and warnings according to your chosen compiler. But because this is C++ and you can’t have nice things, the LSP will still not work with our parser implementation. And to be fair I don’t really blame it since it’s 17 000 lines of templated C++ code. But still!
Git workflow
I used to have neovim
as my editor, which allowed me to use the fugitive plugin to do git operations directly within my editor. However, since helix
does not yet have support for plugins, I have switched to using git
directly on the command-line.
One very annoyingimportant GCC step is writing Changelogs for your commits. They look like this and have a very specific format you need to respect:
commit 7394a6893dd9fc2bc34822e002b53eb200ff51d5
Author: Arthur Cohen <arthur.cohen@embecosm.com>
Date: Wed Mar 1 11:03:24 2023 +0100
hir: Refactor ASTLoweringStmt to source file.
gcc/rust/ChangeLog:
* Make-lang.in: Add rust-ast-lower-stmt.o
* hir/rust-ast-lower-stmt.h: Move definitions to...
* hir/rust-ast-lower-stmt.cc: ...here.
Because they are difficult to write, there are multiple scripts to help GCC hackers with them. The most basic one (but not the easiest one to use) is contrib/mklog.py
. It takes the contents of a patch
as input and will spew out the appropriate Changelog skeleton in your terminal.
gcc/rust/ChangeLog:
* Make-lang.in:
* hir/rust-ast-lower-stmt.h:
* hir/rust-ast-lower-stmt.cc:
You can then copy this skeleton and fill it out when commiting your changes. Aaaaaaand because it’s 2023 and no one can agree on whether one should use spaces or tabs, you’re getting a CI failure on our “check-changelogs” step.
Thankfully there is an easier way to generate Changelog skeletons directly when making your commits. Enter contrib/gcc-git-customization.sh
. This nifty little script will help you setup some git
config variables such as your name, as well as setup branch names for specific GCC workflows, but most importantly for us it will add the “gcc-commit-mklog” git
subcommand.
So you’ll now be able to replace git commit
with git gcc-commit-mklog
and have Changelog skeletons directly embedded in your commits! Since this is just a wrapper around git commit
, you can use all of the other commit options such as --amend
, -s
, -m
… Often, I’ll create commits without a Changelog by accident and run git gcc-commit-mklog --amend
to fix them.
In case you’re still unsure about whether or not your Changelogs are correct (as I often am), you can use contrib/gcc-changelog/git_check_commit.py
. This script takes as input a commit or rev-list of multiple commits and will check each message’s format. I’ll often run the script before pushing my changes, as even with git gcc-commit-mklog
you might create lines that are too long or forget some files that were changed.
I usually run one of the following two commands
$ contrib/gcc-changelog/git_check_commit.py $(git log -1 --format=%h)
$ # or
$ contrib/gcc-changelog/git_check_commit.py upstream/master..HEAD
However, if you are contributing to gccrs
, we have GitHub actions in place which will check the format of your commits. So you do not need to worry about pushing commits with invalid Changelog entries! We also want to ensure that each commit builds and passes the testsuite, and are working towards adding more CI for this. In the meantime, if your PR contains multiple commits, or if you’re unsure about the state of each one since you’ve spent a lot of time splitting commits and rebasing them and squashing and fixup’ing and… well you can use the following command, which will build each commit and run our testsuite.
$ git rebase master -x 'make -C build -j4' \
-x 'make -C build check-rust' \
-x 'if grep \'unexpected\' build/gcc/testsuite/rust/rust.sum; then false; else true; fi'
Since the various check-*
recipes will not return a non-zero exit-code on unexpected failures, you need to add the extra grep
at the end. I think this can be simplified into ! grep '\unexpected\' build/gcc/testsuite/rust/rust.sum
but I wouldn’t trust anything I say regarding shell scripting.
There are probably tips that I missed, so feel free to get in touch to let me know about all of the good tricks :)
If you want even more compilation tips, or explanation about various options, you should have a look at gccrs
’ README, which contains a lot of extra info :) If you have any questions, you can also come chat with us on our Zulip!
type GitHub = "/CohenArthur";
type Twitter = "/CohenArthurDev";
type Mastodon = HachydermIO["@cohenarthur"];