The historic OpenSSL coding style was very unusual and idiosyncratic. Not only that, but it was inconsistently applied and was not formally defined anywhere. This made reading the code and maintaining it more difficult than it needed to be. As part of our roadmap document we decided to change that. Recently we published for the first time our new Coding Style. This is very much inspired by the Linux Kernel style and we used their Coding Style document as a starting point for developing ours.
Having developed our Coding Style, the next question was, how are we going to implement it? After looking around at different options we chose the GNU
indent tool for performing the task. For those interested, the specific indent options being used are:
-bap -bbo -br -brs -c33 -cd33 -ce -ci4 -cli0 -cp33 -d0 -di1 -hnl -i4 -il1 -ip0 -l78 -lp -nbad -nbc -ncdb -ncs -nfc1 -nfca -npcs -nprs -npsl -nsc -ppi1 -saf -sai -saw -sob -ss -ts0 -fc1 -fca -cdb -sc
However, it was a little more complicated than we first hoped. There were a number of problems that needed to be overcome.
indent needs to know about all types that are used within the code. Without this you get lines like this:
int BIO_read(BIO * b, void *out, int outl)
int BIO_read(BIO *b, void *out, int outl)
Note the extra space between the ‘*’ and the ‘b’. indent seems to treat the ‘*’ as a multiplication operator without knowledge of the types. Fortunately
indent has a mechanism for supplying type information - all the openssl types are listed in the
indent.pro file in the
util directory. This isn’t a complete solution however because
indent has problems with certain types that OpenSSL uses, e.g.
STACK_OF(X509). In order to workaround problems like this we have introduced a script,
openssl-format-source which is in the
util directory. This performs some up front manipulation of source files to get them into a form that
indent can handle; runs
indent; and then backs out the changes it initially made. For example it will convert all instances of
indent can deal with), and then converts them back afterwards. The script is invoked as follows:
util/openssl-format-source -v -c .
The above command assumes you are in the top level directory, and will apply the formatting rules to all source files. The ‘-v’ argument provides verbose output, and ‘-c’ applies comment formatting rules.
The next problem was that we wanted
indent to reformat our comments. However, there are some comments in the code which have some internal formatting which we would not want to lose. For example take this comment from “bio.h”:
/* Buffers are setup like this: * * <---------------------- size -----------------------> * +---------------------------------------------------+ * | consumed | remaining | free space | * +---------------------------------------------------+ * <-- off --><------- len -------> */
indent on the above, with our selected options, this is what you get:
/* * Buffers are setup like this: <---------------------- size * -----------------------> * +---------------------------------------------------+ | consumed | * remaining | free space | * +---------------------------------------------------+ <-- off * --><------- len -------> */
Not quite what we had in mind! The solution that we have applied is to add a ‘-’ symbol at the beginning of the comment, as shown below:
/*- * Buffers are setup like this: * * <---------------------- size -----------------------> * +---------------------------------------------------+ * | consumed | remaining | free space | * +---------------------------------------------------+ * <-- off --><------- len -------> */
indent not to reformat the comment, and it is left untouched in the final source. That solution in itself causes other problems though:
- All comments throughout the whole code have to be manually inspected to check whether we should preserve formatting or not
- When I said above that the ‘-’ symbol leaves a comment untouched - I mean completely untouched, so if the indenting of the surrounding source has changed then the comment will remain at its original indentation
The solution to the second problem above was to go through the entire codebase after the automated formatting had been run, and manually reindent the comments.
It seems in general
indent is not very good at handling comments. In particular comments that are embedded in a statement/expression like this:
point_add(nq, nq, nq, nq, nq, nq, 1 /* mixed */, tmp, tmp, tmp);
After reformatting this line looked like:
point_add(nq, nq, nq, nq, nq, nq, 1 /* mixed */ , tmp, tmp, tmp);
Similar problems can occur with comments on the right hand side of statements. The solution to most of these problems was, on the whole, to move the comment to somewhere else so that
indent didn’t get confused.
Sometimes rules are made to be broken. In a small number of instances adhering to the strict letter of the style doesn’t make sense. Take
crypto/aes/aes_core.c as an example. This has lots of lines that look similar to:
t0 = Te0[(s0 >> 24) ] ^ Te1[(s1 >> 16) & 0xff] ^ Te2[(s2 >> 8) & 0xff] ^ Te3[(s3 ) & 0xff] ^ rk;
Applying the strict formatting rules, this turns into:
t0 = Te0[(s0 >> 24)] ^ Te1[(s1 >> 16) & 0xff] ^ Te2[(s2 >> 8) & 0xff] ^ Te3[(s3) & 0xff] ^ rk;
Whilst strictly correct, this is a lot less readable. In the case of this file, and a small number of others, the answer was to manually apply the formatting rules, and to skip it completely from auto-formatting. The
openssl-format-source script has a list of files to skip and so does not apply the formatting rules to them.
The formatting work has been applied to all of our active branches, i.e. in git:
This has been done to ease mainteance work, and enable cherry-picking of changes between the different branches. In order to minimise the impact for everyone we have tried to keep all of the formatting commits together so that they can be easily identified. For the branches that were released at the time the work was being performed (i.e. versions 1.0.1, 1.0.0 and 0.9.8) we have timed the work so that the commits appeared immediately after a release. Version 1.0.2 was not released at the time (although it has now been). There was a period of about a week between the last releases of 1.0.1, 1.0.0 and 0.9.8 and the formatting commits being applied. During that time the public repository was frozen and no other non-formatting related commits were applied.
In order to help people identify the commits related to reformatting a number of tags have been applied to the repository.
On “master”, the last commit before the change freeze was this one:
This has been tagged as:
The last commit before the openssl-format-source script was run (but after manual prep work had been completed) was:
This has been tagged as:
The last commit after the automated script was run was:
This has been tagged as:
The last commit at the end of the change freeze was this one:
This has been tagged as:
It should be noted that on master only there were a couple of commits that went in some time in advance of the change freeze related to comment changes. This was done before we had decided on the strategy that we finally went with. The commits in question are:
These two commits will therefore fall outside the two tags mentioned above for the master branch only.
For the other branches tags have been similarly created for immediately before the reformatting work, and for the last commit at the end of the reformatting work. These tags are as follows:
There are a couple of “gotchas” about these tags to be aware of:
Firstly I stated above that the reformat work was the first thing done after the last releases for 1.0.1, 1.0.0 and 0.9.8 (i.e. releases 1.0.1l, 1.0.0q and 0.9.8ze). Technically speaking for 1.0.1 the 1.0.1l release is tagged at this point:
commit b83ceba7d51e846cf24433aa3c417bfd62b3ffa5 Author: Matt Caswell <firstname.lastname@example.org> AuthorDate: Thu Jan 15 14:45:15 2015 +0000 Prepare for 1.0.1l release Reviewed-by: Stephen Henson <email@example.com>
However, the release process uses an automated script which generated one additional commit after the above:
commit 3a9a0321638ae13957b66baae6d4955597fc128d Author: Matt Caswell <firstname.lastname@example.org> AuthorDate: Thu Jan 15 14:49:54 2015 +0000 Prepare for 1.0.1m-dev Reviewed-by: Stephen Henson <email@example.com>
So that means OpenSSL_1_0_1-pre-reformat is not the same tag as OpenSSL_1_0_1l…it’s the commit after it, i.e. there is one commit after the release but before the reformat. The situation is similar for 1.0.0 and 0.9.8.
Another gotcha is that immediately after the change freeze was lifted “business as usual” began and normal commits started to be applied. Fairly early on it was noticed that, due to some reformatting errors, we were failing to build across all branches on windows. Therefore some additional commits were applied across all branches. These commits are:
The tags I have identified above mark the end of the change freeze and therefore do not include the above commits (which occurred later). On the 1.0.1, 1.0.2 and master branches there are some intervening “business as usual” non reformat related commits between the above and the
It is not unreasonable to expect that there may be further fixes we need to apply, due to reformatting errors that we have not yet encountered.
If you maintain your own patches for OpenSSL releases then you may want to consider reformatting those patches. In the following instructions I will outline how this can be done. It’s a bit fiddly unfortunately, and gets more fiddly if you have a large number of commits to reformat.
I assume a few things. If any are false then you will have to adjust the instructions accordingly:
- You are using a branch in git to maintain your patches, and that you want to rebase your branch commits against the latest state of
- These instructions are for reformatting one commit only. If you have multiple commits then you will have to repeat for each one.
- If your patches are against a released version (as opposed to
master) then you will have to adjust accordingly.
- You are on a Linux/Unix system
Assuming you are on the branch with your commits that need reformatting, then first of all you need to rebase against the master-pre-auto-reformat tag:
git fetch git rebase master-pre-auto-reformat
Fix any conflicts that may occur in the normal way. Get a list of all the commit hashes that need to be reformatted:
git log --format=oneline master-pre-auto-reformat..HEAD
Create some temporary branches to work on:
git checkout master-post-auto-reformat git checkout -b tmp-dest git checkout master-pre-auto-reformat git checkout -b tmp-src
Now you are on tmp-src you need to cherry-pick the first commit onto it (replace
firsthash with the hash value of the first commit to be reformatted):
git cherry-pick firsthash
You will want to preserve the author and commit message details, so store them away:
git log -1 --format="%an <%ae>" >/tmp/tmp.author git log -1 --format="%B" >/tmp/tmp.commitmsg
Now reformat the code (you will need GNU
indent installed for this). This needs to be done twice (for some reason indent takes a little while to settle down):
util/openssl-format-source -v -c . util/openssl-format-source -v -c .
Now create a patch file for your first commit:
git diff tmp-dest --stdout >/tmp/tmp.patch
Reset the source branch to remove the formatting:
git reset --hard
Move to the target branch:
git checkout tmp-dest
Apply the patch:
patch -p1 </tmp/tmp.patch
Commit the patch:
git add . git commit --file=/tmp/tmp.commitmsg --author="`cat /tmp/tmp.author`"
If you have other commits to reformat then checkout the tmp-src branch again:
git checkout tmp-src
Now repeat the process - go back to the cherry-pick instruction above and repeat for the second commit.
Finally, once all commits have been done:
git checkout tmp-dest git rebase master
Fix any conflicts. You’re done.
The reformat work is now completed, aside from any support issues which occur as a result. We are very pleased with the results that we have achieved. I hope that this will make the OpenSSL source code much easier to work with in the future.