/policies/technical/coding-style.html

OpenSSL coding style

This document describes the coding style for the OpenSSL project. It is derived from the Linux kernel coding style.

This guide is not distributed as part of OpenSSL itself. Since it is derived from the Linux Kernel Coding Style, it is distributed under the terms of the kernel license.

Coding style is all about readability and maintainability using commonly available tools. OpenSSL coding style is simple. Avoid tricky expressions.

Chapter 1: Indentation

Indentation is four space characters. Do not use the tab character.

Pre-processor directives use one space for indents:

    #if
    # define
    #else
    # define
    #endif

Chapter 2: Breaking long lines and strings

Don’t put multiple statements, or assignments, on a single line.

    if (condition) do_this();
    do_something_everytime();

The limit on the length of lines is 80 columns. Statements longer than 80 columns must be broken into sensible chunks, unless exceeding 80 columns significantly increases readability and does not hide information. Descendants are always substantially shorter than the parent and are placed substantially to the right. The same applies to function headers with a long argument list. Never break user-visible strings, however, because that breaks the ability to grep for them.

Chapter 3: Placing Braces and Spaces

The other issue that always comes up in C styling is the placement of braces. Unlike the indent size, there are few technical reasons to choose one placement strategy over the other, but the preferred way, following Kernighan and Ritchie, is to put the opening brace last on the line, and the closing brace first:

    if (x is true) {
        we do y
    }

This applies to all non-function statement blocks (if, switch, for, while, do):

    switch (suffix) {
    case 'G':
    case 'g':
        mem <<= 30;
        break;
    case 'M':
    case 'm':
        mem <<= 20;
        break;
    case 'K':
    case 'k':
        mem <<= 10;
        /* fall through */
    default:
        break;
    }

Note, from the above example, that the way to indent a switch statement is to align the switch and its subordinate case labels in the same column instead of double-indenting the case bodies.

There is one special case, however. Functions have the opening brace at the beginning of the next line:

    int function(int x)
    {
        body of function
    }

Note that the closing brace is empty on a line of its own, except in the cases where it is followed by a continuation of the same statement, such as a while in a do-statement or an else in an if-statement, like this:

    do {
        ...
    } while (condition);

and

    if (x == y) {
        ...
    } else if (x > y) {
        ...
    } else {
        ...
    }

In addition to being consistent with K&R, note that that this brace-placement also minimizes the number of empty (or almost empty) lines. Since the supply of new-lines on your screen is not a renewable resource (think 25-line terminal screens here), you have more empty lines to put comments on.

Do not unnecessarily use braces around a single statement:

    if (condition)
        action();

and

    if (condition)
        do_this();
    else
        do_that();

If one of the branches is a compound statement, then use braces on both parts:

    if (condition) {
        do_this();
        do_that();
    } else {
        otherwise();
    }

Nested compound statements should often have braces for clarity, particularly to avoid the dangling-else problem:

    if (condition) {
        do_this();
        if (anothertest)
            do_that();
    } else {
        otherwise();
    }

Chapter 3.1: Spaces

OpenSSL style for use of spaces depends (mostly) on whether the name is a function or keyword. Use a space after most keywords:

    if, switch, case, for, do, while, return

Do not use a space after sizeof, typeof, alignof, or __attribute__. They look somewhat like functions and should have parentheses in OpenSSL, although they are not required by the language. For sizeof, use a variable when at all possible, to ensure that type changes are properly reflected:

    SOMETYPE *p = OPENSSL_malloc(sizeof(*p) * num_of_elements);

Do not add spaces around the inside of parenthesized expressions. This example is wrong:

    s = sizeof( struct file );

When declaring pointer data or a function that returns a pointer type, the asterisk goes next to the data or function name, and not the type:

    char *openssl_banner;
    unsigned long long memparse(char *ptr, char **retptr);
    char *match_strdup(substring_t *s);

Use one space on either side of binary and ternary operators, such as this partial list:

    =  +  -  <  >  *  /  %  |  &  ^  <=  >=  ==  !=  ?  : +=

Put a space after commas and after semicolons in for statements, but not in for (;;).

Do not put a space after unary operators:

    &  *  +  -  ~  !  defined

Do not put a space before the postfix increment and decrement unary operators or after the prefix increment and decrement unary operators:

    foo++
    --bar

Do not put a space around the . and -> structure member operators:

    foo.bar
    foo->bar

Do not use multiple consecutive spaces except in comments, for indentation, and for multi-line alignment of definitions, e.g.:

#define FOO_INVALID  -1   /* invalid or inconsistent arguments */
#define FOO_INTERNAL 0    /* Internal error, most likely malloc */
#define FOO_OK       1    /* success */
#define FOO_GREAT    100  /* some specific outcome */

Do not leave trailing whitespace at the ends of lines. Some editors with smart indentation will insert whitespace at the beginning of new lines as appropriate, so you can start typing the next line of code right away. But they may not remove that whitespace if you leave a blank line, however, and you end up with lines containing trailing, or nothing but, whitespace.

Git will warn you about patches that introduce trailing whitespace, and can optionally strip the trailing whitespace; however, if applying a series of patches, this may make later patches in the series fail by changing their context lines.

Avoid empty lines at the beginning or at the end of a file.

Avoid multiple empty lines in a row.

Chapter 4: Naming

C is a Spartan language, and so should your naming be.

Local variable names should be short, and to the point. If you have some random integer loop counter, it should probably be called i or j.

Avoid single-letter names when they can be visually confusing, such as I and O. Avoid other single-letter names unless they are telling in the given context. For instance, m for modulus and s for SSL pointers are fine.

Use simple variable names like tmp and name as long as they are non-ambiguous in the given context.

If you are afraid that someone might mix up your local variable names, perhaps the function is too long; see the chapter on functions.

Global variables (to be used only if you REALLY need them) need to have descriptive names, as do global functions. If you have a function that counts the number of active users, you should call that count_active_users() or similar, you should NOT call it cntusr().

For getter functions returning a pointer and functions setting a pointer given as a parameter, use names containing get0_ or get1_ (rather than get_) or set0_ or set1_ (rather than set_) or push0_ or push1_ (rather than push_) to indicate whether the structure referred to by the pointer remains as it is or it is duplicated/up-ref’ed such that an additional free() will be needed.

Use lowercase prefix like ossl_ for internal symbols unless they are static (i.e., local to the source file).

Use uppercase prefix like EVP_ or OSSL_CMP_ for public (API) symbols.

Do not encode the type into a name (so-called Hungarian notation, e.g., int iAge).

Align names to terms and wording used in standards and RFCs.

Avoid mixed-case unless needed by other rules. Especially never use FirstCharacterUpperCase. For instance, use EVP_PKEY_do_something rather than EVP_DigestDoSomething.

Make sure that names do not contain spelling errors.

Chapter 5: Typedefs

OpenSSL uses typedef’s extensively. For structures, they are all uppercase and are usually declared like this:

    typedef struct name_st NAME;

For examples, look in types.h, but note that there are many exceptions such as BN_CTX. Typedef’d enum is used much less often and there is no convention, so consider not using a typedef. When doing that, the enum name should be lowercase and the values (mostly) uppercase. Note that enum arguments to public functions are not permitted.

The ASN.1 structures are an exception to this. The rationale is that if a structure (and its fields) is already defined in a standard it’s more convenient to use a similar name. For example, in the CMS code, a CMS_ prefix is used so ContentInfo becomes CMS_ContentInfo, RecipientInfo becomes CMS_RecipientInfo etc. Some older code uses an all uppercase name instead. For example, RecipientInfo for the PKCS#7 code uses PKCS7_RECIP_INFO.

Be careful about common names which might cause conflicts. For example, Windows headers use X509 and X590_NAME. Consider using a prefix, as with CMS_ContentInfo, if the name is common or generic. Of course, you often don’t find out until the code is ported to other platforms.

A final word on struct’s. OpenSSL has historically made all struct definitions public; this has caused problems with maintaining binary compatibility and adding features. Our stated direction is to have struct’s be opaque and only expose pointers in the API. The actual struct definition should be defined in a local header file that is not exported.

Chapter 6: Functions

Ideally, functions should be short and sweet, and do just one thing. A rule of thumb is that they should fit on one or two screenfuls of text (25 lines as we all know), and do one thing and do that well.

The maximum length of a function is often inversely proportional to the complexity and indentation level of that function. So, if you have a conceptually simple function that is just one long (but simple) switch statement, where you have to do lots of small things for a lot of different cases, it’s okay to have a longer function.

If you have a complex function, however, consider using helper functions with descriptive names. You can ask the compiler to in-line them if you think it’s performance-critical, and it will probably do a better job of it than you would have done.

Another measure of complexity is the number of local variables. If there are more than five to 10, consider splitting it into smaller pieces. A human brain can generally easily keep track of about seven different things; anything more and it gets confused. Often things which are simple and clear now are much less obvious two weeks from now, or to someone else. An exception to this is the command-line applications which support many options.

In source files, separate functions with one blank line.

In function prototypes, include parameter names with their data types. Although this is not required by the C language, it is preferred in OpenSSL because it is a simple way to add valuable information for the reader. The name in the prototype declaration should match the name in the function definition.

Separate local variable declarations and subsequent statements by an empty line.

Do not mix local variable declarations and statements.

Chapter 6.1: Checking function arguments

A public function should verify that its arguments are sensible. This includes, but is not limited to, verifying that: - non-optional pointer arguments are not NULL and - numeric arguments are within expected ranges.

Where an argument is not sensible, an error should be returned.

Chapter 6.2: Extending existing functions

From time to time it is necessary to extend an existing function. Typically this will mean adding additional arguments, but it may also include removal of some.

Where an extended function should be added the original function should be kept and a new version created with the same name and an _ex suffix. For example, the RAND_bytes function has an extended form called RAND_bytes_ex.

Where an extended version of a function already exists and a second extended version needs to be created then it should have an _ex2 suffix, and so on for further extensions.

When an extended version of a function is created the order of existing parameters from the original function should be retained. However new parameters may be inserted at any point (they do not have to be at the end), and no longer required parameters may be removed.

Chapter 7: Centralized exiting of functions

The goto statement comes in handy when a function exits from multiple locations and some common work such as cleanup has to be done. If there is no cleanup needed then just return directly. The rationale for this is as follows:

Unconditional statements are easier to understand and follow
It can reduce excessive control structures and nesting
It avoids errors caused by failing to update multiple exit points when the code is modified
It saves the compiler work to optimize redundant code away ;)

For example:

    int fun(int a)
    {
        int result = 0;
        char *buffer = OPENSSL_malloc(SIZE);

        if (buffer == NULL)
            return -1;

        if (condition1) {
            while (loop1) {
                ...
            }
            result = 1;
            goto out;
        }
        ...
    out:
        OPENSSL_free(buffer);
        return result;
    }

Chapter 8: Commenting

Use the classic /* ... */ comment markers. Don’t use // ... markers.

Place comments above or to the right of the code they refer to. Comments referring to the code line after should be indented equally to that code line.

Comments are good, but there is also a danger of over-commenting. NEVER try to explain HOW your code works in a comment. It is much better to write the code so that it is obvious, and it’s a waste of time to explain badly written code. You want your comments to tell WHAT your code does, not HOW.

The preferred style for long (multi-line) comments is:

    /*-
     * This is the preferred style for multi-line
     * comments in the OpenSSL source code.
     * Please use it consistently.
     *
     * Description:  A column of asterisks on the left side,
     * with beginning and ending almost-blank lines.
     */

Note the initial hyphen to prevent indent from modifying the comment. Use this if the comment has particular formatting that must be preserved.

It’s also important to comment data, whether they are basic types or derived types. To this end, use just one data declaration per line (no commas for multiple data declarations). This leaves you room for a small comment on each item, explaining its use.

Chapter 9: Macros and Enums

Names of macros defining constants and labels in enums are in uppercase:

    #define CONSTANT 0x12345

Enums are preferred when defining several related constants. Note, however, that enum arguments to public functions are not permitted.

Macro names should be in uppercase, but macros resembling functions may be written in lower case. Generally, inline functions are preferable to macros resembling functions.

Macros with multiple statements should be enclosed in a do - while block:

    #define macrofun(a, b, c)   \
        do {                    \
            if (a == 5)         \
                do_this(b, c);  \
        } while (0)

Do not write macros that affect control flow:

    #define FOO(x)                 \
        do {                       \
            if (blah(x) < 0)       \
                return -EBUGGERED; \
        } while(0)

Do not write macros that depend on having a local variable with a magic name:

    #define FOO(val) bar(index, val)

It is confusing to the reader and is prone to breakage from seemingly innocent changes.

Do not write macros that are l-values:

    FOO(x) = y

This will cause problems if, e.g., FOO becomes an inline function.

Be careful of precedence. Macros defining an expression must enclose the expression in parentheses unless the expression is a literal or a function application:

    #define SOME_LITERAL 0x4000
    #define CONSTEXP (SOME_LITERAL | 3)
    #define CONSTFUN foo(0, CONSTEXP)

Beware of similar issues with macros using parameters. Put parentheses around uses of macro arguments unless they are passed on as-is to a further macro or function. For example,

#define MACRO(a,b) ((a) * func(a, b))

The GNU cpp manual deals with macros exhaustively.

Chapter 10: Allocating memory

OpenSSL provides many general purpose memory utilities, including, but not limited to: OPENSSL_malloc(), OPENSSL_zalloc(), OPENSSL_realloc(), OPENSSL_memdup(), OPENSSL_strdup() and OPENSSL_free(). Please refer to the API documentation for further information about them.

Chapter 11: Function return values and names

Functions can return values of many different kinds, and one of the most common is a value indicating whether the function succeeded or failed. Usually this is:

1: success
0: failure

Sometimes an additional value is used:

-1: something bad (e.g., internal error or memory allocation failure)

Other APIs use the following pattern:

>= 1: success, with value returning additional information
<= 0: failure with return value indicating why things failed

Sometimes a return value of -1 can mean “should retry” (e.g., BIO, SSL, et al).

Functions whose return value is the actual result of a computation, rather than an indication of whether the computation succeeded, are not subject to these rules. Generally they indicate failure by returning some out-of-range result. The simplest example is functions that return pointers; they return NULL to report failure.

Chapter 12: Editor modelines

Some editors can interpret configuration information embedded in source files, indicated with special markers. For example, emacs interprets lines marked like this:

-- mode: c --

Or like this:

    /*
    Local Variables:
    compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c"
    End:
    */

Vim interprets markers that look like this:

    /* vim:set sw=8 noet */

Do not include any of these in source files. People have their own personal editor configurations, and your source files should not override them. This includes markers for indentation and mode configuration. People may use their own custom mode, or may have some other magic method for making indentation work correctly.

Chapter 13: Processor-specific code

In OpenSSL’s case the only reason to resort to processor-specific code is for performance. As it still exists in a general platform-independent algorithm context, it always has to be backed up by a neutral pure C one. This implies certain limitations. The most common way to resolve this conflict is to opt for short inline assembly function-like snippets, customarily implemented as macros, so that they can be easily interchanged with other platform-specific or neutral code. As with any macro, try to implement it as single expression.

You may need to mark your asm statement as volatile, to prevent GCC from removing it if GCC doesn’t notice any side effects. You don’t always need to do so, though, and doing so unnecessarily can limit optimization.

When writing a single inline assembly statement containing multiple instructions, put each instruction on a separate line in a separate quoted string, and end each string except the last with \n\t to properly indent the next instruction in the assembly output:

        asm ("magic %reg1, #42\n\t"
             "more_magic %reg2, %reg3"
             : /* outputs */ : /* inputs */ : /* clobbers */);

Large, non-trivial assembly functions go in pure assembly modules, with corresponding C prototypes defined in C. The preferred way to implement this is so-called “perlasm”: instead of writing real .s file, you write a perl script that generates one. This allows use symbolic names for variables (register as well as locals allocated on stack) that are independent on specific assembler. It simplifies implementation of recurring instruction sequences with regular permutation of inputs. By adhering to specific coding rules, perlasm is also used to support multiple ABIs and assemblers, see crypto/perlasm/x86_64-xlate.pl for an example.

Another option for processor-specific (primarily SIMD) capabilities is called compiler intrinsics. We avoid this, because it’s not very much less complicated than coding pure assembly, and it doesn’t provide the same performance guarantee across different micro-architecture. Nor is it portable enough to meet our multi-platform support goals.

Chapter 14: Portability

To maximise portability the version of C defined in ISO/IEC 9899:1990 should be used. This is more commonly referred to as C90. ISO/IEC 9899:1999 (also known as C99) is not supported on some platforms that OpenSSL is used on and therefore should be avoided.

Chapter 15: Expressions

Avoid needless parentheses as far as reasonable. For example, do not write

    if ((p == NULL) && (!f(((2 * x) + y) == (z++))))

but

    if (p == NULL && !f(2 * x + y == z++)).

For clarity, always put parentheses when mixing the logical && and || operators, mixing comparison operators like <= and ==, or mixing bitwise operators like & and |. For example,

    if ((a && b) || c)
    if ((a <= b) == ((c >= d) != (e < f)))
    x = (a & b) ^ (c | d)

Regarding parentheses in macro definitions see the chapter on macros.

In comparisons with constants (including NULL and other constant macros) place the constant on the right-hand side of the comparison operator. For example,

    while (i++ < 10 && p != NULL)

Do not use implicit checks for numbers (not) being 0 or pointers (not) being NULL. For example, do not write

    if (i)
    if (!(x & MASK))
    if (!strcmp(a, "FOO"))
    if (!(p = BN_new()))

but do this instead:

    if (i != 0)
    if ((x & MASK) == 0)
    if (strcmp(a, "FOO") == 0)
    if ((p = BN_new()) == NULL)

Boolean values shall be used directly as usual, e.g.,

if (check(x) && !success(y))

Note: Many functions can return 0 or a negative value on error and the Boolean forms need to be used with care.

If you need to break an expression into multiple lines, make the line break before an operator, not after. It is preferred that such a line break is made before as low priority an operator as possible. Examples:

not this:

if (somewhat_long_function_name(foo) == 1 && a_long_variable_name
    == 2)

but rather:

if (somewhat_long_function_name(foo) == 1
    && a_long_variable_name == 2)

This is, however, still ok:

if (this_thing->this_freakishly_super_long_name(somewhat_long_name, 3)
    == PRETTY_DARN_LONG_MACRO_NAME)

When appearing at the beginning of a line, operators can, but do not have to, get an extra indentation (+ 4 characters). For example,

    if (long_condition_expression_1
            && condition_expression_2) {
        statement_1;
        statement_2;
    }

Chapter 16: Asserts

We have 3 kind of asserts. The behaviour depends on being a debug or release build:

Function	failure release	failure debug	success release	success debug
assert	not evaluated	abort	not evaluated	nothing
ossl_assert	returns 0	abort	returns 1	returns 1
OPENSSL_assert	abort	abort	nothing	nothing

Use OPENSSL_assert() only in the following cases:

In the libraries when the global state of the software is corrupted and there is no way to recover it
In applications, test programs and fuzzers

Use ossl_assert() in the libraries when the state can be recovered and an error can be returned. Example code:

    if (!ossl_assert(!should_not_happen)) {
        /* push internal error onto error stack */
        return BAD;
    }

Use assert() in libraries when no error can be returned.

Chapter 17: References

The C Programming Language, Second Edition by Brian W. Kernighan and Dennis M. Ritchie. Prentice Hall, Inc., 1988. ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback).

The Practice of Programming by Brian W. Kernighan and Rob Pike. Addison-Wesley, Inc., 1999. ISBN 0-201-61586-X.

GNU manuals - we’re in compliance with K&R and this text - for cpp, gcc, gcc internals and indent.

WG14 is the international standardization working group for the programming language C.