UC3M

Telematic/Audiovisual Syst./Communication Syst. Engineering

Systems Architecture

September 2017 - January 2018

Style guide for C programming in Systems Architecture.

The same C program can be written in several ways. All of them may be correctly compiled and an executable created, but some of them are easier to understand than others when read by other people. The Style guide is a document that explains how C code must be written. This style changes from one institution to another, but in industrial environments, it is normal to require strict adherence to these rules. It follows an enumeration of the rules that we will require you to observe in this course. We will require that your code complies with all these requirements, thus, the sooner you read them and take them into account, the more time you save to achieve code that is easy to read and maintain.

But, Why are these rules need to be observed when writing programs? The reasons are easy to understand when put in the context of industrial size software projects. A few examples are shown next:

Application

Description

Lines of code

Windows XP Operating Systems

Complete operating system

40 millions

Linux Kernel

Basic functionality of the operating system

8.4 millions

Subversion

Version control system

417.000

Google Chrome

Web browser">

1.5 millions (C++) and 1.4 millions (C)

PHP

Scripting engine for dynamic web pages

800.000

The Gimp

Graphic Editor

675.000

VLC Media Player

Multimedia player

341.000 (C), y 93.000 (C++)

When a program is composed by several hundreds of thousands of lines, or even millions, it is very important to write code that is easy to read. If not, a huge effort (and therefore money) is needed to make the slightest modification.

You need to take into account that normally, code is written once, but read tens of times: to look for a problem, understand before changing, or to write other modules that interact with the rest of the program. The unwritten rule followed in industry is that the code is going to be constantly read by people that did not participate in its creation.

We describe next the rules that you must observe. They are all numbered to facilitate referencing them when you have to review the style of your code.

  1. Variable Names

    1. Variable, function and file names must be short, descriptive and concrete.

      Good Bad!
      struct tcp_header header;
      bool is_enabled;
      
      int parse_xml_file(FILE * file);
      void init_user_interface(void);
      
      list.c
      xml_parser.c
      math.c
      struct tcp_header b;
      bool tmp;
      
      int open_xml_file_and_get_content(FILE * f);
      void ui(void);
      
      types.h
      utils.c
      code3.c
    2. Variable and function names must be written in lower case and, if composed by several words, each word must be separated by the symbol _ (underscore). There are other styles such as to use capital letters to separate words (style known as CamelCase). In this course, we will adopt the separation by underscore. To illustrate why this scheme is preferred, read the following two sentences:

      IHadAGreenDollWithALargeTShirt
      
      I_had_a_green_doll_with_a_large_t_shirt

      Which of the two requires less effort when reading? In any case, if the standard library of the programming languages you are using is written in CamelCAse style (as it is the case for Java), then you must use that style for consistency. But this is not the case of C, thus, we will use the underscore separation.

    3. The macros and constants must be written in upper case to distinguish them from variables and functions.

    4. Constants and public enumerations must include a 3 or 4 character prefix to identify the module in which they are defined (a module is a set of data and functions contained in several files). This avoids conflicts between names in different modules. For example:

      #define LST_MAX_SIZE 32
      enum
      {
          MSG_CONNECT,
          MSG_ACK,
          MSG_DATA,
          MSG_RELEASE
      } message_type_t;
  2. Code format

    1. The code must be indented to represent the logic structure of a program. Tabulators must be used to indent, never white space. The indentation adopted in the course is of size 4 spaces. We recommend that you configure the source code editor so that the tabulator introduces the equivalent of 4 white spaces.

      The reason to use tabulators instead of white space is because with these symbols each programmer may visualize the code with the most comfortable indentation. You only have to configure the editor to represent tabs with the desired depth level.

    2. The curly braces must be placed following the Allman standard also known as BSD, that is, in the line following an if or a while (see examples in the rest of this section).

    3. A white space must be inserted before and after operators such as comparison, assignment, etc. A while space must be inserted also between the keywords (for, while, if, return, etc.) and the following expressions.

    4. The content of a function must fit completely in a screen. There should be no need to scroll to see the complete code of a function, although some special cases can be considered. In any case, function body cannot exceed ever the space of two screens.

    5. Lines must not exceed 80 characters in length. This policy simplifies the visualization of several files simultaneously on the screen. Two code fragments are shown, one correctly formatted and another one incorrectly formatted:

      Good
      int db_sync(void)
      {
          int i, retval = 0, result = 0;
      
          for (i = 0; i < P_SIZE; i++)
          {
              if (param_info[i].dirty && param_info[i].sync_cb)
              {
                  retval = param_info[i].sync_cb(i, param_db[i]);
                  result |= retval;
      
                  if (retval == 0)
                      param_info[i].dirty = false;
              }
              else
              {
                  LOG_WARNING(“No callback for param %d”, i);
              }
          }
      
          return result;
      }
      Bad!
      int db_sync(void)
      {
        int i, retval = 0, result = 0;
      
        for (i = 0; i < P_SIZE; i++){
        if (param_info[i].dirty && param_info[i].sync_cb) {
          retval = param_info[i].sync_cb(i, param_db[i]);
          result |= retval;
          if (retval == 0)
          param_info[i].dirty = false;
        } else {
        LOG_WARNING(“No callback for param %d”, i);
        }
        }
        return result;}
  3. Use of the pre-processor

    1. Macros must be used to define array sizes so that they are easy to read and modify. Macros are frequently used also for any other constant values in the code.

      #define TIMEOUT_SECS  120
      #define MAX_LINE_SIZE 80
      
      char input_line[MAX_LINE_SIZE];
      timer = set_timer(TIMEOUT_SECS);

      In the case of arrays, the reason is simple: initially, C does not allow to use a variable as the size of an array. Thus, the only way to use constants for that purpose is by using the preprocessor.

  4. Comments in the code

    1. All functions defined in a .c file, both public or private (static) must include a comment at the top explaining in a line or two their purpose. This comment may include some remarks specific to the execution of the function.

      /* 
       * db_sync()
       * Synchronizes the internal database with the firmware files by storing
       * modified parameters on permanent storage.
       *
       * Any parameter marked as "dirty" will be dumped by calling its associated
       * sync callback.
       */
      int db_sync(void)
      {
      ...
      }

      These comments are included so that a person that is exploring the code, can understand its purpose without the need to study in detail. The brief comment at the top of the function must rely itself in a descriptive function name.

    2. Comments must be included in those code locations implementing non-trivial operations. It is highly unlikely to comment a single line of code. If the code is cleanly written, a single line should be self-explanatory.

      Good Bad!
      /* Call synchronization callback for parameters
       * marked “dirty”. Clear dirty flag if callback
       * succeeds. */
      if (param_info[i].dirty && param_info[i].sync_cb)
      {
          retval = param_info[i].sync_cb(i, param_db[i]);
          result |= retval;
      
          if (retval == 0)
          {
              param_info[i].dirty = false;
          }
      }
      if (param_info[i].dirty && param_info[i].sync_cb)
      {
        /* Call sync_cb */
        retval = param_info[i].sync_cb(i, param_db[i]);
        result |= retval;
      
        if (retval == 0)
        {
          /* Set dirty flag to false ª/
          param_info[i].dirty = false;
        }
      }
  5. Code organization

    1. The C code is organized in files with extensions .c and .h. For each .c file, there is usually a .h file with the same name (list.c, list.h). This pair of files is informally known as a module. The .c files are not only files containing sets of functions. The key to organize the code in several files and avoid cross-referencing problems (the compiler complains because a symbol or definition that is in a different file, is not known) is to understand that what in other programming languages are called objects in C are called modules (although with a much simpler structure, of course).

    2. Each module (or quote if preferred) contains a set of functions, the prototypes of which make the public interface defined in the .h file. The .c file contains the implementation of these functions and in some cases, additional variables and functions only accessible from the same .c file. The prototypes of those functions that are public (they can be used from outside the module) are included in the .h file so that the rest of the modules may use it by including at the top of the file the directive #include. The rest of the functions, the private ones, are included only inside the .c file defined with the static prefix so that they cannot be invoked from outside that file.

    3. Each .c file must have at the top a directive to include its corresponding .h file (for example, list.c must have #include "list.h"). This is done to avoid inconsistencies between the definition of variables and prototypes in the public functions in the .c file and the declaration of its prototype in the .h file. If the corresponding .h is included in the .c file, the compiler can detect type conflicts.

    4. The .h files must contain only public definitions: types, constants, global variables and prototypes of functions to be used outside the module. Everything else must be included in the .c file.

    5. Every .h file must contain a guard to prevent multiple inclusions. A guard is implemented by surrounding the entire file content between #ifndef SYMBOL and #endif. The symbol name must be unique for this file (it is recommended to use the file name with underscores). After the line with the #ifndef directive a line with the #define directive must be included with the exact same symbol used in the previous line. The content of the .h file is inserted after this second line. It follows an example of a guard implemented in a file with name list.h.

      #ifndef _LIST_H
      #define _LIST_H_
      ...
      ... content of file list.h ...
      ...
      #endif /* _LIST_H_ */
    6. The .h files must only contain the minimum set of #include directives to compile on their own. The best way to test that a .h file includes only the essential files is writing a main as the one shown in the following figure:

      #include "list.h"
      int main(void)
      {
          return 0;
      }
    7. It is mandatory to define as static all private functions and variables (those that cannot be used in any other location). All global variables in a module must be written at the top in order to be seen at once. You should avoid defining them spread over the file. As a consequence, it is highly recommended to separate the definition of a data structure from the declaration of global variables of such type. The data types are defined on one hand (if they are public, in the .h file, if not, in the .c file), and then the declarations of the global variables that use those types.

      Good Bad!
      #include "param_db.h"
      
      struct param_entry
      {
          char param[PARAM_MAX_LEN];
          char value[VALUE_MAX_LEN];
          char default[VALUE_MAX_LEN];
      };
      
      struct param_list
      {
          struct param_entry param;
          struct list_node *next;
      };
      
      /* Global Variables */
      int error_code;
      static struct param_list *param_map;
      #include "param_db.h"
      
      struct param_entry
      {
          char param[PARAM_MAX_LEN];
          char value[VALUE_MAX_LEN];
          char default[VALUE_MAX_LEN];
      };
      
      static struct param_list
      {
          struct param_entry param;
          struct list_node *next;
      } *param_map;
      
      int error_code;