UC3M

Telematic/Audiovisual Syst./Communication Syst. Engineering

Systems Architecture

September 2017 - January 2018

Chapter 2.  C Data Types

The data structures in the C programming languages are simpler than those offered in Java because there is no notion of class nor object. C offers basic data types and two constructions to create more complex data. The access control to data present in Java (private, public and protected methods and fields) does not exist in C. Variables are either global, local to a file or local to a block of code.

2.1.  Basic data types

C offers three basic data types:

  • Integers defined with the keyword int

  • Characters defined with the keyword char

  • Real or floating point numbers defined with the keywords float or double.

2.1.1.  Integers

Defined with int and two optional prefixes are allowed:

  • short and long. Modifies the size in bits of the integer. Thus, there exist three types of integers: int, short int (which can be abbreviated short) and long int (which can be abbreviated long).

    The C programming languages does not define a fixed size for the basic data types. The only guarantee is that a short int has a size less or equal to a int itself with a size less or equal to a long int. This feature of the language has made the creation of programs that are compatible with multiple platforms quite complex.

  • unsigned: defines a natural number (greater or equal to zero).

Suggestion

In your development environment create a text file with the following structure (you may simply copy and paste the text in the following frame):

int main() 
{

}

Insert in the main function integer definitions to test all possible combinations (up to ten). To check that the program is syntactically correct, open a window with a command interpreter and in the folder where you created the file type the command gcc -Wall -o program file.c replacing file.c with the name of the file you created. If the command does not print any message, your program is correct. You may see that the compiler generated a file with extension .o, you may delete it.

Self-assessment questions

Answer the following questions (check your answers also compiling the program):

  1. The following program compiles without errors:

    void main()
    {
      int i;
      long int j;
      long k;
      short int l;
      short m;
    }
    • True

    • False

  2. The following program produces an error when compiling because an integer variable is being assigned to an unsigned variable (natural).

    void main()
    {
      unsigned i;
      int j = 0;
    
      i = j;
    }
    • True

    • False

  3. The declaration unsigned short x is identical to short unsigned x.

    • True

    • False

    Write a brief program and check your answer with the compiler.

2.1.2.  Characters and strings

The variables of type character are declared as char. To refer to a character, the symbol must be surrounded by simple quotes: 'M'. Characters are internally represented as numbers and the C language allows arithmetic operations with them such as 'M' + 25.

The strings are represented as tables of char. The library functions to manipulate strings all assume that the last byte of the chain has value zero. The strings are written in the program surrounded by double quotes and contain the value zero at the end. It follows an example with two definitions:

#define SIZE 6
char a = 'A';
char b[SIZE] = "hello";

Why the second definition has a size of six characters when the string only has five?

Suggestion

Reuse the program from the previous section and add char and string definitions. For the last ones, use different table sizes (too small and too large for the string). Write also arithmetic expressions over the characters. Remember that if the compiler does not emit any message, the program is correct.

Self-assessment questions

  1. Consider the following declaration:

    #define SIZE 6      
    char m[SIZE] = 'strag';

    It is incorrect because the string must be surrounded by double quotes.

    • True

    • False

    It is incorrect because the size must be 5 (it has 5 characters).

    • True

    • False

    If you think the answers are wrong, write the declaration in a program and compile it.

  2. A C program that prints the result of the expression 'M' + 25 is correct and prints a f.

    • True

    • False

    We recommend that you write such program. To print use printf("%d\n", 'M' + 25);

2.1.3.  Real numbers

The real numbers are defined with float or double. The difference is the amount of precision used to represent the numbers internally. There is an infinite number of real numbers, but they are represented with a finite number of bits in the computer. The bigger the number of bits used, the better the precision. Real numbers defined with double are represented with a size double of those declared as float. As in the case of the integers, the size of these representations depends on the platform.

Some platforms offer an extra type of real numbers with size larger than double that are defined as long double. Typical sizes for the float, double and double long data types are 4, 8 and 12 bytes respectively. It follows some examples of real number definitions.

float a = 3.5;
double b = -5.4e-12;
long double c = 3.54e320;

Suggestion

Add to the program used in the previous sections floating point numbers. Try to define very large and very small numbers to see the representation capabilities of each of the three types. Compile to check if the definitions are correct.

2.1.4.  Arrays

C arrays are almost identical to arrays in Java, the size surrounded by square brackets follows the array name. Also as in the case of Java, table elements begin with index zero. Some examples of array definitions are the following:

#define SIZE_TABLE 100
#define  SIZE_SHORT 5
#define SIZE_LONG 3
#define SIZE_NAME 10

int table[SIZE_TABLE];
short st[SIZE_SHORT] = { 1, 2, 3, 4, 5 };
long lt[SIZE_LONG] = { 20, 30, 40};
char name[SIZE_NAME];

Table elements are accessed with the table name followed by the index in square brackets..

One of the differences between C and Java is that no array verification is performed in C. If an array is accessed with an incorrect index in a Java program an exception of type ArrayIndexOutOfBounds is produced. This check is never done in C (unless it is explicitly written int the program). If an array is accessed with an incorrect index, the data in an incorrect memory area is manipulated, and the program proceeds with the execution.

After this incorrect access, two things may happen. The first one is that the accessed memory location is out of the limits of the program. In this case the execution terminates abruptly and the command interpreter shows the message segmentation fault. The second possibility is that a memory location still inside the program data area is accessed and the program keeps executing. This situation will likely produce an error with symptoms difficult to related to the incorrect access.

Multiple dimension arrays

C allows the definition of multiple dimension arrays by writing the multiple sizes surrounded by square brackets and concatenated. The access is done by providing as many indexes as required, each of them surrounded by square brackets. As in the case of uni-dimensional arrays, C performs no check on the indexes when accessing an element. It follows some examples multiple dimension array definitions.

#define MATRIX_A 100
#define MATRIX_B 30
#define COMMON_SIZE 10

int matrix[MATRIX_A][MATRIX_B];
long squarematrix[COMMON_SIZE][COMMON_SIZE];
char soup[COMMON_SIZE][COMMON_SIZE];

Suggestion

Add to the previous program definitions and manipulations of arrays of the different basic data types. Check that they are syntactically correct with the compiler.

2.1.5.  Size of the basic data types

The size of the basic data types in C may vary from one platform to another. This language feature has been highly criticized because it may translate in compatibility problems (an application behaves differently when executed in different platforms).

As an example, the following table includes the sizes for the data types in the Linux/Intel i686 Platform

Table 2.1.  Size of C data types in the Linux/Intel i686 platform

Type Size (bytes)
char, unsigned char1
short int, unsigned short int 2
int, unsigned int, long int, unsigned long int 4
float4
double8
long double12