UC3M

Telematic/Audiovisual Syst./Communication Syst. Engineering

Systems Architecture

September 2017 - January 2018

2.7.6.  Insertion sort and dicothomic search

Work Plan

SAUCEM S.L. is preparing an application to compete in the mobile world with goodreads y shelfari, social networks for book-lovers.

This kind of applications assign to users and authors an unique identifier, an integer. Besides, they identify the books by their ISBN, a 13-digits integer.

SAUCEM S.L. wants us to implement the search of a book by ISBN. However, the internal data structure of the application is not already defined, so SAUCEM S.L. wish a previous version: the search over an integer array. Besides, SAUCEM S.L. requests that we sort the array before performing the search.

Thus. first of all, we must sort the given array, and, then, search for a given element within that array.

The following steps will guide you towards your goal:

  1. Define in the file insertionSortandSearch.c, a size. In order to do so, use the #define directive, writing #define SIZE 5 before (and outside) of your main. Declare in your main an array of type int of size SIZE.

  2. We are going to use the function void random_filling(int *array, int size, int max, int min); to fill the array with random values. Download and include the file random_filling.h in your code. Besides, you must download the library random_filling.o for 32 or 64 bits random_filling_64.o.

    The function void random_filling(int *array, int nmemb, int max, int min) receives as input parameters the array, the number of elements of the array and its maximum and minimum values.

  3. Implement the function void print_array_integers(int *array, int nmemb) that prints an array of integers of size nmemb. Use this function in your main to check the filling of the array.

  4. As SAUCEM S.L. asks us to sort before searching we will implement a sorting algorithm. One of the easier ones: the insertion sort algorithm.

    This gif, taken from Wikipedia, shows the behaviour of the algorithm.

    The algorithm goes through the array from i equal to 1 to nmemb-1.

    It stores the element of the i position in a temporal variable tmp, leaving a hole.

    It compares all the elements from the position j=i-1 to 0 with the value stored in tmp.

    If tmp value is lower than array[j], the algorithm moves the hole to the j position, i.e., it copies array[j] in array[j+1].

    If tmp is greater than or equal to array[j], the algorithm copies tmp to the hole.

    Implement the algorithm in the function void insertion_sort(int *array, int nmemb).

    The algorithm must print messages (in this example, SIZE is equal to 3):

    $ ./insertSortandSearch 
    Filling with 34 to 3456
    
    Before sorting:
    Array = 580 2106 776 
    **********
     i=1
    Array = 580 2106 776 
    Ordering array[1]=2106
    array[0+1]=array[1]
    Array = 580 2106 776 
    **********
     i=2
    Array = 580 2106 776 
    Ordering array[2]=776
    j=1
    array[1+1] =array[1]
    Array = 580 2106 2106 
    array[0+1]=array[2]
    Array = 580 776 2106 
    
    After sorting:
    Array = 580 776 2106
  5. Delete the line where you have defined the value of SIZE.

    Now, compile the program in this way: gcc -Wall -g -o programa insertionSortandSearch.c random_filling.o -DSIZE=15 and execute it.

    Compile and execute the program for different values of SIZE, checking its behaviour.

  6. Now, we want to get rid of the comments and messages that the program prints. Instead of deleting or commenting them, we want to keep them for a future, just in case, but we do not want them to be printed everytime we execute the program.

    In the insertion_sort function, before each printf, set of printfs or call to the print_array_integers function write #ifdef INFO_SORT. Just after, #endif

    For example:

    #ifdef INFO_SORT
    printf ("**********\n i=%i\n",i);
    print_array_integers(array,size);
    #endif

    Compile with gcc -Wall -g -o programa insertSortandSearch.c random_filling.o -DINFO_SORT -DSIZE=5 Execute and check that the program still prints your messages.

    Compile again with gcc -Wall -g -o programa insertSortandSearch.c random_filling.o -DSIZE=10 Execute and check that the program does not print any of them

  7. Now, we want to search for a specific element within the sorted array. Besides, we also want to know how many iterations are needed to find such element.

    In this section, we will implement a brute force search algorithm. This algorithm goes through all the positions of the array until finding the desired number. When the algorithm finds the number, it stops and retrieves the index where the number is stored. The algorithm retrieves a -1 if it couldn't find the number.

    Implement int search_brute_force(int busco,int *array,int nmemb, int *numb_itero). The function receives as input parameters the number to find, the array, the number of elements of the array and a pointer where to store the number of iterations. It returns the index where it found the number or -1 if not.

    The function does not assume a pre-sorted array.

    To check its behaviour, define an integer in your main and call the search_brute_force and print where it found the number and the number of iterations needed to find it.

    $ ./insertSortandSearch 
    Filling with 2 to 2000
    Before sorting:
    Array = 321 1212 435 404 178 1605 1330 465 1445 893 1046 743 1271 974 97 169 729 566 1717 787 
    
    After sorting:
    Array = 97 169 178 321 404 435 465 566 729 743 787 893 974 1046 1212 1271 1330 1445 1605 1717 
    We want to look for a number: 890
    With Brute Force:
    No Found. Number of iterations 20
    
    $ ./insertSortandSearch 
    Filling with 2 to 2000
    Before sorting:
    Array = 321 1212 435 404 178 1605 1330 465 1445 893 1046 743 1271 974 97 169 729 566 1717 787 
    
    After sorting:
    Array = 97 169 178 321 404 435 465 566 729 743 787 893 974 1046 1212 1271 1330 1445 1605 1717 
    We want to look for a number: 1271
    With Brute Force:
    Found in index 15. Number of iterations 16
  8. Now, you must implement the binary search or dicothomic search algorithm.

    This algorithm assumes a pre-sorted array.

    The algorithm calculates the middle of the array. Then, it checks if the desired value is of that index. If not, checks if the desired value will in [initial, middle] or [middle, final]. In both cases, it updates the values of initial and final to keep up searching (in the first or in the second part of the array). This is repeated until the value is found.

    Implement the function int search_dichotomic_iterative(int busco,int *array,int ind_min,int ind_max, int *numb_itero).

    To better understand the algorithms, you should print the intermediate steps. Do not forget to use #ifdef INFO_SEARCH and #endif within the code and compile with -DINFO_SEARCH.

    The output with messages should be similar to this one:

    $ ./insertSortSearch 
    Filling with 2 to 8900
    Before sorting:
    Array = 1424 5390 1931 1795 788 
    
    After sorting:
    Array = 788 1424 1795 1931 5390 
    We want to look for a number: 1931
    
    ***************We are in Brute Force Algorithm***********
    In array[0]=788
    In array[1]=1424
    In array[2]=1795
    Found !! in i=3
    Number of iterations: 4
    
    ******************We are in Dichotomic Search Algorithm**********
    Searching between indexes array[0] = 788 and array[4] = 5390
    Comparing with array[2]=1795
    Searching between indexes array[3] = 1931 and array[4] = 5390
    Comparing with array[3]=1931
    Found !! in i=3
    Number of iterations: 2
    
    ******************Results*******************
    With Brute Force:
    Found in index 3. Number of iterations 4
    With Dichotomic Search:
    Found in index 3. Number of iterations 2
  9. We want to know what happens when we have huge size arrays, e.g. of 200000 or 300000 elements.

    As printing all the elements will consume a lot of time, use #ifdef IMPRESION in your code and compile with -DIMPRESION if you want them to be printed.

    First, we notice that filling the array takes a lot of time. If this is the case, compile with -DSIZE=20000

    $ ./insertSortSearch 
    Filling with 2 to 30000
    
    We want to look for a number: 5678
    
    With Brute Force:
    Found in index 3738. Number of iterations 3739
    
    With Dichotomic Search:
    Found in index 3739. Number of iterations 11
    
    $ ./insertSortDichSearch 
    Please, introduce a min: 2
    
    Please, introduce a max: 30000
    Filling with 2 to 30000
    
    We want to look for a number.
    Please, introduce a number: 5679
    
    With Brute Force:
    No Found. Number of iterations 20000
    
    With Dichotomic Search:
    No Found. Number of iterations 14

    You can notice a much lower number of iterations of the binary search algorithm compared with the brute force one. This is due to its lower complexity.

    The computational complexity measures the performance of an algorithm in the best, average and worse case.

    Regarding the sorting algorithms, we count the number of operations needed in the best, average and worse case in order to obtain a sorted array.

    In a file called cuestion.txt calculate your estimation of the complexity of the 3 algorithms you implemented in this exercise.

  10. ADDITIONAL ACTIVITY: implement the function void random_filling(int *array, int nmemb, int max, int min) that receives an array, its number of elements and the values maximum and minum.

    The function should use the int rand(void); function from the stdlib.h library. int rand(void); returns a random number between 0 and RAND_MAX.

    You should transform the random number from being in [0, RAND_MAX] to being in [min, max].

    Use the formula: min + rand()/RAND_MAX * (max-min). Be careful, because you can lose precision (or even the number) if the adecuate castings are not used.

    Use your function void random_filling instead of the given one and check its behaviour.