Notes 4

The order of algorithms

Much of a computers time is spent sorting and searching through data. Enormous amounts of research effort have been spent making this more efficient. We will discuss some sorting and searching algorithms and how to compare them.

The order of an algorithm is a measure of how efficient it is. It is not an equation that can be used to predict the running time of a program that implements this algorithm. It is similar to calculating limits in calculus. Only the highest order terms count.

A single statement is of order 1. This is written as O(1). This is known as 'big O' notation. This means that the time to execute this statement is constant. It may be a different constant from statement to statement and the actual time to run it will be different. But the time to run it is bounded by a constant and doesn't depend on the amount of data being processed.

A for loop is a different matter. For example,

for(int i=0; i<N; i++)
	x += i;   // this statement is O(1)
The loop runs N times and the body of it is O(1). So the order of the loop is O(N). This means that the time to execute the loop is proportional to the number of items. This tells us little about the actual number of instructions run or the time it takes. It tells is that the time increases linearly as N increases.

Now lets look at a nested loop.


for(int i=0; i<N; i++)
	for(int j=0; j<N; j++)
		x = j*i;   // this statement is O(1)
The inner loop runs N times for each execution of the outer loop. So it runs N*N times. Therefore the order of a nested loop is O(N2).

Now lets looks at part of a sort routine.

for(int i=N; i>0; i--)
	for(int j=0; j<i; j++)
		if(a[j] > a[j+1]) swap(a[j],a[j+1]);  // This is O(1)
The inner loop doesn't run to N, it runs to I. So it executed N-1,N-2,N-3,...,3,2,1 times. The sum of these is the total time the inner loop runs. This is the sum of an arithmetic series so the sum is N(N-1)/2. So the algorithm is of order O((N2-N)/2). But like limits, we can throw out the -N term because it is small compared to the N2. And dividing by 2 doesn't matter either. So the algorithm is of order O(N2). The actual running time is slightly faster than the earlier nested loop (because of the -N term) but we are interested in upper bounds. The order of an algorithm is not used to predict running time, but to compare the efficiencies of different algorithms to solve the same problem.

Sorting

Much of a computers time is spent sorting and searching through data. Enormous amounts of research effort have been spent making this more efficient. We will look at a couple of simple sorting algorithms. We will see others later in the term.

Selection Sort

This works by finding the smallest item in the unsorted part of the array and swapping it with the front of the unsorted part. It moves one item and continues. The order is ((N2 - N)/2). Reverse order is the worst case.

   //-----------------------------------------------------------------
   //  Sorts the specified array of integers using the selection
   //  sort algorithm.
   // from the chapter 6 examples
   //-----------------------------------------------------------------
   public static void selectionSort (int[] numbers)
   {
      int min, temp;

      for (int index = 0; index < numbers.length-1; index++)
      {
         min = index;
         for (int scan = index+1; scan < numbers.length; scan++)
            if (numbers[scan] < numbers[min])
               min = scan;

         // Swap the values
         temp = numbers[min];
         numbers[min] = numbers[index];
         numbers[index] = temp;
      }
   }

Insertion Sort

This one works by putting each element in the correct place. It uses a temp variable to hold the misplaced item. Order is (1/4(N2 + N -2)). Could be changed to include a binary search to locate insertion point. This makes a noted improvement in speed.

   //-----------------------------------------------------------------
   //  Sorts the specified array of integers using the insertion
   //  sort algorithm.
   // from the chapter 6 examples
   //-----------------------------------------------------------------
   public static void insertionSort (int[] numbers)
   {
      for (int index = 1; index < numbers.length; index++)
      {
         int key = numbers[index];
         int position = index;

         // shift larger values to the right
         while (position > 0 && numbers[position-1] > key)
         {
            numbers[position] = numbers[position-1];
            position--;
         }
            
         numbers[position] = key;
      }
   } 

Here is a link to some animated sorting algorithms.

Sorting objects in general

We can write sorting code that will sort arrays of objects rather then just numbers. For this to be possible, the class that describe the objects must obey some rules. Each class we want to sort must implement the Comparable interface. An interface is a class with methods that are not defined, only declared. ( more detail here) The Comparable interface has only one method, compareTo(). This method is used by sorting routines to compare two instances of the class to determine their order. The method must return an integer that has a value of 0 if the two objects are equal, a negative number if the executing object is less than the parameter, and a positive number if the executing object is greater than the parameter. Here is an example. Let's make a class that represents buildings. Further, we will decide that they way two buildings are compared is to compare the total square footage of the buildings.

public class Building implements Comparable
{
   private int numFloors;  // number of floors
   private double squareFeet; // total area
   private double perFloor;   // area per floor
   
   public Building(int nfloors,double sqft)
   {
       numFloors = nfloors;
	   perFloor = sqft;
	   squareFeet = (double) numFloors * perFloor;
   } // constructor
   public int compareTo(Object other)
   {
      if(this.squareFeet < ((Building)other).squareFeet)
	     return -1;
	  if(this.squareFeet > ((Building)other).squareFeet)
	     return 1; 
	  return 0;  // same
   } // compareTo
} // class building
We need another class that implements the interface to show the usefulness.

public class House implements Comparable
{
   private double squareFeet; // total area
   
   public House(double sqft)
   {
	   squareFeet = sqft;
   } // constructor
   public int compareTo(Object other)
   {
      if(this.squareFeet < ((House)other).squareFeet)
	     return -1;
	  if(this.squareFeet > ((House)other).squareFeet)
	     return 1; 
	  return 0;  // same
   } // compareTo
} // class house
And here is a driver for the test.

// BuildTest.java       Author: Kent Archie
//
//  test comparable example
//********************************************************************

import Building;
import House;

public class BuildTest
{
   
   //-----------------------------------------------------------------
   //  Creates and displays the Building example
   //-----------------------------------------------------------------
   public static void main (String[] args)
   {
      Building b1 = new Building(10,1234.5);
      Building b2 = new Building(15,1234.5);
      House h1 = new House(1234.5);
      House h2 = new House(6543.2);
      checkIt(b1,b2);
      checkIt(h2,h1);
 //     checkIt(b1,h2);
   } // main
   
   private static void checkIt(Comparable one, Comparable two)
   {
     if(one.compareTo(two) < 0)
         System.out.println("one is smaller");
      else
         if(one.compareTo(two) > 0)
            System.out.println("two is smaller");
         else
            System.out.println("one == two");
   } // checkIt
} // BuildTest
This shows the usefulness of interfaces. We can write one test that uses several different types of objects.

Searching

Here is some code that compares a linear and a binary search.


 * This demonstrates a simple binary search
 * 
 * @author Kent Archie
 * @version 1.0
 */
public class Bsearch
{
  //-----------------------------------------------------------------
   //  Creates and displays the Building example
   //-----------------------------------------------------------------
   public static void main (String[] args)
   {
        int[] data = {1,2,3,4,5,6,7,8,9,10};
        if(args.length < 1) {
            System.out.println("need a target value");
            System.exit(1);
        }
        bsearch(data,Integer.parseInt(args[0]));
        lsearch(data,Integer.parseInt(args[0]));
   } // main

   private static void bsearch(int [] d, int target)
   {
        int mid=0,bottom=0,top=d.length-1,numchecks=0;
        System.out.println("Starting binary search for " + target);
        while(bottom < top) {
            mid = (bottom+top)/2;
            if(d[mid] < target)
                bottom = mid + 1;
            else
                top = mid;
            numchecks++;
        } // while
        if(top < bottom) {  // target isn't in here
            System.out.println("target not found after " + numchecks + " checks");
            return;
        }
        if(d[bottom] == target) {  // found it
            System.out.println("Found " + target + " at " + bottom + 
                               " after " + (numchecks+1) + " checks");
            return;
        }
        System.out.println("target not found after " + numchecks + " checks");
   } // bsearch

    // this does a simple linear search of the data array.
   private static void lsearch(int[] d, int target)
   {
        int numchecks = 0;
        System.out.println("Starting linear search for " + target);
        for(int i=0; i< d.length; i++) {
            if(d[i] == target)
                System.out.println("Found " + target + " at " + i + 
                                    " after " + numchecks + " checks");
            else
                numchecks++;
        } // for
   } // lsearch
} // class Bsearch

Starting binary search for 5
Found 5 at 4 after 4 checks
Starting linear search for 5
Found 5 at 4 after 4 checks

Starting binary search for 8
Found 8 at 7 after 4 checks
Starting linear search for 8
Found 8 at 7 after 7 checks

Starting binary search for 10
Found 10 at 9 after 4 checks
Starting linear search for 10
Found 10 at 9 after 9 checks

Starting binary search for 1
Found 1 at 0 after 5 checks
Starting linear search for 1
Found 1 at 0 after 0 checks

The linear search looks though the array until it finds the target, then it prints a message and exits the loop. Notice how the number of checks increases as the further down the array the target is. In this case, we are going to look at three order calculations. We are interested in the best case, the worst case and the average case. The best case is the target is in a[0]. This is O(1). The worst case is the target is in a[N-1]; This is O(N); The average is O((N+1)/2). We don't simplify this to N since we are looking at the three cases.

Now lets look at a binary search. This search technique cuts the space it has to search in half each time. It is completely dependent on the assumption that the array is sorted before the search, Notice that the number of checks is the about the same regardless of the target value. Every time this is called, it searches half the remaining space. So the time to search looks like N. N/2, N/4,...,4,2,1. If we take the log base 2 of these numbers and add them, we get an arithmetic series like, log2, log2N-1, log2N-2, ..., log23, log22, log21. After a little algebra, we get an order of O(log2 + 1). This simplifies to O(log2). So this is pretty fast in comparison to the linear search. However, there is a price. You must either sort the data first, or keep it sorted as it is used. This must be added to calculate the total cost of the algorithm. If the data can be kept sorted, the total cost is still less that a linear search of unordered data. If there are 64 items, the linear search time is about 32 while the binary search time is 6.