assignment4

1. External and internal searching

ANSWER:

The searching problem falls naturally into two case. If there are many records, perhaps each one quite large, then it will be necessary to store the records in files on disk or tape, external to the computer memory. This case is called external searching. In the other case, the records to be searched are stored entirely within the computer memory. This case is called internal searching. We will consider only internal searching in this term.

We shall be concerned only with contiguous list in this chapter. Hence we shall always be searching in a contiguous list that we generally call list. What we have called records will be C structures, and they will be the entries in the list LIST. The C type that these records have we shall name as ListEntry, as we have for lists. One of the fields of each list entry will be denoted key and have a type called KeyType.

Examples: float, int, char

2. Binary search

ANSWER:

Sequential search is easy to write and efficient for short lists, but a disaster of long ones. One of the best is to compare the target key with the one in the center of the list and then restrict our attention to only the first or the second half of the list, depending on whether the target key comes before or after the central one.

The method we are discussing is called binary search. This approach requires that the entries in the list be of a scalar of other type that can be regarded as having an order and that the list already be completely in order.

Since binary search requires jumping back and forth from one end of the list to the middle, it requires an implementation of the list in which the random access is rapid, as it is within an array, but not within a simply linked list. Hence we shall study binary search only for the array implementation.

Definition:

An ordered list in which each entry contains a key, such that the keys are in order. That is, if entry i comes before entry j in the list, then the key of entry i is less than or equal to the key of entry j.

Algorithm development:

Our binary search algorithm will use two indices, top and bottom, to enclose the part of the list in which we are looking for the target key.

The target key, provided it is present in the list, will be found between the indices bottom and top, inclusive. We establish the initial correctness of this statement by setting bottom to 0 and top to list.count-1, where list.count is the number of entries in the list.

To do binary search, we first calculate the index middle halfway between bottom and top by calculating middle=(top+bottom)/2 and then we compare

the target against the key at position middle, and change one of the indices top or bottom to reduce the list to either its bottom or top half. Next we note that the binary-search process should terminate when top==bottom, that is, when the remaining part of the list contains at most the item, providing that we have not terminate earlier by finding the target.

3. Binary sort tree

ANSWER:

Binary sort tree is either an empty tree or a binary tree in which all the nodes in the left subtree which is not empty is less than the root node and all the nodes in the right subtree which is not empty is greater than the root node. The right subtree and the left subtree are also binary sort tree. Two examples are here below.

4. Insertion Sort:

ANSWER:

The list can be divided into two parts. The first part is an ordered list and the second part is a disordered list. The initialize situation: the first part has one entry and the second part has n-1 entries. For ordered lists, we shall often use three new operations that have an counterparts for other lists, since they use keys rather than positions to locate the entry. One operation retrieves an entry with a specified key from the ordered list. The second operation inserts a new entry into an ordered list by using the key in the new entry to determine where in the list to insert it. The third operation sorts a list which may not be ordered but which has keys in its entries, so that the resulting list is ordered.

5. Selection Sort:

ANSWER:

Insertion sort has major disadvantage. Even after most entries have been sorted properly into the first part of the list, the insertion of a later entry may require that many of them be moved. All the moves made by insertion sort are moves of only one position at a time. Thus to move an entry 20 positions up the list requires 20 separate moves. If the entries are small, perhaps a key alone, or if the entries are in linked storage, then the many moves may not require excessive time. But if the entries are very large, such as personnel files or student transcripts—records containing hundreds of components, and these records must be kept in contiguous storage, then it would be far more efficient if an entry being moved could immediately be placed in its final position. Our next sorting method accomplishes this goal. Selection Sort: The list can be divide into two parts. One is a sorted part and the other is an unsorted part. The initialize situation: the sorted part is an empty list. The unsorted part contain all the entries of the list.

6. Divide-and-Conquer Sorting

ANSWER:

It is much easier to sort short lists than long ones. If the number of entries to be sorted doubles, then the work more than doubles (with insertion or selection sort it quadruples, roughly). Hence if we can find a way to divide the list into two roughly equal-sized lists and sort them separately, then we will save work. If, for example, you were working in a library and were given a thousand index card to put in alphabetical order, then a good way would be to distribute them into piles according to the first letter and sort the piles separately.Here again we have an application of the idea of dividing a problem into smaller but similar subproblems, that is, of divide-and-conquer.

First, we note that comparisons by computer are usually two-way branches, so we shall divide the entries to sort into two lists at each stage of the process. What method, you may ask, should we use to sort the reduced lists? Since we have (temporarily) forgotten all the other methods we know, let us simply use the same method, divide-and-conquer, again, repeatedly subdividing the list. But we won’t keep going forever: Sorting a list with only one entry doesn’t take any work, even if we know no formal sorting methods.

7. Mergesort

ANSWER:

In the first method, we simply chop the list into two sublists of sizes as nearly equal as possible and then sort them separately. Afterward, we carefully merge the two sorted sublists into a single sorted list. Hence this method is called mergesort.

8. Quicksort

ANSEWER:

The second method does more work in the first step of partitioning the list into two sublists, and the final step of combining the sublists then becomes trivial. This method was invented and christened quicksort by C.A.R.HOARE. To partition the list, we first choose some key from the list for which, we hope, about half the keys will come before and half after. We shall use the name pivot for this selected key. We next partition the entries so that all those with keys less than the pivot come in one sublist, and all those with greater keys come in another. Finally, then, we sort the two reduced lists separately, put the sublists together, and the whole list will be in order.

9. Heapsort

ANSWER:

Heapsort proceed in two phases. First, we must arrange the entries in the list so that they satisfy the requirement for a heap (analogous to organizing a corporate hierarchy). Second, we repeatedly remove the top of the heap and promote another entry to take its place. For this second phase, we recall that the root of the tree (which is the first entry of the list as well as the top of the

heap) has the largest key. This key belongs at the end of the list. we therefore move the first entry to the last position, replacing an entry current.

We then decrease a counter lu (last unsorted) that keeps track of the size of the unsorted part of the list, thereby excluding the largest entry from further sorting. The entry current that has been removed from the last position, however, may not belong on the top of the heap, and therefore we must insert current into the proper position to restore the heap property before continuing to loop in the same way.

10. Radix Sort

ANSWER:

The idea is to consider the key one part at a time and to divide the entries, not into two sublists, but into as many sublists as there are possibilities for the given part from the key. If our keys, for example, are words or other alphabetic strings, then we divide the list into 26 sublists at each stage. That is, we set up a table of 26 lists and distribute the entries into the lists according to one of the characters in the key.

A person sorting words by this method might first distribute the words into 26 lists according to the initial letter, then divide each of these sublists into further sublists according to the second letter, and so on. The following idea eliminates this multiplicity of sublists: Partition the items into the table of sublists first by the least significant position, not the most significant. After this first partition, the sublists from the table are put back together as a single list, in the order given by the character in the least significant position and recombined as one list. When, after repetition of these steps, the list has been partitioned by the most significant place and recombined, it will be completely sorted.

11. Hashing

ANSWER:

Suppose we want to sort the list with the method of Insertion Sort or Selection sort. The basic operations for sorting are movements and comparisons. There is no certain relationship between the key and the record. Now we will introduce a new method for sorting. The key can be used directly as in array indexing. Do not need movements and comparisons. We can continue to exploit table lookup even in situations where the key is no longer an index that can be used directly as in array indexing. What we can do is to set up a one –to-one correspondence between the keys by which we wish to retrieve information and indices that we can use to access an array. The index function that we produce will be somewhat more complicated than those of previous sections, since it may need to convert the key from, say, alphabetic information to an integer but in principle it can still be done.