CS50: Week 3, Sorting and Searching Algorithms

Since earlier lectures, we’ve been studying about reminiscence, easy methods to retailer information in it, the categories and sizes of knowledge, amongst others issues.

If we have to handle a considerable amount of information, it’s handy to make use of an array to retailer it, we already know easy methods to do it, so this week David Malan teaches easy methods to type and search values in arrays explaining Searching Algorithms and Sorting Algorithms.



Massive O and Massive Omega

To know the distinction between distinct algorithms, we should first have a notion about Massive O notation. In pc science, we will measure the effectivity of an algorithm when it comes to reminiscence consumption and working time. Big O notation is a particular notation that measures working time, it tells you how briskly an algorithm is, or how lengthy an algorithm takes to run given some dimension of enter. It doesn’t measure working time in seconds or milliseconds, as a substitute, it provides us an concept of what number of operations an algorithm performs in relation to the enter it receives.

Massive O describes the higher sure of the variety of operations, or what number of steps an algorithm may take within the worst case till it finishes, whereas Massive Omega notation (massive Ω), in counterpart, describes the decrease sure of the variety of operations of an algorithm, or how few steps it’d soak up one of the best case.

The picture above expresses the working time of three other ways of looking for an individual in a cellphone e-book:

  1. The purple line represents looking out on one web page at a time ranging from the start. It might take a number of time discover somebody as a result of the the looking out turns into proportionally longer to finish because the cellphone e-book grows. If it takes 1 second to verify 10 identify, it’s going to take 2 seconds to verify 20 names, 3 seconds to verify 30 names, 100 seconds to verify 1000 names, and so forth.

  2. The yellow line represents looking out on two pages at a time. It’s double sooner than the primary possibility however nonetheless takes a number of time to finish the search.

  3. The inexperienced line represents the final and most effective possibility, a divide and conquers strategy. Open the cellphone e-book within the center, if the goal identify shouldn’t be there, then protect the half the place the identify alphabetically should seem and discard the opposite half. Then repeat the method with the remaining half till the goal is discovered. It’s a lot sooner than the opposite two choices as a result of the time required to discover a identify goes up linearly whereas the scale of the cellphone e-book grows exponentially. So if it takes 1 second to verify 10 names, it’s going to take 2 seconds to verify 100 names, 3 seconds to verify 1000 names, and so forth.

    Within the chart above, if we zoomed out and altered the items on our axes, we’d see the purple and yellow traces find yourself very shut collectively:

Big O comparision

Since each the purple and yellow traces find yourself being very comparable as n turns into bigger and bigger, we will say they’ve the identical time complexity. We might describe them each as having massive O of n or on the order of n working time.

The inexperienced line, although, is basically completely different in its form, at the same time as n turns into very giant, so it takes massive O of log⁡ n steps 🤯 (later we’ll see why).

The most typical working occasions are:

  • O(n2) – Quadratic time
  • O(n * log n) – Linearithmic time
  • O(n) – Linear time
  • O(log n) – Logarithmic time
  • O(1) – Fixed time




Linear Search

Is the best and intuitive technique for locating a component inside an array. The steps required to carry out Linear Search could be:

  1. Begin from the leftmost component of the array.
  2. Verify if the goal worth matches the worth saved within the array.
  3. If matches, then the algorithm finishes efficiently.
  4. If it doesn’t match, repeat step 2 with the following component of the array.
  5. If the array finishes with no matches, then the goal shouldn’t be current within the array.

We would write the algorithm pseudocode as:

For every component from left to proper
    If quantity is at this place
        Return true
Return false
Enter fullscreen mode

Exit fullscreen mode

Within the subsequent instance we seek for the quantity 7:

checklist = [0, 2, 10, 4, 7, 9]

[0, 2, 10, 4, 7, 9] --> 0 shouldn't be equal to 7
 ^
[0, 2, 10, 4, 7, 9] --> 2 shouldn't be equal to 7
    ^
[0, 2, 10, 4, 7, 9] --> 10 shouldn't be equal to 7
       ^
[0, 2, 10, 4, 7, 9] --> 4 shouldn't be equal to 7
           ^
[0, 2, 10, 4, 7, 9] --> MATCH
              ^

The quantity 7 was discovered after 5 iterations
Enter fullscreen mode

Exit fullscreen mode

With n components, we’ll want to take a look at all n of them within the worst case (e.g. if we had been looking for the quantity 9 within the instance above), so the massive O working time of this algorithm is linear O(n).

The decrease sure, could be Ω(1), since we is perhaps fortunate and discover the quantity we’re in search of on the first place of the array (e.g. if we had been looking for the quantity 0 within the instance above). Ω(1) is also called fixed time, as a result of a relentless variety of steps is required, irrespective of how massive the issue is.

Linear Search is right as a result of it solves the looking out drawback, however it’s not very environment friendly.

Simply in case, that is my implementation of Linear Search in Python:

# Linear Search

def linear_search(n, arr):
    # Use the in operator to verify if the goal worth 
    # is on the checklist
    if n in arr:
        return True
    return False

if __name__ == '__main__':
    array = [53,9,6,23,8,5,4,1,88]
    # Ask the consumer for a goal worth
    quantity = int(enter('Que número buscas?: '))
    end result = linear_search(quantity, array)
    if end result:
        print('El número existe.')
    else:
        print('El número no existe.')
Enter fullscreen mode

Exit fullscreen mode




Binary Search

This technique makes use of an strategy known as divide and conquer, the place we divide the issue in half on every step. This algorithm works solely on sorted arrays. The steps required to carry out Linear Search could be:

  1. Get a component from the center of the array.
  2. If the component matches the goal worth, then the algorithm finishes efficiently.
  3. If the goal worth is smaller than the center component, slender the array to the left half.
  4. If the goal worth is larger than the center component, slender the array to the correct half.
  5. Repeat from step 1 with the narrowed array.
  6. If the array is empty, then the goal shouldn’t be current within the array.

We would write the algorithm pseudocode as:

If array is empty
    Return false
If quantity is on the center place
    Return true
Else if quantity < quantity on the center place
    Search left half
Else if quantity > quantity on the center place
    Search proper half
Enter fullscreen mode

Exit fullscreen mode

Within the subsequent instance we seek for the quantity 6:

checklist = [0, 1, 2, 3, 4, 5, 6]

[0, 1, 2, 3, 4, 5, 6] --> 3 is the center component. 6 is greater so we search on the correct half
          ^
[4, 5, 6] --> 5 is the center component. 6 is greater so we search on the correct half
    ^
[6] --> MATCH
 ^

The quantity 6 was discovered after 3 iterations
Enter fullscreen mode

Exit fullscreen mode

The higher sure for binary search is O(log⁡ n), often known as logarithmic time. When an algorithm has O(log n) working time, it implies that because the enter dimension grows, the variety of operations grows very slowly. Since in Binary Search we’re discarding the half of the weather on each step, we will say that n (the scale of the issue) is lowered exponentially, whereas the time required to seek out the goal goes up linearly.

The very best case situation for this algorithm is to seek out the goal precisely within the center, within the first iteration, which implies that the Massive Omega working time is within the order of Ω(1).

Though binary search is perhaps a lot sooner than linear search, it requires our array to be sorted first. If we’re planning to go looking our information many occasions, it is perhaps value taking the time to type it first, so we will use binary search.

To know in-depth how Binary Search works we should know first what recursion is. When you have a look at the pseudocode once more be aware that we’re utilizing the identical “search” algorithm for the left and proper half. That’s recursion, the power for a operate to name itself.

This looks as if a cyclical course of that may by no means finish, however we’re really altering the enter to the operate and dividing the issue in half every time, stopping as soon as there are not any extra components left. When the operate runs over components, that’s the situation, also called base case, which tells the recursive operate to cease, so it doesn’t name itself again and again perpetually.

Right here is my implementation of Binary Search written in Python:

# Binary Search

def binary_search(n, arr, l):
    # Base Case
    # If there's just one component then verify explcitly if matches the goal worth
    if l == 1:
        return True if arr[0] == n else False

    # Calculate the center of the checklist
    mid = l // 2

    # Verify if the center component matches the goal
    if arr[mid] == n:
        return True
    else:
        # If the goal is smaller, repeat the search on the left half 
        # in any other case, look in the correct half.
        if n < arr[mid]:
            return binary_search(n, arr[:mid], len(arr[:mid]))
        else:
            return binary_search(n, arr[mid:], len(arr[mid:]))

if __name__ == '__main__':
    array = [53,9,6,23,8,5,4,1,88]
    lengthy = len(array)
    # Ask the consumer for a goal worth
    quantity = int(enter('Que número buscas?: '))
    # Kind the array with sorted() operate since Binary Search works with sorted lists
    end result = binary_search(quantity, sorted(array), lengthy)
    if end result:
        print('El número existe.')
    else:
        print('El número no existe.')
Enter fullscreen mode

Exit fullscreen mode




Choice Kind

Choice Kind works by sorting arrays from left to proper. The steps required to type a listing with this algorithm could be:

  1. Begin from the leftmost component of the array
  2. Traverse the array to seek out the smallest component
  3. After traversing the array, swap the smallest component with the worth on the primary place of the array
  4. Repeat from step 2, now ranging from the second component, and swap the smallest worth with the worth on the second place of the array. Proceed repeating with the third, 4td, and so forth, till the array is sorted.

As a result of computer systems can solely have a look at one quantity at a time, we should iterate over all the array every time to seek out the smallest worth, with out considering already sorted components.

We would write the algorithm pseudocode as:

For i from 0 to n–1
    Discover smallest quantity between numbers[i] and numbers[n-1]
    Swap smallest quantity with numbers[i]
Enter fullscreen mode

Exit fullscreen mode

Within the subsequent instance we type the following checklist:

checklist = [3, 6, 5, 7, 8, 2]

[3, 6, 5, 7, 8, 2] --> 2 is the smallest component
                ^
[2, 6, 5, 7, 8, 3] --> Swap it with the component on the primary place
 ^              ^

[2, 6, 5, 7, 8, 3] --> Now 3 is the smallest, since 2 is already sorted
                ^
[2, 3, 5, 7, 8, 6] --> Swap it with the component within the second place
    ^           ^

[2, 3, 5, 7, 8, 6] --> Now 5 is the smallest quantity. No swap wanted as a result of is already sorted
       ^

[2, 3, 5, 7, 8, 6] --> Repeat the method
                ^
[2, 3, 5, 6, 8, 7]
          ^     ^

[2, 3, 5, 6, 8, 7]
                ^
[2, 3, 5, 6, 7, 8]
             ^  ^

After 5 iterations the  array is sorted
Enter fullscreen mode

Exit fullscreen mode

We are able to say that the algorithm has an higher sure for working time of O(n2), as a result of we might be iterating the array again and again, looking for the smallest quantity.

In one of the best case, the place the checklist is already sorted, our choice type algorithm will nonetheless have a look at all of the numbers and repeat the loop, so it has a decrease sure for working time of Ω(n2).

Right here is my implementation of Choice Kind written in Python:

# Choice Kind

def selection_sort(arr, l):
    # If the checklist has only one merchandise then it is already sorted
    if l == 1:
        return arr

    # Loop over all the checklist
    for i in vary(l):
        lowest_index = None
        # Loop over every component to seek out the smallest one
        for j in vary(i, l):
            if lowest_index is None or arr[j] < arr[lowest_index]:
                # Save the index of the smallest component
                lowest_index = j
        # Swap components
        # It is attainable to swap components like this in Python,
        # avoiding the usage of a short lived variable
        (arr[i], arr[lowest_index]) = (arr[lowest_index], arr[i])

    return arr

if __name__ == '__main__':
    array = [53,9,6,23,8,5,4,1,88]
    lengthy = len(array)
    print(selection_sort(array, lengthy))
Enter fullscreen mode

Exit fullscreen mode




Bubble Kind

This algorithm kinds components from proper to left utilizing comparability and swapping components. We examine contiguous pairs of components and swapping each if the smaller is on the correct, and repeating all the loop till no swaps had been wanted anymore.

We might describe its performance as follows:

  1. Take a look at the primary component of the array, which would be the “present worth”.
  2. Examine the present worth with the following component within the array.
  3. If the following component is smaller, swap each values. Save in a variable {that a} swap was carried out.
  4. Transfer to the following component of the array and make it the present worth.
  5. Repeat from step 2 till the final quantity within the checklist has been reached.
  6. If any values had been swapped, repeat once more from step 1, with out considering the final component of the array, which might be already sorted.
  7. If the top of the checklist is reached with none swaps being made, then the checklist is sorted.

The pseudocode for bubble type may appear to be:

Repeat n-1 occasions
    For i from 0 to n–2
        If numbers[i] and numbers[i+1] out of order
            Swap them
    If no swaps
        Stop
Enter fullscreen mode

Exit fullscreen mode

We’re going to type the identical checklist we sorted with Choice Kind to see the variations:

checklist = [3, 6, 2, 7, 5]

[3, 6, 2, 7, 5] --> examine 3 and 6, they're sorted
 ^  ^
[3, 6, 2, 7, 5] --> examine 6 and a pair of, swap them
    ^  ^
[3, 2, 6, 7, 5] --> examine 6 and seven, they're sorted
       ^  ^
[3, 2, 6, 7, 5] --> examine 7 and 5, swap them
          ^  ^
// Begin from the start once more

[3, 2, 6, 5, 7] --> examine 3 and a pair of, swap them
 ^  ^
[2, 3, 6, 5, 7] --> examine 3 and 6, they're sorted
    ^  ^
[2, 3, 6, 5, 7] --> examine 6 and 5, swap them
       ^  ^

// We do not examine 7 as a result of is already sorted. At this level
// we all know that the array is sorted, however nonetheless we have to loop 
// via the array another time to verify that no extra swaps 
// are wanted

[2, 3, 5, 6, 7] --> examine 2 and three, they're sorted
 ^  ^
[2, 3, 5, 6, 7] --> examine 3 and 5, they're sorted
    ^  ^

// At this level the algorithm finishes as a result of there have been no
// swaps, and as within the earlier iteration, we need not
// examine the quantity 6 as a result of is already sorted.


After 3 iterations the array is sorted
Enter fullscreen mode

Exit fullscreen mode

We are able to say that bubble type has an higher sure for working time of O(n2), as a result of as with Choice Kind, we should iterate over the array as soon as and as soon as, however the distinction is that the decrease sure could be Ω(n), as a result of if the array is sorted only a single iteration over the array is required to verify that there have been no swaps.

Right here is my implementation of Bubble Kind written in Python:

# Bubble Kind

def bubble_sort(arr, l):
    # If the checklist has only one merchandise then it is already sorted 
    if l == 1:
        return arr

    # Loop over all the checklist
    for i in vary(1, l):
        # Swaps counter
        swaps = 0

        # Loop over the checklist to match pairs of adjoining components
        for j in vary(l-i):
            if arr[j] > arr[j+1]:
                # Swap components
                (arr[j], arr[j+1]) = (arr[j+1], arr[j])
                swaps += 1

        # If there have been no swaps, then the array is sorted
        if swaps == 0:
            return arr

if __name__ == '__main__':
    array = [53,9,6,23,8,5,4,1,88]
    lengthy = len(array)
    print(bubble_sort(array, lengthy))
Enter fullscreen mode

Exit fullscreen mode




Merge Kind

This algorithm additionally makes use of the divide and conquer strategy, and the recursion approach, identical to Binary Search does. It divides the enter array into two halves, calls itself for the 2 halves till it get an array containing one component (which is taken into account sorted), after which it repeatedly merges every sorted halves.

We might describe the algorithm with the next steps

  1. Discover the center of the array.
  2. Divide the array from the center.
  3. Name Merge Kind for the left half of the array.
  4. Name Merge Kind for the correct half of the array.
  5. Merge the 2 sorted halves right into a single sorted array

As we’re utilizing recursion, every name to Merge Kind will carry out all of the steps (from 1 to five) for the sublist it receives. The bottom case, when Merge Kind stops calling itself, is when it receives an array with only one component since an array of 1 component is already sorted.

The pseudocode for bubble type may appear to be:

If just one quantity
  Stop
Else
    Kind left half of quantity
    Kind proper half of quantity
    Merge sorted halves
Enter fullscreen mode

Exit fullscreen mode

It might be obscure the algorithm simply by phrases, so an instance might assist to indicate how Merge Kind works.

Within the instance beneath, the purple arrows symbolize every name to Merge Kind, which divides the array it receives into two sublists; and the inexperienced arrows symbolize the merging operation.

Merge Sort example

The technique is to type every half, after which merge them. The merging operation works as follows:

Examine the leftmost component of each sublists and transfer the smallest worth to a brand new array. Repeat the method till each sublists are merged into one array.

If we need to merge the final two sublists of the instance above, the steps are the next:

// Examine the primary component of every sublist and duplicate the smallest worth 
// to a brand new array

[3, 27, 38, 43]  [9, 10, 82]  =>  [3]
 ^                ^
[27, 38, 43]     [9, 10, 82]  =>  [3, 9]
 ^                ^
[27, 38, 43]     [10, 82]     =>  [3, 9, 10]
 ^                ^
[27, 38, 43]     [82]         =>  [3, 9, 10, 27]
 ^                ^
[38, 43]         [82]         =>  [3, 9, 10, 27, 38]
 ^                ^
[43]             [82]         =>  [3, 9, 10, 27, 38, 43]
 ^                ^
[]               [82]         =>  [3, 9, 10, 27, 38, 43, 82]
                  ^
Enter fullscreen mode

Exit fullscreen mode

The base case for our recursive operate might be a listing containing only one component, as I stated earlier than, a listing of 1 component is taken into account sorted.

The time complexity of Merge Kind is O(n Log n), because the algorithm at all times divides the array into two halves, which have a complexity of O(log n); and the merge operation takes linear time, or O(n), since we solely want to take a look at every component as soon as.

The decrease sure of our merge type remains to be Ω(n log⁡ n), since we now have to do all of the work even when the checklist is sorted.

Right here is my implementation of Merge Kind written in Python:

# Merge Kind

def merge_sort(arr, l):
    # Base case
    if l == 1:
        return arr

    # Calculate the center of the checklist
    mid = l // 2

    # Name merge type for left and proper halves
    left = merge_sort(arr[:mid], len(arr[:mid]))
    proper = merge_sort(arr[mid:], len(arr[mid:]))

    # Calculate what number of components should be merged
    merge_q = len(left) + len(proper)
    # Empty checklist had been the sublists might be merged into
    sorted_array = []

    for i in vary(merge_q):
        # If each sublists have components to be sorted
        if len(left) > 0 and len(proper) > 0:
            # Append smallest worth to the sorted checklist
            if left[0] >= proper[0]:
                sorted_array.append(proper[0])
                del(proper[0])
            else:
                sorted_array.append(left[0])
                del(left[0])
        # If proper half is empty append left half to the sorted checklist
        elif len(left) > 0:
            sorted_array.append(left[0])
            del(left[0])
        # If left half is empty append proper half to the sorted checklist
        else:
            sorted_array.append(proper[0])
            del(proper[0])

    return sorted_array

if __name__ == '__main__':
    array = [53,9,6,23,8,5,4,1,88]
    size = len(array)
    print(merge_sort(array, size))
Enter fullscreen mode

Exit fullscreen mode

I hope you discovered attention-grabbing looking out and sorting algorithms, as I discovered it when taking the category. There are different algorithms, however this can be a good first strategy to this subject.

The subsequent lecture is about reminiscence: addresses, pointers, allocation, and different ideas to proceed studying how pc software program works.

I hope you discover attention-grabbing what I’m sharing about my studying path. When you’re taking CS50 additionally, go away me your opinion about it.

See you quickly! 😉


Thanks for studying. If you’re concerned with understanding extra about me, you may give me a observe or contact me on Instagram, Twitter, or LinkedIn 💗.



Add a Comment

Your email address will not be published. Required fields are marked *