W3 - Shorts 📚🔍- Learn to use Algorithms in C (Linear and Binary Search; Selection, Bubble and Merge Sort; Recursion)

Read more here
CS50 IDE: https://cs50.dev/ (Online VSCode)
CS50 Library Documentation: https://manual.cs50.io/#cs50.h

Table of Contents:

🔙 Previous Part | Next Part 🔜

↩️ Go Back

A) Linear Search Algorithm 🎦

Video 🎦 - Linear Search

Video: https://youtu.be/TwsgCHYmbbA

A.1) What is Linear Search?

Linear search is an algorithm used to find an element in an array.

Algorithm Definition: A step-by-step set of instructions to complete a task.
How it Works:
1. Iterate through the array from left to right.
2. At each step, check if the current element matches the target.
3. Stop if:
  - The element is found.
  - The entire array is traversed without finding the element.

A.2) Pseudocode for Linear Search

Start at the first element in the array.
While not at the end of the array:
a. If the current element equals the target, STOP (success).
b. Otherwise, move to the next element.
If the end of the array is reached without finding the target, STOP (failure).

A.3) Example Walkthrough

Case 1: Element is in the Array

Array: $[11, 23, 7, 9, 42]$
Target Value: $9$

Step 1: Compare the first element, $11$ , with $9$ .
- $11 \neq 9$ , so move to the next element.
Step 2: Compare the second element, $23$ , with $9$ .
- $23 \neq 9$ , so move to the next element.
Step 3: Compare the third element, $7$ , with $9$ .
- $7 \neq 9$ , so move to the next element.
Step 4: Compare the fourth element, $9$ , with $9$ .
- $9 = 9$ , so STOP. The target is found.

Case 2: Element is NOT in the Array

Array: $[11, 23, 7, 9, 42]$
Target Value: $50$

Step 1: Compare each element in turn with $50$ :
- $11 \neq 50$ , $23 \neq 50$ , $7 \neq 50$ , $9 \neq 50$ , $42 \neq 50$ .
Step 2: Reach the end of the array without finding $50$ .
Conclusion: The target is not in the array. Linear search has completed.

A.4) Best and Worst Case Scenarios

Best Case:
- The target element is the first element.
- Steps Required: $1$ .
- Complexity: $Ω (1)$ (constant time).
Worst Case:
- The target element is the last element, or it is not in the array.
- Steps Required: Look through all $n$ elements.
- Complexity: $O (n)$ (linear time).

A.4.1) Complexity Analysis

Best Case: $Ω (1)$
- The target is found in the first position.
Worst Case: $O (n)$
- The target is the last element or not in the array.
Average Case: $Θ (n)$
- On average, half the array will be traversed before finding the target.

A.5) Key Points: Linear Search

Advantages: Simple to implement, no need for sorting the array.
Disadvantages: Inefficient for large arrays due to its linear complexity.
Use Case: Suitable for small arrays or unsorted data where simplicity is more important than performance.

B) Binary Search Algorithm 🎦

Video 🎦 - Binary Search

Video: https://youtu.be/T98PIp4omUA

B.1) What is Binary Search?

Binary search is an efficient algorithm to find an element in a sorted array.

Key Idea: Divide and Conquer
- Reduce the search area by half at each step.
- This makes it significantly faster than linear search.
Condition: The array must be sorted before applying binary search. If not, the algorithm cannot guarantee correct results.

B.2) Pseudocode for Binary Search

Define the start and end indices of the array:
- Initially, start = 0 and end = n - 1.
Repeat the following steps until the subarray has size 0:
- Calculate the midpoint: $mid = ⌊ \frac{start + end}{2} ⌋$
- Compare the midpoint value with the target:
  - If target == array[mid], STOP. The target is found.
  - If target < array[mid], search the left subarray: $end = mid - 1$
  - If target > array[mid], search the right subarray: $start = mid + 1$
If the start index becomes greater than the end index, STOP. The target is not in the array.

B.3) Example Walkthrough

Case 1: Target Found in Array

Array: $[3, 7, 12, 15, 19, 21, 27]$
Target Value: $19$

Step 1:
- Start: $0$ , End: $6$ .
- Midpoint: $$ \text{mid} = \lfloor \frac{0 + 6}{2} \rfloor = 3 $$
- Value at array[3] is $15$ .
- Since $19 > 15$ , update: $start = mid + 1 = 4$
Step 2:
- Start: $4$ , End: $6$ .
- Midpoint: $$ \text{mid} = \lfloor \frac{4 + 6}{2} \rfloor = 5 $$
- Value at array[5] is $21$ .
- Since $19 < 21$ , update: $end = mid - 1 = 4$
Step 3:
- Start: $4$ , End: $4$ .
- Midpoint: $$ \text{mid} = \lfloor \frac{4 + 4}{2} \rfloor = 4 $$
- Value at array[4] is $19$ .
- Target found! STOP.

Case 2: Target Not Found in Array

Array: $[3, 7, 12, 15, 19, 21, 27]$
Target Value: $25$

Step 1:
- Start: $0$ , End: $6$ .
- Midpoint: $$ \text{mid} = \lfloor \frac{0 + 6}{2} \rfloor = 3 $$
- Value at array[3] is $15$ .
- Since $25 > 15$ , update: $start = mid + 1 = 4$
Step 2:
- Start: $4$ , End: $6$ .
- Midpoint: $$ \text{mid} = \lfloor \frac{4 + 6}{2} \rfloor = 5 $$
- Value at array[5] is $21$ .
- Since $25 > 21$ , update: $start = mid + 1 = 6$
Step 3:
- Start: $6$ , End: $6$ .
- Midpoint: $$ \text{mid} = \lfloor \frac{6 + 6}{2} \rfloor = 6 $$
- Value at array[6] is $27$ .
- Since $25 < 27$ , update: $end = mid - 1 = 5$
Step 4:
- Start: $6$ , End: $5$ .
- Start index is now greater than the end index, so STOP.
- Target is not in the array.

B.4) Best and Worst Case Scenarios

Best Case:
- The target is found at the first midpoint calculation.
- Steps Required: $1$ .
- Complexity: $Ω (1)$ .
Worst Case:
- The array is repeatedly halved until only one element remains.
- Steps Required: $\log_{2} (n)$
- Complexity: $O (\log n)$ .
Average Case:
- On average, the array is halved multiple times to locate the target.
- Complexity: $Θ (\log n)$ .

B.5) Key Points: Binary Search

Advantages:
- Much faster than linear search for large arrays.
- Reduces the search space by half at each step.
Disadvantages:
- Requires the array to be sorted beforehand.
- Sorting the array adds preprocessing cost.
Use Case: Suitable for searching large datasets that are already sorted or where sorting is acceptable as a preprocessing step.

______________________________________________________________

C) Selection Sort Algorithm🎦

Video 🎦 - Selection Sort

Video: https://youtu.be/3hH8kTHFw2A

C.1) What is Selection Sort?

Selection Sort is a simple algorithm used to sort an array by repeatedly selecting the smallest unsorted element and placing it at the beginning of the array.

Key Idea:
- Divide the array into a sorted part and an unsorted part.
- Iteratively move the smallest element from the unsorted part to the sorted part.

C.2) Pseudocode for Selection Sort

Repeat until no unsorted elements remain:
1. Search through the unsorted part of the array to find the smallest value.
2. Swap the smallest value with the first element of the unsorted part.

C.3) Step-by-Step Example

Unsorted Array:

[5, 1, 4, 2, 6, 3]

Steps of Selection Sort:

Iteration 1:
- Find the smallest element in the array: ( 1 ).
- Swap ( 1 ) with the first element (( 5 )): $[1, 5, 4, 2, 6, 3]$
- Now, ( 1 ) is sorted.
Iteration 2:
- Find the smallest element in the unsorted part: ( 2 ).
- Swap ( 2 ) with the first element of the unsorted part (( 5 )): $[1, 2, 4, 5, 6, 3]$
- Now, ( 2 ) is sorted.
Iteration 3:
- Find the smallest element in the unsorted part: ( 3 ).
- Swap ( 3 ) with the first element of the unsorted part (( 4 )): $[1, 2, 3, 5, 6, 4]$
- Now, ( 3 ) is sorted.
Iteration 4:
- Find the smallest element in the unsorted part: ( 4 ).
- Swap ( 4 ) with the first element of the unsorted part (( 5 )): $[1, 2, 3, 4, 6, 5]$
- Now, ( 4 ) is sorted.
Iteration 5:
- Find the smallest element in the unsorted part: ( 5 ).
- Swap ( 5 ) with the first element of the unsorted part (( 6 )): $[1, 2, 3, 4, 5, 6]$
- Now, ( 5 ) is sorted.
Iteration 6:
- Only ( 6 ) remains. It is already sorted: $[1, 2, 3, 4, 5, 6]$

C.4) Complexity Analysis

Worst-Case Scenario

For every element, we need to search the entire unsorted part to find the smallest value.
Number of comparisons:
- First pass: $n$
- Second pass: $n - 1$
- Third pass: $n - 2$
- ...
- Total: $n + (n - 1) + (n - 2) + \dots + 1 = \frac{n (n - 1)}{2}$ .
Time Complexity: $O (n^{2})$ .

Best-Case Scenario

Even if the array is already sorted, the algorithm still performs the same number of comparisons.
Time Complexity: $Ω (n^{2})$ .

Key Insight:

Selection Sort has the same complexity for both the best and worst cases.

C.4.1) Advantages and Disadvantages (Selection Sort)

Advantages:

Simple to implement.
Useful for small datasets.

Disadvantages:

Inefficient for large datasets.
Always $O (n^{2})$ regardless of the input.

C.5) Key Points: Selection Sort

Selection Sort builds the sorted array incrementally by moving the smallest element to the sorted portion.
It has a quadratic time complexity ( $O (n^{2})$ ) in both best and worst cases.
It's a good choice for small datasets but impractical for large ones.

D) Bubble Sort Algorithm 🎦

Video 🎦 - Bubble Sort

Video: https://youtu.be/RT-hUXUWQ2I

Overview: Bubble Sort

Bubble sort is a sorting algorithm that repeatedly steps through the array, compares adjacent elements, and swaps them if they are in the wrong order. The process repeats until no swaps are needed, indicating the array is sorted.

Pseudo-code: Algorithm Steps

Initialize:
- Set a swap_counter to a non-zero value (e.g., -1 initially).
Repeat Until:
- The swap_counter becomes 0 (indicating no swaps are required).
For Each Pass:
- Reset the swap_counter to 0.
- Compare each pair of adjacent elements:
  - If they are out of order: Swap them and increment the swap_counter by 1.
Optimization:
- After each pass, the largest unsorted element is bubbled to its correct position at the end of the array, so the sorted portion grows from right to left.

Step-by-Step Example

Initial Array
Unsorted array: [5, 2, 1, 3, 6, 4]

First Pass:
- Compare 5 and 2, swap: [2, 5, 1, 3, 6, 4] (swap_counter = 1)
- Compare 5 and 1, swap: [2, 1, 5, 3, 6, 4] (swap_counter = 2)
- Compare 5 and 3, swap: [2, 1, 3, 5, 6, 4] (swap_counter = 3)
- Compare 5 and 6, no swap: [2, 1, 3, 5, 6, 4]
- Compare 6 and 4, swap: [2, 1, 3, 5, 4, 6] (swap_counter = 4)
Sorted portion: [6].
Subsequent Passes:
- The process repeats for the unsorted portion until no swaps are required, at which point the array is fully sorted.

Complexity Analysis

Worst-Case Scenario

Scenario: The array is in reverse order (e.g., [6, 5, 4, 3, 2, 1]).
Steps:
- For each of the $n$ elements, it must be compared to and swapped with each of the other $n - 1$ elements.
Time Complexity: $O (n^{2})$ .

Best-Case Scenario

Scenario: The array is already sorted (e.g., [1, 2, 3, 4, 5, 6]).
Steps:
- A single pass through the array confirms no swaps are needed, and the algorithm terminates.
Time Complexity: $Ω (n)$ .

Key Points:

Worst Case: $O (n^{2})$
Best Case: $Ω (n)$
Bubble Sort is slightly better than Selection Sort in the best case, as it can terminate early if no swaps are needed.
Bubble Sort is inefficient for large datasets but is easy to implement and understand.

Advantages and Disadvantages (Bubble Sort)

Advantages:

Simple to implement.
Can detect if the array is already sorted.

Disadvantages:

Inefficient for large datasets.
Always $O (n^{2})$ in the average and worst cases.

Bubble Sort Summary

Bubble sort is intuitive but impractical for large datasets due to its quadratic time complexity.
It "bubbles" the largest unsorted elements to the end of the array in each pass, gradually building the sorted portion.

E) Understanding Recursion 🎦

Video 🎦 - Recursion

Video: https://youtu.be/mz6tAJMVmfM

Recursion is a technique in programming where a function calls itself as part of its execution. It's particularly useful for solving problems that can be broken into smaller, similar subproblems.

Components of a Recursive Function

Base Case:
- A simple condition that stops the recursive process.
- Prevents infinite recursion by halting the function when a certain state is reached.
Recursive Case:
- The function calls itself with modified arguments to make progress toward the base case.
- Solves a smaller piece of the original problem.

Example: Factorial Function Using Recursion

The factorial of a positive integer $n$ , denoted as $n!$ , is the product of all integers from $1$ to $n$ :

n! = n \times (n - 1) \times (n - 2) \times \dots \times 1

For example:

$5! = 5 \times 4 \times 3 \times 2 \times 1 = 120$

Recursive Definition:

n! = {\begin{cases} 1 & if n = 1 (Base Case) \\ n \times (n - 1)! & if n > 1 (Recursive Case) \end{cases}

Code Implementation:

def factorial(n):
    if n == 1:  # Base Case
        return 1
    return n * factorial(n - 1)  # Recursive Case

Example: Fibonacci Sequence Using Recursion

The Fibonacci sequence is defined as:

$F (1) = 0$
$F (2) = 1$
$F (n) = F (n - 1) + F (n - 2) for n > 2$

For example:

F (1) = 0, F (2) = 1, F (3) = 1, F (4) = 2, F (5) = 3, F (6) = 5

Recursive Definition

F (n) = {\begin{cases} 0 & if n = 1 (Base Case) \\ 1 & if n = 2 (Base Case) \\ F (n - 1) + F (n - 2) & if n > 2 (Recursive Case) \end{cases}

Code Implementation

def fibonacci(n):
    if n == 1:  # Base Case
        return 0
    if n == 2:  # Base Case
        return 1
    return fibonacci(n - 1) + fibonacci(n - 2)  # Recursive Case

Example: Collatz Conjecture Using Recursion

The Collatz conjecture states that for any positive integer $n$ :

If $n = 1$ , stop.
If $n$ is even, the next number is $n / 2$ .
If $n$ is odd, the next number is $3 n + 1$ .

The conjecture speculates that this process will eventually reach $n = 1$ for all positive integers.

Recursive Definition:
The number of steps $S (n)$ required to reach $1$ can be defined as:

S (n) = {\begin{cases} 0 & if n = 1 (Base Case) \\ 1 + S (\frac{n}{2}) & if n is even (Recursive Case) \\ 1 + S (3 n + 1) & if n is odd (Recursive Case) \end{cases}

Code Implementation

def collatz_steps(n):
    if n == 1:  # Base Case
        return 0
    if n % 2 == 0:  # Recursive Case for even numbers
        return 1 + collatz_steps(n // 2)
    return 1 + collatz_steps(3 * n + 1)  # Recursive Case for odd numbers

Comparison of Recursive vs Iterative Factorial

Recursive Factorial

def factorial_recursive(n):
    if n == 1:  # Base Case
        return 1
    return n * factorial_recursive(n - 1)  # Recursive Case

Iterative Factorial

def factorial_iterative(n):
    product = 1
    for i in range(1, n + 1):
        product *= i
    return product

Both approaches produce the same result, but recursion often leads to more elegant, compact code.

Advantages and Disadvantages of Recursion

Advantages:

Elegant and Simple Code: Reduces the need for loops or manual tracking.
Natural for Tree/Divide-and-Conquer Problems: Useful for problems like Fibonacci, factorial, and tree traversals.

Disadvantages:

Performance Overhead: Recursive calls require more memory due to the function call stack.
Risk of Stack Overflow: Without a proper base case, recursion can result in infinite loops and crash programs.

Key Takeaways: Recursion

Recursion is a powerful tool for solving problems that can be divided into smaller subproblems.
Always define a base case to prevent infinite recursion.
Recursive code is elegant but can be inefficient compared to iterative solutions for certain tasks.

F) Merge Sort Algorithm 🎦

Video 🎦 - Merge Sort

Video: https://youtu.be/Ns7tGNbtvV4

Overview: Merge Sort Algorithm

Merge Sort is a divide-and-conquer algorithm that recursively splits an array into smaller subarrays, sorts them, and then merges them back together. It offers significant efficiency improvements over simpler algorithms like bubble sort and insertion sort.

Steps in Merge Sort

Split the Array
- Divide the array into two halves recursively until each subarray contains only one element. A single-element array is inherently sorted.
Sort the Subarrays
- Recursively sort the left and right halves.
Merge the Halves
- Compare elements from the two sorted subarrays and merge them into a single sorted array.

Pseudocode

function mergeSort(array):
    if size of array <= 1:
        return array  # Base case: single-element arrays are sorted

    mid = size of array / 2
    left = mergeSort(array[0:mid])  # Sort the left half
    right = mergeSort(array[mid:end])  # Sort the right half

    return merge(left, right)  # Merge the sorted halves

function merge(left, right):
    result = []
    while left and right:
        if left[0] <= right[0]:
            append left[0] to result and remove it from left
        else:
            append right[0] to result and remove it from right

    append any remaining elements in left or right to result
    return result

Example Walkthrough

Input Array:
$[5, 2, 1, 3, 6, 4]$

Split the Array
Recursively divide the array:
- Split 1: $[5, 2, 1]$ and $[3, 6, 4]$
- Split 2 (left): $[5]$ and $[2, 1]$
- Split 3 (right): $[3]$ and $[6, 4]$
Sort Each Half
- $[5]$ is sorted.
- Sort $[2, 1]$ : split into $[2]$ and $[1]$ , merge into $[1, 2]$ .
- Merge $[5]$ and $[1, 2]$ into $[1, 2, 5]$ .
- $[3]$ is sorted.
- Sort $[6, 4]$ : split into $[6]$ and $[4]$ , merge into $[4, 6]$ .
- Merge $[3]$ and $[4, 6]$ into $[3, 4, 6]$ .
Merge Final Halves
Merge $[1, 2, 5]$ and $[3, 4, 6]$ :
- Compare $1$ and $3$ → Add $1$ .
- Compare $2$ and $3$ → Add $2$ .
- Compare $5$ and $3$ → Add $3$ .
- Compare $5$ and $4$ → Add $4$ .
- Compare $5$ and $6$ → Add $5$ .
- Add remaining $6$ .

Result:
$[1, 2, 3, 4, 5, 6]$

Complexity Analysis

Worst-Case Scenario

Steps: Divide the array into subarrays ( $\log n$ steps) and merge them back ( $n$ comparisons per step).
Time Complexity: $O (n \log n)$ .

Best-Case Scenario

Even if the array is already sorted, Merge Sort doesn’t have a shortcut to detect it.
Time Complexity: $Ω (n \log n)$ .

Advantages and Disadvantages (Merge Sort)

Advantages

Efficiency:
- Faster than bubble sort, insertion sort, and selection sort for large datasets.
Stable Sort:
- Preserves the relative order of equal elements.
Predictable Time Complexity:
- Always $O (n \log n)$ , regardless of input.

Disadvantages

Space Complexity:
- Requires extra space for merging subarrays ( $O (n)$ ).
Overhead for Small Arrays:
- For small datasets, simpler algorithms like insertion sort may outperform merge sort due to recursion overhead.

Key Points: Merge Sort

Merge Sort is a powerful divide-and-conquer sorting algorithm.
Best suited for large datasets where stability and predictable performance are required.
It leverages recursion to break down a sorting problem into manageable subproblems.

🔙 Previous Part | Next Part 🔜

↩️ Go Back

Z) 🗃️ Glossary

File	Definition

Uncreated files	Origin Note