Week 1 - Writing Convincing Arguments With Proofs

#Math #Discrete_Math #Computer_Science

↩️ Go Back

Table of Contents:


A) Course Introduction (🎦 folder 1)

Welcome! 👋

[[Discrete mathematics]] is the language of computer science.


Read more here

One needs to be fluent in it in order to be able to work in fields like:

The specialization covers [[discrete mathematics]] ideas and concepts needed in various branches of computer science. To make the learning process more efficient and enjoyable, we use the following active learning components.

A.1) Using Python

We will also use [[Python]] code snippets to understand the discrete math ideas with applications.

Why Python?

  1. High-level language: The syntax is reader friendly

  2. Interactive mode: we can use the [[read-eval-print loop (REPL)]] for instant feedback from the machine

  3. Libraries: we can generate a random sequence, plot and graph in just one line of code.

Disadvantage: High-level = less flexible in performance tuning.


How to use Python?

  1. Locally:

  2. In the cloud:



Link: https://leanpub.com/discrete-math
Alternative Link here

There are:


B) Why Proofs? (🎦 folder 2)

The first part of this discrete math course is about [[proofs]] in general. But why do programmers need proof? We are not mathematicians.

If you are writing a videogame or a search engine, you probably do not need to prove its correctness.

In the other hand, if you write an [[operating system]] or a [[cryptographic protocol]], you will probably need to prove that the [[operating system]] will not stop at some point or prove that the [[cryptographic protocol]] is safe.

B.1) What is a proof?

A [[proof]] is an argument, a convincing argument.

To get a [[proof]] first we need to understand the problem.

The goal of this course is to: *understand, invent, explain and enjoy [[proofs]].

If you are given a puzzle, you should immediately stop and think about it before you look up the solution.

Did you know?

"God has a book that contains all the proofs."
—Paul Erdős

Paul Erdős was a renowned Hungarian mathematician known for his work in number theory, combinatorics, and many other fields, as well as for his prolific collaboration with other mathematicians. The reference to "God's book" is a famous metaphor he used to describe a hypothetical book in which the most elegant and beautiful mathematical proofs are written down.

Inspired by this concept, the publication "Proofs from THE BOOK" presents a collection of elegant proofs, showcasing the beauty of mathematics that Erdős so revered. (authors, Martin Aigner and Günter M. Ziegler)
|200


B.2) Proof by Example

So, to understand proofs we will begin with an example.

Problem 1
"Can a chess board (8x8) be tiled (no overlaps or empty spaces) by domino tiles (1x2)?"

The Answer: YES! Let's prove it!

We can show an example of one solution as a proof,

This is known as a [[Proof by Example]].


Problem 2
Now, let's consider another problem:
"Can a chessboard (8x8) with a missing square in one of its corners be tiled (no overlaps or empty spaces by domino tiles (2x2)?"

Let's try to find an example:

Questions:

  1. Did we find a tiling, and thus prove that a solution exists?

    • Ans: No
  2. Did we prove that the solution does not exist?

    • Ans: No! We haven't tried all the possibilities.
  3. Can we prove that the solution either exist or not exist? Can we do one of these 2 proofs with examples?

    • Ans: No. At this time it is impossible, but we will see the [[impossibility proof]] in the next section.

B.3) Impossibility Proof (part 1)

Let's go back to the problem 2.

Problem 2
"Can a chessboard (8x8) with a missing square in one of its corners be tiled (no overlaps or empty spaces by domino tiles (2x2)?"

The Answer: No. But why? Can we prove it?
↳ One cell will always remain uncovered, because the chessboard has an odd number of cells. There are (8×8)1=63 cells.

So, we can use 31 tiles to cover 62 cells, because:

# tiles=62 cells2 cells/tile

but 1 cell will always remain uncovered.
We just proved that it is not possible. This is our [[impossibility proof]].


Problem 3
Now, let's consider another problem:
"Can a chessboard (8x8) with 2 missing cells at opposite corners be tiled (no overlaps or empty spaces) by domino tiles (2x2)?"

Let's use our argument again, but with a new problem (will it work?).

Now we have an even number of cells, there are (8×8)2=62 cells.
This means we could use 31 files to cover the board, right?

Question:

  1. Is this argument correct?
    • Ans: I don't know, let's find out with some examples

Example 1:

In this attempt we could not cover 2 cells.
Our domino tiles are also 2 cells each, but the position of the red cells is not appropriate for a domino tile.
So even with an "even"number of cells we failed to tiled the chessboard.

Maybe we did something wrong, let's try again.

Example 2:

Same problem...
The 2 remaining cells cannot be covered by the shape of 1 domino tile.

Question:
Should we try more possibilities or is it impossible to solve?
Ans: Let's find out in the next section.

B.4) Impossibility Proof (part 2 and conclusion)

Let's go back to the problem.

Problem 3
"Can a chessboard (8x8) with 2 missing cells at opposite corners be tiled (no overlaps or empty spaces) by domino tiles (2x2)?"

The Answer: No. But why? Can we prove it?

For our second [[impossibility proof]], we need deeper understanding on the arrangement of the chessboard. Let's color the board as follows,

Now we can see that the 2 corners that we removed are both black.
This means we don't have the same number of black cells as we have white cells.
32 white and 30 black.
But why is this important?

Let's see how the domino files cover the board,

One domino tile always require 1 black cell and 1 white cell.
This means that in order to cover the board, not only do we need an even number of cells, but also we need to have the same number of black cells as we have white cells.

B.5) Recap: Proofs and Theorems

So, in formal mathematics, we would have:

So from our problem 3 we could say the following,

Theorem

A chess board (8x8) without 2 opposite corners cannot be tiled by (1x2) dominos.

|200

And we can also show the [[proof]] of our [[theorem]]

Proof:

  • A regular chessboard has 4 black and 4 white cells in each row and thus we can know the total number of black and white cells in a regular chessboard. 32 blacks and 32 whites.

  • Opposite corners are always of the same color. For example, let's say they are black

  • With that example we would have a board with less black cells than white cells (30 black cells and 32 white cells).

  • a domino tile always has 2 different colors

  • and because we cannot pair 2 white cells together, we will have a white cells uncovered.
    "[[Quod Erat Demostradum"(QED)]]"
    which means "that is what we wanted to prove" (in Latin).

B.5.1) Conclusions: Lessons Learned

What did we learn about proofs with these examples?

  1. [[Proofs]] can be really convincing

  2. A [[theorem]] is a mathematical statement that can and must be proven to true.

So, if we consider a new problem and we claim:
Problem 4
"we can tile a (8x8) board without 2 non-opposite corners (say, the left bottom and left top ones)" We must prove it!

Indeed, we already know that if # of blacks is not equal to the # whites, the tiling does not exist. But having the same number of black and whites cells does not imply that a solution exists. We have yet to prove that!

We can prove it with an example:


B.5.2) Bonus Problem!

Let's prove another theorem

Problem 5
"we can tile a (8x8) board without 2 arbitrary cells of different colors by (1x2) dominos"

To prove this we must consider the following "path" on our chessboard:

This "path"has the shape of a snake.

This snake is useful to make the following argument, but first let's make an example of a board without 2 arbitary cells of different color


B.6) Existence Proofs (🎦 folder 3)

B.6.1) One Example is Enough

Now we will learn one of the simplest kinds of proofs, which is called "[[The existence proof]]".

In order to provide the [[proof]] , we one need one example.

In an "[[existence proof]]" we have to consider the "[[existential statement]]".

This statement claims that an object with given properties exist.

In mathematical notation, the "[[existential statement]]" is defined as follows:

So, we are saying that x:P(x) "there exists an example x where the property proposed P is satisfied as P(x)"

And to prove this claim it's enough to show one example.

Note: it is also important to check for other examples, but one example is enough to provide a proof.

Let's see how it works on some simple geometric problems.

This problem is about cutting geometric shapes into congruent pieces (pieces of the same shape and size).

B.6.2) Another Puzzle: Splitting an Octagon

Let's solve another puzzle!

This problem is also about cutting geometric shapes in to congruent pieces (pieces of the same shape and size).

But now we must split an irregular octagon (a polygon of eight angles and eight sides).

Let's try to find examples as [[proofs]],

B.6.3) Having Fun in Real Life: Tensegrities (Optional)

Let's see an example in Real Life. We will study a type of structure named a [[tensegrity]].

Can we find an example of a structure x, with the property P(x) which states that the compressed members cannot touch each other and can only be connected by cables.

In order words, can we prove that x:P(x)?
("there exists an x such that P(x)")

Ans: Yes, here is an example:

B.6.3) Know your rights; proofs do not require explanation

Let's return to the [[existencial statements]] x:P(x) and their [[proofs]].

We said that one example is all we need to provide a [[proof]], but there is also another property of "[[existence proofs]]" which says that it is not necessary to explain how we got the example for the proof, the only important part is that it exists.

Let's solve another example to understand this concept:

Example 1: Can we find a 2 digit number that becomes 7 times smaller after the first digit is deleted?

Let's find out, imagine we have a number with digits "ab", then:

Conclusion: the number 35 becomes 7 times smaller after the first digit is deleted

Example 2: Can we find a n digit number that becomes 57 times smaller after the first digit is deleted?

Questions:

  1. How do we find a solution?

    • Ans: I don't know (yet), but hey here is a solution: 7125 = 57 (125)
  2. Is this enough to prove that such n digit number (x) exists and is 57 times smaller after the 1st digit is deleted as stablished by P(x)?

    • Ans: Yes! x:P(x), it is 7125
  3. Do we have the right to ask for a procedure?

    • Ans: No! the example is all we need.

Can we find a procedure for this example?

This problem does have a procedure and it is the following:

  1. Consider a n digit number like a b ... z

This number must be 57 times bigger than the number without the 1st digit so:

  1. x = b ... z has k digits (n-1)

Therefore, in decimal notation we have,

  1. We can pass x on the right side and factorize,

  1. From the factors of 10k, one multiple of 8 is 23, therefore k = 3

  2. We solve for x,

Conclusion: 7125 becomes 57 times smaller after the first digit is deleted:

7125=57(125)

We can take powers of 10 and it also works, for example,

71250 = 57(1250), and so on...


B.6.4) Nobody can win all the time: Non-existing examples

This section is a warning!

Sometimes, when you are trying to find an example for a [[proof]], this example might not exist!!! No matter how hard you try, sometimes there is no example that can help you prove something.

Let's see how this works with the following problem.

B.6.4.1) The Backpack Problem

Problem: Imagine you want to carry some weights with 2 identical backpacks.

You also want to split the weights so that you carry the same amount in each backpack.

How would you split the weights in the following cases?

Case 1: 3 weights: 1kg, 2kg, 3kg

|350

One backpack carries 1kg + 2kg, and the other backpack carries 3kg,


Case 2: 6 weights: 1, 2, 3, 4, 5, 7
How can we split them to 2 groups of equal weight?

First, let's determine what is the total weight?

Total weight = 1 + 2 + 3 + 4 + 5 + 7 = 22 kg

We want to split 22 into 2 groups of equal weight

22/2 = 11 for each group

How can we get 2 groups of 11 each?

Note: There can be cases when the task is impossible, we can have obstacles that won't let us find a solution. Consider the following case.


Case 3: 6 weights: 1, 2, 3, 4, 5, 6
How can we split them to 2 groups of equal weight?

First, let's determine what is the total weight?

Total weight = 1 + 2 + 3 + 4 + 5 + 6 = 21 kg

We want to split 21 into 2 groups of equal weight
↳ 21/2 = 10.5 kg for each group it is impossible!

We cannot get that number by adding our available weights.


Case 4: 6 weights: 2, 4, 6, 8, 10, 12
How can we split them to 2 groups of equal weight?

First, let's determine what is the total weight?

Total weight = 2 + 4 + 6 + 8 + 10 + 12 = 42 kg

We want to split 42 into 2 groups of equal weight
↳ 42/2 = 21 kg for each group

But we have a problem... 21 is odd, and all our weights are even (in fact they are 2 times the weight of case 3). And we cannot get an odd number by adding even numbers. So finding an example, a solution IS IMPOSSIBLE!!!

As we have seen in the previous cases, it seem that we have some obstacles when we search for a proof for our "existential statement" like:

But, are there other type of obstacles? And, can we count the total number of obstacles this problem has?

Let's find out with the following case.


Case 5: 6 weights: 1, 2, 3, 4, 5, 17
How can we split them to 2 groups of equal weight?

First, let's determine what is the total weight?

Total weight = 1 + 2 + 3 + n + 5 + 17 = 32 kg

We want to split 32 into 2 groups of equal weight
↳ 32/2 = 16 kg for each group.

BUT! We have one weight of 17, so there is no way to split the weights to 16 kg in each backpack. IT IS IMPOSSIBLE To SOLVE!!!

This is yet another type of obstacle...

Are these bad news?
Ans: Yes


B.6.4.2) Bad News for our Backpack problem...
  1. Obstacles for our proof can be of different types
  1. There is no complete list of obstacles, we don't know all the obstacles this problem has. Therefore, we cannot check if the problem has a solution or not before we try to solve it.

  2. This is an [[NP-Complete Problem]], which means it is an infeasible problem to solve.

In our case examples we could show easily if a solution exists or not. But if we were working with 10, 000 weights, then we don't know if there is an algorithm to check whether they can be split into 2 groups or not.

In fact, we can not even prove that such an algorithm doesn't exist. This is given by the P vs NP problem. (check "Extra Content" for more info).

The Knapsack Problem

En algoritmia, el problema de la mochila, comúnmente abreviado por KP (del inglés Knapsack problem) es un problema de [[optimización combinatoria]], es decir, que busca la mejor solución entre un conjunto finito de posibles soluciones a un problema. Modela una situación análoga al llenar una mochila, incapaz de soportar más de un peso determinado, con todo o parte de un conjunto de objetos, cada uno con un peso y valor específicos. Los objetos colocados en la mochila deben maximizar el valor total sin exceder el peso máximo.

The knapsack problem has been studied for more than a century, with early works dating as far back as 1897. The name "knapsack problem" dates back to the early works of the mathematician Tobias Dantzig (1884–1956), and refers to the commonplace problem of packing the most valuable or useful items without overloading the luggage.

We are now closer to the boundary of modern of computer science knowledge, now we know about proofs and even about [[NP-Complete Problems]], which have been tried to be solved, but with no success.

The following is the Take-home Message for this 1st week.

C) Take-home Message (💬 week1)

  1. The structure of the proof reflects the structure of the claims. It depends on what we want to prove
  1. If the statement is "existential", then you need only one example to prove it.
x:P(x)
  1. no need to disclose the sources of the proof.

  2. Beware! Sometimes the claim you want to prove may be false. Don't spend the rest of your life looking for a solution that doesn't exist. In some cases, we can prove that it doesn't exist, but in other cases sadly we can't.


D) Extra Content: Big O (explained 🎦)

Video: https://youtu.be/kS_gr2_-ws8?si=Sp8pkVzo5h9rb4oT

#Math #Discrete_Math #Computer_Science #Algorithms #Complexity_Theory

↩️ Go Back

Table of Contents:


A) How do we know which code is "best"? (Reverse String Example)

What is Big O Notation? Let's begin with an example

Imagine we have multiple implementations of the same function in a computer program.

For example, Imagine we have 10 different variations which all work and can solve our problem.

How do we know which is "best"?

Let's use a concrete example: We are given the following task,
"Write a function that accepts a string input and returns a reversed copy of that string "
|450

Here are 10 implementations found in....

using JavaScript:

How do we know which one is "best"?

A.1) What does "best" even mean? Why should we care?

It would be nice to have a scale that could help us score our code. So we would know if our code is "Great! ""OK"or even "Awful". But how can we get such scale?

This is where Big O Notation can help us!
It is going to help us talk about our code and compare our code.

But, why should we compare our code? Why should we even care?

Because...

  1. it is important to have a precise vocabulary to talk about how our code performs.

  2. it is useful for discussing trade-offs between different approaches and can help us make decisions.

  3. when our code slows down or crashes, identifying parts of the code that are inefficient can help us find bottlenecks and pain points in our applications

  4. it comes up in interviews...


B) Understanding "Time Complexity" (Total Sum Example)

Let's use a SIMPLE example: We are given the following task:
"Write a function that calculates the sum of all numbers from 1 up to (and including) some number n."

Here are 2 implementations using JavaScript:

Implementation 1

It uses a for loop and it adds the counting variable n times. For each iteration it executes one sum (total += i)

Implementation 2

This second version is smaller and it uses Gauss formula. This function executes the formula n(n+1)/2 only once.

How do we know which one is better?
What does "better" even mean?

Is it the...

  1. faster one?

  2. less memory intensive? (the one that takes up less space?

  3. more readable?

  4. shorter one? (the one with less lines?

time complexity"

Note: focusing on memory is known as "[[space complexity]]".

About [[Complexity Theory]]

  • [[Time complexity]] is a function describing the amount of time an algorithm takes in terms of the amount of input to the algorithm.

  • [[Space complexity]] is a function describing the amount of memory (space) an algorithm takes in terms of the amount of input to the algorithm.


B.1) Doing some tests with Timers

So, for time complexity it seems that a good place to start would be using a timer.

For example:

Implementation 1 (using the for loop)

We can repeat this process with the other implementation and compare them to see which one is faster,

Is this a good idea? Well, not really...

The problem of using timers or time itself is that:

  1. Different machines will record different times

  2. The same machine will record different times

  3. For fast algorithms, speed measurements may not be precise enough.

Let's see an example using timers so that we see their limitations and plot the time elapsed against the input size "n

How to measure time?

We can use the following website: https://rithmschool.github.io/function-timer-demo/

1st Experiment:
Using a small input size (1-10)

We can see that with small values of n, the timer is unpredictable. This is the problem with timers, but now let's try using larger inputs so that we put our machine into some heavy work.

2nd Experiment:
Using a larger input size (1 - 1,000,000,000)

This is a not at all practical method to test our code, but now we can see a clear trend. Our 2nd implementation is much faster than the 1st one when n

That is the approach we are going to take in evaluating our code. We are going to think about how it performs as the input gets towards "infinity".

But, we are not going to make a graph and chart our code every single time we evaluate the performance of our code relative to other code.

So, instead of working with "time", we will work with the "number of operations" the computer has to perform. With this, we will be able to determine how our code performs just by looking at it without having to run our code and print a graph.

We will be able to roughly approximate what that chart will look like if we were to plot it.

Let's see how this works with our previous examples.

B.2) A better method, counting operations

Counting Operations -2nd implementation: So, we count the "number of operations" with a very simplified way of counting.

Counting Operations -1st Implementation: Now if we analyse the 1st implementation, things are different.

But it doesn't really matter the exact number of operations, the bigger picture trend shows that as n grows (n), the "number of operations" is going to grow (at least) proportional to n (because of the "for loop")

Depending on what we count, the number of operations can be as low as 2n or as high as 5n+2.

But as we said, the exact number of operations doesn't really matter. When n is large, there is no significant difference between 2n or 5n+2. or even just "n", the graph will have the shape of a straight line.


C) So... What is Big O Notation?

computer science, and [[mathematics]] to describe the asymptotic behavior of functions. Basically, it tells you how fast a function grows or declines.

Landau's symbol comes from the name of the German number theoretician Edmund Landau who invented the notation. The letter O is used because the rate of growth of a function is also called its order.

In other words, Big O Notation is a "fuzzy" way of "formalized counting".

We say "fuzzy" because we are not being very specific, we are just zooming way out and imagining the graph of "number of operations" as n gets larger.

It is a nice way of describing how good our code is,

  • is it "Good"(and fast)?
  • Or is it "Awful"(and slow)?

Except, instead of using "colors" to describe the performance of our code. With [[Big O]],
we actually describe it with a mathematical function, which has the following definition:

C.1) Formal Definition: Big O Notation

Formal Definition - Big O Notation

f(x) is an element of O(g(x)) if and only if there exists a positive constant K (or sometimes also written as M) and some x0 such that for all x greater than x0, f(x) is smaller than Kg(x)

Or written in formal mathematics:

I know, the definition can be confusing. But it is way clearer if you see the definition as a graph,

So, what this definition is telling us is that:

The number of simple operations a computer has to do in an algorithm, for example: f(n)=5n+2 is an element of O(g(n))

where g(n) can be:

  • linear g(n)=n
  • quadratic g(n)=n2
  • or a constant g(n)=1

if and only if in some interval where n is large, f(n)Kg(n)

For example: 5n+17n , whenever n1

What is this "5n+2O(n)"notation telling us?
It is telling us that the amount of operations f(n) (the time it takes) is going to grow roughly in proportion with g(n) as n grows.

Did you know?

Just like we have computer science to describe the limiting behavior of functions, often to characterize the efficiency of algorithms.

In summary:

  • [[Big O]] is about the worst-case scenario.
  • [[Big Omega]] is about the best-case scenario.
  • [[Big Theta]] is when the worst-case and best-case are the same.

It's worth noting that in practice, [[Big O]] is the most commonly used notation because it provides an upper limit on performance, which is typically the most critical information when assessing the viability of an algorithm. However, knowing all three gives a complete picture of an algorithm's efficiency.

Read more here: Difference between Big O, Big Omega Ω and Big Theta Θ


C.2) Applying Big O Notation to our Example (Total Sum)

Let's go back to our example

Fuzzy Counting -2nd Implementation:

We saw there were 3 operations, regardless of the size of n (f(n)=3).
But with "fuzzy"counting we just want the big picture, so we would call this O(1), also known as constant time.

This means that as n grows, the number of operations we have to do stays the same.

And if we were to plot that in a chart, it would look like a flat line. Time doesn't increase if me increase the input "n".

Now let's compare the 1st implementation.

Fuzzy Coonting -1st Implementation

In this case, we know that the number of operations is going to grow in proportion with n (f(n)=5n+2)

But we also know that f(n) is bounded by a multiple of g(n) (say, g(n)=7n).

So, the general trend is O(n) (also known as linear time).

This means that the time it takes for our function to get a results grows as a straight line as we increase n.

Note: we saw that with small values of n, the trend was not clear, but with larger values of n the trend is clear, it is a straight line.

C.3) Applying Big O Notation to more Examples

Let's review more examples:

Example 2: *"Write a function that takes a number "n" and prints all of the numbers from 0 until n and then prints them from n to 0 "

Here is our code:

It has 2 separated for loops, we could count the operations as f(n)=2n, but the general trend is just O(n). Therefore it has linear time.


Example 3: "Write a function that prints all the pairs of numbers between 0 and n"

For example: 01, 02, 03, ... , 11, 12, 13, ... , nn

Here is our code:

Now we have a nested for loop (not good for [[time complexity]], we don't get fast code).

If we count operations, we now see that we have to do roughly n2 operations (due to the nested for loop).

In this case we have O(n2) also known as quadratic time.

This means that the trend now has the shape of a parabola, which grows much faster than a straight line.


C.4) Comparing the Big O Time Complexities functions

In fact, we can compare two of our functions in JavaScript. We can compare addUpTo(n) which runs in linear time and printAllPairs(n) which runs in quadratic time.

So, here is a chart that shows some of the most common [[Big O time complexities functions]]:

Big O Notation Time Complexity Description
O(1) Constant Time It is the best, but it is very rare
O(log n) Logarithmic Time Comes often when dealing with [[data structures]] like [[trees]] and [[sorting algorithms]]
O(n) Linear Time Time is proportional to n
O(n^2) Quadratic Time Very slow with large numbers of n (but it is sometimes faster when dealing with small n). Suggestion: just avoid it!

C.5) Simplifying Big O Expressions (rules of thumb)

When determining the [[time complexity]] of an algorithm, there are some helpful rules of thumb for Big O expressions.

These rules of thumb are consequences of the definition of [[Big O notation]].

Rules of Thumb:

Let's see more Big O Shorthands:
Note: These rules won't always work, but are a helpful starting point.


D) Test what you learnt (more examples)

Let's consider some other examples...
What is the [[time complexity]] of the following functions?

Example 1: "write a function that shows in the console a list of numbers of at least 5 numbers from 1 to 5 and at most n numbers from 1 to n"

Here is our code:

Example 2: "write a function that shows in the console a list of numbers of at most 5 numbers from 1 to 5 and at least n numbers from 1 to n"(where n5 )

Here is our code"

Note: [[Space Complexity]]

So far, we've been focusing on [[time complexity]]: how can we analyze the runtime of an algorithm as the size of the inputs increases?

We can also use [[big O notation]] to analyze [[space complexity]]: how much additional memory do we need to allocate in order to run the code in our algorithm?


E) More Info: Time and Space Complexity in Data Structures and Algorithms

Did you know?

Just like we have [[Big O Notation]], we also have [[Big Omega Ω Notation]] and [[Big Theta Θ Notation]]. These are all notations used in [[computer science]] to describe the limiting behavior of functions, often to characterize the efficiency of algorithms.

In summary:

  • [[Big O]] is about the worst-case scenario.
  • [[Big Omega]] is about the best-case scenario.
  • [[Big Theta]] is when the worst-case and best-case are the same.

It's worth noting that in practice, [[Big O]] is the most commonly used notation because it provides an upper limit on performance, which is typically the most critical information when assessing the viability of an algorithm. However, knowing all three gives a complete picture of an algorithm's efficiency.

Read more here: Difference between Big O, Big Omega Ω and Big Theta Θ

Data Structures

Algorithms


↩️ Go Back


Z) 🗃️ Glossary

File Definition

E) Extra Content: P vs NP (explained 🎦)

Video: https://youtu.be/EHp4FPyajKQ?si=4yA88InvCRRUyJ2E

#Math #Discrete_Math #Computer_Science #Algorithms #Complexity_Theory

↩️ Go Back

Table of Contents:


A) Background: The Millennium Prize Problems

The P vs. NP problem has often been called "the biggest unsolved problem in Computer Science"

There is currently a $1 million prize waiting for the person who solves it.

B) But... what is the P vs. NP problem?

The P vs. NP problem can roughly be translated to:
"If a problem is easy to check the solution to, is it also easy to find the solution to?"

This question sounds crazy...

  • Why is it even a question in the 1st place?
  • What does this have to do with Computer Science?

B.1) Example: Let's play Scrabble! (finding Algorithmic Equivalence)

Let's start by playing a game!

The game is called "Number Scrabble" and it is a 2 player game (1 vs 1).

How to play:

  1. We place 9 numbers on the table. (The numbers are in order).

  1. Then the 1st player picks a number

  1. Then the 2nd player picks a number

  1. The aim of the game is to pick 3 numbers that add up to 15. Who ever does this first, WINS!!

  2. If all number are picked and no player has exactly 15, the game is a draw.

Now, how would you go about winning this game?

Options:

  1. You can try some trial and error. And block the other player from getting to 15.

  2. Or, we can "reframe the game"....

Instead of lining up the numbers in a line.

We can arrange them in a square (like a 3x3 matrix) in a VERY specific order...

Now, with the magic square. How would you try to win this game?

Options:

  1. we now can try to pick 3 numbers which make a triplet and block the opponent of doing the same.

Now we are playing Tic-Tac-Toe!!!*

When we arrange "Number Scrabble"in a magic square, we see it is really the same game as Tic-Tac-Toe.

A lot of games can be reduced to the exact same game. And a lot of problems in math and computer science can be reduced to the same problem.

But, why is this important?

If 2 problems can be reduced to the same problem, they can be solved by the same strategy.

Any strategy for Tic-Tac-Toe can also solve "Number Scrabble"


B.2) Computational Complexity and Complexity Classes in PSPACE

In Math and Computer Science these "strategies" are called "algorithms". And they are mechanical procedures (like a recipe).

You don't need to think when carrying it out, just follow each step until you've reached the solution.

Note: Depending on the problem, algorithms can be more or less complex

How complicated an algorithm a problem needs to solve is one of the main areas of interest in [[computer science]].

It is so important in fact, that problems are grouped into classed based on how complicated they are.

|450

This area of Computationaly Complexity.

And the groups of problems are called Complexity Classes

Each complexity class has its own name. The 2 we are interested in are the:

|450

"P and NP"are just classes of problems based on how difficult they are to solve.

Note: [[P vs NP problem]]

The “Complexity Class P” contains all decision problems that can be solved by a deterministic Turing machine using a polynomial amount of computation time, also known as polynomial time.


The “Complexity Class NP” is the set of decision problems for which the problem instances (or solutions), where the answer is "yes", have proofs verifiable in polynomial time by a deterministic Turing machine, or alternatively the set of problems that can be solved in polynomial time by a nondeterministic Turing machine.


Decision Problem


decision problem has only two possible outputs (yes or no) on any input.
Read more here


B.2.1) Complexity Class "P"

Let's see what these classes mean....

P class: It is the class of all problems which have an algorithm that can be computed in "polynomial time". Or in [[Big O Notation]], when the "[[time complexity]]" is O(nc) where c>1 and n is the input size.

Remember

P stands for "Polynomial Time"


P es el conjunto de problemas en los que podemos encontrar una respuesta al problema en un tiempo razonable con un algoritmo polinomial O(nc), c>1

In other words...
The class P contains all problems a computer can solve in a reasonable amount of time.
|400

Examples:


Keep in mind...

The list of complexity classes is huge!

But we are only discussing these 2 for now...

|450


B.2.2) Complexity Class "NP"

NP class: it is the class of problems that can be verified in "polynomial time" but may take an exponential number of steps to solve. Or in [[Big O Notation]], the "[[time complexity]]"is O(2n) when n is the input size.

Remember

NP stands for "Non - deterministic Polynomial Time"


NP es el conjunto de problemas en los que podemos comprobar en un tiempo razonable si una respuesta es correcta o no.

En otras palabras...
NP = Problemas que se comprueba que "algo es solución" en tiempo polinomial

Examples:


B.3) What is the important difference between P and NP?

Computers take an extremely long time to solve NP problems...

For example:

If a problem is in the class P and has 100 inputs. And its algorithm is proportional to
n3 (polynomial time).

The algorithm could solve the problem in about 3 hours let's say...


If the problem is in NP class with a completion time proportional to 2n (exporential time).

The algorithm will take roughly 300 quintillion years to solve. That is longer than the age of the universe.

Although NP problems take a very long time to calculate, they also take a relatively short time to verify.

Consider Sudoku!

Sudoku is an NP problem, it takes some effort and time to solve. But once is solved, it is pretty quick to check that it is correct.

Did you know?

Even Candy Crush is an NP problem

|150


C) Let's recap: Does P = NP?

Let's go back to our original question!

We said previously that the P vs. NP problem can roughly be translated to:
"If a problem is easy to check the solution to, is it also easy to find the solution to?"

In other words...
Does P = NP?

What are we really asking here?
We are asking whether complexity classes, P and NP are in fact the same class.

That is, can we perhaps solve these exponential problems in polynomial time?

Why is this even a question?
P and NP seem to be very different things, why would anyone think they are the same?

One reason could because: "since PNP has not been proven, there is still a possibility that P=NP, and that is what we want to prove.

Is P=NP ?"
Note: most computer science researchers don't think P=NP.


C.1) What would happen if P = NP?

To understand why this is even a question, we should consider what would happen if P=NP ?

Optimization problems like:

  1. Transport rooting

  2. Production of goods

  3. Job scheduling

  4. Circuit and database design

They are all NP problems, everything would be more efficient if we could find the most optimal solutions to all these problems. Anything that is feasible to check would become feasible to solve.

Other NP problems like "protein folding", would have "polynomial time" solutions and we could cure cancer.

As MIT complexity researcher Scott Aaronson said:

But we would also have some bad news....

Encryption (for banks) and privacy safe all rely on PNP.

If P=NP all our passwords and our information would be feasible to get by anyone.


C.2) Has a NP problem ever been solved in Polynomial Time?

Before we continue asking if P=NP ,

Has a NP problem ever been solved in polynomial time?

Let's consider the following problem:
We are given the task to write an algorithm that can "sort" a list of numbers

C2.1) Testing the "Bubble Sort Algorithm"

1st Attempt - A simple algorithm (Bubble sort algorithm).

This algorithm goes through an entire array and compares each neighboring number.

It puts the larger value on the right and the smaller value on the left.

The runtime of this algorithm is of the order of n2.
So, if n=100 numbers to be sorted, it will take a maximum of about 10,000 steps to sort the away. But can we do better?


C2.2) Testing the "Merge Sort Algorithm"

2nd Attempt - Divide-and-conquer (Merge Sort algorithm)

This new (recursive) program takes an away, splits it in 2 down the middle, and keeps doing so until there are only squares of 1 left.

Then it takes 2 neighboring squares and sorts them with the lower number on the left and the higher number on the right.

It then looks at the neighboring pairs and compares the lower numbers with each other and sorts them.

Then compares the higher numbers and sorts theme

It does this with all pairs...

And then merges the squares into its previous sub arrays.

And then we repeat the process, we sort and merge until we have a complete array sorted.

Think of it as if the 2 arrays were "competing" to see who has the smaller value, the one who wins gives its number to the bigger sub array.

This new algorithm has a runtime of the order of nlog(n), which mean if there are 100 inputs it'll take around 100log(100) = 200 steps to finish.

The Bubble sort algorithm required 10,000 steps and now we can do it with only 200.

This is a massive improvement!


C2.3) Conclusion: From NP realm to the P realm

So... what is the point of this example?
The point is that sometimes you can improve an algorithm so that it solves a problem faster.

The problem didn't change (we are still sorting numbers), but by being clever we found a faster way to do it.

The same happens in the case of SOME NP problems.

It was deemed that they couldn't be solved in polynomial time, but then some clever person came along with an amazing algorithm and managed to transport the problem from the NP realm to the P realm.

Wait...
So if it was shown that SOME NP problems were in fact equal to P, doesn't that prove that P=NP?

Ans: No, we need to consider NP Complete Problems (the hardest problems in NP).


C.3) Complexity Classes and NP Complete Problems

Let's review complexity classes:

NP problems can all be solved in exponential time or less.

|450

Some NP problems (those in the P class) can be solved in polynomial time of less.
Therefore, the class P is contained within the class NP.

|450

Some NP problems have been shown to be in the class P,

However, the hardest problems in NP (the NP complete problems), have never been shown to be in the class P.

|450


C.3.1) So, P = NP (complete) is still unsolved...

So the question of whether P=NP still isn't solved, because "NP" actually refers to NP complete problems,

If just one of the NP complete problems was shown to be in the class P, that would mean that all of the NP complete problems would be in the class of P (just like solving "number scrable" also solves Tic-TAC -Toe).

And if some NP problems have been shown to be in class P, it is not to crazy to think that maybe just one NP complete problem might be in the class of P.

Maybe someday someone will come up with an amazing algorithm that can show one NP complete problem being reduced to a P problem.

This would show that all NP problems could be reduced to P problems and that would change the world...


↩️ Go Back


Z) 🗃️ Glossary

File Definition

Z) 🗃️ Glossary

File Definition

Week 2 -How to find an Example

Magic Squares