Big O Notation by Colt Steele

Table of Contents:

A) How do we know which code is "best"? (Reverse String Example)

What is Big O Notation? Let's begin with an example

Imagine we have multiple implementations of the same function in a computer program.

For example, Imagine we have 10 different variations which all work and can solve our problem.

How do we know which is "best"?

Let's use a concrete example: We are given the following task,
"Write a function that accepts a string input and returns a reversed copy of that string "
|450

Here are 10 implementations found in....

using JavaScript:

Reverse String - 10 JavaScript Implementations

Link: https://stackoverflow.com/questions/958908/how-do-you-reverse-a-string-in-place

How do we know which one is "best"?

A.1) What does "best" even mean? Why should we care?

It would be nice to have a scale that could help us score our code. So we would know if our code is "Great! ""OK"or even "Awful". But how can we get such scale?

This is where Big O Notation can help us!
It is going to help us talk about our code and compare our code.

But, why should we compare our code? Why should we even care?

Because...

it is important to have a precise vocabulary to talk about how our code performs.
it is useful for discussing trade-offs between different approaches and can help us make decisions.
when our code slows down or crashes, identifying parts of the code that are inefficient can help us find bottlenecks and pain points in our applications
it comes up in interviews...

B) Understanding "Time Complexity" (Total Sum Example)

Let's use a SIMPLE example: We are given the following task:
"Write a function that calculates the sum of all numbers from 1 up to (and including) some number n."

Here are 2 implementations using JavaScript:

Implementation 1

It uses a for loop and it adds the counting variable n times. For each iteration it executes one sum (total += i)

Implementation 2

This second version is smaller and it uses Gauss formula. This function executes the formula $n (n + 1) / 2$ only once.

How do we know which one is better?
What does "better" even mean?

Is it the...

faster one?
less memory intensive? (the one that takes up less space?
more readable?
shorter one? (the one with less lines?

Big O focuses on speed, also known as "time complexity"

Note: focusing on memory is known as "space complexity".

About Complexity Theory

Time complexity is a function describing the amount of time an algorithm takes in terms of the amount of input to the algorithm.

Space complexity is a function describing the amount of memory (space) an algorithm takes in terms of the amount of input to the algorithm.

B.1) Doing some tests with Timers

So, for time complexity it seems that a good place to start would be using a timer.

For example:

Implementation 1 (using the for loop)

We can repeat this process with the other implementation and compare them to see which one is faster,

Is this a good idea? Well, not really...

The problem of using timers or time itself is that:

Different machines will record different times
The same machine will record different times
For fast algorithms, speed measurements may not be precise enough.

Let's see an example using timers so that we see their limitations and plot the time elapsed against the input size "n"·

How to measure time?

We can use the following website: https://rithmschool.github.io/function-timer-demo/

1st Experiment:
Using a small input size (1-10)

We can see that with small values of n, the timer is unpredictable. This is the problem with timers, but now let's try using larger inputs so that we put our machine into some heavy work.

2nd Experiment:
Using a larger input size (1 - 1,000,000,000)

This is a not at all practical method to test our code, but now we can see a clear trend. Our 2nd implementation is much faster than the 1st one when $n \to \infty$

That is the approach we are going to take in evaluating our code. We are going to think about how it performs as the input gets towards "infinity".

But, we are not going to make a graph and chart our code every single time we evaluate the performance of our code relative to other code.

So, instead of working with "time", we will work with the "number of operations" the computer has to perform. With this, we will be able to determine how our code performs just by looking at it without having to run our code and print a graph.

We will be able to roughly approximate what that chart will look like if we were to plot it.

Let's see how this works with our previous examples.

B.2) A better method, counting operations

Counting Operations -2nd implementation: So, we count the "number of operations" with a very simplified way of counting.

Counting Operations -1st Implementation: Now if we analyse the 1st implementation, things are different.

But it doesn't really matter the exact number of operations, the bigger picture trend shows that as n grows ( $n \to \infty$ ), the "number of operations" is going to grow (at least) proportional to n (because of the "for loop")

Depending on what we count, the number of operations can be as low as $2 n$ or as high as $5 n + 2$ .

But as we said, the exact number of operations doesn't really matter. When n is large, there is no significant difference between $2 n$ or $5 n + 2$ . or even just " $n$ ", the graph will have the shape of a straight line.

C) So... What is Big O Notation?

Big O notation (with a capital letter O, not a zero), also called Landau's symbol, is a symbolism used in complexity theory, computer science, and mathematics to describe the asymptotic behavior of functions. Basically, it tells you how fast a function grows or declines.

Landau's symbol comes from the name of the German number theoretician Edmund Landau who invented the notation. The letter O is used because the rate of growth of a function is also called its order.

In other words, Big O Notation is a "fuzzy" way of "formalized counting".

We say "fuzzy" because we are not being very specific, we are just zooming way out and imagining the graph of "number of operations" as n gets larger.

It is a nice way of describing how good our code is,

is it "Good"(and fast)?
Or is it "Awful"(and slow)?

Except, instead of using "colors" to describe the performance of our code. With Big O,
we actually describe it with a mathematical function, which has the following definition:

C.1) Formal Definition: Big O Notation

Formal Definition - Big O Notation

$f (x)$ is an element of $O (g (x))$ if and only if there exists a positive constant $K$ (or sometimes also written as $M$ ) and some $x_{0}$ such that for all $x$ greater than $x 0$ , $f (x)$ is smaller than $K g (x)$

Or written in formal mathematics:

I know, the definition can be confusing. But it is way clearer if you see the definition as a graph,

So, what this definition is telling us is that:

The number of simple operations a computer has to do in an algorithm, for example: $f (n) = 5 n + 2$ is an element of $O (g (n))$

where $g (n)$ can be:

linear $g (n) = n$
quadratic $g (n) = n^{2}$
or a constant $g (n) = 1$

if and only if in some interval where $n$ is large, $f (n) \leq K g (n)$

For example: $5 n + 1 \leq 7 n$ , whenever $n \geq 1$

What is this " $5 n + 2 \in O (n)$ "notation telling us?
It is telling us that the amount of operations $f (n)$ (the time it takes) is going to grow roughly in proportion with $g (n)$ as $n$ grows.

Did you know?

Just like we have Big O Notation, we also have Big Omega Ω Notation and Big Theta Θ Notation. These are all notations used in computer science to describe the limiting behavior of functions, often to characterize the efficiency of algorithms.

In summary:

Big O is about the worst-case scenario.
Big Omega is about the best-case scenario.
Big Theta is when the worst-case and best-case are the same.

It's worth noting that in practice, Big O is the most commonly used notation because it provides an upper limit on performance, which is typically the most critical information when assessing the viability of an algorithm. However, knowing all three gives a complete picture of an algorithm's efficiency.

C.2) Applying Big O Notation to our Example (Total Sum)

Let's go back to our example

Fuzzy Counting -2nd Implementation:

We saw there were 3 operations, regardless of the size of n ( $f (n) = 3$ ).
But with "fuzzy"counting we just want the big picture, so we would call this $O (1)$ , also known as constant time.

This means that as n grows, the number of operations we have to do stays the same.

And if we were to plot that in a chart, it would look like a flat line. Time doesn't increase if me increase the input "n".

Now let's compare the 1st implementation.

Fuzzy Coonting -1st Implementation

In this case, we know that the number of operations is going to grow in proportion with n ( $f (n) = 5 n + 2$ )

But we also know that $f (n)$ is bounded by a multiple of $g (n)$ (say, $g (n) = 7 n$ ).

So, the general trend is $O (n)$ (also known as linear time).

This means that the time it takes for our function to get a results grows as a straight line as we increase n.

Note: we saw that with small values of n, the trend was not clear, but with larger values of n the trend is clear, it is a straight line.

C.3) Applying Big O Notation to more Examples

Let's review more examples:

Example 2: *"Write a function that takes a number "n" and prints all of the numbers from 0 until n and then prints them from n to 0 "

Here is our code:

It has 2 separated for loops, we could count the operations as $f (n) = 2 n$ , but the general trend is just $O (n)$ . Therefore it has linear time.

Example 3: "Write a function that prints all the pairs of numbers between 0 and n"

For example: 01, 02, 03, ... , 11, 12, 13, ... , nn

Here is our code:

Now we have a nested for loop (not good for time complexity, we don't get fast code).

If we count operations, we now see that we have to do roughly $n^{2}$ operations (due to the nested for loop).

In this case we have $O (n^{2})$ also known as quadratic time.

This means that the trend now has the shape of a parabola, which grows much faster than a straight line.

C.4) Comparing the Big O Time Complexities functions

In fact, we can compare two of our functions in JavaScript. We can compare addUpTo(n) which runs in linear time and printAllPairs(n) which runs in quadratic time.

So, here is a chart that shows some of the most common Big O time complexities functions:

Big O Notation	Time Complexity	Description
O(1)	Constant Time	It is the best, but it is very rare
O(log n)	Logarithmic Time	Comes often when dealing with data structures like trees and sorting algorithms
O(n)	Linear Time	Time is proportional to n
O(n^2)	Quadratic Time	Very slow with large numbers of n (but it is sometimes faster when dealing with small n). Suggestion: just avoid it!

C.5) Simplifying Big O Expressions (rules of thumb)

When determining the time complexity of an algorithm, there are some helpful rules of thumb for Big O expressions.

These rules of thumb are consequences of the definition of Big O notation.

Rules of Thumb:

Let's see more Big O Shorthands:
Note: These rules won't always work, but are a helpful starting point.

D) Test what you learnt (more examples)

Let's consider some other examples...
What is the time complexity of the following functions?

Example 1: "write a function that shows in the console a list of numbers of at least 5 numbers from 1 to 5 and at most n numbers from 1 to n"

Here is our code:

Example 2: "write a function that shows in the console a list of numbers of at most 5 numbers from 1 to 5 and at least n numbers from 1 to n"(where $n \leq 5$ )

Here is our code"

Note: Space Complexity

So far, we've been focusing on time complexity: how can we analyze the runtime of an algorithm as the size of the inputs increases?

We can also use big O notation to analyze space complexity: how much additional memory do we need to allocate in order to run the code in our algorithm?

E) More Info: Time and Space Complexity in Data Structures and Algorithms

Did you know?

In summary:

Big O is about the worst-case scenario.
Big Omega is about the best-case scenario.
Big Theta is when the worst-case and best-case are the same.

Data Structures

Algorithms

↩️ Go Back

Z) 🗃️ Glossary

File	Definition