Video: YouTube
#CS50x #Computer_Science #Complexity_Theory
#Algorithms #Complexity_Theory
Read more here
Previous Part | Next Part 🔜
Table of Contents:
Welcome to CS50, Harvard University’s introduction to computer science and programming.
At its core, computer science is about problem-solving.
You take an input, which is the problem you wish to solve, and work toward a solution, or output. In between lies a “black box” of code — the magic that enables computers to process solutions.
Our goal is to empower you to control this process.
For many students, CS50 represents a shift from familiar subjects to a new way of thinking. Nearly two-thirds of students each year have no prior background in computer science, making it a welcoming course for beginners.
What matters most is your own progress, rather than comparisons to others.
Early in the course, you’ll recreate a portion of Super Mario Brothers using ASCII art, giving you a taste of foundational programming.
Later, you’ll build CS50 Finance, a web application that simulates real-time stock trading using virtual currency.
Over time, you’ll move from simple projects to fully functional applications, culminating in a final project of your own design.
What is Computer Science?
A simple way to understand data representation is to start counting people in a room with fingers. This unary notation, while easy to grasp, is limited. We humans are more used to counting with base 10 (using the digits 0,1,2,3,4,5,6,7,8,9).
The unary numeral system is the simplest numeral system to represent natural numbers
To represent a number N, a symbol representing 1 is repeated N times.
In the unary system, the number 0 (zero) is represented by the empty string"Empty string", that is, the absence of a symbol.
Numbers 1, 2, 3, 4, 5, 6, ... are represented in unary as 1, 11, 111, 1111, 11111, 111111, ...
Computers use binary, which consists of only 0
and 1
.
Unlike our base 10 system (from 0 to 9),
where we construct numbers based on powers of ten,
binary is based on powers of two, creating a simplified yet powerful framework for data representation.
The 0
and 1
from binary are known as bits (binary-digits)
Computers communicate through binary because they operate with electricity: on (1) or off (0). This state can be represented by light bulbs
or, more accurately, by millions of tiny transistors inside a computer, each capable of switching on or off.
Increasing Representation with Bits:
With multiple bits, we can represent increasingly complex values.
For example: Three bits (each representing 0
or 1
) enable us to count from 000
to 111
, or from 0 to 7. Each additional bit doubles the number of values we can represent, enabling computers to process and store vast amounts of data.
Computers rely on base 2, where each position corresponds to a power of two: ones, twos, fours, and so on. For example:
010
represents the decimal value 2
.111
represents the decimal value 7
.With this structure, we move beyond simple light bulb analogies to a scalable, efficient representation system.
Binary Number Example: Converting 110010 to Decimal
Consider the binary number 110010
. What decimal value does this represent? Here’s how to break it down:
Following this pattern, the binary 110010
translates to 32 + 16 + 2 = 50 in decimal.
The Power of Binary in Computing:
Binary’s simplicity enables computers to handle complex calculations. Using patterns of on and off switches (bits), computers process data efficiently. The ability to represent numbers and data in binary underpins all computational processes, from simple calculations to complex algorithms.
Computers began as calculators,
but they now handle much more complex tasks, including representing text.
How, then, could a computer represent letters of the alphabet using only switches (binary)?
For instance:
65
in decimal form.ASCII was created to represent letters and punctuation in English, with 65
representing A, 66
as B, and so forth.
This standardized mapping allows computers to interpret binary patterns as letters or symbols depending on context:
While ASCII represents English characters effectively, it lacks support for many global languages.
Each ASCII character requires 8 bits (1 Byte), allowing for 256 possible characters (2^8 combinations).
However, this range is insufficient for accented characters and non-Latin scripts, such as those in Asian languages.
or even emojis,
As computing has evolved, the need for larger character sets has become essential for representing diverse global languages.
Unicode was developed to address the limitations of ASCII, allowing computers to support thousands of characters across various languages and symbols.
For example, while English characters fit comfortably within ASCII (256 characters), more complex languages with unique characters, such as Asian scripts, require far more.
In Unicode, additional bits (16, 24, or even 32) are often used, providing an expansive range to include emojis, special characters, and diverse alphabets.
Video: https://youtu.be/nhN8larXM2w
Emojis, which are commonly used in digital communication, may look like images, but they are technically characters represented by patterns of zeros and ones in Unicode.
Just like letters, each emoji has a unique binary pattern that corresponds to a specific decimal code.
For instance, the popular “face with tears of joy” emoji is represented by the decimal 128514
.
and as we know, the binary representation would look like this:
This allows emojis to appear consistently across devices and platforms, transforming simple text into expressive symbols.
Read more here
Color representation in computers follows a similar principle (we need to use numbers to represent them).
Since we need a standard way to define colors, we use systems like RGB (Red, Green, Blue).
The RGB model allows computers to create any color by combining these three components:
RGB (72, 73, 33)
might represent a shade with:
Each color in a pixel is expressed as 24 bits (8 bits for each red, green and blue shades), which means every pixel on the screen is assigned a specific amount of red, green, and blue to create a precise color.
And from a distance, the color combination looks like this (a yellow pixel),
An image is composed of many small pixels, each defined by an RGB value. When viewed at normal resolution, these pixels blend into a smooth image.
However, zooming in reveals each pixel as a distinct colored square, a process known as pixelation. This characteristic is especially noticeable when resizing lower-resolution images, where individual pixels become apparent.
Each pixel's RGB values require 24 bits (or 3 Bytes), and with large images containing thousands of pixels, file sizes quickly increase.
This is why high-resolution images are often measured in kilobytes (KB), megabytes (MB), or even gigabytes (GB).
A video is essentially a sequence of images (frames) displayed in rapid succession to create the illusion of movement.
Each frame, just like a still image, consists of pixels with their respective RGB values.
By changing the pixel colors frame-by-frame, a computer can display fluid motion. So, a video file is essentially a series of images, processed and displayed quickly enough to appear as continuous motion.
Computers represent audio using numbers to encode sound frequencies and duration.
For instance, a musical note can be represented by a specific number that denotes its pitch,
and another value can indicate duration.
This way, audio can be broken down into small data units, making it possible to store and play back complex sounds.
By standardizing these values, computers can produce sound files (Digital formats for audio) that play consistently across devices.
Different file types—such as JPEG
for images, MP3
for audio, or MP4
for video—each have unique encoding methods that organize bits in specific ways to optimize file storage and interpretation.
These formats are essentially agreements among developers,
on how to structure zeros and ones to ensure that computers can interpret data correctly.
In computer science, algorithms... (the blackbox that solves our problem)
provide step-by-step instructions for solving problems.
For computers, algorithms must be both correct and precise to ensure accurate results.
There is not place for ambiguity for machines.
Programming involves translating these algorithms into a language computers understand, which we often develop first in pseudocode—a structured outline in plain language.
In our everyday lives, searching for information, like a contact’s phone number, is a common task.
On digital devices, contacts are typically stored alphabetically.
Similarly, in the past, people relied on phone books, which organized names in alphabetical order.
Now, we’ll explore some algorithms by examining three different approaches to searching for a name in a phone book.
Suppose we want to look up a name in the phone book.
If I’m searching for “David,” I could start at the first page and move through each one until I find my name.
The pseudocode for our algorithms would look like this:
This linear search is correct because it eventually finds the name, but it’s also slow and inefficient.
If “David” is toward the end of the phone book, this method takes a long time. This search strategy is “correct” in that it eventually finds the answer, but it’s not efficient.
To make the search faster, we could increase our steps and check every second page instead of every single one.
The pseudocode for our algorithms would look like this:
BUT....
This skipping strategy speeds things up, but it introduces the risk of missing the name if it happens to fall on a skipped page.
To address this, we can add a backup step:
This makes the algorithm more efficient without losing accuracy, though it requires a little more effort to account for missed pages.
However, this approach is still linear and relatively slow for large phone books.
A much more efficient way to search is the binary search algorithm, which divides the search space in half with each step,
The pseudocode for our algorithms would look like this:
If it’s the correct name, you’re done.
If the target name is alphabetically earlier, open to the middle of the left half of the book.
If the target name is alphabetically later, open to the middle of the right half.
Each time we repeat this process, we’re cutting the remaining search space by half.
In other words, we are making the problem smaller an easier with each step.
This “divide and conquer” method is highly efficient: with a 1,000-page book, binary search requires only about 10 steps to find the target, no matter where it appears.
To understand the efficiency of these approaches, we can visualize them on a graph:
In computer science, we express these efficiencies mathematically:
Read more about:
The power of binary search illustrates an essential computer science principle: algorithms should be both correct and efficient. With careful design, we create solutions that not only solve problems accurately but also use resources wisely.
We will go deeper about these topics in this lecture:
So let me propose that what I really just did verbally (the binary search) can be translated into pseudocode, which is like an algorithm implemented in English, or whatever your spoken or written language is.
But the key is that it's got to be correct, and ideally it had better be precise so that there's no ambiguity.
We then can write this:
read it and try to understand how the problem is solved step by step.
Note: notice how we have an Else
---> Quit
. That means that if the name does not exist in the phone book then we stop searching, otherwise our logic could crash.
In the phone book example, we see several programming concepts at play:
These foundational concepts help in designing precise and efficient algorithms, reducing the need for redundancy by enabling a loop to handle repeated actions.
We have many topics to cover in this programming course, and we will.
Once the algorithm is clearly laid out in pseudocode, we move to actual programming.
In CS50, we start with Scratch, a graphical programming language that visually represents core concepts.
After Scratch, we progress to a text-based language known as C.
Though C is powerful, its syntax can seem complex, with elements like #include
, <>
, {}
, ;
, and quotes.
A simple C program that prints “hello, world” may look complicated at first, but mastering syntax will enable us to command the computer with precision.
Despite its initial complexity, C serves as an ideal foundation for learning how computers interpret algorithms in code form.
When the code starts, it first runs at the yellow block
In C we have the main function, where we start running out code,
For example, a custom function meow()
can handle the printf
command.
Previous Part | Next Part 🔜
File | Definition |
---|