W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C

Video: https://youtu.be/F9-yqoS7b8w

#CS50x #C #Computer_Science

Read more here

🔙 Previous Part | Next Part 🔜

↩️ Go Back

Table of Contents:

↩️ Go Back


A) Intro - Removing Training Wheels: The CS50 Library

Welcome to week four of CS50!

For the past few weeks, we've been using the CS50 library as a form of "training wheels" for programming in C.


Read more at https://manual.cs50.io/#cs50.h

|200

This library simplifies tasks by providing functions like get_string and get_int, and you’ve been using it by including cs50.h at the top of your programs.

Additionally, commands like clang automatically linked the library using -lcs50, but make has automated all of this for you.

Today, we’ll start taking those training wheels off, diving into the mechanics of computers and memory.

While C may seem complicated at first, there are only a few key concepts you need to understand to unlock far more sophisticated and exciting problems.

Let’s begin by "relearning how to count."


B) Counting in Hexadecimal

Think about your computer’s memory as a grid of bytes, each with its own address.

While humans typically count in decimal (base-10)

or [[binary]] (base-2),

computer scientists often use [[hexadecimal]] (base-16).

Hexadecimal allows us to represent values using 16 symbols: 0-9 followed by A-F.

For example:

Hexadecimal simplifies memory addressing because each digit represents four bits (a "nibble").

This means you can efficiently represent a byte (eight bits) with just two hexadecimal digits.

For example:

Our grid with memory addresses can now look like this,

Note: By convention, hexadecimal numbers are often prefixed with 0x to distinguish them from decimal numbers. For example, 0xFF clearly indicates a hexadecimal value.


B.1) Example: Hexadecimal in RGB Colors

If you've worked with graphics software or web design, you've likely encountered hexadecimal in color codes.

For instance:

This system allows programmers and artists to represent colors efficiently using hexadecimal values for red, green, and blue intensities.


C) Computer Memory Addresses Explained

C.1) Exploring Memory with Code: using the "address-of" operator (&)

Let’s see how these concepts apply to your computer's memory.

Our grid with memory addresses can now look like this,

Consider this simple C program:

#include <stdio.h>

int main(void) {
    int n = 50;
    printf("%i\n", n);
}

This code declares an integer variable n with a value of 50 and prints it.

When executed, this value is stored in a specific memory location.

To see where n resides in memory, we can modify the program to print its address:

#include <stdio.h>

int main(void) {
    int n = 50;
    printf("%p\n", &n);  // Print the address of n
}

Here, & is the [[address-of operator]], which tells C to retrieve the memory address of n.

When compiled and run, this program outputs something like 0x7ffd80792f7c.

This hexadecimal value is the memory address where n is stored.


C.2) Introducing Pointers - a variable that stores memory addresses ( * )

A [[pointer]] is a variable that stores the address of another variable.

In C, pointers are declared using the * symbol, like this:

int *p = &n;  // Pointer p stores the address of n

Here’s what’s happening:

We can even see the memory address again by printing the [[pointer]],

To access the value stored at the address p points to, use the dereference operator (*):

printf("%i\n", *p);  // Prints the value at the address stored in p

For example,


C.3) Visualizing Pointers in the Memory Grid

Imagine your computer's memory as a grid:

Instead of focusing on raw addresses, we often abstract this relationship as p pointing to n.

This abstraction simplifies thinking about pointers without worrying about specific addresses.


D) Strings Are Pointers

Strings in C are [[arrays]]] of [[characters]] stored in contiguous memory, ending with a special \0 ([[null terminator (NUL)]]) to indicate the end.

For example, the string "HI!"

is stored as an [[array]]:

which is equivalent to having some [[bytes]], each with their contiguous [[memory addresses]]:

When you declare a string like this:

string s = "HI!";

Internally, s is just a [[pointer]] to the first character ('H') at address 0x123.

This means that the variable s holds the address 0x123,

and you can traverse the rest of the string by accessing subsequent memory locations.

For example:

And all we need to be careful about is finding the [[null terminator (NUL)]] to know that the string has ended,


D.1) Example: Removing the Training Wheels and create Strings as Pointers

Until now, the CS50 library has abstracted [[strings]] for you by introducing the string data type.

However, in raw C, there’s no string type—it’s actually a char * (a pointer to a character).

For example:

char *s = "HI!";

The CS50 library uses typedef struct to create string as a synonym for char *.

This simplification makes it easier to work with strings early in the course but isn’t necessary as you progress and understand pointers more deeply.

Key Takeaways:

  • Hexadecimal is a base-16 system used for memory addressing and color representation.
  • Pointers store addresses of variables and allow you to directly manipulate memory.
  • Strings in C are pointers to the first character of a null-terminated array of characters.

Understanding these building blocks will allow you to write more sophisticated programs and manipulate data at a lower level, giving you greater control over how your code interacts with the machine.


D.2) Using Pointer Arithmetic with Strings

Recap: Strings in C are technically [[pointers]] to the first character of a null-terminated sequence of characters.

For example, "HI!" is stored contiguously in memory, and the variable s holds the address of the first character ('H').

Using *s dereferences the pointer, allowing you to access the value ('H') stored at the address.

Pointer arithmetic lets you move through memory:

This mechanism underlies the common square bracket notation (s[1]),

For example:

which is syntactic sugar for *(s + 1).

For example:

Pointer arithmetic allows you to traverse memory by directly manipulating the pointer:

char *s = "HI!";
printf("%c\n", *s);       // Outputs: H
printf("%c\n", *(s + 1)); // Outputs: I
printf("%c\n", *(s + 2)); // Outputs: !

This behavior highlights that strings in C are addresses, and the square bracket notation is simply a user-friendly abstraction.

Danger! - Segmentation Fault

  • Accessing memory beyond a string's length (e.g., *(s + 3)) may lead to undefined behavior, such as reading the null terminator (\0) or causing a segmentation fault.
  • A segmentation fault occurs when you attempt to access memory outside the bounds allocated to your program.


D.2.1) Note: Comparing Strings (understanding the strcmp function)

Comparing strings with __==__ doesn't check their content—it compares their memory addresses.

They are stored at different addresses,

To compare the actual characters, we can use a loop to compare each character, but it is easier to just #include <string.h>,

and use the strcmp function:

#include <string.h>

if (strcmp(s, t) == 0) {
    printf("Strings are the same\n");
} else {
    printf("Strings are different\n");
}

strcmp compares strings lexicographically:


D.3) Copying Strings using malloc

Using = to assign one string to another copies the address, not the contents.

To do this correctly, we need to use the function malloc, which comes included in the library stdlib.h,

malloc allow us to allocate memory for the new string, and now with a for loop, we go to each of the elements in s and create a copy t with the first letter capitalized,

Note: to prevent bugs when the input has no character, use an if,

using malloc...

So, in general, to copy a string:

  1. Allocate memory for the new string using malloc.

  2. Check that you don't get a [[null pointer (NULL)]]

  3. Copy the characters, including the [[null terminator (NUL)]] (\0), using a loop or strcpy:

    char *t = malloc(strlen(s) + 1); // Allocate memory for the new string
    if (t == NULL)  // check if malloc functioned correctly and not shows and empty pointer
    {
        printf("Memory allocation failed\n");
        return 1;
    }
    strcpy(t, s); // Copy the string
    

    This creates an independent copy of the string in memory.

String Copy with strcpy

strcpy simplifies string copying:

char *t = malloc(strlen(s) + 1);
if (t != NULL) {
  strcpy(t, s);
}

strcpy automatically handles the loop and null terminator for you.

The Role of NULL

  • NULL (all caps) is a special pointer value that represents "no address."
  • It’s different from \0 (the null character) used to terminate strings.
  • Always check the result of malloc:
 if (t == NULL) {
     printf("Memory allocation failed\n");
     return 1;
 }

Summary: Memory Allocation with malloc

malloc dynamically allocates memory:

  • Syntax: void *malloc(size_t size)
  • size_t size: Number of bytes to allocate.
  • Returns the address of the allocated memory or NULL if the allocation fails.
  • Always add 1 to account for the null terminator when copying strings.

Example:

char *s = "HI!";
char *t = malloc(strlen(s) + 1);
if (t != NULL) {
  strcpy(t, s);
}

D.31) Releasing Memory with free after using malloc

When using malloc, you must release the memory with free to avoid memory leaks:

Failure to call free can cause your program to consume increasing amounts of memory over time, leading to crashes or performance issues.

Reminder!

Everytime you use malloc, you must use free at the end of your program!!


D.3.2) Note: CS50's get_string (it uses malloc)

Key Takeaways

  1. Strings are pointers to the first character in a null-terminated sequence.
  2. Use strcmp to compare strings by content, not addresses.
  3. Use malloc to allocate memory for string copies and free to release it.
  4. Understand the difference between \0 (null character) and NULL (null pointer).
  5. Avoid segmentation faults by accessing memory within bounds and freeing allocated memory.

By mastering pointers, memory allocation, and string operations, you gain control over low-level memory management, enabling more efficient and powerful C programming.


E.1) What is Valgrind?

[[Valgrind]] is a powerful tool for detecting memory-related bugs in C programs.

It helps identify issues such as:

  1. Invalid memory accesses: Reading or writing memory that hasn't been allocated or is out of bounds.
  2. Memory leaks: Forgetting to free allocated memory, leading to gradual depletion of available memory.

To run Valgrind:

valgrind ./program_name

It outputs detailed information about memory usage, errors, and leaks.

While the output can seem overwhelming, focus on:

  1. Invalid reads/writes: Indicates out-of-bounds memory access.
  2. Memory leaks: Reports memory that was allocated but not freed.

E.1.1) Example: Troubleshooting with Valgrind

Code with Mistakes:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    char *s = malloc(3); // Mistake: Only 3 bytes allocated for 4 needed
    s[0] = 'H';
    s[1] = 'I';
    s[2] = '!'; 
    s[3] = '\0'; // Writing out of bounds
    printf("%s\n", s);
    // Missing free(s); // Memory is not freed
    return 0;
}

Errors in the Code:

  1. Allocating only 3 bytes when 4 are needed.
  2. Writing to the 4th byte without allocating enough memory.
  3. Not freeing memory allocated with malloc.

Notice the error is not obvious, since we are able to compile and get the expected output,

Run Valgrind (note, you can use help50 with [[valgrind]] to get some help)

wow... the output looks crazy....

Pay attention to this:

==12345== Invalid write of size 1
==12345==    at 0x4005FD: main (memory.c:10)
==12345==  Address 0x5203040 is 0 bytes after a block of size 3 alloc'd
==12345== Invalid read of size 1
==12345==    at 0x40063B: main (memory.c:11)
==12345==  Address 0x5203043 is 0 bytes after a block of size 3 alloc'd
==12345== LEAK SUMMARY:
==12345==    definitely lost: 3 bytes in 1 blocks

Key Insights - Valgrind Report:

  1. Invalid Write: Writing to the 4th byte (s[3]) is not allowed because only 3 bytes were allocated.
  2. Invalid Read: Reading the string (e.g., in printf) uses the invalid write.
  3. Memory Leak: malloc allocated 3 bytes, but free was not called.

Fixing the Issues:

  1. Allocate Enough Memory:

    • Correctly account for the null terminator (\0):

      char *s = malloc(4);
      
  2. Free Allocated Memory:

    • Call free(s) when the memory is no longer needed:

      free(s);
      

Corrected Code:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    char *s = malloc(4); // Allocate 4 bytes for "HI!\0"
    if (s == NULL) {
        printf("Memory allocation failed\n");
        return 1;
    }
    s[0] = 'H';
    s[1] = 'I';
    s[2] = '!';
    s[3] = '\0'; // Properly terminate the string
    printf("%s\n", s);
    free(s); // Free allocated memory
    return 0;
}

Valgrind confirms we did a good job,


E.1.2) Example: Proper Pointer Initialization - use sizeof() with malloc() and avoid Garbage Values

This program has a logical flaw because of improper pointer usage, particularly involving the variable y.

Code Walkthrough:

  1. Pointer Declarations:

    • int *x; and int *y; are declared, which are pointers to integers.
  2. Memory Allocation:

    • x = malloc(sizeof(int));
      • Dynamic memory is allocated for x, enough to store an int. At this point, x is a valid pointer.
  3. Assigning Values via Dereferencing:

    • *x = 42;
      • The value 42 is stored in the memory location allocated to x.
  4. Undefined Behavior - Dereferencing an Uninitialized Pointer (y):

    • *y = 13;
      • y has not been initialized or allocated memory. Attempting to dereference it (i.e., *y = 13) leads to undefined behavior. This could cause:
        • A segmentation fault.
        • Corruption of memory or crashing the program.

Garbage Values and Uninitialized Pointers

Memory Reuse and Garbage Values

How to Fix It: Properly allocate memory for y before dereferencing,

Code:

int main(void)
{
    int *x;
    int *y;

    x = malloc(sizeof(int));
    if (x == NULL) return 1; // Handle malloc failure

    y = malloc(sizeof(int)); // Allocate memory for y
    if (y == NULL) return 1; // Handle malloc failure

    *x = 42;
    *y = 13;

    free(x); // Free memory to avoid leaks
    free(y);

    return 0;
}

Memory Management: Always free memory allocated with malloc to avoid memory leaks:

    free(x);  // Releases memory allocated to x.
    x = NULL; // Prevents dangling pointer issues.
    ```

> [!summary] Note: Takeaways from Binky
> - **Pointers need pointees**: A pointer must point to a valid memory location before being used.
>- **Pointers must point to valid memory before dereferencing**.
>- Using an uninitialized pointer (like `int *y`) leads to undefined behavior.
>- Allocate memory before dereferencing pointers:
   > 
    >```c
  >  int *x = malloc(sizeof(int)); // Allocate memory for an integer
  >  *x = 42;                      // Assign a value
  >  free(x);                      // Free memory
  >  ```
    >
>
>By understanding and applying these practices, you can effectively manage memory in C, reducing errors and improving program stability.

- - - -
#### E.1.3) Summary: Key Concepts and Best Practices for Using Pointers

1. **`malloc` and `free`:**
    - `malloc` allocates memory, and you must pair it with `free` to release the memory.
    - Example:
        ```c
        char *ptr = malloc(10); // Allocate 10 bytes
        free(ptr);              // Release memory
        ```


2. **Invalid Reads/Writes:**
    - Writing outside the bounds of allocated memory is an invalid write.
    - Reading unallocated memory is an invalid read.


3. **Memory Leaks:**
    - Forgetting to call `free` leads to memory leaks, where the program holds onto memory it no longer uses.

**Best Practices for Using Pointers and Memory:**

1. **Always check `malloc` return value**:
    ```c
    char *ptr = malloc(size);
    if (ptr == NULL) {
        // Handle memory allocation failure
    }
    ```
    
2. **Initialize pointers**:
    - Avoid garbage values by setting pointers to `NULL` or valid memory.
    - Example:
        
        ```c
        int *ptr = NULL; // Initialize pointer
        ```
        
3. **Avoid dangling pointers**:
    - After freeing memory, set the pointer to `NULL`:
        ```c
        free(ptr);
        ptr = NULL;
        ```
        
4. **Use tools like Valgrind**:
    - Regularly run Valgrind to check for memory leaks or invalid accesses:
        
        ```bash
        valgrind ./program_name
        ```
        


- - -
#### E.1.4) Example: Uninitialized Variables and Garbage Values with Arrays

- **Issue**: Using uninitialized variables ( e.g., an array) leads to unpredictable values, known as _garbage values_.

**Example**: get garbage values
```c
int scores[3];
for (int i = 0; i < 3; i++) {
	printf("%d\n", scores[i]); // Prints garbage values
}
    ```

![](https://i.imgur.com/sZeRmrI.png)

- **Reason**: Memory is not automatically cleared when allocated. It retains whatever was previously stored there.

 **Fix**: Always initialize variables before use:
```c
int scores[3] = {0}; // All elements set to 0


F) Example: Swapping Values with Temporary Variables and Pointers

Example Using Pointers: pass addresses to the function, not just values

void swap(int *a, int *b) {
    int temp = *a;
    *a = *b;
    *b = temp;
}

int main() {
    int x = 1, y = 2;
    swap(&x, &y);
    printf("x: %d, y: %d\n", x, y); // Outputs: x: 2, y: 1
}

G) Memory Layout: understanding Heap and Stack


G.1) Recursive Functions and Stack Overflows

Example:

void draw(int height) {
	if (height == 0) return; // Base case, without it we get stack overflow
	draw(height - 1); // recursion
	for (int i = 0; i < height; i++) {
		printf("#");
	}
	printf("\n");
}

G.2) Buffer Overflows

Example:

char buffer[5];
strcpy(buffer, "Hello, world!"); // Overflows buffer

G.3) Key Takeaways for Robust Memory Management

  1. Always initialize variables: Avoid garbage values.
  2. Manage memory explicitly: Free memory allocated with malloc.
  3. Use pointers carefully: Ensure they are initialized before dereferencing.
  4. Test thoroughly: Use tools like Valgrind to catch hidden bugs.
  5. Design cautiously:
    • Limit recursion depth to avoid stack overflow.
    • Validate inputs to prevent buffer overflows.
  6. Debug systematically:
    • Use printf for quick checks.
    • Leverage debuggers and tools like Valgrind for deeper issues.

By applying these principles, you can write safer, more efficient C programs while navigating the complexities of memory management.


H) Transitioning from Training Wheels (Advanced C Concepts)


H.1) Dangers of scanf

Example:

int x;
printf("x: ");
scanf("%d", &x);
printf("x: %d\n", x);

H.2) Handling Strings with scanf

Example:

char s[50]; // Allocates 50 bytes - USES Stack Memory
printf("Name: ");
scanf("%49s", s); // Reads at most 49 characters

Alternative: Dynamic memory allocation:

char *s = malloc(100); // Allocates memory dynamically - USES Heap Memory
scanf("%99s", s);      // Reads safely into allocated space
free(s);               // Free memory after use

I) File Manipulation with C

I.1) File Input/Output (Write and Read from Files) - FILE Data Type

Example: Writing to a File:

FILE *file = fopen("phonebook.csv", "a");
if (file == NULL) {
    printf("Error opening file.\n");
    return 1;
}
fprintf(file, "%s,%s\n", "David", "123-456-7890");
fclose(file);

Example: Reading from a File:

FILE *file = fopen("phonebook.csv", "r");
if (file == NULL) {
    printf("Error opening file.\n");
    return 1;
}
char name[50], phone[15];
while (fscanf(file, "%49[^,],%14s\n", name, phone) != EOF) {
    printf("Name: %s, Phone: %s\n", name, phone);
}
fclose(file);

I.2) Working with Images

I.2.1) File Formats and Byte-Level Manipulation (Identify JPEG Files)

Reading File Headers:

unsigned char buffer[3];
fread(buffer, sizeof(buffer), 1, file);
if (buffer[0] == 0xFF && buffer[1] == 0xD8 && buffer[2] == 0xFF) {
    printf("This is a JPEG file.\n");
}

I.2.2) Example: Copying Files (cp Implementation)

Example:

FILE *src = fopen("source.jpg", "r");
FILE *dest = fopen("copy.jpg", "w");
unsigned char buffer[512];
size_t bytesRead;
while ((bytesRead = fread(buffer, 1, sizeof(buffer), src)) > 0) {
    fwrite(buffer, 1, bytesRead, dest);
}
fclose(src);
fclose(dest);

I.2.3) Example: Manipulating Image Files

Example: Grayscale Filter:

for (int i = 0; i < height; i++) {
    for (int j = 0; j < width; j++) {
        RGBTRIPLE pixel = image[i][j];
        int grayscale = (pixel.rgbtRed + pixel.rgbtGreen + pixel.rgbtBlue) / 3;
        pixel.rgbtRed = pixel.rgbtGreen = pixel.rgbtBlue = grayscale;
        image[i][j] = pixel;
    }
}

I.3) Summary: Low Level C and File Manipulation

By mastering these topics, you gain a deeper understanding of how programs interact with memory, files, and data at the byte level. These skills are essential for writing efficient, low-level code.


🔙 Previous Part | Next Part 🔜

↩️ Go Back


Z) 🗃️ Glossary

File Definition
Uncreated files Origin Note
address-of operator W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
address-of operator W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
array W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
arrays W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
binary W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
buffer W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
Buffer overflow W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
bytes W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
characters W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
dereference operator W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
Heap W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
hexadecimal W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
↩️ Go Back W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
memory addresses W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
null pointer (NULL) W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
null terminator (NUL) W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
null terminator (NUL) W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
null terminator (NUL) W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
Pass-by-reference W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
Pass-by-value W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
pointer W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
pointer W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
pointer W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
pointers W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
Stack W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
stack overflow W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
strings W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
valgrind W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C
Valgrind W4 - 👨‍🏫 Lecture - Understanding Hexadecimal Memory Addresses, Pointers, and Strings in C