Advent of Code 2024 [Days 10 and 11]

Advent of Code 2024 [Days 10 and 11]#

Welcome back to my C++ learning journey through Advent of Code 2024! Continuing from Day 9, this part covers my solutions and reflections for Days 10 and 11. The goal remains the same: using these challenges to get better at C++.

This is my attempt to document my learning journey through Advent of Code 2024. Instead of cluttering with each post for each day’s challenge, I’ll be grouping my progress into consolidated posts. This page is the fourth part of my progress log covering Days 10 and 11.

Full solutions are available on GitHub: ABD-01/AoC2024.

Day 10: Hoof It - Finding Hiking Trails on a Topographic Map#

Solution Overview#

Given a topographic map where each position’s height ranges from 0 (lowest) to 9 (highest), the goal is to find a valid hiking trail. A valid trail is one where the height increases gradually (i.e., by 1 unit at each step) from a trailhead (0-height) to a 9-height position.

TopographicalMap.png

Part 1 : For each trailhead, find number of distinct 9-height position that can be reached. Perform a Depth First Search (DFS) from each trailhead. When reaching a 9-height position, increment the result count. To ensure distinct target counting, keep track of visited positions.

Part 2: For each trailhead, find the number of distinct trails leading to 9-height position. Same as before, perform a DFS. However, this time, multiple trails can lead to same target (i.e. 9-height), hence will not stop exploration even if position is already visited.

 76void DFS(const std::vector<std::vector<int>>& map, const std::pair<int, int>& idx, std::unordered_set<int>& visited, int& result)
 77{
 78    static std::pair<int, int> dirs[4] = {
 79        {-1, 0}, {0, 1}, {1, 0}, {0, -1}
 80    };
 81
 82    int curr = map[idx.first][idx.second];
 83
 84#if PART_2
 85    if(visited.find((idx.first*map[0].size() + idx.second)) != visited.end()) return; 
 86    visited.insert((idx.first*map[0].size() + idx.second));
 87#endif
 88
 89    if(curr == 9)
 90    {
 91        cout << "Found 9-height position" << endl;
 92        ++result;
 93        return;
 94    }
 95
 96    for(const auto& dir: dirs)
 97    {
 98        int i = idx.first+dir.first, j = idx.second+dir.second;
 99        if(i<0||i>map.size()-1||j<0||j>map[0].size()-1)
100        {
101            continue;
102        }
103        int temp = map[i][j];
104        if(temp-curr != 1)
105        {
106            continue;
107        }
108        DFS(map, {i,j}, visited, result);
109    }
110}
111
112int main(int argc, char* argv[])
113{
114    while(!toVisit.empty())
115    {
116        int result = 0;
117        std::unordered_set<int> visited = {};
118        DFS(map, toVisit.top(), visited, result);
119        toVisit.pop();
120        r += result;
121    }
122    cout << "Part " <<  (PART_2 ? 2 : 1) <<  ": " << r << endl;
123}

Optimizations#

Instead of using unordered_set<int> data structure for keeping record of visited, tried using different data structures.

Performance on Day 10 Part 2#

Data Structure

Time (µs)

vector<vector<bool>>

1044

set<int>

402

unordered_set<int>

314

vector<char>

134

vector<bool>

113

  1. vector<vector<bool>> performs poorly due to:

    • It has double indirection overhead.

    • Memory layout is non-contiguous, this means elements of different rows are not stored contiguously, leading to inefficient memory access.

  2. set<int> is relatively slower:

    • Uses balanced tree (typically Red-Black Tree) with \(O(\log N)\) insertion and lookups.

  3. unordered_set<int>:

    • hashing overhead

    • amortized insertion time is \(O(1)\), even though worst-case insertion (rehashing) is \(O(n)\).

    • unordered_set<int> is useful when grid is extremely sparse, meaning only a few cells need to be marked as visited. However, in dense grids, it adds hashing overhead.

  4. vector<char> and vector<bool> are fastest:

    • both use contiguous in memory, which minimizes cache misses and improves performance

    • vector<bool> packs multiple boolean values (up to 8) into a single byte.

39template<typename VisitedType>
40void DFS(const std::vector<std::vector<int>>& map, const std::pair<int, int>& idx, VisitedType& visited, int& result);
41
42template <typename VisitedType>
43bool isVisitedAndMark(VisitedType& visited, const std::pair<int, int>& idx, int width);
44
45// Specialization for std::unordered_set<int>
46template<>
47bool isVisitedAndMark<std::unordered_set<int>>(std::unordered_set<int>& visited, const std::pair<int, int>& idx, int width) {
48    int pos = idx.first * width + idx.second;
49    if (visited.find(pos) != visited.end()) return true;
50    visited.insert(pos);
51    return false;
52}
53
54// Specialization for std::vector<std::vector<bool>>
55template<>
56bool isVisitedAndMark(std::vector<std::vector<bool>>& visited, const std::pair<int, int>& idx, int) {
57    if (visited[idx.first][idx.second]) return true;
58    visited[idx.first][idx.second] = true;
59    return false;
60}

Concepts Learned#

vector<bool>#

std::vector<bool> is bit-packed, meaning it stores multiple bool values in a single byte, reducing memory footprint.

std::vector<bool> is a possibly space-efficient specialization of std::vector for the type bool.

Source: gcc/libstdc++-v3/include/bits/stl_bvector.h

Best for very large boolean datasets where memory efficiency is critical or need a dynamically resizable bit-array but don’t require direct bit manipulation.

Pros

Cons

Space-efficient (bit-packed), dynamic resizing.

no direct memory access.

Contiguous memory allows for efficient cache utilization.

performance issues due to bit manipulations. (Slower per-element modification)

1std::vector<bool> vec = {false, false, true, false};
2// std::vector<bool> does not store actual bools
3for(auto& i: vec) // compilation error: cannot bind `bool&` to `std::vector<bool>::reference
4{
5	std::cout << i << " "; 
6}
Alternatives#
  • std::bitset<N>: Faster but requires compile-time size.

  • std::vector<char>: Uses 1 byte per value, avoids bit-packing overhead, direct memory access.

Return Value Optimization#

# TODO

Example
1std::vector<std::pair<int, int>> my_function() {
2    std::vector<std::pair<int, int>> local_variable;
3    return std::move(local_variable);
4}
5// warning: moving a local object in a return statement prevents copy elision [-Wpessimizing-move]

Template Specialization#

In C++, function template specialization allows you to define custom implementations of a function template for specific types. This is useful when the default behavior is inefficient or incorrect for certain data structures.

1template <typename T>
2bool isVisitedAndMark(T& visited, const std::pair<int, int>& idx, int width);
3
4template <>  // explicit specialization for T = std::unordered_set<int>
5bool isVisitedAndMark<std::unordered_set<int>>(std::unordered_set<int>& visited,...) {
6    // Implementation for std::unordered_set<int>
7}

When specializing a function template, its template arguments can be omitted if template argument deduction can provide them from the function arguments:

—Explicit specializations of function templates - cppreference.com

std::find#

Example
 1#include <iostream>
 2#include <vector>
 3#include <algorithm>
 4
 5int main() {
 6    std::vector<int> nums = {10, 20, 30, 40, 50};
 7    std::vector<int>::iterator it = std::find(nums.begin(), nums.end(), 30);
 8
 9    if (it != nums.end())
10        std::cout << "Found at index: " << std::distance(nums.begin(), it) << std::endl;
11    else
12        std::cout << "Not found." << std::endl;
13}

Source: gcc/libstdc++-v3/include/bits/stl_algo.h

Stack in CPP#

# TODO

References and Resources#


Day 11: Plutonian Pebbles#

Solution Overview#

Given a list of pebbles, each with a numerical value. After each blink, the number of pebbles and their values change according to specific rules. The goal is to find the total number of pebbles after a set number of blinks.

The conditions are: $\( f(n+1) = \left\{ \begin{array}{ll} 1 & \text{if } n = 0 \\ \text{HIGH}(n), \text{LOW}(n) & \text{if numDigits(\)n\() is even} \\ n \times 2024 & \text{else} \end{array} \right. \)$

where \(\text{HIGH}(n)\) refers to the first half of digits., \(\text{LOW}(n)\) refers to the second half of digits.

  • Example: If \(n = 123456 ,\ \text{HIGH}(n) = 123, \ \text{LOW}(n) = 456\)

My solution was to use a recursive function where each call represents a single blink. It has a depth parameter that terminates the recursion when the desired number of blinks is achieved.

Part 1: 25 Blinks Part 2: 75 Blinks

134void blink(ull value, ull& result, int numBlinks,  int depth)
135{
136    if(depth > numBlinks - 1) // depth start with 0
137        return;
138    
139    if(value == 0) 
140    {
141        value = 1;
142        return blink(value, result, numBlinks, depth+1);
143    }
144    int n = numDigits(value);
145    if (!(n%2)) // n is even
146    {
147        ull splitValue = 0;
148        int lastDigit = 0;
149        int base10 = 1;
150        for(auto i = 0; i < n/2; ++i)
151        {
152            lastDigit = value % 10;
153            value /= 10;
154            splitValue += (lastDigit * base10);
155            base10 *= 10;
156        }
157        result++;
158        blink(value, result, numBlinks, depth+1);
159        blink(splitValue, result, numBlinks, depth+1);
160        return;
161    }
162    value = value * 2024;
163    return blink(value, result, numBlinks, depth+1);
164}

This approach works well for Part 1, but for Part 2, the sheer number of recursive calls makes the program infeasible. The exponential growth in the number of pebbles causes excessive memory usage and function calls, leading to stack overflow.

We cannot store the entire list of stones, however we can store the count of each different stone instead. I could make a map that store value as key, and it’s count as the value, but I was unsure of how many different stones are possible. To get a very rough upper bound assume, in the worst case each pebble split into 2 every blink, we will have \(2^{numBlinks}\) pebbles at the end. How many distinct?? I don’t know… Help me with that if you solved it that way. If you’ve computed this bound more rigorously, let me know in the comments!

Since, I was skeptical about the memory used to store each stone and it’s count. I used a different method, just caching the result for stones with values 0 to 9 for \(numBlinks\). {lineno-start=42}

1std::vector<std::vector<ull>> cache(10, std::vector<ull>(MAX_NUM_BLINKS, 0));
2// stores the resulting number of stones for values from 0 to 9
3// cache[v][b-1] represent number of pebbles after blinking b times starting with pebble of value v
121void fill_cache(int numBlinks)
122{
123    for(auto nb = 0; nb < numBlinks; ++nb)
124    {
125        for(ull i = 0; i < 10; ++i)
126        {
127            ull r = 1;
128            blink(i, r, nb+1);
129            cache[i][nb] = r;
130        }
131    }
132}

So before I start solving, I already know that cache[3][55] is what would happen if stone with value \(3\) is blinked \(56\) times. This will help short-circuit the entire blink calls for that value.

While pebbles may have values greater than 9, many will eventually be reduced to a single-digit value due to repeated splitting. Thus, caching results for numbers 0-9 is a memory-efficient approximation

This caching enabled finding the number of stones for large values. {lineno-start=141}

 1    if(value < 10)
 2    {
 3        if (cache[value][numBlinks - depth - 1] != 0)
 4        {
 5            numShortCircuited++;
 6            DEBUG("Short circuited (" << g_numShortCircuited << ")" << endl);
 7            DEBUG("Pebble " << value << " after " << numBlinks - depth << " blinks will be split into " << cache[value][numBlinks-1 - depth] << endl);
 8            result += (cache[value][numBlinks-1 - depth] - 1); 
 9            // that -1 is there because the current stone is being counted twice.
10            return; // no more recursive call again
11        }
12    }

Also, a bit about numDigits See file: Day11_Plutonian_Pebbles/numDigits.cpp {lineno-start=108}

 1int numDigits(unsigned long long i)
 2{
 3    int n = 1;
 4    if ( i >= 10000000000000000 ) { n += 16; i /= 10000000000000000; }
 5    if ( i >= 100000000         ) { n += 8; i /= 100000000; }
 6    if ( i >= 10000             ) { n += 4; i /= 10000; }
 7    if ( i >= 100               ) { n += 2; i /= 100; }
 8    if ( i >= 10                ) { n += 1; }
 9
10    return n;
11    // ref: https://stackoverflow.com/a/6655759
12}
Performance on Day 11 Part 2#

Approach

g++ (No -O3)

g++ -O3

Clang -O3

Speedup (g++ No -O3 → Clang -O3)

StackOverflow (Bitwise Check)

4,148,573 ns

2,298,124 ns

120 ns

~34,500x Faster

Logarithmic (log10)

14,081,701 ns

10,078,787 ns

40 ns

~350,000x Faster

Iterative Division (/ 10)

38,035,651 ns

11,254,840 ns

30 ns

~1.26M× Faster

String Conversion (std::to_string)

73,341,935 ns

34,854,592 ns

31,866,131 ns

~2.3x Faster

Builtin CLZ (__builtin_clzll)

2,909,022 ns

2,799,729 ns

30 ns

~72,000x Faster

Concepts Learned#

Constant Expression#

Introduced in C++11.

  • can be evaluated at compile time

  • give the compiler deep insight

  • constexpr is by design thread safe (A data race requires shared mutable state, something which is const is not mutable).

References and Resources#


This marks the fourth part of my Advent of Code 2024 journey. More updates soon. If you’re also learning C++ or participating in Advent of Code, I’d love to hear about your experiences! Share your thoughts, tips, or solutions in the comments below.

Comments