Advent of Code 2024 [Days 10 and 11]
Advent of Code 2024 [Days 10 and 11]#
Welcome back to my C++ learning journey through Advent of Code 2024! Continuing from Day 9, this part covers my solutions and reflections for Days 10 and 11. The goal remains the same: using these challenges to get better at C++.
This is my attempt to document my learning journey through Advent of Code 2024. Instead of cluttering with each post for each day’s challenge, I’ll be grouping my progress into consolidated posts. This page is the fourth part of my progress log covering Days 10 and 11.
Full solutions are available on GitHub: ABD-01/AoC2024.
Day 10: Hoof It - Finding Hiking Trails on a Topographic Map#
Solution Overview#
Given a topographic map where each position’s height ranges from 0
(lowest) to 9
(highest), the goal is to find a valid hiking trail. A valid trail is one where the height increases gradually (i.e., by 1 unit at each step) from a trailhead (0-height
) to a 9-height
position.
Part 1 : For each trailhead, find number of distinct 9-height
position that can be reached.
Perform a Depth First Search (DFS) from each trailhead. When reaching a 9-height
position, increment the result count. To ensure distinct target counting, keep track of visited positions.
Part 2: For each trailhead, find the number of distinct trails leading to 9-height
position.
Same as before, perform a DFS. However, this time, multiple trails can lead to same target (i.e. 9-height
), hence will not stop exploration even if position is already visited.
76void DFS(const std::vector<std::vector<int>>& map, const std::pair<int, int>& idx, std::unordered_set<int>& visited, int& result)
77{
78 static std::pair<int, int> dirs[4] = {
79 {-1, 0}, {0, 1}, {1, 0}, {0, -1}
80 };
81
82 int curr = map[idx.first][idx.second];
83
84#if PART_2
85 if(visited.find((idx.first*map[0].size() + idx.second)) != visited.end()) return;
86 visited.insert((idx.first*map[0].size() + idx.second));
87#endif
88
89 if(curr == 9)
90 {
91 cout << "Found 9-height position" << endl;
92 ++result;
93 return;
94 }
95
96 for(const auto& dir: dirs)
97 {
98 int i = idx.first+dir.first, j = idx.second+dir.second;
99 if(i<0||i>map.size()-1||j<0||j>map[0].size()-1)
100 {
101 continue;
102 }
103 int temp = map[i][j];
104 if(temp-curr != 1)
105 {
106 continue;
107 }
108 DFS(map, {i,j}, visited, result);
109 }
110}
111
112int main(int argc, char* argv[])
113{
114 while(!toVisit.empty())
115 {
116 int result = 0;
117 std::unordered_set<int> visited = {};
118 DFS(map, toVisit.top(), visited, result);
119 toVisit.pop();
120 r += result;
121 }
122 cout << "Part " << (PART_2 ? 2 : 1) << ": " << r << endl;
123}
Optimizations#
Instead of using unordered_set<int>
data structure for keeping record of visited, tried using different data structures.
Data Structure |
Time (µs) |
---|---|
1044 |
|
402 |
|
314 |
|
134 |
|
113 |
vector<vector<bool>>
performs poorly due to:It has double indirection overhead.
Memory layout is non-contiguous, this means elements of different rows are not stored contiguously, leading to inefficient memory access.
set<int>
is relatively slower:Uses balanced tree (typically Red-Black Tree) with \(O(\log N)\) insertion and lookups.
unordered_set<int>
:hashing overhead
amortized insertion time is \(O(1)\), even though worst-case insertion (rehashing) is \(O(n)\).
unordered_set<int>
is useful when grid is extremely sparse, meaning only a few cells need to be marked as visited. However, in dense grids, it adds hashing overhead.
vector<char>
andvector<bool>
are fastest:both use contiguous in memory, which minimizes cache misses and improves performance
vector<bool>
packs multiple boolean values (up to 8) into a single byte.
39template<typename VisitedType>
40void DFS(const std::vector<std::vector<int>>& map, const std::pair<int, int>& idx, VisitedType& visited, int& result);
41
42template <typename VisitedType>
43bool isVisitedAndMark(VisitedType& visited, const std::pair<int, int>& idx, int width);
44
45// Specialization for std::unordered_set<int>
46template<>
47bool isVisitedAndMark<std::unordered_set<int>>(std::unordered_set<int>& visited, const std::pair<int, int>& idx, int width) {
48 int pos = idx.first * width + idx.second;
49 if (visited.find(pos) != visited.end()) return true;
50 visited.insert(pos);
51 return false;
52}
53
54// Specialization for std::vector<std::vector<bool>>
55template<>
56bool isVisitedAndMark(std::vector<std::vector<bool>>& visited, const std::pair<int, int>& idx, int) {
57 if (visited[idx.first][idx.second]) return true;
58 visited[idx.first][idx.second] = true;
59 return false;
60}
Concepts Learned#
vector<bool>#
std::vector<bool>
is bit-packed, meaning it stores multiple bool
values in a single byte, reducing memory footprint.
std::vector<bool>
is a possibly space-efficient specialization of std::vector for the type bool.
Source: gcc/libstdc++-v3/include/bits/stl_bvector.h
Best for very large boolean datasets where memory efficiency is critical or need a dynamically resizable bit-array but don’t require direct bit manipulation.
Pros |
Cons |
---|---|
Space-efficient (bit-packed), dynamic resizing. |
no direct memory access. |
Contiguous memory allows for efficient cache utilization. |
performance issues due to bit manipulations. (Slower per-element modification) |
1std::vector<bool> vec = {false, false, true, false};
2// std::vector<bool> does not store actual bools
3for(auto& i: vec) // compilation error: cannot bind `bool&` to `std::vector<bool>::reference
4{
5 std::cout << i << " ";
6}
Alternatives#
std::bitset<N>
: Faster but requires compile-time size.std::vector<char>
: Uses 1 byte per value, avoids bit-packing overhead, direct memory access.
Return Value Optimization#
# TODO
Example
1std::vector<std::pair<int, int>> my_function() {
2 std::vector<std::pair<int, int>> local_variable;
3 return std::move(local_variable);
4}
5// warning: moving a local object in a return statement prevents copy elision [-Wpessimizing-move]
Template Specialization#
In C++, function template specialization allows you to define custom implementations of a function template for specific types. This is useful when the default behavior is inefficient or incorrect for certain data structures.
1template <typename T>
2bool isVisitedAndMark(T& visited, const std::pair<int, int>& idx, int width);
3
4template <> // explicit specialization for T = std::unordered_set<int>
5bool isVisitedAndMark<std::unordered_set<int>>(std::unordered_set<int>& visited,...) {
6 // Implementation for std::unordered_set<int>
7}
When specializing a function template, its template arguments can be omitted if template argument deduction can provide them from the function arguments:
—Explicit specializations of function templates - cppreference.com
std::find#
Example
1#include <iostream>
2#include <vector>
3#include <algorithm>
4
5int main() {
6 std::vector<int> nums = {10, 20, 30, 40, 50};
7 std::vector<int>::iterator it = std::find(nums.begin(), nums.end(), 30);
8
9 if (it != nums.end())
10 std::cout << "Found at index: " << std::distance(nums.begin(), it) << std::endl;
11 else
12 std::cout << "Not found." << std::endl;
13}
Stack in CPP#
# TODO
References and Resources#
c++ - Why isn’t vector<bool> a STL container? - Stack Overflow
c++ - What are copy elision and return value optimization? - Stack Overflow
Function Templates Partial Specialization in C++ - Fluent C++
Day 11: Plutonian Pebbles#
Solution Overview#
Given a list of pebbles, each with a numerical value. After each blink, the number of pebbles and their values change according to specific rules. The goal is to find the total number of pebbles after a set number of blinks.
The conditions are: $\( f(n+1) = \left\{ \begin{array}{ll} 1 & \text{if } n = 0 \\ \text{HIGH}(n), \text{LOW}(n) & \text{if numDigits(\)n\() is even} \\ n \times 2024 & \text{else} \end{array} \right. \)$
where \(\text{HIGH}(n)\) refers to the first half of digits., \(\text{LOW}(n)\) refers to the second half of digits.
Example: If \(n = 123456 ,\ \text{HIGH}(n) = 123, \ \text{LOW}(n) = 456\)
My solution was to use a recursive function where each call represents a single blink. It has a depth
parameter that terminates the recursion when the desired number of blinks is achieved.
Part 1: 25 Blinks Part 2: 75 Blinks
134void blink(ull value, ull& result, int numBlinks, int depth)
135{
136 if(depth > numBlinks - 1) // depth start with 0
137 return;
138
139 if(value == 0)
140 {
141 value = 1;
142 return blink(value, result, numBlinks, depth+1);
143 }
144 int n = numDigits(value);
145 if (!(n%2)) // n is even
146 {
147 ull splitValue = 0;
148 int lastDigit = 0;
149 int base10 = 1;
150 for(auto i = 0; i < n/2; ++i)
151 {
152 lastDigit = value % 10;
153 value /= 10;
154 splitValue += (lastDigit * base10);
155 base10 *= 10;
156 }
157 result++;
158 blink(value, result, numBlinks, depth+1);
159 blink(splitValue, result, numBlinks, depth+1);
160 return;
161 }
162 value = value * 2024;
163 return blink(value, result, numBlinks, depth+1);
164}
This approach works well for Part 1, but for Part 2, the sheer number of recursive calls makes the program infeasible. The exponential growth in the number of pebbles causes excessive memory usage and function calls, leading to stack overflow.
We cannot store the entire list of stones, however we can store the count of each different stone instead. I could make a map that store value as key, and it’s count as the value, but I was unsure of how many different stones are possible. To get a very rough upper bound assume, in the worst case each pebble split into 2 every blink, we will have \(2^{numBlinks}\) pebbles at the end. How many distinct?? I don’t know… Help me with that if you solved it that way. If you’ve computed this bound more rigorously, let me know in the comments!
Since, I was skeptical about the memory used to store each stone and it’s count. I used a different method, just caching the result for stones with values 0 to 9 for \(numBlinks\). {lineno-start=42}
1std::vector<std::vector<ull>> cache(10, std::vector<ull>(MAX_NUM_BLINKS, 0));
2// stores the resulting number of stones for values from 0 to 9
3// cache[v][b-1] represent number of pebbles after blinking b times starting with pebble of value v
121void fill_cache(int numBlinks)
122{
123 for(auto nb = 0; nb < numBlinks; ++nb)
124 {
125 for(ull i = 0; i < 10; ++i)
126 {
127 ull r = 1;
128 blink(i, r, nb+1);
129 cache[i][nb] = r;
130 }
131 }
132}
So before I start solving, I already know that cache[3][55]
is what would happen if stone with value \(3\) is blinked \(56\) times. This will help short-circuit the entire blink calls for that value.
While pebbles may have values greater than 9, many will eventually be reduced to a single-digit value due to repeated splitting. Thus, caching results for numbers 0-9
is a memory-efficient approximation
This caching enabled finding the number of stones for large values. {lineno-start=141}
1 if(value < 10)
2 {
3 if (cache[value][numBlinks - depth - 1] != 0)
4 {
5 numShortCircuited++;
6 DEBUG("Short circuited (" << g_numShortCircuited << ")" << endl);
7 DEBUG("Pebble " << value << " after " << numBlinks - depth << " blinks will be split into " << cache[value][numBlinks-1 - depth] << endl);
8 result += (cache[value][numBlinks-1 - depth] - 1);
9 // that -1 is there because the current stone is being counted twice.
10 return; // no more recursive call again
11 }
12 }
Also, a bit about numDigits
See file: Day11_Plutonian_Pebbles/numDigits.cpp
{lineno-start=108}
1int numDigits(unsigned long long i)
2{
3 int n = 1;
4 if ( i >= 10000000000000000 ) { n += 16; i /= 10000000000000000; }
5 if ( i >= 100000000 ) { n += 8; i /= 100000000; }
6 if ( i >= 10000 ) { n += 4; i /= 10000; }
7 if ( i >= 100 ) { n += 2; i /= 100; }
8 if ( i >= 10 ) { n += 1; }
9
10 return n;
11 // ref: https://stackoverflow.com/a/6655759
12}
Approach |
g++ (No -O3) |
g++ -O3 |
Clang -O3 |
Speedup (g++ No -O3 → Clang -O3) |
---|---|---|---|---|
StackOverflow (Bitwise Check) |
4,148,573 ns |
2,298,124 ns |
120 ns |
~34,500x Faster |
Logarithmic ( |
14,081,701 ns |
10,078,787 ns |
40 ns |
~350,000x Faster |
Iterative Division ( |
38,035,651 ns |
11,254,840 ns |
30 ns |
~1.26M× Faster |
String Conversion ( |
73,341,935 ns |
34,854,592 ns |
31,866,131 ns |
~2.3x Faster |
Builtin CLZ ( |
2,909,022 ns |
2,799,729 ns |
30 ns |
~72,000x Faster |
Concepts Learned#
Constant Expression#
Introduced in C++11.
can be evaluated at compile time
give the compiler deep insight
constexpr is by design thread safe (A data race requires shared mutable state, something which is
const
is not mutable).
References and Resources#
algorithm - Finding the number of digits of an integer - Stack Overflow
Back to Basics: const and constexpr - Rainer Grimm - CppCon 2021
This marks the fourth part of my Advent of Code 2024 journey. More updates soon. If you’re also learning C++ or participating in Advent of Code, I’d love to hear about your experiences! Share your thoughts, tips, or solutions in the comments below.
Comments