Learning CPP via Advent of Code 2024 [Day 9]

Learning CPP via Advent of Code 2024 [Days 9]#

Welcome back to my C++ learning journey through Advent of Code 2024! Continuing from Days 1 to 8, this part covers my solutions and reflections for Days 9. The goal remains the same: using these challenges to get better at C++.

This is my attempt to document my learning journey through Advent of Code 2024. Instead of cluttering with each post for each day’s challenge, I’ll be grouping my progress into consolidated posts. This page is the second part of my progress log covering Day 9.

Full solutions are available on GitHub: ABD-01/AoC2024

Day 9: Disk Fragmenter#

Solution Overview#

The disk map uses a dense format to represent the layout of files and free space on the disk. The digits alternate between indicating the length of a file and the length of free space.

—adventofcode.com/2024/day/9

Each file on the disk has an ID number, assigned sequentially based on its appearance, starting with ID 0. For example, the disk map 12345 consists of three files: a one-block file (ID 0), a three-block file (ID 1), and a five-block file (ID 2)

For example, in the disk map 24531

  • 2 represents a two-block file.

  • 4 represents four blocks of free space.

  • 5 represents a five-block file.

  • 3 represents three blocks of free space.

  • 1 represents a one-block file.” And it will look like

00....11111...2

The checksum for the disk map is sum of block’s position multiplied by file Id it contains. Given a disk map, rearrange it to make the disk compact.

Part 1:#

The arrangement to be done is by moving a single file block from the end of the disk to the available free space in the beginning of the disk. For instance disk map= 13212141 The movement would look like

0...11.22.3333.
03..11.22.333..
033.11.22.33...
033311.22.3....
033311322......

Logic:

  • Iterate over the disk with two pointers.

    • The first pointer, \(i\), starts at 0 and increments by 1, this will give the index of individual blocks (both files and spaces).

    • Second one, \(j\) start at the position of last file block, and decrements by 2 to be always pointing at file block.

  • If current position index is even (i.e. file block), do nothing.

  • If current position index is odd (i.e. free space), move files from block \(j\) into free space until:

    • free space at current position is used up, or

    • length of free space block is more than length of file block at \(j\), move \(j\) left to next file block and repeat.

 64void part1(std::vector<int> dm)
 65{
 66    Timer t;
 67
 68    long long unsigned int result = 0;
 69    int i = 0,j,id_first = 0, id_last;
 70    j = (dm.size() % 2) ? dm.size() - 1 : dm.size() - 2;
 71    id_last = j / 2;
 72    // An individual file/empty space is different from file/empty block. Files in same block have same id. 
 73    // i is the position of individual file/empty space
 74    // j is the position of file block starting from the end (decrements by 2 to keep pointing to file block)
 75    // id_first and id_last is the ids of files blocks. 
 76
 77    for(int pos = 0; pos < dm.size(); ++pos)
 78    {
 79        if( pos % 2 == 0) // even position i.e it is file block
 80        {
 81            for(int k = 0; k < dm[pos]; ++k)  // for files in block at pos
 82            {
 83                result += (id_first*i);
 84                ++i;
 85            }
 86            ++id_first;
 87        }
 88        else // odd position i.e. empty block
 89        {
 90            for(int k = 0; k < dm[pos]; ++k) // for empty spaces in block at pos
 91            {
 92                while(dm[j] == 0 && j > pos) //  shift j left until 
 93                {
 94                    j-=2;
 95                    --id_last;
 96                }
 97                if(j<pos) break;
 98
 99                --dm[j];
100                result += (id_last*i);
101                ++i;
102            }
103        }
104        if(j<pos) break;
105    }
106    cout << "Part 1: " << result << endl;
107}

Part 2:#

Rather than moving individual file blocks, move whole file instead to the first available free space. Using same example from Part 1, the movement would look like

0...11.22.3333.
0...11.22.3333.
022.11....3333.
022.11....3333.

The file with ID 3 could not be moved because there was no contiguous free space of length 4 available

I was stuck on this problem for a week, so first wrote the solution in python. Used a list of structure Block with two attributes block_size, file_id to represent the disk map. The original python code can be found at AoC2024/Day09_Disk_Fragmenter/main.py.

145std::vector<Block> moveBlocks(std::vector<Block>& dm, int j) {
146    std::vector<Block> tmp;
147    if (dm[j].file_id == -1)
148        return dm;  // Selected block is empty, return as is
149
150    for (size_t i = 0; i < dm.size(); i++) {
151        if (j < i) {
152            tmp.insert(tmp.end(), dm.begin() + i, dm.end());
153            break;
154        }
155
156        if (dm[i].file_id != -1) {
157            tmp.push_back(dm[i]);
158            continue;
159        }
160
161        if (dm[i].block_size < dm[j].block_size) {
162            tmp.push_back(dm[i]);
163            continue;
164        }
165
166        int diff = dm[i].block_size - dm[j].block_size;
167        if (diff > 0) {
168            tmp.emplace_back(dm[j].block_size, dm[j].file_id);
169            tmp.emplace_back(diff, -1);
170            dm[j].file_id = -1;
171        } else {
172            tmp.emplace_back(dm[j].block_size, dm[j].file_id);
173            dm[j].file_id = -1;
174        }
175
176        tmp.insert(tmp.end(), dm.begin() + i + 1, dm.end());
177        break;
178    }
179
180    return tmp;
181}
182
183void part2(std::vector<int> data) {
184    Timer t;
185    std::vector<Block> dm; /*populate the list*/
186    printBlocks(dm);
187
188    for (int j = dm.size() - 1; j >= 0; j--) {
189        dm = moveBlocks(dm, j);
190        printBlocks(dm);
191    }
192
193    long long unsigned int result = 0;
194    int pos = 0;
195    
196    for (const auto& b : dm) {
197        for (int i = 0; i < b.block_size; i++) {
198            if (b.file_id != -1)
199                result += (static_cast<unsigned long long>(b.file_id) * pos);
200            pos++;
201        }
202    }
203
204    std::cout << "Part 2: " << result << std::endl;
205}
Optimizations#

See my discussion on reddit related this section.

The obvious performance bottleneck in Python was the repeated creation of new copies of the disk map (dm = moveBlocks(dm, j)) in each iteration. Additionally, Python’s append and extend operations are expensive as well.

The solution was to update the disk map in place. This required to not use iterator on list that is changing it’s length inside the loop. Using u/TheZigerionScammer’s suggestion, I iterated backwards over file ids and selected index from list based on the id.

Apart from that porting the solution to C++, caused massive improvement as Python does not like loops.

162void moveBlocks(std::vector<Block>& dm, int j) {
163	...
164        int diff = dm[i].block_size - dm[j].block_size;
165        if (diff > 0) {
166            dm[i].block_size = diff;
167            dm.emplace(dm.begin()+i, dm[j].block_size, dm[j].file_id);
168            dm[j+1].file_id = -1; // to account for len increase
169        } else {
170            std::swap(dm[i], dm[j]);
171        }
172	...
173}
174void part2(std::vector<int> data) {
175	...
176    for (int id = id_last; id > -1; --id) {
177        j = dm.size() - 1;
178        while(dm[j].file_id != id) --j;
179
180        moveBlocks(dm, j);
181    }
182    ...
183}
Performance on Day 9 Part 2#

Solution

Time (in secs)

File

Original Code

7.714727

main.py

In Place Update
(u/TheZigerionScammer’s suggestion)

7.579476

main2.py

List of Lists of Block
(u/ndunnett’s suggestion)

12.329163

main3.py

Original Solution in C++

0.428756

main.cpp
(with macro USE_LEGACY)

In Place Update in C++

0.096095

main.cpp

Hardware Used: GitHub Codespaces VM: AMD EPYC 7763 (2 vCPUs @ 3.2GHz), 8GB RAM, 32GB

Found a more optimized solution on the internet, that solves this in \(O(N\log N)\).

Concepts Learned#

std::vector<T>::emplace#

std::vector::emplace inserts a new element into the vector at a specified position by constructing it in place, which can improve performance by avoiding unnecessary copies or moves.

Example
1int main() {
2    std::vector<std::pair<int, std::string>> vec;
3
4    vec.emplace(vec.begin(), 1, "one"); 
5
6    // Without emplace, we would need to create a temporary std::pair and use push/insert:
7    // vec.insert(vec.begin(), std::make_pair(1, "one"));
8}

Extending a vector in C++#

In C++, the equivalent of Python’s list1.extend(list2) can be achieved using std::vector<T>::insert. {lineno-start=241}

1tmp.insert(tmp.end(), dm.begin() + i, dm.end());
2// Appends elements from index 'i' to the end of 'dm' into 'tmp'.

Calling Python from C++ and Compiling with CMake#

 8find_package(Python3 COMPONENTS Interpreter Development)
 9
10if(NOT Python3_FOUND)
11	message(FATAL_ERROR "Python3 not found")
12endif()
13# set(Python3_ROOT_DIR "/path/to/Python/Python310")
14# set(Python3_INCLUDE_DIRS "${Python3_ROOT_DIR}/include/")
15# set(Python3_LIBRARIES "${Python3_ROOT_DIR}/libs/python310.lib")
16
17target_include_directories(day9 PRIVATE ${Python3_INCLUDE_DIRS})
18target_link_libraries(day9 PRIVATE ${Python3_LIBRARIES})
19
20if(NOT WIN32)  # Link pthread, dl, and util on non-Windows platforms
21    target_link_libraries(day9 PRIVATE pthread dl util)
22endif()

References and Resources#

Comments