Project 2020q3 (Cache Lab) - HackMD (2023)

---title: 2020q3 Project (cache-lab) tags: sysprog---# 2020q3: cache-lab contributed by < `TsundereChen` >> GitHub Repo: [TsundereChen/csapp-cache-lab](https://github .com/TsundereChen/csapp-cache-lab)> [README](https://csapp.cs.cmu.edu/3e/README-cachelab)## Files```- cahcelab.[c,h] Lab Assistant files - csim.c Cache Simulator - csim-ref Executable reference cache simulator - Driver.py Driver, run test-csim and test-trans - Makefile - test-csim Cache simulator test tool - test-trans.c Tool for test transposition function-tracegen.c helper for test-trans-trans.c Transpose function-traces/trace files used by test-csim.c```## Environment* Valgrind must be installed on the system## Part A - Downloading the program argument - [getopt.h](https://www.gnu.org/software/libc/manual/html_node/Using-Getopt.html) - Create arguments according to csim```clike=void parseArg( int argc , char * argv[]) { if (argc == 1) { printf("Usage: ./csim [-hv] -s-MI-B-T\n"); exit(1); }; int opt; while ((opt = getopt(argc, argv, "hvs:E:b:t:")) != -1) { switch (option) { case 'h': printf("Gebruik: ./csim [-hv] -s-MI-B-T\n"); printf("Options:\n"); printf(" -h Print this help message.\n"); printf(" -v Optional extended flag.\n"); printf(" -sNumber of index bits set.\n"); printf(" -EThe number of lines in the set.\n"); printf(" -bBlock offset bits.\n"); printf(" -tTrace file.\n"); printf("\n"); printf("Examples:\n"); printf(" linux> ./csim -s 4 -E 1 -b 4 -ttraces/yi.trace \ n"); printf(" linux> ./csim -v -s 8 -E 2 -b 4 -t traces/yi.trace\n"); output(1); case 'v': full = true; break; 's' case: setIndexBit = atoi(optarg); break; 'E' case: linePerSet = atoi(optarg); break; 'b' case: blockOffsetBit = atoi(optarg); gap; 't' case: traceFile = fopen(optarg, "r"); pause; default: printf("Usage: ./csim [-hv] -s-MI-B-T\n"); exit(1); } }}``### Cache-struct - The cache looks like this![cache_read_cs213](https://i.imgur.com/sJWRc45.png) - Below you will see find the structure line or cache - Part A of this exercise doesn't require us to modify the data in the cache, we just need to check that the correct bit is 1 and that the tag matches input``clike=typedef struct cacheData { int valid; int lru; uint64_t tag; } cacheData;```- Input - input files are stored in the `traces/` folder - all input data have the same format, e.g. ```clike= L 10.1 M 20 , 1 L 22.1 S 18.1

,``` - Four instructions - I (load instruction) - Not used in this lab - L (load) - M (change) - Load + save - Two steps - S (save) - Use `sscanf` to read input - We must extract this information from the `
` - Tag - Cacheset - Offset - Since we already know the size of the cache set and the size of the offset, we can use a bitwise operation to extract this information ``` clike=while(fgets(buffer, sizeof(buffer) , trace file ) != NULL ){ counter++; if(buffer[0] == 'I') continue; else{ sscanf(buffer, "%c%lx,%d", &operation, &address, &size); unsigned int tag = parseTag(setIndexBit, blockOffsetBit, address); unsigned int cacheset = parseCacheSet(setIndexBit, blockOffsetBit, address); unsigned int offset = parseOffset(setIndexBit, blockOffsetBit, address); switch(operation){ case 'L': checkCache(tag, cache set, offset, counter); pause; case 'M': checkCache(tag, cache set, offset, counter); checkCache(tag, cache set, offset, counter); pause; "S" case: checkCache(tag, cache set, offset, counter); pause; } } }```- Hit/Miss cache - If the correct bit is 1 and the tag matches, it is a match - If there is no match, it is a match. - We need to search the cache and see if there is an invalid cache - If so, we activate this cache and enter the tag information - If there is no match and all cache lines are valid - Then it's a ban, you need to find the least used cache cache, then replace it. - By counting the execution line, we can achieve LRU - The smaller the LRU, the less often it is used``` clike=void checkCache(int tag, int cacheSet, int offset, int counter) { for(int i = 0; i < linePerSet; i++){ if(cache[cacheSet][i].valid == 1 && cache[cacheSet][i].tag == tag){ cache[cacheSet][i].lru = counter; hits++; if(full) printf("HIT\n"); give; } } misses +; if(full) printf("ERROR\n"); int n_lru = 0; for(int i = 0; i < linePerSet; i++){ if(cache[cacheSet][i].valid == 0){ cache[cacheSet][i].valid = 1; cache[cacheSet][i].tag = tag; cache[cacheSet][i].lru = counter; give; } else if(cache[cacheSet][n_lru].lru > cache[cacheSet][i].lru) n_lru = i; } evictions++; if(verbose) printf("EXECTION!\n"); cache[cacheSet][n_lru].valid = 1; cache[cacheSet][n_lru].tag = tag; cache[cacheSet][n_lru].lru = counter; return;};```## Part B- You need to write a matrix transpose function - Three test cases - 32 x 32 - 64 x 64 - 61 x 67 - s = 5, E = 1, b = 5 - Cache size is 32*2 - Cache block size is 32 bytes - Can hold 8 integers at a time ### 32 x 32 - Instead of processing one integer at a time as the system is also fetching nearby data at the same time, it can store more numbers at once process integers, use cache and generate faster results. - Since the cache block size is 32 bytes and can hold 8 integers, we move 8 integers at a time. ```clike=// Case 1 for ( int i = 0; i < N; i += 8) { for ( int j = 0; j < M; j += 8) { for (int k = i; k < i + 8; k++) { int t0 = A[k][j]; int t1 = A[k][j+1]; int t2 = A[k][j+2]; int t3 = A[k][j+3]; int t4 = A[k][j+4]; int t5 = A[k][j+5]; int t6 = A[k][j+6]; int t7 = A[k][j+7]; B[j][k] = t0; B[j + 1] [k] = t1; B[j + 2] [k] = t2; B[j + 3] [k] = t3; B[j + 4][k] = t4; B[j + 5][k] = t5; B[j + 6] [k] = t6; B[j + 7] [k] = t7; } } }```:::warningWhy doesn't the following code use the cache?```clike=for (int i = 0; i < N; i += 8) { for (int j = 0; j < M ;j += 8) { for (int k = i; k < i + 8; k++) { B[j + 0][k] = A[k][j]; B[j + 1] [k] = A[k] [j+1]; B[j + 2] [k] = A[k] [j+2]; B[j + 3] [k] = A[k] [j+3]; B[j + 4][k] = A[k][j+4]; B[j + 5][k] = A[k][j+5]; B[j + 6] [k] = A[k] [j+6]; B[j + 7] [k] = A[k] [j+7]; } }}```:::### 64x64- See blog [深入理解计算机系统 (CS:APP) - 高速缓存实验 Cache Lab 解析](https://billc.io/2019/05/ csapp - cachelab /)- Process using 4 blocks.```clike=int t0, t1, t2, t3, t4, t5, t6, t7; for (int i = 0; i < N; i += 8) { for (int j = 0; j < M; j += 8) { for (int k = i; k < i + 4; k++) { t0 = A[k][j]; t1 = A[k][j+1]; t2 = A[k][j+2]; t3 = A[k][j+3]; t4 = A[k][j+4]; t5 = A[k][j+5]; t6 = A[k][j+6]; t7 = A[k][j+7]; B[j][k] = t0; B[j + 1] [k] = t1; B[j + 2] [k] = t2; B[j + 3] [k] = t3; B[j + 0][k + 4] = t7; B[j + 1] [k + 4] = t6; B[j + 2][k+4] = t5; B[j + 3] [k + 4] = t4; } for (int h = 0; h < 4; h++) { t0 = A[i + 4][j + 3 - h]; t1 = A[i + 5][j + 3 - h]; t2 = A[i + 6][j + 3 - h]; t3 = A[i + 7][j + 3 - h]; t4 = A[i + 4][j + 4 + h]; t5 = A[i + 5][j + 4 + h]; t6 = A[i + 6][j + 4 + h]; t7 = A[i + 7][j + 4 + h]; B[j + 4 + h] [i + 0] = B[j + 3 - h] [i + 4]; B[j + 4 + h] [i + 1] = B[j + 3 - h] [i + 5]; B[j + 4 + h] [i + 2] = B[j + 3 - h] [i + 6]; B[j + 4 + h] [i + 3] = B[j + 3 - h] [i + 7]; B[j + 3 - h][i + 4] = t0; B[j + 3 - h] [i + 5] = t1; B[j + 3 - h] [i + 6] = t2; B[j + 3 - h] [i + 7] = t3; B[j + 4 + h] [i + 4] = t4; B[j + 4 + h] [i + 5] = t5; B[j + 4 + h] [i + 6] = t6; B[j + 4 + h] [i + 7] = t7; } }}```### 61 x 67- Due to the method used in `32 x 32` and the change of steps, we can conclude that processing 16 integers at a time works best```clike= for ( int i = 0; i < N; i += 16) { for (int j = 0; j < M; j += 16) { for (int k = i; k < i + 16 && k < N ; k++ ) { for (int l = j; l#define MATRIX_SIZE 8void transpose(){ int A[MATRIX_SIZE][MATRIX_SIZE], B[MATRIX_SIZE][MATRIX_SIZE]; for(int i = 0; i < MATRIX_SIZE; i++){ for(int j = 0; j < MATRIX_SIZE; j++){ A[i][j] = (i+1) * (j+1); } }; for(int j = 0; j < ROZMIAR_MACIERZY; j++){ for(int i = 0; i < ROZMIAR_ MATRIX; i++){ int t0 = A[i][j]; B[j][i] = t0; } } return;}``` - Konfiguracja z suchymi pamięciami podręcznymi - 32 elementy, 4 elementy bezpośrednio w kaart gebracht Hitpercentage: 96,62% ![n8_direct_mapped](https://i.imgur.com/8BqKC6b.png) - 32-items Volledig associatief uit 4 worden Hitpercentage: 97,16% ![n8_fully_associative](https://i.imgur.com/gFDVjkb.png) - 32-invoer 4-woorden 2-weg set associatief Hitpercentage: 97, 08% ![n8_2-way_set_associative](https://i.imgur.com/DqeDLku.png) - Haal 4 variabelen tegelijk op```clike=void transpose(){ int A[MATRIX_SIZE][MATRIX_SIZE], B[ MATRIX_SIZE][MATRIX_SIZE]; for(int i = 0; i < MATRIX_SIZE; i++){ for(int j = 0; j < MATRIX_SIZE; j++){ A[i][j] = (i+1) * (j+1); } }; for(int j = 0; j < MATRIX_SIZE; j+=4){ for(int i = 0; i < MATRIX_SIZE; i++){ int t0 = A[i][j]; int t1 = A[i][j+1]; int t2 = A[i][j+2]; int t3 = A[i][j+3]; B[j][i] = t0; B[j+1][i] = t1; B[j+2][i] = t2; B[j+3][i] = t3; } } return;}``` - Konfiguracja z suchymi pamięciami podręcznymi - 32 elementy, 4 elementy bezpośrednio w kaart gebracht Hitpercentage: 93,81% ![n8_direct_mapped](https://i.imgur.com/3sD4ENr.png) - 32-items Volledig associatief uit 4 worden Hitpercentage: 96,54% ![n8_fully_associative](https://i.imgur.com/4nfIzTi.png) - 32-invoer 4-woorden 2-weg set associatief Hitpercentage: 96, % ![n8_direct_mapped](https:// /i.imgur.com/LSW5j6H.png) - 32-elementy, 4-woorden, łączny procent stowarzyszeń Hitpercentage: 95,34% ![n8_fully_associative](https://i.imgur.com /gE4OoW9.png) - 32-elementy, 4 słowa, 2-weg zestaw skojarzeń Procent trafień: 95,86% ![n8_2-way_set_associative](https://i.imgur.com/nfetRxO.png) | | Bezpośrednio w kaart gebracht | Stowarzyszenie Volledig | Stowarzyszenie 2-wegset || --------- | --------- | ------------------ | ---------------------------------- || 1 tegelijk | 96,62% | 97,16% | 97,08% || 4 tegelijk | 93,81% | 96,54% | 96,72% || 8 tegelijk | 94,2% | 95,34% | 95,86% || 4 tegelijk, drie voor lus | 95,56% | 96,08% | 96,42% |#### N = 32- Metoda fikcyjna - Konfiguracja z suchymi pamięciami podręcznymi - 32 elementy w 4 drewnianych bezpośrednio w kaart gebracht Hitpercentage: 93,81% ![n32_direct_mapped](https://i.imgur .com/fDf4oV0.png ) - 32 elementy, 4 drewniane, volledig associatief Hitpercentage: 95,55% ![n32_fully_associative](https://i.imgur.com/EVbWJoJ.png) - 32 elementy, 4 drewniane 2-weg set associatief Hitpercentage: 94,75 % ![n32_2-way_set_associative](https://i.imgur.com/t3z4gmq.png) - Haal 4 variabelen tegelijk op - Konfiguracja z suchymi pamięciami podręcznymi - 32 vermeldingen i 4 drewniane bezpośrednie w kaart gebracht Procent trafień: 89,87% ![n32_direct_mapped ](https://i.imgur.com/v9Q1jU4.png) - 32 vermeldingen, volledig associatief, 4 woorden Hitpercentage: 91,19% ![n32_fully_associative](https://i .imgur.com/WLJR5ps.png) — 32 elementy, 4 drewniane, 2 zestawy skojarzeń Hitpercentage: 91,19% ![n32_2-way_set_associative](https://i.imgur.com/MCUNEiO.png) — Posiada 8 zmiennych tegelijk op - Konfiguracja z suchymi pamięciami podręcznymi - 32 - 4-drewniane bezpośrednie w kaart gebracht Hitpercentage: 93,36% ![n32_direct_mapped](https://i.imgur.com/aXMvre0.png) - 32-inzendingen 4-woorden volledig associatief Hitpercentage: 96,32% ![n32_fully_associative]( https://i.imgur.com/DtSDnq6.png) - 32-items, 4 woorden, 2-weg set associatief Hitpercentage: 94,86% ! [n32_2-way_set_associative](https://i.imgur.com/agCjGC8. png) - 4 różne opcje tegelijk op, suche lus - Konfiguracja z suchymi pamięciami podręcznymi - 32 elementy i 4 drewniane bezpośrednie w kaart gebracht Hitpercentage: 86,15%! [n32_direct_mapped] (https://i.imgur.com/pixiqBd.png) - 32 vermeldingen, 4 woorden, volledig associatief Hitpercentage: 94,36%! [n32_fully_associative](https://i.imgur.com/T0ZjHk5.png) - 32 vermeldingen, 4 drewniane, 2-weg set associatief Hitpercentage: 94,03% ![n32_2-way_set_associative](https://i.imgur .com/wSePOLg.png) | | Bezpośrednio w kaart gebracht | Stowarzyszenie Volledig | Stowarzyszenie 2-wegset || --------- | --------- | ------------------ | ---------------------------------- || 1 tegelijk | 93,81% | 95,55% | 94,75% || 4 tegelijk | 89,87% | 91,19% | 91,19% || 8 tegelijk | 93,36% | 96,32% | 94,86% || 4 tegelijk, drie voor lus | 86,15% | 94,36% | 94,03% || 8 tegelijk, drie voor lus | 90,11% | 94,03% | 90,11% |#### N = 64| | Bezpośrednio w kaart gebracht | Stowarzyszenie Volledig | Stowarzyszenie 2-wegset || --------- | --------- | ------------------ | ---------------------------------- || 1 tegelijk | 90,88% | 92,17% | 92,13% || 4 tegelijk | 89,75% | 95,54% | 91,09% || 8 tegelijk | 89,65% | 95,44% | 90,78% || 4 tegelijk, drie voor lus | 88,57% | 94,22% | 89,94% || 8 tegelijk, drie voor lus | 90,90% | 94,58% | 92,05% |## Referentie- [CS:APP 3/e cachelab](https://hackmd.io/@pSnFKx_GTlmTWXn4A8lpKw/SkeA19jWN?type=view)- [CSAPP:Lab4-Cache Lab](https:// zhuanlan.zhihu.com/p/142942823)

Last Modified

Tsundere Chen Only human, nothing else

6033

published on HackMD

apply

z

By clicking below, you consent to ourTerms of Service.

Log in with Facebook Subscribe via Twitter Register via GitHub Sign in to Dropbox

New to HackMD?apply

References

Top Articles
Latest Posts
Article information

Author: Prof. Nancy Dach

Last Updated: 07/02/2023

Views: 6059

Rating: 4.7 / 5 (57 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Prof. Nancy Dach

Birthday: 1993-08-23

Address: 569 Waelchi Ports, South Blainebury, LA 11589

Phone: +9958996486049

Job: Sales Manager

Hobby: Web surfing, Scuba diving, Mountaineering, Writing, Sailing, Dance, Blacksmithing

Introduction: My name is Prof. Nancy Dach, I am a lively, joyous, courageous, lovely, tender, charming, open person who loves writing and wants to share my knowledge and understanding with you.