L1 Cache Sim: interlude

My last post was on simming an L1 cache. Being pleased with how it came out, I thought I’d say a little about its construction.

The cache implementation is:

interface CacheLine {
  tag: number | null; // main-memory address
  last_used: number;
}

type Program = () => Generator<number, undefined, undefined>;

export class Cache {
  sets: number;
  ways: number;
  data: CacheLine[][];

  hits: number;
  misses: number;
  evictions: number;

  id: number;

  generation: number;
  last_address: number | null;

  program: Program;

  constructor(sets: number, ways: number, program: Program) {
    this.sets = sets;
    this.ways = ways;
    this.data = [];
    this.hits = 0;
    this.misses = 0;
    this.evictions = 0;
    this.generation = 0;
    this.last_address = null;
    this.program = program;
    this.id = 0;

    for (let i = 0; i < sets; i++) {
      if (this.data[i] === undefined) this.data[i] = [];

      for (let j = 0; j < ways; j++) {
        this.data[i][j] = {
          tag: null,
          last_used: 0,
        };
      }
    }
  }

  reset(id: number) {
    this.id = id;
    this.hits = 0;
    this.misses = 0;
    this.evictions = 0;
    this.generation = 0;
    this.last_address = null;

    for (let i = 0; i < this.sets; i++) {
      if (this.data[i] === undefined) this.data[i] = [];

      for (let j = 0; j < this.ways; j++) {
        this.data[i][j] = {
          tag: null,
          last_used: 0,
        };
      }
    }
  }

  access(byte_address: number) {
    this.last_address = byte_address;
    this.generation += 1;

    const setIndex = this.getSetIndex(byte_address);
    const set = this.data[setIndex];

    const address = (byte_address >> 6) << 6;

    for (const way of set) {
      if (way.tag === address) {
        this.hits += 1;
        return;
      }
    }

    this.misses += 1;

    let least_recently_used: number = Number.MAX_SAFE_INTEGER;
    let least_recently_used_index: number = 0;
    for (const [i, way] of set.entries()) {
      if (way.tag === null) {
        this.data[setIndex][i] = {
          tag: address,
          last_used: this.generation,
        };
        return;
      } else {
        if (way.last_used < least_recently_used) {
          least_recently_used = way.last_used;
          least_recently_used_index = i;
        }
      }
    }

    this.evictions += 1;
    this.data[setIndex][least_recently_used_index] = {
      tag: address,
      last_used: this.generation,
    };
  }

  getSetIndex(address: number) {
    const set = (address >> 6) & 0b111111;

    return set;
  }
}

Let’s break it down. The definition of a cache line is:

interface CacheLine {
  tag: number | null; // main-memory address
  last_used: number;
}

The tag field tells us whether a cache line is unused (tag === null) or whether it is used in which case the tag is the main-memory address of the cache line.

In a given set the simulation implements a least-recently used (LRU) cache, so we record the generation the cache line / address was last accessed.

type Program = () => Generator<number, undefined, undefined>;

This is our “program” definition. Those familiar with javascript will recognise this as being the type of a javascript generator. I will come back to this later.

The cache is then defined with a javascript class:

export class Cache {
  sets: number;
  ways: number;
  data: CacheLine[][];

  hits: number;
  misses: number;
  evictions: number;

  id: number;

  generation: number;
  last_address: number | null;

  program: Program;

  constructor(sets: number, ways: number, program: Program) {
    this.sets = sets;
    this.ways = ways;
    this.data = [];
    this.hits = 0;
    this.misses = 0;
    this.evictions = 0;
    this.generation = 0;
    this.last_address = null;
    this.program = program;
    this.id = 0;

    for (let i = 0; i < sets; i++) {
      if (this.data[i] === undefined) this.data[i] = [];

      for (let j = 0; j < ways; j++) {
        this.data[i][j] = {
          tag: null,
          last_used: 0,
        };
      }
    }
  }

  reset(id: number) {
    this.id = id;
    this.hits = 0;
    this.misses = 0;
    this.evictions = 0;
    this.generation = 0;
    this.last_address = null;

    for (let i = 0; i < this.sets; i++) {
      if (this.data[i] === undefined) this.data[i] = [];

      for (let j = 0; j < this.ways; j++) {
        this.data[i][j] = {
          tag: null,
          last_used: 0,
        };
      }
    }
  }

  access(byte_address: number) {
    this.last_address = byte_address;
    this.generation += 1;

    const setIndex = this.getSetIndex(byte_address);
    const set = this.data[setIndex];

    const address = (byte_address >> 6) << 6;

    for (const way of set) {
      if (way.tag === address) {
        this.hits += 1;
        return;
      }
    }

    this.misses += 1;

    let least_recently_used: number = Number.MAX_SAFE_INTEGER;
    let least_recently_used_index: number = 0;
    for (const [i, way] of set.entries()) {
      if (way.tag === null) {
        this.data[setIndex][i] = {
          tag: address,
          last_used: this.generation,
        };
        return;
      } else {
        if (way.last_used < least_recently_used) {
          least_recently_used = way.last_used;
          least_recently_used_index = i;
        }
      }
    }

    this.evictions += 1;
    this.data[setIndex][least_recently_used_index] = {
      tag: address,
      last_used: this.generation,
    };
  }

  getSetIndex(address: number) {
    const set = (address >> 6) & 0b111111;

    return set;
  }
}

This most important part of this is probably the access method. Given a main-memory address this method looks up the set, measures a hit / miss and evicts where necessary.

So what does a program look like? As I said before, the program is represented with a javascript generator. An example program might look like this:

export function linearAccess(
  offset: number,
  n: number,
  stride: number,
  loops: number
) {
  return function* () {
    let j = 0;
    while (j < loops) {
      let i = 0;
      while (i < n) {
        yield offset + i * stride;
        i += 1;
      }
      j += 1;
    }
  };
}

And indeed that is the program used in the previous post: consecutive memory access modulo some stride. The “program” simply yields addresses.

This is basically equivalent to an actual program you might write with the arr living at some offset in memory:

let arr = [];

let j = 0;
while (j < loops) {
  let i = 0;
  while (i < n) {
    arr[i * stride];
    i += 1;
  }
  j += 1;
}

Aside: I wonder if there is some clever trick I could play with javascript proxies to write effectively that but get addresses?

Okay, cool. What I’ve not shown yet is how the access function of the Cache is invoked…how is our simulation actually run? I have wrapped the simulation in a CacheView.svelte component. I will only show the <script> part of the component…the rest is simply the HTML to draw the table and styling:

<script lang="ts">
	import { frame } from '$lib/frame';
	import type { Cache } from './cache';

	export let cache: Cache;

	let scroll: boolean = true;

	let run_id: number = 0;

	async function run() {
		const id = (run_id += 1);
		console.log(`Thread ${id}: starting`);
		cache.reset(id);
		for (const address of cache.program()) {
			await frame();

			if (run_id !== id || cache.id !== id) {
				console.log(`Thread ${id}: exiting`);
				return;
			}

			cache.access(address);
			cache.data = [...cache.data];
		}
	}

	function reset() {
		run_id = 0;
		cache.reset(0);
		cache.data = [...cache.data];
	}
</script>

A couple of immediate things to note:

We have a run function. Calling this starts the simulation.
The run function is async

For the dual purposes of not blocking the main thread and “animating” the simulation, the run function is async. Within the for loop the first thing we do is await frame(). What is frame? It’s simply a wrapper around requestAnimationFrame in a a way that is lovely to use from an async function:

export function frame(): Promise<number> {
  return new Promise((resolve) => requestAnimationFrame(resolve));
}

So the operation goes like this:

The user clicks the Run button which kicks off the run function.
The run function then iterates over the cache.program (again a generator) which yields main-memory addresses.
After waiting for an animation frame the address is passed to cache.access which invokes the logic of the cache.
The CacheView then redraws using the usual svelte reactivity:

...

	<div class="buttons">
		<button on:click={(ev) => run()}>Run</button>
		<button on:click={(ev) => reset()}>Reset</button>
	</div>
	<div>
		<div>Hits: {cache.hits}</div>
		<div>Misses: {cache.misses}</div>
		<div>
			Miss rate: {cache.misses + cache.hits === 0
				? '-'
				: ((cache.misses / (cache.misses + cache.hits)) * 100).toFixed(2)}%
		</div>
		<div>Evictions: {cache.evictions}</div>
	</div>

...

Pretty straightforward and very neat I think, hence I am writing about it.

There are a some changes I would like to make:

Given the await frame() after every memory address is yielded the simulation is stuck at 60-addresses per second (assuming your browser is drawing at 60 FPS). For a larger example this is probably a bit slow, so I want to have some multiplier (i.e. N number of accesses per frame).
It would be nice to step through a simulation rather than the continuous run until completion.
I think it could generally use some colour / animation.

With those changes in place I’ll be ready for part 2 of the series. Stay tuned.