Memory Mapping and Parallelizing Random Forests for Speed and Cache Efficiency



Memory mapping enhances decision tree implementations by enabling constant-time statistical inference, and is particularly effective when memory mapped tables fit in processor cache. However, memory mapping is more challenging when applied to random forests—ensembles of many trees—as the table sizes can easily outstrip cache capacity. We argue that careful system design for parallel and cache efficiency can make memory mapping effective for random forests. Our preliminary results show memory-mapped forests can speed up inference latency by a factor of up to 30 × .

Kyle C. Hale
Kyle C. Hale
Associate Professor of Computer Science

Hale’s research lies at the intersection of operating systems, HPC, parallel computing, computer architecture.