Latency tuning guidelines
Many of our tutorials and applications depend on reliabile, low latency processing.
This is a rough checklist of steps for creating an image (e.g. baseline_1804_lowlatency) that has minimized deterministic latency.
factors affecting this:
- maximum stable processor clock speed
- we cannot depend on boost states (intel turbo boost), as they can't be maintained under all load conditions, or for many cores.
- it is possible to fix a small number of cores to a high boost by disabling others, but this usually requires bios support
- number of cpus
- systems with multiple cpus can introduce latency with communication between them. Single cpu systems are much simpler to optimize
- NUMA issues: with multiple cpus, you must pay close attention to which memory is allocated to which cpu, as well as where pcie devices are attached.
- Choosing node to use:
- from above, pick a node with a single cpu, and the maximum clock available
- install a low latency kernel
apt install linux-lowlatency-hwe-18.04 linux-tools-lowlatency-hwe-18.04
- usage of tuned-adm
- tuned is a perfomance optimization project that wraps many configuration methods
apt install tuned
- hwloc (to show pci and numa layout)
Dell R740 results
|1||0||stress -c 48||3300|
|1||0||stress —matrix 48||3000|
- USRP 2974:
- C0 2500mhz all core
- C0 2300mhz stress —matrix 0
- C0 2000mhz prime95 avx
- dell r740xd xeon gold
- 3300mhz c0 all core
- 3000mhz c0 stress-ng —matrix 0
- 2300mhz c0 p95 avx512