The new TechEmpower benchmarks results are finally out, round 14 took almost 6 months to be ready, and while a more frequent testing is desirable this gave us plenty of time to improve things.
Last round a bug was crashing our threaded benchmarks (which wasted time restarting), besides having that bug fixed the most notably changes that improved our performance were:
- Use of a custom Epoll event dispatcher instead of default glib one (the tests without epoll in it’s name are using the default glib)
- jemalloc, this brought the coolest boost, you just need link to it (or use LD_PRELOAD) and memory allocation magically get’s faster (around 20% for the tests)
- CPU affinity that helps the scheduler to keep each thread/process pinned to a CPU core.
- SO_REUSEPORT which on Linux helps to get connections evenly distributed among each thread/process
- Several code optimizations thanks to perf
I’m very glad with our results, Cutelyst managed to stay with the top performers which really pays off all the hard work, it also provide us with good numbers to show for people interested in.
Check it out https://www.techempower.com/benchmarks/