www.doublefourteen.io

You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

6.4 KiB

Raw Blame History

title

mobile_menu_title

date

description

series

Few performance facts about bgpgrep

If you are a performance junkie like me, the first question that probably pops in your mind after a major code rewrite is something like:

Is it faster than before?

Let's now satisfy this curiosity (of mine) with some benchmarking.

Benchmark environment

Processor: Intel© Core™ i7-8565U at 1.80GHz (4 cores physical, 8 cores with hyperthreading)
Cache layout:
- L1 data cache: 128 KiB (4 instances)
- L1 instructions cache: 128 KiB (4 instances)
- L2 cache: 1 MiB (4 instances)
- L3 cache: 8 MiB (1 instance)
Memory: 16 GB RAM DDR4, in two 8GB banks
Hard disk: SAMSUNG MZALQ512HALU-000L1
Kernel: Linux 5.10.62-1-lts SMP x86_64 GNU/Linux

To avoid adultering results we also:

disable cron and any other background file indexing service;
force performance CPU profile, disabling powersave mode;
disable Linux address space layout randomization for the duration of our tests;
increase kernel performance events sample rate;
drop filesystem caches and clean any temporary file;
run benchmarks in console mode, outside any desktop environment.

Both bgpscanner and bgpgrep have been compiled in release mode with full optimizations, as documented in their official build instructions, using clang version 12.0.1. For reference, we also do a benchmark run with bgpdump, version 1.6.2, as available from Arch Linux User Repositories (AUR).

Results are calculated by averaging five runs of each command, immediately after one warmup round. MRT data is decompressed upfront, to avoid accounting for decompression overhead, the output is sent directly to /dev/null, to avoid any disk write overhead.

Let the fun begin!

We take the data for the first benchmark from RouteViews' Sydney Route Collector, and pull the very first RIB of December 2020, along with any subsequent updates from the same month. This gives us 47.1GB uncompressed MRT data to work with.

We then run our benchmarks with the following commands:

bgpgrep sydney/2020-12/uncompressed.mrt >/dev/null
bgpscanner sydney/2020-12/uncompressed.mrt >/dev/null
bgpdump -mv sydney/2020-12/uncompressed.mrt >/dev/null

	Average (sec)	Best (sec)	Worst (sec)	Memory (KiB)
bgpgrep	404.45	401.62	411.38	2076
bgpscanner	453.59	451.93	455.13	2448
bgpdump	2053.73	2037.19	2082.22	2316

bgpgrep is 11% faster than bgpscanner, which is good. Since this benchmark operates mostly on MRT update dumps, let's try the same on a different dataset, mostly made of RIBs. We pull nine RIBs from RIPE RIS NCC RRC00 Route Collector, and obtain 25.7GB worth of uncompressed MRT data. This time the benchmark is limited to bgpgrep and bgpscanner.

Executed commands and results:

bgpgrep rrc00/2019-12/rib-uncompressed.mrt >/dev/null
bgpscanner rrc00/2019-12/rib-uncompressed.mrt >/dev/null

	Average (sec)	Best (sec)	Worst (sec)	Memory (KiB)
bgpgrep	295.84	292.20	298.14	2112
bgpscanner	333.35	321.73	339.56	3016

The same trend is confirmed, bgpgrep is about 12% faster, indicating that the advantage was not data dependent.

Though, running our benchmarks under average system load might lead to an interesting surprise:

bgpgrep isolario/2021-07/rib-uncompressed.mrt >/dev/null
bgpscanner isolario/2021-07/rib-uncompressed.mrt >/dev/null

	Average (sec)	Best (sec)	Worst (sec)	Memory (KiB)
bgpgrep	344.90	342.88	347.03	2260
bgpscanner	411.39	405.13	412.70	2436

These runs have been performed under a regular GNOME desktop session, with other applications running. We used 60.8GB worth of MRT data from the Isolario project Dagobah Collector, from the month of July, 2021 (mostly RIBs). It might strike us that the performance gain now approaches 20%.

The reason might be a smarter use of memory, and the reduced chance of page faults. You might have noticed by our results that bgpgrep memory requirements are moderate compared to bgpscanner, what's less evident is that bgpgrep also keeps its data structures compact and doesn't like moving them around much. This lessens the page pressure on the system (and makes the CPU cache happier). The net effects of this aren't evident in the benchmarking environment, since bgpgrep and bgpscanner, in turns, are the only resource intensive tasks on the system. The initial warmup round contributes to their ideal performance. When more tasks are concurrently fighting over memory, and processes might get swapped to different cores for various reasons, invalidating their cache, the value of bgpgrep approach becomes more prominent.

Conclusion

bgpgrep seems to be a nice improvement over bgpscanner, and I am quite satisfied with the performance improvements. Especially when they come with a more solid codebase.

In the next few weeks I intend to improve the filtering engine. In general I'd like to stop for a bit to polish the codebase to make it more mature, before moving on to implement more features.

If you haven't already, be sure to check out the Micro BGP Suite at our official Git Repository.

Like always, happy hacking to you all!

Lorenzo Cogotti

6.4 KiB Raw Blame History

Few performance facts about bgpgrep

Benchmark environment

Let the fun begin!

Conclusion

6.4 KiB

Raw Blame History