From 929f5869aa4c494840843bb14218d8ddc2d00b95 Mon Sep 17 00:00:00 2001 From: Lorenzo Cogotti Date: Wed, 20 Oct 2021 03:51:35 +0200 Subject: [PATCH] [blog/bgpgrep-performance-facts] Add article with the first benchmarks from bgpgrep --- content/blog/bgpgrep-performance-facts.md | 135 ++++++++++++++++++++++ 1 file changed, 135 insertions(+) create mode 100644 content/blog/bgpgrep-performance-facts.md diff --git a/content/blog/bgpgrep-performance-facts.md b/content/blog/bgpgrep-performance-facts.md new file mode 100644 index 0000000..209fae2 --- /dev/null +++ b/content/blog/bgpgrep-performance-facts.md @@ -0,0 +1,135 @@ +--- +title: "Few performance facts about bgpgrep" +mobile_menu_title: "bgpgrep performance facts" +date: 2021-10-19 +description: "After a complete codebase rewrite, some tweaks, and a lot of new +features, it is time to look back and enjoy some benchmarking between bgpgrep and bgpscanner. +Let's see how much things have improved and why." +series: [ "ubgpsuite - The Micro BGP Suite" ] +categories: [ "benchmarks", "development" ] +tags: [ "ubgpsuite", "bgpscanner", "bgpgrep", "Networking", "BGP", "benchmarks" ] +news_keywords: [ "ubgpsuite", "bgpscanner", "bgpgrep", "benchmarks" ] +--- + +## Few performance facts about bgpgrep + +If you are a performance junkie like me, the first question +that probably pops in your mind after a major code rewrite is something like: + +*Is it faster than before?* + +Let's now satisfy this curiosity (of mine) with some benchmarking. + +## Benchmark environment +* Processor: Intel© Core™ i7-8565U at 1.80GHz (4 cores physical, 8 cores with hyperthreading) +* Cache layout: + - L1 data cache: 128 KiB (4 instances) + - L1 instructions cache: 128 KiB (4 instances) + - L2 cache: 1 MiB (4 instances) + - L3 cache: 8 MiB (1 instance) +* Memory: 16 GB RAM DDR4, in two 8GB banks +* Hard disk: SAMSUNG MZALQ512HALU-000L1 +* Kernel: Linux 5.10.62-1-lts SMP x86_64 GNU/Linux + +To avoid adultering results we also: +- disable `cron` and any other background file indexing service; +- force performance CPU profile, disabling powersave mode; +- disable [Linux address space layout randomization](https://en.wikipedia.org/wiki/Address_space_layout_randomization) for the duration of our tests; +- increase kernel performance events sample rate; +- drop filesystem caches and clean any temporary file; +- run benchmarks in console mode, outside any desktop environment. + +Both `bgpscanner` and `bgpgrep` have been compiled in release mode with full optimizations, +as documented in their official build instructions, using `clang` version 12.0.1. +For reference, we also do a benchmark run with `bgpdump`, version 1.6.2, as available from +[Arch Linux User Repositories (AUR)](https://aur.archlinux.org/packages/bgpdump/). + +Results are calculated by averaging five runs of each command, immediately +after one warmup round. MRT data is decompressed upfront, to avoid accounting for +decompression overhead, the output is sent directly to `/dev/null`, +to avoid any disk write overhead. + +## The show's on +We take the data for the first benchmark from +RouteViews' [Sydney Route Collector](http://archive.routeviews.org/route-views.sydney/bgpdata), +and pull the very first RIB of December 2020, along with any subsequent updates from the same month. +This gives us 47.1GB uncompressed MRT data to work with. + +We then run our benchmarks with the following commands: +```sh +bgpgrep sydney/2020-12/uncompressed.mrt >/dev/null +bgpscanner sydney/2020-12/uncompressed.mrt >/dev/null +bgpdump -mv sydney/2020-12/uncompressed.mrt >/dev/null +``` + +| | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) | +|------------|---------------|------------|-------------|--------------| +| bgpgrep | 404.45 | 401.62 | 411.38 | 2076 | +| bgpscanner | 453.59 | 451.93 | 455.13 | 2448 | +| bgpdump | 2053.73 | 2037.19 | 2082.22 | 2316 | + +`bgpgrep` is 11% faster than `bgpscanner`, which is good. +Since this benchmark operates mostly on MRT update dumps, let's try the same +on a different dataset, mostly made of RIBs. +We pull nine RIBs from RIPE RIS NCC [RRC00 Route Collector]](https://data.ris.ripe.net/rrc00/2019.12/), +and obtain 25.7GB worth of uncompressed MRT data. +This time the benchmark is limited to `bgpgrep` and `bgpscanner`. + +Executed commands and results: +```sh +bgpgrep rrc00/2019-12/rib-uncompressed.mrt >/dev/null +bgpscanner rrc00/2019-12/rib-uncompressed.mrt >/dev/null +``` + +| | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) | +|------------|---------------|------------|-------------|--------------| +| bgpgrep | 295.84 | 292.20 | 298.14 | 2112 | +| bgpscanner | 333.35 | 321.73 | 339.56 | 3016 | + +The same trend is confirmed, `bgpgrep` is about 12% faster, indicating that +the advantage was not data dependent. + +Though, running our benchmarks under average system load might lead to an +interesting surprise: +```sh +bgpgrep isolario/2021-07/rib-uncompressed.mrt >/dev/null +bgpscanner isolario/2021-07/rib-uncompressed.mrt >/dev/null +``` + +| | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) | +|------------|---------------|------------|-------------|--------------| +| bgpgrep | 344.90 | 342.88 | 347.03 | 2260 | +| bgpscanner | 411.39 | 405.13 | 412.70 | 2436 | + +These runs have been performed under a regular GNOME desktop session, +with other applications running. We used 60.8GB worth of MRT data from +the Isolario project [Dagobah Collector](https://isolario.it/Isolario_MRT_data/Dagobah/), +from the month of July, 2021 (mostly RIBs). +It might strike us that the performance gain now approaches 20%. + +The reason might be a smarter use of memory, and the reduced chance of page faults. +You might have noticed by our results that `bgpgrep` memory requirements are +moderate compared to `bgpscanner`, what's less evident is that `bgpgrep` also keeps +its data structures compact and doesn't like moving them around much. +This lessens the page pressure on the system (and makes the CPU cache happier). +The net effects of this aren't evident in the benchmarking environment, +since `bgpgrep` and `bgpscanner`, in turns, are the only resource intensive +tasks on the system. +The initial warmup round contributes to their ideal performance. +When more tasks are concurrently fighting over memory, and processes might get +swapped to different cores for various reasons, invalidating their cache, +the value of `bgpgrep` approach becomes more prominent. + +## Conclusion + +`bgpgrep` seems to be a nice improvement over `bgpscanner`, and I am +quite satisfied with the performance improvements. Especially when they come with +a more solid codebase. + +In the next few weeks I intend to improve the filtering engine. +In general I'd like to stop for a bit to polish the codebase to make it more mature, +before moving on to implement more features. + +Like always, happy hacking to you all! + +Lorenzo Cogotti