[blog/bgpgrep-performance-facts] Add article with the first benchmarks from bgpgrep

4 years ago · 929f5869aa
parent c64976ab66
commit 929f5869aa
1 changed files with 135 additions and 0 deletions
--- a/content/blog/bgpgrep-performance-facts.md
+++ b/content/blog/bgpgrep-performance-facts.md
@ -0,0 +1,135 @@
+---
+title: "Few performance facts about bgpgrep"
+mobile_menu_title: "bgpgrep performance facts"
+date: 2021-10-19
+description: "After a complete codebase rewrite, some tweaks, and a lot of new
+features, it is time to look back and enjoy some benchmarking between bgpgrep and bgpscanner.
+Let's see how much things have improved and why."
+series: [ "ubgpsuite - The Micro BGP Suite" ]
+categories: [ "benchmarks", "development" ]
+tags: [ "ubgpsuite", "bgpscanner", "bgpgrep", "Networking", "BGP", "benchmarks" ]
+news_keywords: [ "ubgpsuite", "bgpscanner", "bgpgrep", "benchmarks" ]
+---
+
+## Few performance facts about bgpgrep
+
+If you are a performance junkie like me, the first question
+that probably pops in your mind after a major code rewrite is something like:
+
+*Is it faster than before?*
+
+Let's now satisfy this curiosity (of mine) with some benchmarking.
+
+## Benchmark environment
+* Processor: Intel© Core™ i7-8565U at 1.80GHz (4 cores physical, 8 cores with hyperthreading)
+* Cache layout:
+  - L1 data cache: 128 KiB (4 instances)
+  - L1 instructions cache: 128 KiB (4 instances)
+  - L2 cache: 1 MiB (4 instances)
+  - L3 cache: 8 MiB (1 instance)
+* Memory: 16 GB RAM DDR4, in two 8GB banks
+* Hard disk: SAMSUNG MZALQ512HALU-000L1
+* Kernel: Linux 5.10.62-1-lts SMP x86_64 GNU/Linux
+
+To avoid adultering results we also:
+- disable `cron` and any other background file indexing service;
+- force performance CPU profile, disabling powersave mode;
+- disable [Linux address space layout randomization](https://en.wikipedia.org/wiki/Address_space_layout_randomization) for the duration of our tests;
+- increase kernel performance events sample rate;
+- drop filesystem caches and clean any temporary file;
+- run benchmarks in console mode, outside any desktop environment.
+
+Both `bgpscanner` and `bgpgrep` have been compiled in release mode with full optimizations,
+as documented in their official build instructions, using `clang` version 12.0.1.
+For reference, we also do a benchmark run with `bgpdump`, version 1.6.2, as available from
+[Arch Linux User Repositories (AUR)](https://aur.archlinux.org/packages/bgpdump/).
+
+Results are calculated by averaging five runs of each command, immediately
+after one warmup round. MRT data is decompressed upfront, to avoid accounting for
+decompression overhead, the output is sent directly to `/dev/null`,
+to avoid any disk write overhead.
+
+## The show's on
+We take the data for the first benchmark from
+RouteViews' [Sydney Route Collector](http://archive.routeviews.org/route-views.sydney/bgpdata),
+and pull the very first RIB of December 2020, along with any subsequent updates from the same month.
+This gives us 47.1GB uncompressed MRT data to work with.
+
+We then run our benchmarks with the following commands:
+```sh
+bgpgrep sydney/2020-12/uncompressed.mrt >/dev/null
+bgpscanner sydney/2020-12/uncompressed.mrt >/dev/null
+bgpdump -mv sydney/2020-12/uncompressed.mrt >/dev/null
+```
+
+|            | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) |
+|------------|---------------|------------|-------------|--------------|
+| bgpgrep    | 404.45        | 401.62     | 411.38      | 2076         |
+| bgpscanner | 453.59        | 451.93     | 455.13      | 2448         |
+| bgpdump    | 2053.73       | 2037.19    | 2082.22     | 2316         |
+
+`bgpgrep` is 11% faster than `bgpscanner`, which is good.
+Since this benchmark operates mostly on MRT update dumps, let's try the same
+on a different dataset, mostly made of RIBs.
+We pull nine RIBs from RIPE RIS NCC [RRC00 Route Collector]](https://data.ris.ripe.net/rrc00/2019.12/),
+and obtain 25.7GB worth of uncompressed MRT data.
+This time the benchmark is limited to `bgpgrep` and `bgpscanner`.
+
+Executed commands and results:
+```sh
+bgpgrep rrc00/2019-12/rib-uncompressed.mrt >/dev/null
+bgpscanner rrc00/2019-12/rib-uncompressed.mrt >/dev/null
+```
+
+|            | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) |
+|------------|---------------|------------|-------------|--------------|
+| bgpgrep    | 295.84        | 292.20     | 298.14      | 2112         |
+| bgpscanner | 333.35        | 321.73     | 339.56      | 3016         |
+
+The same trend is confirmed, `bgpgrep` is about 12% faster, indicating that
+the advantage was not data dependent.
+
+Though, running our benchmarks under average system load might lead to an
+interesting surprise:
+```sh
+bgpgrep isolario/2021-07/rib-uncompressed.mrt >/dev/null
+bgpscanner isolario/2021-07/rib-uncompressed.mrt >/dev/null
+```
+
+|            | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) |
+|------------|---------------|------------|-------------|--------------|
+| bgpgrep    | 344.90        | 342.88     | 347.03      | 2260         |
+| bgpscanner | 411.39        | 405.13     | 412.70      | 2436         |
+
+These runs have been performed under a regular GNOME desktop session,
+with other applications running. We used 60.8GB worth of MRT data from
+the Isolario project [Dagobah Collector](https://isolario.it/Isolario_MRT_data/Dagobah/),
+from the month of July, 2021 (mostly RIBs).
+It might strike us that the performance gain now approaches 20%.
+
+The reason might be a smarter use of memory, and the reduced chance of page faults.
+You might have noticed by our results that `bgpgrep` memory requirements are
+moderate compared to `bgpscanner`, what's less evident is that `bgpgrep` also keeps
+its data structures compact and doesn't like moving them around much.
+This lessens the page pressure on the system (and makes the CPU cache happier).
+The net effects of this aren't evident in the benchmarking environment,
+since `bgpgrep` and `bgpscanner`, in turns, are the only resource intensive
+tasks on the system.
+The initial warmup round contributes to their ideal performance.
+When more tasks are concurrently fighting over memory, and processes might get
+swapped to different cores for various reasons, invalidating their cache,
+the value of `bgpgrep` approach becomes more prominent.
+
+## Conclusion
+
+`bgpgrep` seems to be a nice improvement over `bgpscanner`, and I am
+quite satisfied with the performance improvements. Especially when they come with
+a more solid codebase.
+
+In the next few weeks I intend to improve the filtering engine.
+In general I'd like to stop for a bit to polish the codebase to make it more mature,
+before moving on to implement more features.
+
+Like always, happy hacking to you all!
+
+Lorenzo Cogotti