[blog/bgpgrep-performance-facts] Add article with the first benchmarks from bgpgrep
parent
c64976ab66
commit
929f5869aa
@ -0,0 +1,135 @@
|
||||
---
|
||||
title: "Few performance facts about bgpgrep"
|
||||
mobile_menu_title: "bgpgrep performance facts"
|
||||
date: 2021-10-19
|
||||
description: "After a complete codebase rewrite, some tweaks, and a lot of new
|
||||
features, it is time to look back and enjoy some benchmarking between bgpgrep and bgpscanner.
|
||||
Let's see how much things have improved and why."
|
||||
series: [ "ubgpsuite - The Micro BGP Suite" ]
|
||||
categories: [ "benchmarks", "development" ]
|
||||
tags: [ "ubgpsuite", "bgpscanner", "bgpgrep", "Networking", "BGP", "benchmarks" ]
|
||||
news_keywords: [ "ubgpsuite", "bgpscanner", "bgpgrep", "benchmarks" ]
|
||||
---
|
||||
|
||||
## Few performance facts about bgpgrep
|
||||
|
||||
If you are a performance junkie like me, the first question
|
||||
that probably pops in your mind after a major code rewrite is something like:
|
||||
|
||||
*Is it faster than before?*
|
||||
|
||||
Let's now satisfy this curiosity (of mine) with some benchmarking.
|
||||
|
||||
## Benchmark environment
|
||||
* Processor: Intel© Core™ i7-8565U at 1.80GHz (4 cores physical, 8 cores with hyperthreading)
|
||||
* Cache layout:
|
||||
- L1 data cache: 128 KiB (4 instances)
|
||||
- L1 instructions cache: 128 KiB (4 instances)
|
||||
- L2 cache: 1 MiB (4 instances)
|
||||
- L3 cache: 8 MiB (1 instance)
|
||||
* Memory: 16 GB RAM DDR4, in two 8GB banks
|
||||
* Hard disk: SAMSUNG MZALQ512HALU-000L1
|
||||
* Kernel: Linux 5.10.62-1-lts SMP x86_64 GNU/Linux
|
||||
|
||||
To avoid adultering results we also:
|
||||
- disable `cron` and any other background file indexing service;
|
||||
- force performance CPU profile, disabling powersave mode;
|
||||
- disable [Linux address space layout randomization](https://en.wikipedia.org/wiki/Address_space_layout_randomization) for the duration of our tests;
|
||||
- increase kernel performance events sample rate;
|
||||
- drop filesystem caches and clean any temporary file;
|
||||
- run benchmarks in console mode, outside any desktop environment.
|
||||
|
||||
Both `bgpscanner` and `bgpgrep` have been compiled in release mode with full optimizations,
|
||||
as documented in their official build instructions, using `clang` version 12.0.1.
|
||||
For reference, we also do a benchmark run with `bgpdump`, version 1.6.2, as available from
|
||||
[Arch Linux User Repositories (AUR)](https://aur.archlinux.org/packages/bgpdump/).
|
||||
|
||||
Results are calculated by averaging five runs of each command, immediately
|
||||
after one warmup round. MRT data is decompressed upfront, to avoid accounting for
|
||||
decompression overhead, the output is sent directly to `/dev/null`,
|
||||
to avoid any disk write overhead.
|
||||
|
||||
## The show's on
|
||||
We take the data for the first benchmark from
|
||||
RouteViews' [Sydney Route Collector](http://archive.routeviews.org/route-views.sydney/bgpdata),
|
||||
and pull the very first RIB of December 2020, along with any subsequent updates from the same month.
|
||||
This gives us 47.1GB uncompressed MRT data to work with.
|
||||
|
||||
We then run our benchmarks with the following commands:
|
||||
```sh
|
||||
bgpgrep sydney/2020-12/uncompressed.mrt >/dev/null
|
||||
bgpscanner sydney/2020-12/uncompressed.mrt >/dev/null
|
||||
bgpdump -mv sydney/2020-12/uncompressed.mrt >/dev/null
|
||||
```
|
||||
|
||||
| | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) |
|
||||
|------------|---------------|------------|-------------|--------------|
|
||||
| bgpgrep | 404.45 | 401.62 | 411.38 | 2076 |
|
||||
| bgpscanner | 453.59 | 451.93 | 455.13 | 2448 |
|
||||
| bgpdump | 2053.73 | 2037.19 | 2082.22 | 2316 |
|
||||
|
||||
`bgpgrep` is 11% faster than `bgpscanner`, which is good.
|
||||
Since this benchmark operates mostly on MRT update dumps, let's try the same
|
||||
on a different dataset, mostly made of RIBs.
|
||||
We pull nine RIBs from RIPE RIS NCC [RRC00 Route Collector]](https://data.ris.ripe.net/rrc00/2019.12/),
|
||||
and obtain 25.7GB worth of uncompressed MRT data.
|
||||
This time the benchmark is limited to `bgpgrep` and `bgpscanner`.
|
||||
|
||||
Executed commands and results:
|
||||
```sh
|
||||
bgpgrep rrc00/2019-12/rib-uncompressed.mrt >/dev/null
|
||||
bgpscanner rrc00/2019-12/rib-uncompressed.mrt >/dev/null
|
||||
```
|
||||
|
||||
| | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) |
|
||||
|------------|---------------|------------|-------------|--------------|
|
||||
| bgpgrep | 295.84 | 292.20 | 298.14 | 2112 |
|
||||
| bgpscanner | 333.35 | 321.73 | 339.56 | 3016 |
|
||||
|
||||
The same trend is confirmed, `bgpgrep` is about 12% faster, indicating that
|
||||
the advantage was not data dependent.
|
||||
|
||||
Though, running our benchmarks under average system load might lead to an
|
||||
interesting surprise:
|
||||
```sh
|
||||
bgpgrep isolario/2021-07/rib-uncompressed.mrt >/dev/null
|
||||
bgpscanner isolario/2021-07/rib-uncompressed.mrt >/dev/null
|
||||
```
|
||||
|
||||
| | Average (sec) | Best (sec) | Worst (sec) | Memory (KiB) |
|
||||
|------------|---------------|------------|-------------|--------------|
|
||||
| bgpgrep | 344.90 | 342.88 | 347.03 | 2260 |
|
||||
| bgpscanner | 411.39 | 405.13 | 412.70 | 2436 |
|
||||
|
||||
These runs have been performed under a regular GNOME desktop session,
|
||||
with other applications running. We used 60.8GB worth of MRT data from
|
||||
the Isolario project [Dagobah Collector](https://isolario.it/Isolario_MRT_data/Dagobah/),
|
||||
from the month of July, 2021 (mostly RIBs).
|
||||
It might strike us that the performance gain now approaches 20%.
|
||||
|
||||
The reason might be a smarter use of memory, and the reduced chance of page faults.
|
||||
You might have noticed by our results that `bgpgrep` memory requirements are
|
||||
moderate compared to `bgpscanner`, what's less evident is that `bgpgrep` also keeps
|
||||
its data structures compact and doesn't like moving them around much.
|
||||
This lessens the page pressure on the system (and makes the CPU cache happier).
|
||||
The net effects of this aren't evident in the benchmarking environment,
|
||||
since `bgpgrep` and `bgpscanner`, in turns, are the only resource intensive
|
||||
tasks on the system.
|
||||
The initial warmup round contributes to their ideal performance.
|
||||
When more tasks are concurrently fighting over memory, and processes might get
|
||||
swapped to different cores for various reasons, invalidating their cache,
|
||||
the value of `bgpgrep` approach becomes more prominent.
|
||||
|
||||
## Conclusion
|
||||
|
||||
`bgpgrep` seems to be a nice improvement over `bgpscanner`, and I am
|
||||
quite satisfied with the performance improvements. Especially when they come with
|
||||
a more solid codebase.
|
||||
|
||||
In the next few weeks I intend to improve the filtering engine.
|
||||
In general I'd like to stop for a bit to polish the codebase to make it more mature,
|
||||
before moving on to implement more features.
|
||||
|
||||
Like always, happy hacking to you all!
|
||||
|
||||
Lorenzo Cogotti
|
Loading…
Reference in New Issue