Hi,
We are running Zeek in cluster mode on a single machine using PF_RING (v8.2.0). The node.cfg looks like
[manager]
type=manager
host=localhost
[logger]
type=logger
host=localhost
[proxy-1]
type=proxy
host=localhost
[worker-1]
type=worker
host=localhost
interface=ens10
lb_method=pf_ring
lb_procs=5
We have noticed a significant performance degradation (upto 90% packet loss) after upgrading from Zeek-5.2.2 to Zeek-6.0.0 on the same machine/configuration to the point where we had to revert back to Zeek-5.2.2
For Zeek-5.2.2 the pf_ring process info for one single process is
Bound Device(s) : ens10
Active : 1
Breed : Standard
Appl. Name : zeek-ens10
Socket Mode : RX+TX
Capture Direction : RX+TX
Sampling Rate : 1
Filtering Sampling Rate: 0
IP Defragment : No
BPF Filtering : Enabled
Sw Filt Hash Rules : 0
Sw Filt WC Rules : 0
Sw Filt Hash Match : 0
Sw Filt Hash Miss : 0
Sw Filt Hash Filtered : 0
Hw Filt Rules : 0
Poll Pkt Watermark : 1
Num Poll Calls : 5
Poll Watermark Timeout : 0
Channel Id Mask : 0xFFFFFFFFFFFFFFFF
VLAN Id : 65535
Cluster Id : 21
Slot Version : 20 [8.2.0]
Min Num Slots : 7315
Bucket Len : 9216
Slot Len : 9272 [bucket+header]
Tot Memory : 67842048
Tot Packets : 126093
Tot Pkt Lost : 0
Tot Insert : 126093
Tot Read : 126068
Insert Offset : 24036888
Remove Offset : 24029152
Num Free Slots : 7290
TX: Send Ok : 0
TX: Send Errors : 0
Reflect: Fwd Ok : 0
Reflect: Fwd Errors : 0
The stats.log
is nice and linear too
1689673088.948389 logger 151 0 0 - - - 940 384 0 0 0 0 0 0 2048 89 0 0 0 0 0 0 0 0
1689673090.540163 manager 103 0 0 - - - 7019 6467 0 0 0 0 0 0 2137 63 0 0 0 0 0 0 0 0
1689673092.124542 proxy-1 91 0 0 - - - 176690 176134 0 0 0 0 0 0 1719 59 0 0 0 0 0 0 0 0
1689673093.844903 worker-1-5 290 2724366 2195088093 0 2724366 0.000606 1941787 1941785 2249 2700 150 31166 76081 723 499807 125434 12511 82 0 0 780853 475078 1078275 0
1689673093.890729 worker-1-4 287 2320015 1619416526 0 2320015 0.000097 1883343 1883342 2176 2669 143 31093 76157 707 499680 124347 12381 67 0 0 804117 463304 238523 0
1689673093.855832 worker-1-1 286 2069789 1443405113 0 2069789 0.000120 1880384 1880384 2193 2765 142 30845 76106 696 496321 124011 12250 86 0 0 701742 541965 16001 0
1689673093.925209 worker-1-2 283 1893755 1280572861 0 1893755 0.000076 1882802 1882800 2165 2713 147 30661 75975 704 519899 124042 12266 72 0 0 696494 541729 901994 0
1689673093.897475 worker-1-3 286 1840839 1242641106 0 1840839 0.000110 1964678 1964670 2219 2712 159 30599 76131 734 494667 124522 12365 82 0 0 814335 555651 457608 0
1689673388.948476 logger 151 0 0 - - - 363 363 0 0 0 0 0 0 2013 89 0 0 0 0 0 0 0 0
1689673390.540846 manager 103 0 0 - - - 3493 3491 0 0 0 0 0 0 2136 62 0 0 0 0 0 0 0 0
1689673392.124647 proxy-1 93 0 0 - - - 160647 160646 0 0 0 0 0 0 1715 61 0 0 0 0 0 0 0 0
1689673393.855856 worker-1-1 331 3362873 2815507136 0 3362873 0.000072 1837817 1837813 2098 2538 117 29002 75150 695 525583 227924 11623 85 0 0 517671 27176 32091 0
1689673393.844947 worker-1-5 338 1648761 1108523273 0 1648761 0.000116 1855376 1855380 2076 2559 137 28830 75933 729 532624 229876 11596 106 0 0 477789 277150 1146270 0
1689673393.925209 worker-1-2 329 1490401 965625528 0 1490401 0.000074 1826702 1826700 2048 2429 124 28740 74828 687 551752 227072 11769 89 0 0 521065 3630 988203 0
1689673393.890797 worker-1-4 336 1933832 1340839201 0 1933832 0.000111 1905318 1905316 2097 2626 115 28913 75073 715 527678 228638 12029 96 0 0 523117 110731 259305 0
1689673393.897583 worker-1-3 332 2053098 1460076779 0 2053098 0.000492 1899970 1899973 2095 2580 133 28805 75206 725 527748 228125 11584 67 0 0 640793 0 485294 0
1689673688.949326 logger 151 0 0 - - - 363 363 0 0 0 0 0 0 2014 90 0 0 0 0 0 0 0 0
1689673690.541238 manager 103 0 0 - - - 2171 2173 0 0 0 0 0 0 2137 62 0 0 0 0 0 0 0 0
1689673692.124714 proxy-1 94 0 0 - - - 156467 156469 0 0 0 0 0 0 1711 59 0 0 0 0 0 0 0 0
1689673693.845019 worker-1-5 348 2609874 2016559593 0 2609874 0.000095 2076648 2076642 1823 2436 89 27867 76349 543 519900 226112 11021 71 0 0 555726 337290 1060520 0
1689673693.855920 worker-1-1 340 1944933 1338137324 0 1944933 0.000103 2008271 2008272 1809 2419 82 28164 76435 516 516145 224930 11056 76 0 0 679186 38704 38687 0
1689673693.925291 worker-1-2 342 1465935 909106868 0 1465935 0.000105 2000249 2000250 1848 2490 86 28113 76396 527 541327 224869 11117 67 0 0 568298 336331 910187 0
1689673693.890857 worker-1-4 348 1836140 1257882500 0 1836140 0.000223 2038914 2038916 1858 2487 87 28131 76249 535 517739 225119 11355 118 0 0 626095 1933811 268142 0
1689673693.897670 worker-1-3 343 1671119 1010249583 0 1671119 0.000093 2091418 2091416 1927 2463 83 28185 76132 537 522347 225316 11002 59 0 0 594599 0 492906 0
1689673988.950265 logger 152 0 0 - - - 363 363 0 0 0 0 0 0 2014 91 0 0 0 0 0 0 0 0
1689673990.541520 manager 103 0 0 - - - 4115 4113 0 0 0 0 0 0 2138 64 0 0 0 0 0 0 0 0
1689673992.124785 proxy-1 95 0 0 - - - 153749 153748 0 0 0 0 0 0 1714 60 0 0 0 0 0 0 0 0
1689673993.845092 worker-1-5 348 1790141 1140473966 0 1790141 0.000121 1940427 1940427 1954 2584 152 28604 73976 804 500852 225623 10918 70 0 0 471313 27773 1124798 0
1689673993.856001 worker-1-1 340 3260145 2585018935 0 3260145 0.000090 1915993 1915993 1999 2696 153 28541 74335 783 502756 225331 11106 62 0 0 538834 27032 10122 0
1689673993.925308 worker-1-2 342 1582863 1056736314 0 1582863 0.000058 1897449 1897450 1999 2613 147 28724 74178 796 526744 225627 11114 87 0 0 462121 561302 901605 0
1689673993.890926 worker-1-4 348 2094632 1561020151 0 2094632 0.000146 2045062 2045059 1956 2576 157 28375 74017 779 503563 224704 11363 76 0 0 524205 180654 241585 0
1689673993.897735 worker-1-3 343 1607045 1007812267 0 1607045 0.000063 1973494 1973495 1899 2576 139 28227 73770 782 500908 224153 11053 70 0 0 449315 0 471086 0
After upgrade to Zeek-6.0.0 there is no appreciable CPU or memory increase (infact CPU usage decreases)
This is what PF_RING info for a single worker process looks like when using Zeek-6.0.0 in that case
Bound Device(s) : ens10
Active : 1
Breed : Standard
Appl. Name : zeek-ens10
Socket Mode : RX+TX
Capture Direction : RX+TX
Sampling Rate : 1
Filtering Sampling Rate: 0
IP Defragment : No
BPF Filtering : Enabled
Sw Filt Hash Rules : 0
Sw Filt WC Rules : 0
Sw Filt Hash Match : 0
Sw Filt Hash Miss : 0
Sw Filt Hash Filtered : 0
Hw Filt Rules : 0
Poll Pkt Watermark : 1
Num Poll Calls : 665
Poll Watermark Timeout : 0
Channel Id Mask : 0xFFFFFFFFFFFFFFFF
VLAN Id : 65535
Cluster Id : 21
Slot Version : 20 [8.2.0]
Min Num Slots : 7315
Bucket Len : 9216
Slot Len : 9272 [bucket+header]
Tot Memory : 67842048
Tot Packets : 776413
Tot Pkt Lost : 660401
Tot Insert : 116024
Tot Read : 12736
Insert Offset : 9749592
Remove Offset : 9774208
Num Free Slots : 0
TX: Send Ok : 0
TX: Send Errors : 0
Reflect: Fwd Ok : 0
Reflect: Fwd Errors : 0
stats.log
becomes non-linear and messy with workers reporting late
1689665990.583017 logger 325 0 0 - - - - 947 384 0 0 0 0 0 02046 87 0 0 0 0 0 0 0 0
1689665992.160743 manager 168 0 0 - - - - 6669 6109 0 0 0 0 0 02138 64 0 0 0 0 0 0 0 0
1689665993.820183 proxy-1 143 0 0 - - - - 80788 80225 0 0 0 0 0 01719 59 0 0 0 0 0 0 0 0
1689665995.749078 worker-1-5 334 1517951 909319007 0 1517951 0.000074 - 1806814 1806811 1799 2438 38 27536 82035 729 490907 127634 10567 71 0 0 381022 618268 1924534 0
1689665995.716962 worker-1-1 311 848554 487461472 1519209 2367763 0.374560 - 1153972 1153968 2152 2371 33 17641 46739 285 312799 81282 6048 76 0 0 849879 32792 1224808 0
1689665995.718581 worker-1-2 269 166082 92919503 4489850 4655932 50.207620 - 231399 231397 1918 1787 12 5354 10622 154 109471 25035 1036 10 0 0 886489 0 254857 0
1689666290.583110 logger 326 0 0 - - - - 363 363 0 0 0 0 0 02015 89 0 0 0 0 0 0 0 0
1689666292.161305 manager 168 0 0 - - - - 3173 3173 0 0 0 0 0 02138 65 0 0 0 0 0 0 0 0
1689666293.820261 proxy-1 144 0 0 - - - - 132879 132878 0 0 0 0 0 01715 61 0 0 0 0 0 0 0 0
1689666295.749121 worker-1-5 379 3194441 2610559636 0 3194441 0.000075 - 1738975 1738976 1869 2550 110 27118 73571 521 499412 229168 10568 63 0 0 317955 0 2085997 0
1689666295.718795 worker-1-2 339 1433823 899745950 934 1434757 0.000188 - 1683376 1683377 2168 2584 104 25326 66461 499 464683 127777 9676 71 0 0 613979 0 1961929 0
1689666295.720408 worker-1-1 358 1663067 1115119437 3807 1666874 0.000148 - 1811448 1811465 1916 2594 102 26664 73508 545 500881 183698 10731 90 0 0 265676 0 2047659 0
1689666590.583305 logger 326 0 0 - - - - 363 363 0 0 0 0 0 02015 91 0 0 0 0 0 0 0 0
1689665996.059727 worker-1-4 241 96332 78257454 18494939 18591271 596.212214 - 61073 61072715 20 0 1760 2877 49 44141 6515 239 3 0 0 482022 0 5140 0
1689666592.161775 manager 168 0 0 - - - - 1782 1783 0 0 0 0 0 02133 62 0 0 0 0 0 0 0 0
1689666593.820329 proxy-1 147 0 0 - - - - 126948 126948 0 0 0 0 0 01713 61 0 0 0 0 0 0 0 0
1689666595.749254 worker-1-5 381 6530084 5740179828 0 6530084 0.000070 - 1831942 1831940 1954 2658 85 26576 73868 756 500686 221413 10091 84 0 0 367133 0 2005657 0
1689666595.718838 worker-1-2 381 1830578 1283171424 252747 2083325 0.019678 - 1817988 1817987 2380 2606 84 26032 72557 756 511779 211629 10379 106 0 0 507179 0 1963502 0
1689666595.720606 worker-1-1 373 1671716 1176140418 18551 1690267 5.095542 - 1763935 1763918 1916 2624 80 26211 73952 770 499710 219954 9896 93 0 0 487499 4796 1773685 0
1689666296.412325 worker-1-4 241 31689 19420776 2124988 2156677 405.070991 - 33879 33997 701 68 31361 1494 24 40363 9504 93 5 0 0 571243 0 10598 0
1689666596.566161 worker-1-4 249 75704 44841174 996368 1072072 177.229402 - 96204 96086 1496 297 73235 4288 61 76415 15491 315 20 0 0 1075730 0 53963 0
1689666890.583509 logger 328 0 0 - - - - 363 363 0 0 0 0 0 02013 91 0 0 0 0 0 0 0 0
1689666892.162084 manager 168 0 0 - - - - 2979 2978 0 0 0 0 0 02138 63 0 0 0 0 0 0 0 0
1689666893.820429 proxy-1 147 0 0 - - - - 143140 143140 0 0 0 0 0 01713 61 0 0 0 0 0 0 0 0
1689666895.723584 worker-1-1 376 1481959 939371895 0 1481959 0.000488 - 1795831 1795832 1825 2646 33 27172 72046 644 511643 219479 10702 58 0 0 296163 66640 1626471 0
1689666895.749393 worker-1-5 384 5828942 5291618670 0 5828942 0.000083 - 1866109 1866109 1872 2708 35 27293 71715 637 511385 219957 10892 76 0 0 367351 66640 1702553 0
1689666895.718901 worker-1-2 389 2066399 1469797942 74890 2141289 4.142399 - 1965421 1965431 1997 2702 29 26938 71323 652 530447 217642 10286 64 0 0 380822 0 1965573 0
1689666896.582199 worker-1-4 296 358184 227000933 2210716 2568900 24.103091 - 396927 396926 3249 706 22 10728 18019 177 227485 48773 1634 94 0 0 2435022 66640 300480 0
1689665995.730098 worker-1-3 224 79777 78413709 22974980 23054757 933.746794 - 24602 24602406 6 0 801 1082 21 22461 2869 74 0 0 0 393285 0 6447 0
1689666296.300109 worker-1-3 226 27861 21002965 1112905 1140766 730.235311 - 24530 24703 571 35 51096 1198 17 29583 5728 66 5 0 0 512228 0 43848 0
1689666596.610617 worker-1-3 242 42103 20710360 673414 715517 488.876490 - 70446 70284 1406 413 18 2666 3365 86 64433 13224 240 18 0 0 949777 0 37330 0
1689666896.728680 worker-1-3 246 32230 15999183 319419 351649 223.234622 - 57683 57676 979 90 02026 2343 42 79123 14049 183 9 0 0 777349 2631 68473 0
1689667190.583757 logger 328 0 0 - - - - 363 363 0 0 0 0 0 02013 91 0 0 0 0 0 0 0 0
1689667192.162172 manager 168 0 0 - - - - 2660 2661 0 0 0 0 0 02138 64 0 0 0 0 0 0 0 0
1689667193.820507 proxy-1 147 0 0 - - - - 152504 152504 0 0 0 0 0 01713 61 0 0 0 0 0 0 0 0
1689667195.724268 worker-1-1 377 1314268 799209166 39488 1353756 0.000783 - 1741699 1741703 1875 2388 137 25022 73626 498 483656 217134 8727 80 0 0 342824 291098 1977426 0
1689667195.749673 worker-1-5 385 3136137 2672363215 0 3136137 0.000242 - 1769909 1769913 1805 2304 145 25522 73118 514 488066 216860 8931 69 0 0 364753 693059 2173149 0
1689667195.718903 worker-1-2 392 4210580 3691968830 465463 4676043 4.113310 - 1722161 1722150 2514 2325 139 24820 70433 492 497302 214421 7889 108 0 0 792600 0 2138074 0
1689667166.480769 worker-1-3 285 184686 103105728 510936 695622 35.534405 - 284808 313724 2491 70 82 7784 13572 189 158475 0 983 61 0 0 1703517 0 0 0
1689667192.996349 worker-1-4 354 2107692 1701573134 1898311 4006003 8.940853 - 1214532 1214550 3474 1045 107 21457 54765 459 514212 120782 5493 140 0 0 1478287 135880 1145634 0
1689667201.829916 worker-1-5 395 32787 22349091 0 32787 0.688453 - 140295 243137 1626 127 146 542 1472 4 10087 0 170 70 0 0 375742 0 0 0
1689667197.675986 worker-1-2 399 26080 21116474 0 26080 4.852965 - 118121 215936 2302 115 139 179 477 1 3630 0 57 85 0 0 773034 0 0 0
1689667201.824249 worker-1-1 388 20455 10610665 0 20455 0.744583 - 140617 243384 1633 127 136 521 1499 1 10184 0 159 76 0 0 364371 193789 0 0
1689667204.945037 proxy-1 147 0 0 - - - - 6744 6746 0 0 0 0 0 0107 0 0 0 0 0 0 0 0 0
1689667206.852718 manager 168 0 0 - - - - 79 83 0 0 0 0 0 0133 0 0 0 0 0 0 0 0 0
There are no entries in reporter.log
and no errors in stderr.log
We use custom RPMS built from official source tarballs and the build sequence for both Zeek-5.2.2 and 6.0.0 is the same
./configure --prefix=%{_prefix} --binary-package --enable-static-broker --disable-broker-tests --disable-btest --disable-btest-pcaps --with-python=/usr/bin/python3 --enable-jemalloc --disable-cpp-tests --build-type=Release
make
make install
PF_RING also reports correct number of rings (with same cluster ID) in both cases.
zeekctl.cfg
SendMail =
MailTo = root@localhost
MailConnectionSummary = 0
MinDiskSpace = 5
MailHostUpDown = 0
LogRotationInterval = 3600
LogExpireInterval = 550
StatsLogEnable = 1
StatsLogExpireInterval = 90
StatusCmdShowAll = 0
SitePolicyScripts = local.zeek
LogDir = /zeeklog/logs
SpoolDir = /zeeklog/spool
CfgDir = /opt/zeek/etc
PFRINGClusterID = 21
ZeekArgs = -f "<LONGISH BPF FILTER>"
Any help in resolving this issue is appreciated.
EDIT: This appears to be an issue with live traffic. Timing tests on a 1.7GB PCAP having around 2M packets show no appreciable difference in Zeek-5 and Zeek-6.