Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 BUG: heavy CPU usage with use_system_route_table: yes #1322

Open
nikitos opened this issue Jan 30, 2025 · 3 comments
Open

🐛 BUG: heavy CPU usage with use_system_route_table: yes #1322

nikitos opened this issue Jan 30, 2025 · 3 comments
Labels
NeedsFix The path to resolution is known, but the work has not been done.

Comments

@nikitos
Copy link

nikitos commented Jan 30, 2025

What version of nebula are you using? (nebula -version)

1.9.5

What operating system are you using?

Linux

Describe the Bug

after massively (about 10k routes) adding routes via ospf to routing table, nebula print several (30-40 times)
"Adding route" destination=/24 via=10.8.0.3 and then consume 200% CPU, all routings seems works fine.

Logs from affected hosts

"Adding route" destination=<network>/24 via=10.8.0.3
"Adding route" destination=<network>/24 via=10.8.0.3
"Adding route" destination=<network>/24 via=10.8.0.3
"Adding route" destination=<network>/24 via=10.8.0.3
"Adding route" destination=<network>/24 via=10.8.0.3
"Adding route" destination=<network>/24 via=10.8.0.3
"Adding route" destination=<network>/24 via=10.8.0.3

Config files from affected hosts

static_host_map:
  "10.8.0.1": ["<IP>:4242"]

lighthouse:
  am_lighthouse: false
  interval: 60
  hosts:
    - "10.8.0.1"
routines: 2

cipher: aes
listen:
  host: <IP>
  port: 4242
tun:
  drop_local_broadcast: false
  drop_multicast: false
  mtu: 1400
  tx_queue: 5000
  use_system_route_table: yes
  unsafe_routes:
    - route: 10.0.0.0/8
      via: 10.8.0.1
      metric: 100
      install: false
    - route: 0.0.0.0/0
      via: 10.8.0.3
      metric: 100
      install: false
firewall:
  outbound_action: drop
  inbound_action: drop

  conntrack:
    tcp_timeout: 12m
    udp_timeout: 3m
    default_timeout: 10m

  outbound:
    # Allow all outbound traffic from this node
    - port: any
      proto: any
      host: any

  inbound:
    # Allow icmp between any nebula hosts
    - port: any
      proto: any
      host: any

all other host has the same configs

@nikitos
Copy link
Author

nikitos commented Jan 30, 2025

after enabling debug i mentioned that this message appears

Ignoring route update, not a gateway route" route="{Ifindex: 0 Dst: <nil> Src: <nil> Gw: <nil> Flags: [] Table: 0 Realm: 0}"
Ignoring route update, not a gateway route" route="{Ifindex: 0 Dst: <nil> Src: <nil> Gw: <nil> Flags: [] Table: 0 Realm: 0}"
Ignoring route update, not a gateway route" route="{Ifindex: 0 Dst: <nil> Src: <nil> Gw: <nil> Flags: [] Table: 0 Realm: 0}"
Ignoring route update, not a gateway route" route="{Ifindex: 0 Dst: <nil> Src: <nil> Gw: <nil> Flags: [] Table: 0 Realm: 0}"
Ignoring route update, not a gateway route" route="{Ifindex: 0 Dst: <nil> Src: <nil> Gw: <nil> Flags: [] Table: 0 Realm: 0}"
Ignoring route update, not a gateway route" route="{Ifindex: 0 Dst: <nil> Src: <nil> Gw: <nil> Flags: [] Table: 0 Realm: 0}"
Ignoring route update, not a gateway route" route="{Ifindex: 0 Dst: <nil> Src: <nil> Gw: <nil> Flags: [] Table: 0 Realm: 0}"

@nikitos
Copy link
Author

nikitos commented Jan 30, 2025

looks like problem is here
vishvananda/netlink#551

@nikitos
Copy link
Author

nikitos commented Feb 7, 2025

after some more investigation i mention that increasing netlink receive buffer fix problem - nebula reads routes not fast (may be cause of cloning route tree every time, need some more investigations), will send PR soon with netlink options

@johnmaguire johnmaguire added the NeedsFix The path to resolution is known, but the work has not been done. label Mar 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsFix The path to resolution is known, but the work has not been done.
Projects
None yet
Development

No branches or pull requests

2 participants