Post

IPv6 next-hop problem when recursively resolving BGP route

Why can’t I SSH to my routers anymore???????

Well, everything started with an upgrade to one of the latest VyOS Nightly builds “vyos-1.4-rolling-202306080317” on some of my AS203528 routers.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
fabrizzio@osr1j1:~$ ping osr2br2
PING osr2br2(dum0.OSR2BR2.compumundohipermegared.one (2a0e:8f02:21d0:ffff::15)) 56 data bytes
^C
--- osr2br2 ping statistics ---
7 packets transmitted, 0 received, 100% packet loss, time 6130ms

fabrizzio@osr1j1:~$ traceroute osr2br2
traceroute to osr2br2 (2a0e:8f02:21d0:ffff::15), 30 hops max, 80 byte packets
 1  _gateway (2a0e:8f02:21d1:120::1)  0.247 ms  0.222 ms  0.212 ms
 2  eth9.osr1cr5.compumundohipermegared.one (2a0e:8f02:21d1:feed:0:1:19:11)  0.423 ms  0.408 ms  0.383 ms
 3  eth2.osr1cr3.compumundohipermegared.one (2a0e:8f02:21d1:feed:0:1:5:11)  0.774 ms  0.751 ms  0.678 ms
 4  osr1fw2.compumundohipermegared.one (2a0e:8f02:21d1:ffff::42)  0.860 ms  0.836 ms  0.810 ms
 5  dum0.OSR1BR2.compumundohipermegared.one (2a0e:8f02:21d0:ffff::13)  1.107 ms  1.079 ms  1.054 ms
 6  * * *
 7  * * *
 8  * * *
 9  * * *
10  * * *
11  *^C

Just by the trace it looked very much like a problem on the reverse path. Luckily I was still able to connect via IPv4. Trying to ping my OSR1 jumphost from OSR2BR2:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
fabrizzio@OSR2BR2:~$ ping 2a0e:8f02:21d1:120::11
PING 2a0e:8f02:21d1:120::11(2a0e:8f02:21d1:120::11) 56 data bytes
From 2a0e:8f02:21d0:feed:deed:0:21c:1002 icmp_seq=1 Destination unreachable: Address unreachable
From 2a0e:8f02:21d0:feed:deed:0:21c:1002 icmp_seq=2 Destination unreachable: Address unreachable
From 2a0e:8f02:21d0:feed:deed:0:21c:1002 icmp_seq=3 Destination unreachable: Address unreachable
From 2a0e:8f02:21d0:feed:deed:0:21c:1002 icmp_seq=4 Destination unreachable: Address unreachable
^C
--- 2a0e:8f02:21d1:120::11 ping statistics ---
5 packets transmitted, 0 received, +4 errors, 100% packet loss, time 4068ms

fabrizzio@OSR2BR2:~$ sh ipv6 route 2a0e:8f02:21d1:120::11
Routing entry for 2a0e:8f02:21d1::/48
  Known via "bgp", distance 200, metric 1000, best
  Last update 01:11:38 ago
    fc0e:8f02:21d0:ffff::12 (recursive), weight 1
  *   fe80::7c2a:81ff:fe87:f5a2, via br536, weight 1
    fc0e:8f02:21d0:ffff::13 (recursive), weight 1
  *   fe80::401d:82ff:fe26:9549, via br540, weight 1

Destination unreachable?

This was very odd for me. I could ping the other end of the tunnel (OSR2BR2 <> OSR1BR1 & OSR2BR2 <> OSR1BR2). Also, the full mesh of IS-IS adjacencies are up at OSR2BR2.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
fabrizzio@OSR2BR2:~$ sh int bridge br536
br536: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1600 qdisc noqueue state UP group default qlen 1000
    link/ether c6:21:99:5a:13:d2 brd ff:ff:ff:ff:ff:ff
    inet6 2a0e:8f02:21d0:feed:deed:0:218:1002/126 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::ac30:5fff:fe1f:b9eb/64 scope link
       valid_lft forever preferred_lft forever
    Description: IPv6 Tunnel to OSR1BR1

    RX:    bytes  packets  errors  dropped  overrun       mcast
         4596611     3846       0        0        0        3825
    TX:    bytes  packets  errors  dropped  carrier  collisions
         4774008     4002       0        0        0           0
fabrizzio@OSR2BR2:~$ sh int bridge br540
br540: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1600 qdisc noqueue state UP group default qlen 1000
    link/ether 8e:50:3e:54:f7:01 brd ff:ff:ff:ff:ff:ff
    inet6 2a0e:8f02:21d0:feed:deed:0:21c:1002/126 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::f01e:3bff:fe29:3d8/64 scope link
       valid_lft forever preferred_lft forever
    Description: IPv6 Tunnel to OSR1BR2

    RX:    bytes  packets  errors  dropped  overrun       mcast
         4641220     3993       0        0        0        3884
    TX:    bytes  packets  errors  dropped  carrier  collisions
         4777203     3967       0        0        0           0
fabrizzio@OSR2BR2:~$ ping 2a0e:8f02:21d0:feed:deed:0:21c:1001
PING 2a0e:8f02:21d0:feed:deed:0:21c:1001(2a0e:8f02:21d0:feed:deed:0:21c:1001) 56 data bytes
64 bytes from 2a0e:8f02:21d0:feed:deed:0:21c:1001: icmp_seq=1 ttl=64 time=34.0 ms
64 bytes from 2a0e:8f02:21d0:feed:deed:0:21c:1001: icmp_seq=2 ttl=64 time=16.9 ms
^C
--- 2a0e:8f02:21d0:feed:deed:0:21c:1001 ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 16.870/25.443/34.017/8.573 ms
fabrizzio@OSR2BR2:~$ sh isis neighbor
Area VyOS:
  System Id           Interface   L  State        Holdtime SNPA
 AMS1BR1             br533       2  Up            28       2020.2020.2020
 NYC1BR1             br534       2  Up            29       2020.2020.2020
 OSR1BR1             br536       2  Up            29       2020.2020.2020
 OSR1BR2             br540       2  Up            28       2020.2020.2020
 OSR2BR1             br548       2  Up            29       2020.2020.2020
 OSR1BR3             br596       2  Up            29       2020.2020.2020
 OSR2GLASS1          eth8        2  Up            27       2020.2020.2020

Oddly enough I can ping the loopbacks of OSR1BR1 & OSR1BR2 (both GUA and ULA):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
fabrizzio@OSR2BR2:~$ ping 2a0e:8f02:21d0:ffff::12
PING 2a0e:8f02:21d0:ffff::12(2a0e:8f02:21d0:ffff::12) 56 data bytes
64 bytes from 2a0e:8f02:21d0:ffff::12: icmp_seq=1 ttl=64 time=32.2 ms
^C
--- 2a0e:8f02:21d0:ffff::12 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 32.191/32.191/32.191/0.000 ms

fabrizzio@OSR2BR2:~$ ping 2a0e:8f02:21d0:ffff::13
PING 2a0e:8f02:21d0:ffff::13(2a0e:8f02:21d0:ffff::13) 56 data bytes
64 bytes from 2a0e:8f02:21d0:ffff::13: icmp_seq=1 ttl=64 time=16.9 ms
^C
--- 2a0e:8f02:21d0:ffff::13 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 16.927/16.927/16.927/0.000 ms

fabrizzio@OSR2BR2:~$ ping fc0e:8f02:21d0:ffff::12
PING fc0e:8f02:21d0:ffff::12(fc0e:8f02:21d0:ffff::12) 56 data bytes
64 bytes from fc0e:8f02:21d0:ffff::12: icmp_seq=1 ttl=64 time=16.3 ms
^C
--- fc0e:8f02:21d0:ffff::12 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 16.340/16.340/16.340/0.000 ms

fabrizzio@OSR2BR2:~$ ping fc0e:8f02:21d0:ffff::13
PING fc0e:8f02:21d0:ffff::13(fc0e:8f02:21d0:ffff::13) 56 data bytes
64 bytes from fc0e:8f02:21d0:ffff::13: icmp_seq=1 ttl=64 time=16.2 ms
^C
--- fc0e:8f02:21d0:ffff::13 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 16.228/16.228/16.228/0.000 ms

Then by comparing the routes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
fabrizzio@OSR2BR2:~$ sh ipv6 route  2a0e:8f02:21d0:ffff::13
Routing entry for 2a0e:8f02:21d0:ffff::13/128
  Known via "isis", distance 115, metric 500, best
  Last update 01:59:55 ago
  * fe80::909f:cdff:fe62:592e, via br540, weight 1 <<<<< CORRECT LL address from OSR1BR2

Routing entry for 2a0e:8f02:21d0:ffff::13/128
  Known via "bgp", distance 200, metric 0
  Last update 02:00:04 ago
    fc0e:8f02:21d0:ffff::13 (recursive), weight 1
      fe80::401d:82ff:fe26:9549, via br540, weight 1  <<<<< where did this come from?
fabrizzio@OSR2BR2:~$ sh ipv6 route  2a0e:8f02:21d0:ffff::12
Routing entry for 2a0e:8f02:21d0:ffff::12/128
  Known via "bgp", distance 200, metric 0
  Last update 01:16:11 ago
    fc0e:8f02:21d0:ffff::12 (recursive), weight 1
      fe80::7c2a:81ff:fe87:f5a2, via br536, weight 1 <<<< where did this come from?

Routing entry for 2a0e:8f02:21d0:ffff::12/128
  Known via "isis", distance 115, metric 500, best
  Last update 01:16:11 ago
  * fe80::60fa:89ff:fe52:4194, via br536, weight 1 <<<<< CORRECT LL address from OSR1BR1

fabrizzio@OSR2BR2:~$ sh ipv6 route  fc0e:8f02:21d0:ffff::12
Routing entry for fc0e:8f02:21d0:ffff::12/128
  Known via "isis", distance 115, metric 510, best
  Last update 01:16:22 ago
  * fe80::60fa:89ff:fe52:4194, via br536, weight 1 <<<<< CORRECT LL address from OSR1BR1


sh fabrizzio@OSR2BR2:~$ sh ipv6 route 2a0e:8f02:21d1:120::11
Routing entry for 2a0e:8f02:21d1::/48
  Known via "bgp", distance 200, metric 1000, best
  Last update 01:19:09 ago
    fc0e:8f02:21d0:ffff::12 (recursive), weight 1
  *   fe80::7c2a:81ff:fe87:f5a2, via br536, weight 1 <<<< Both next hop LL IPs don't match what's on the other end of the tunnel
    fc0e:8f02:21d0:ffff::13 (recursive), weight 1
  *   fe80::401d:82ff:fe26:9549, via br540, weight 1 <<<< Both next hop LL IPs don't match what's on the other end of the tunnel.


Now here’s the issue. For some reason the IPv6 route to the remote router’s IPv6 ULA loopback (“fc0e:8f02:21d0:ffff::12”, which I am forcing BGP to use) has the correct LL next hop for the other end of the tunnel. But when doing the recursive lookup, as an example towards “2a0e:8f02:21d0:ffff::12/128” or to “2a0e:8f02:21d1::/48”, which gets recursively resolved using “fc0e:8f02:21d0:ffff::12”, the next-hop found for it is incorrect. I tried changing manually the IPv6 Link-local address on OSR1BR1, the IS-IS route next-hop as seen on OSR2BR2 did change, but when doing the recursive look-up it was stuck on the old link-local address.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
fabrizzio@OSR1BR1:~$ sh interfaces bridge br536
br536: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1600 qdisc noqueue state UP group default qlen 1000
    link/ether fe:45:5c:03:c6:ae brd ff:ff:ff:ff:ff:ff
    inet6 2a0e:8f02:21d0:feed:deed:0:218:1001/126 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::60fa:89ff:fe52:4194/64 scope link <<<<< CORRECT LL address from OSR1BR1
       valid_lft forever preferred_lft forever
    Description: IPv6 Tunnel to OSR2BR2

    RX:    bytes  packets  errors  dropped  overrun       mcast
         2739350     2303       0        0        0        2289
    TX:    bytes  packets  errors  dropped  carrier  collisions
         2781627     2287       0        0        0           0
fabrizzio@OSR1BR2:~$ sh interfaces bridge br540
br540: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1600 qdisc noqueue state UP group default qlen 1000
    link/ether 4a:11:ee:ae:f3:87 brd ff:ff:ff:ff:ff:ff
    inet6 2a0e:8f02:21d0:feed:deed:0:21c:1001/126 scope global
       valid_lft forever preferred_lft forever
    inet6 fe80::909f:cdff:fe62:592e/64 scope link <<<<< CORRECT LL address from OSR1BR2
       valid_lft forever preferred_lft forever
    Description: IPv6 Tunnel to OSR2BR2

    RX:    bytes  packets  errors  dropped  overrun       mcast
         4288601     3568       0        0        0        3552
    TX:    bytes  packets  errors  dropped  carrier  collisions"
         4356271     3683       0        0        0           0

I truly have no idea what might be going on here. This one is an example. I had the issue occur on other routers and rebooting them was playing a nasty whack-a-mole game with the issue appearing elsewhere. Clearing BGP neighbors didn’t fix this issue as well. The thing is that the MAC addresses assigned to the tunnels change upon a router reboot. Therefore if you reboot router A then all the other tunnels from the other routers (with the suspected software bug) pointing towards A will still have the old IPv6 link-local next hop of the tunnel endpoints at A.

I’ve just rolled back to known good version “1.4-rolling-202210280218” for now. If I get some spare time I will lab this up and file a bug with VyOS. I don’t know if it’s VyOS bug or FRR bug to be fair. “1.4-rolling-202210280218” uses FRR 8.3.1, the nightly I tried was “vyos-1.4-rolling-202306080317” with FRR 8.5.1

Hope this helps someone.

This post is licensed under CC BY 4.0 by the author.

Trending Tags