Post

VyOS and BGP MED

Inefficient Routing

An important part of my (new) network is the integration with my old Linodes, on which I rely for the IPv4/v6 addressing for my authoritative nameservers, among other things.

That’s connected to my current internal network through a variety of tunnels (That’s a discussion for another day).

Current setup Current setup

The problem I have been facing is with the return traffic from my Linodes to my internal network. That (new) internal network spans two different physical locations, however it is completely integrated (single private ASN / single IGP, both sites connected via GRETAP over AS203528).

I would like to have the routers in OSR1 / OSR2 advertise via BGP the prefixes of the (OSR1+OSR2) network, to the old Linodes, using MED value of the IGP metric to reach the iBGP next hops. This would be something like Juniper’s “metric-out igp” BGP config. Unfortunately I have not found a VyOS config parameter to do that. The most you can do is set up a route-map to force an specific MED value.

Default behavior: no MED sent.

1
2
3
fabrizzio@OSR1CR2:~$ sh ip bgp ipv4 unicast neighbors 192.168.251.209 advertised-routes | match "Metric|192.168.20.0/24"
   Network          Next Hop            Metric LocPrf Weight Path
*> 192.168.20.0/24  0.0.0.0                       100      0 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 i

Because of this, unless I do something to amend this situation, traffic from either towards my network can be sent to either OSR1 or OSR2 (if it arrives at the wrong POP it must be “backhauled”). Those routes will be equivalent.

Dirty Fix

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
192.168.251.186 4 4200000001    200881    153094        0    0    0 01:14:05          109      121 To OSR2CR2
192.168.251.194 4 4200000001    175286    143443        0    0    0 01:14:04          109      121 To OSR1CR1

fabrizzio@FFT1EV1:~$ sh  ip bgp ipv4 unicast neighbors 192.168.251.186 routes | match  "Metric|192.168.20.0/24"
   Network          Next Hop            Metric LocPrf Weight Path
*  192.168.20.0/24  192.168.251.186      10000             0 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 i
fabrizzio@FFT1EV1:~$ sh  ip bgp ipv4 unicast neighbors 192.168.251.194 routes | match  "Metric|192.168.20.0/24"
   Network          Next Hop            Metric LocPrf Weight Path
*  192.168.20.0/24  192.168.251.194                        0 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 i
000001 4200000001 i

set policy route-map METRIC_10000 rule 10 action 'permit'
set policy route-map METRIC_10000 rule 10 set metric '10000'

set protocols bgp neighbor 192.168.251.186 address-family ipv4-unicast route-map import 'METRIC_10000'
set protocols bgp neighbor 192.168.251.186 address-family ipv6-unicast route-map import 'METRIC_10000'


I have more capacity at OSR1, therefore I forced a MED value on the BGP peerings from each Linode to OSR2 nodes. Traffic destined to OSR2 will route via OSR1 (then the GRETAP to OSR2). Outbound traffic from OSR2 will be direct.

To be fair this works just fine. I could mess with the route map to make it filter some prefixes, or maybe two route maps (to prefer OSR1 prefixes on OSR1 tunnel, and OSR2 prefixes on OSR2 tunnel).

However this can be made better, much better.

Communities to the rescue

Easiest way would be having the route reflectors within my internal network, add a community on the routes coming from each RR-client, based on the node location.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
set policy route-map RTR_OSR1 rule 10 action 'permit'
set policy route-map RTR_OSR1 rule 10 set community add '65000:101'
set policy route-map RTR_OSR2 rule 10 action 'permit'
set policy route-map RTR_OSR2 rule 10 set community add '65000:102'



set protocols bgp neighbor 192.168.254.10 address-family ipv4-unicast route-map import RTR_OSR1
set protocols bgp neighbor 192.168.254.10 address-family ipv6-unicast route-map import RTR_OSR1


set protocols bgp neighbor 192.168.254.16 address-family ipv4-unicast route-map import RTR_OSR2
set protocols bgp neighbor 192.168.254.16 address-family ipv6-unicast route-map import RTR_OSR2

This works just fine: (I have four RRs)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
fabrizzio@OSR1CR1:~$ sh ip bgp ipv4 unicast 192.168.20.0/24
BGP routing table entry for 192.168.20.0/24, version 128126
Paths: (4 available, best #4, table default)
  Local
    192.168.254.33 (metric 1110) from 192.168.254.53 (192.168.254.33)
      Origin IGP, metric 0, localpref 100, valid, internal
      Community: 65000:101
      Originator: 192.168.254.33, Cluster list: 4.4.4.4
      AddPath ID: RX 85, TX-All 1231 TX-Best-Per-AS 0
      Last update: Fri Mar  3 18:07:12 2023
  Local
    192.168.254.33 (metric 1110) from 192.168.254.51 (192.168.254.33)
      Origin IGP, metric 0, localpref 100, valid, internal
      Community: 65000:101
      Originator: 192.168.254.33, Cluster list: 2.2.2.2
      AddPath ID: RX 477, TX-All 1724 TX-Best-Per-AS 0
      Last update: Fri Mar  3 18:06:24 2023
  Local
    192.168.254.33 (metric 1110) from 192.168.254.52 (192.168.254.33)
      Origin IGP, metric 0, localpref 100, valid, internal
      Community: 65000:101
      Originator: 192.168.254.33, Cluster list: 3.3.3.3
      AddPath ID: RX 478, TX-All 1728 TX-Best-Per-AS 0
      Last update: Fri Mar  3 18:06:50 2023
  Local
    192.168.254.33 (metric 1110) from 192.168.254.50 (192.168.254.33)
      Origin IGP, metric 0, localpref 100, valid, internal, best (Neighbor IP)
      Community: 65000:101
      Originator: 192.168.254.33, Cluster list: 1.1.1.1
      AddPath ID: RX 199, TX-All 1729 TX-Best-Per-AS 0
      Advertised to: 172.27.18.42 172.27.18.58 192.168.251.193
      Last update: Fri Mar  3 18:05:05 2023
fabrizzio@OSR1CR1:~$ sh ip bgp ipv4 unicast 192.168.35.0/24
BGP routing table entry for 192.168.35.0/24, version 128070
Paths: (4 available, best #3, table default)
  Local
    192.168.254.34 (metric 45110) from 192.168.254.53 (192.168.254.34)
      Origin IGP, metric 0, localpref 100, valid, internal
      Community: 65000:102
      Originator: 192.168.254.34, Cluster list: 4.4.4.4
      AddPath ID: RX 4, TX-All 1005 TX-Best-Per-AS 0
      Last update: Fri Mar  3 18:07:12 2023
  Local
    192.168.254.34 (metric 45110) from 192.168.254.51 (192.168.254.34)
      Origin IGP, metric 0, localpref 100, valid, internal
      Community: 65000:102
      Originator: 192.168.254.34, Cluster list: 2.2.2.2
      AddPath ID: RX 9, TX-All 393 TX-Best-Per-AS 0
      Last update: Fri Mar  3 18:06:24 2023
  Local
    192.168.254.34 (metric 45110) from 192.168.254.50 (192.168.254.34)
      Origin IGP, metric 0, localpref 100, valid, internal, best (Neighbor IP)
      Community: 65000:102
      Originator: 192.168.254.34, Cluster list: 1.1.1.1
      AddPath ID: RX 88, TX-All 396 TX-Best-Per-AS 0
      Advertised to: 172.27.18.42 172.27.18.58 192.168.251.193
      Last update: Fri Mar  3 18:05:05 2023
  Local
    192.168.254.34 (metric 45110) from 192.168.254.52 (192.168.254.34)
      Origin IGP, metric 0, localpref 100, valid, internal
      Community: 65000:102
      Originator: 192.168.254.34, Cluster list: 3.3.3.3
      AddPath ID: RX 27, TX-All 400 TX-Best-Per-AS 0
      Last update: Fri Mar  3 18:06:50 2023

Then on the routers at OSR1 / OSR2 peering with my Linodes:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
set policy community-list FROM_OSR1 rule 10 action 'permit'
set policy community-list FROM_OSR1 rule 10 regex '65000:101'
set policy community-list FROM_OSR2 rule 10 action 'permit'
set policy community-list FROM_OSR2 rule 10 regex '65000:102'

set policy route-map LINODE_MED_FROM_OSR1 rule 10 action 'permit'
set policy route-map LINODE_MED_FROM_OSR1 rule 10 match community community-list 'FROM_OSR1'
set policy route-map LINODE_MED_FROM_OSR1 rule 10 set metric '100'
set policy route-map LINODE_MED_FROM_OSR1 rule 20 action 'permit'
set policy route-map LINODE_MED_FROM_OSR1 rule 20 match community community-list 'FROM_OSR2'
set policy route-map LINODE_MED_FROM_OSR1 rule 20 set metric '200'


set policy route-map PREPEND-9-LINODE rule 10 action 'permit'
set policy route-map PREPEND-9-LINODE rule 10 set as-path prepend '4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001'
set policy route-map PREPEND-9-LINODE rule 10 call 'LINODE_MED_FROM_OSR1'

set protocols bgp neighbor 192.168.251.209 address-family ipv4-unicast route-map export PREPEND-9-LINODE
set protocols bgp neighbor 192.168.251.209 address-family ipv6-unicast route-map export PREPEND-9-LINODE

(The AS prepending will vary depending on the distance, I ran BGP as IGP before and this is just a remnant).

Results

1
2
3
4
5
6
7
fabrizzio@OSR1CR2:~$ sh ip bgp ipv4 unicast neighbors 192.168.251.209 advertised-routes | match "Metric|192.168.20.0/24"
   Network          Next Hop            Metric LocPrf Weight Path
*> 192.168.20.0/24  0.0.0.0                100    100      0 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 i
fabrizzio@OSR1CR2:~$ sh ip bgp ipv4 unicast neighbors 192.168.251.209 advertised-routes | match "Metric|192.168.35.0/24"
   Network          Next Hop            Metric LocPrf Weight Path
*> 192.168.35.0/24  0.0.0.0                200    100      0 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 i

Now the metrics for the prefixes depend on the site. This one was easy :D

The mistake and fix

After deploying this I was trying to figure out why some routes were not being advertised.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
172.27.18.58    4      65108   1130276   1166833        0    0    0 01w6d08h            2      121 To OSR1E3
192.168.251.193 4      65007     47890     55758        0    0    0 00:27:11           36      106 To FFT1EV1 *LEGACY*


fabrizzio@OSR1CR1:~$ sh ip bgp ipv4 unicast neighbors 172.27.18.58 routes
BGP table version is 128213, local router ID is 192.168.254.10, vrf id 0
Default local pref 100, local AS 4200000001
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          Next Hop            Metric LocPrf Weight Path
*> 192.168.39.0/24  172.27.18.58             0             0 65108 i
*> 192.168.45.128/26
                    172.27.18.58             0             0 65108 i

Displayed  2 routes and 1787 total paths
fabrizzio@OSR1CR1:~$ sh ip bgp ipv4 unicast neighbors 192.168.251.193 advertised-routes | match "192.168.39.0/24"
fabrizzio@OSR1CR1:~$


The core routers also have eBGP sessions to routers that were also part of my legacy network (or just the Mikrotik boxes which I don’t want on my IGP!). These routes that are received via eBGP (or just redistributed into BGP) instead of coming from a route reflector, don’t have the OSR1/OSR2 community - therefore are dropped.

1
2
3
4
5
6
7
8
9
10
11
fabrizzio@OSR1CR1:~$ sh ip bgp ipv4 unicast 192.168.39.0/24
BGP routing table entry for 192.168.39.0/24, version 128102
Paths: (5 available, best #5, table default)
<snip>
  65108
    172.27.18.58 from 172.27.18.58 (192.168.45.129)
      Origin IGP, metric 0, valid, external, best (AS Path)
      AddPath ID: RX 0, TX-All 9 TX-Best-Per-AS 0
      Advertised to: 172.27.18.42 172.27.18.58 192.168.254.50 192.168.254.51 192.168.254.52 192.168.254.53
      Last update: Sat Feb 18 11:18:30 2023

Summary of the issue Summary of the issue

Fix and results after:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
set policy route-map LINODE_MED_FROM_OSR1 rule 10 action 'permit'
set policy route-map LINODE_MED_FROM_OSR1 rule 10 match community community-list 'FROM_OSR1'
set policy route-map LINODE_MED_FROM_OSR1 rule 10 set metric '100'
set policy route-map LINODE_MED_FROM_OSR1 rule 20 action 'permit'
set policy route-map LINODE_MED_FROM_OSR1 rule 20 match community community-list 'FROM_OSR2'
set policy route-map LINODE_MED_FROM_OSR1 rule 20 set metric '200'
set policy route-map LINODE_MED_FROM_OSR1 rule 30 action permit
set policy route-map LINODE_MED_FROM_OSR1 rule 30 set metric '100'
set policy route-map LINODE_MED_FROM_OSR1 rule 30 description 'Catch-all for local non-reflected routes'


fabrizzio@OSR1CR1:~$ sh ip bgp ipv4 unicast neighbors 192.168.251.193 advertised-routes | match "192.168.39.0/24"
*> 192.168.39.0/24  0.0.0.0                100             0 4200000001 4200000001 4200000001 4200000001 4200000001 4200000001 65108 i

Was an easy thing to fix at least :D

This post is licensed under CC BY 4.0 by the author.

Trending Tags