How I spent last summer:
Converting MAX to 2547bis VPNs

Dan Magorian
Director, Engineering and Operations
magorian@maxgigapop.net

Talk 1:  MAX’s 2547 vpn background
Heard Ivan Gonzalez of Juniper’s presentation to routing wg July 2002 on Calren 2547 mockup. Examined his configs, looked useful & feasible.
MAX in 2002 was offering standard mix of routes from Abilene, DREN, Esnet, vBNS, etc peerings. Everything working, nobody interested in 2547.
But pressure from customers mounted to resell Qwest ISP service.  Ran tests for 6 months with UMD as ISP and GU as customer, and found routing problems.  Mocked up 2547 topology on MAX’s inexpensive lab routers, then deployed.

Mid-Atlantic Crossroads

MAX Core Optical Network

So what are RFC 2547 L3 VPNs or, Why should you let MPLS onto your network?
Probably everyone’s heard “fish problem” talks & knows about policy constrained routing issues.
Don’t want to bore everyone with fish again, or characterizations of atm/frame vs L2 vs L3 vpns.  Many people have done that better than I could
This is from operational perspective:  what caused MAX to convert to them, what are pros and cons, why regional (not just national) service providers might consider using them.

Policy Constrained Routing Review
Policy Constrained Routing, Explicit Routing Objectives
Solve long standing “fish problem” by use of single router node to create multiple policies or “routing instances”
Use more than destination as criteria for routing decision
At minimum use Source (VPN membership or L3 Info) + Destination for route decision
Technology evolution offers solution
RFC 2547bis and MPLS

PCR/ER Overview

PCR/ER – The Challenge

PCR/ER – Solution – (Control Plane) 2547bis L3VPN

That’s fine, but why should you care?
2547 is widely used by many national service providers to create overlay networks to different customers.  Many run ip on edge, mpls in core.
Should mention Cisco routers might have similar VRF capability.  Don’t know, can’t speak to that.
Yet gigapops don’t usually have overlay networks, and people usually find workarounds to fish problems.
But in this case we ran into a show-stopper that caused us to really need to deploy them.

What Juniper doesn’t tell you about 2547
They’re called Routing Instances (VRFs), but they AREN’T virtual routers.  JunOs has virtual routers in 6.1
2547 vpns have only one iBGP.  What happens is that a Route Identifier and Target bgp communities are added, putting routes into separate tables.  IGP remains same.
The catch is that interfaces need to be in one VRF or another.  They can’t be in multiples.  There are tricky workarounds with next-table policies, but they didn’t work well in our situation.  Customers don’t know about 2547.
So since we ebgp with everyone, we establish second peerings with ISP customers.  A nuisance but it works.

Slide 12

So what was the big problem?
We mark routes from upstreams with bgp communities, and use them to subset which routes are advertised to downstream customers.
Eg, everyone gets Abilene, Dren, Esnet, MAX, but only a few get vBNS (yes, we still have)
Started out doing same w/ ISP :  bgp community to control Qwest route advertisement.
Then discovered that we were blackholing traffic from certain non-Qwest customers eg NLM.

Exploring further
We found that the main problem was gigapops who advertise unequal prefix length announcements to their ISP and Abilene. If everyone’s were equal would be fine.
But we discovered that it’s unfortunately fairly common for GPs to aggregate towards Abilene and not aggregate or aggregate less towards their ISPs, very undesirable
So when NLM saw a route and sent traffic to MAX, a more specific Qwest route not advertised to them could take precedence, and yet they hadn’t subscribed to Qwest ISP .  So that traffic would be dropped.
After some attempts to get sources to fix it we gave up.

So we realized that
Most gigapops have only one service offering, and give everyone a mix of I2 and ISP routes.
The problem would only get worse once we had customers not subscribed to Abilene (now do).
There wasn’t an easy way out that could “get back” routes not preferred by bgp when sub-setted by bgp communities for advertisement.
We had seen same issue in minor way with vBNS, but NGIX Abilene/vBNS peering solved it
Lots of folks run 2547, and Juniper supports well

OK, so how hard was it to deploy?
Went quite smoothly.  We had Ivan’s configs that had worked with Calren mockup.  Ben Eater of Juniper gave lots of help, but we mocked up ourselves in our lab.
Main issue was what to do about the dual peering issue.  Since we bgp with everyone except 2 downtown offices, decided to simplify to make everyone same.  One went away and Ucaid/dc converting soon.  Lots would be prob
Basically, we turned on new VRF, ran for awhile, no issues, then turned on Qwest peering & new custs.
Did cheat and keep existing inet.0:  Juniper recommends doing everything as VRFs.  Better but nasty cutover.

Slide 17

There’ve got to be some downsides
Two virtual circuit requirement is biggest.  We couldn’t have done it if we had lots of statics to small customers.  Ethernet easy, yet carriers like Yipes don’t support trunking and multiple vlans.  Sonet & DSXs need frame relay encap for dlcis.
Also have get used to tracing & pinging within other routing tables, & oddities about source ints
And what MPLS is doing can be fuzzy with core and edge routers same, like nat net in 4 boxes.
Plus your bgp becomes unusual, eg Arbor mons

Sample JunOS showing VRF (partial)
routing-instances {
    Q {
        instance-type vrf;
        /* UMD-ISP */
        interface ge-2/2/0.1;
        /* GU-ISP */
        interface at-1/0/0.4;
        /* HHMI-ISP */
        interface ge-2/2/1.6;
        route-distinguisher 206.196.177.246:1;
        vrf-import QWEST-IMPORT;
        vrf-export QWEST-EXPORT;
        routing-options {
            rib Q.inet.0 {
                aggregate {
                    route 206.196.176.0/21 {
                        community 10886:1; passive

"protocols {"
        protocols {
            bgp {
                group ISP {
                    type external;
                    export [ AGGREGATE-ROUTES NO-MAX-SPECIFICS ];
                    neighbor 206.196.177.50 {
                        description UMD-ISP;
                        import from-UMD;
                        export [ DEFAULT MAX QWEST REJECT ];
                        peer-as 27;
                    neighbor 206.196.177.154 {
                        description HHMI-ISP;
                        import from-HHMI;
                        export [ NO-DEFAULT QWEST REJECT ];
                        peer-as 16692community QWEST members 10886:8;

"policy-statement QWEST-IMPORT {"
    policy-statement QWEST-IMPORT {
        term 10 {
            from community QWEST-VRF  then accept;
        term 20 then reject
    policy-statement QWEST-EXPORT {
        term 10 {
            then  community add QWEST-VRF;  accept;
    community ABILENE members 10886:3;
    community DREN members 10886:5;
    community ESNET members 10886:6;
    community MAX members 10886:1;
    community NISN members 10886:4;
    community NREN members 10886:9;
    community QWEST members 10886:8;
    community QWEST-CUST members 209:209;
    community QWEST-VRF members   target:209:1;
    community VBNS members 10886:2;

So do we recommend it?
Sure.  Wasn’t that hard, no real war stories, didn’t cost anything except time to scope it out.
Solves problem nicely, caused by our being late into ISP resale, and reselling each separately.
Can have as many overlay nets as we want for different service offerings (mostly different ISPs)
Might also be handy if don’t want to mix lost-cost ISP routes (eg Cogent) w/higher cost (eg Qwest)
Might be trickier for GPs with lots of small custs

Thanks!

Questions?