How I spent last summer:
Converting NGIX/DC atm to gige

Dan Magorian
Director, Engineering and Operations
magorian@maxgigapop.net

Talk 2:  Converting NGIX/DC to gige
MAX has run the Fednet peer point for Abilene, vBNS, DREN, NISN, NREN, USGS for 4 years
Previously called NGIX/East coast, is meet-me point (NAP) for major Federal R&E nets.
Topology was full L2 mesh of p-t-p atm pvcs. No transit peerings, no common route pool.  Doesn’t scale to larger peer points, but kept life simple.
Then atm began to go out of style, and customer nets began push for frame-based (gige) peering

Took us awhile to get it done
We contemplated several architectures, including putting all nets on Junipers directly and using transitional cross-connects (tccs) that allow bolting ethernets to atms to sonets.
Ended up deciding to use conventional switch, because of high cost of 10G router interfaces ($110k vs $30k for switched), also no tcc IPv6.
Did side-by-side comparison of Cisco 6500 vs Extreme Black Diamond.  Liked Extremes, but no oc12 atm, and winning feature was Cisco’s OSM-2OC12-atm 1483 bridged-routing blade

RFC 1483:Multiprotocol encap over AAL5
One of the many reasons in addition to lane why people run screaming from atm (actually, if you only do pvcs, atm is easy and reliable AND can do 9K and even 64k mtus).  I’m no 1483 expert.
There are many variations involving routed and bridged pdus.  Basically, bridged is used by SPs who need to connect ethernets via an atm cloud.  What Marconi’s gige int does: useless for us.  Juniper does minimal routed subset w/1 ip addr good only for dslams.  Others as well, complex.

Bridged-routed connect: the magic
There are several variants of routed pdus across different platforms, mostly also useless to us.
But on 6500 OSM blades Cisco has developed a 1483 bridged/routed encap that does just what we wanted: allow ethernets to connect to atms.  Uses proxy arp and other ugly stuff internally.
Best part was that we only needed it for the transition, so we borrowed the $60K blade for 6 months & then returned it when all atm gone.

So why are we discussing 1483 anyway?
Converting a national NAP from atm to gige isn’t like changing over your campus net at midnight.  Replacement lines have to be arranged, peer partners have to coordinate ints, downtime has to be minimal because they’re paying for uptime
The biggest issue was how to do the transition without massive flag-day changeover nightmare
With testing, realized that the Cisco OSM blade bridged-routing would allow us to single-endedly change over each peer at midnight without any of their full-mesh of peers needing to be there.

How did we do it?
   We hooked up both OSM OC12 atms to the Marconi atm switch (would have been a problem if more traffic than would fit, but there wasn’t).
All peers were assigned new vlans matched to their old pvcs.  The 6500 bridged routed encap allowed us to make ip connections across.  In some sense was subset of TCCs, but cheaper.
When each peer had their new line and gige int ready with new vlans, we hot-cut their line both ends to gige, moving the /30 w/o peers knowing!

Sample bre-connect 6500 IOS (partial)
interface ATM4/1
 description marconi 1D1
 mtu 9186
 no ip address
 atm bridge-enable
 switchport trunk allowed vlan 161,170,178
 switchport mode trunk
interface ATM4/1.161 point-to-point
 pvc USGS-Abilene 161/32
    bre-connect 161
interface GigabitEthernet3/5
 description USGS
 mtu 9216
 no ip address
 switchport trunk encapsulation dot1q
 switchport trunk allowed vlan 158,161,162,178
 switchport mode trunk
interface Vlan161
 description ABIL-USGS ACTIVE gige
 no ip address
 mtu 9216

How did it go?  Had to be some war story
Of course there was,
Overall had to be VERY careful at midnight cutting pvcs
Testing initially didn’t work, took conf with Cisco product manager to resolve issues.  Then fine.  Had incredible lab juryrig of atm switch & routers in multiple bldgs
NASA’s NISN was first, easy because had spare fiber to GSFC, didn’t have to hotcut.
Then Abilene was easy, because MAX sells them lambda transport to NGIX, just had change wdm
USGS, NLM, and MAX also easy hotcuts.
But MCI’s DREN and vBNS were nightmare.

The war story
Caveat:  MCI did VERY good job, not their fault.
MCI is contractor for both original vBNS and in 2002 brought up new DREN.  MCI had Verizon OC12 from their pop to MAX for years, I suggested both share it.
With summer gige conversion, were unable to get replacement gige line to their pop.  Decided keep OC12atm and I installed their colo M10 to run TCCs.
Juniper and Andrea Reitzel tested in lab, worked
At midnight cut, vBNS TCCs failed, had to back out 5am.  Eventually had to loan MCI MAX on-site test routers.
Scott Robohn of Juniper found,needed old gige firmware
After another 4am-er got working, but tccs still touchy

So now that’s done…
All peers off atm, the OSM blade went back
Now the config is totally boring, pure L2 on one 16-port gige blade, runs beautifully.
Will be upgrading Abilene’s peering to 10G next week, already tested Luxn 10G transponders.
NREN joined NGIX, using GSFC wdm transport
Subsequently bought lab 6503 to test future IOSes and other changes. Upcoming release will have L2 support for netflow, now just int stats.

Thanks!
Questions?
Dan Magorian
magorian@maxgiapop.net