Working on a possible topology for new data center buildout -- we are small enough that a single switch servicing a rack is not ideal (so desire MLAG for tolerating switch failure / maintenance events), but want to be able to scale out appropriately and limit fault domains, so would like to do L3 CLOS Leaf and Spine between Racks.
The challenge i'm coming to is how best to design that with appropriate failover and routing, etc. I am using the VX VMs to lab this up as best I can and curious if anyone else has done this, or has any insight or input.
What i've thought about so far:
- IBGP between switches that are MLAG pair, EBGP from MLAG Leaf to Spines (not sure whether its best for Spines to be same AS, or all different AS; seems like conflicting designs out in the wild that have been published; each Leaf MLAG pair would be its own AS). Running this in the VM lab environment now, and seems to work alright (although in current form each spine gets two ECMP routes for the MLAG loopbacks rather than only one)
- EBGP everywhere, every switch is its own AS. MLAG peers EBGP with each other, and every spine. not sure implications of this yet, haven't labbed it up yet.
- Similar to option 2, but instead of EBGP between MLAG pairs, a default static route at higher Admin Distance so if somehow an MLAG switch loses all its EBGP sessions to spines, but peer router still has them, it would ride that static default to peer.
Anyone have any ideas / comments?