Friday, November 27, 2015

CCIE SPv4 Lab: Running it on my laptop (w/only 16GB of RAM)


BIG shot out to VMWare ESXi, because that finally did the trick. Currently, I'm visiting family for the holidays but I still want to be able to study when I can. I know that's super nerdy, but if you're reading this... you're not much better than me. So stop judging. I've recently purchased a server off eBay (nice Dell R610 with 32GB of DDR3 for ~300 bucks), but it's not yet arrived so I've been trying to get something workable in place for the interim. At home my workstation has 32GB of RAM, and is fairly capable of running these SP labs. However, I'm getting both off point and a little ahead of myself. Let me first tell you about my lab topology, from INE's SPv4 material.

 - (10) CSR1000v. The CSR1kv has certainly come a long way in terms of requirements, but it's still a hefty beast. You can technically get by on (1) vCPU and 2.5GB of RAM, but I like to run mine with 3GB of RAM. So right off the bat, I'm at a potential 30GB of RAM needed.

- (4) IOS-XRv. IOS-XRv runs perfectly well with only 2GB of RAM, so we're up to a potential 38GB of RAM needed.



So, obviously these VMs will rarely (if ever) use that much memory. That doesn't matter with VIRL though, because it will still prevent you from firing up that environment without 38GB of available memory. First attempt at resolving this, lie to VIRL. I went into my VMWare Workstation settings and told it "Allow most virtual machine memory t be swapped"



However, for this to really work I'd need to limit overall memory usage of all my VMs to ~8-10GB and swap the rest. Workstation doesn't like that, anything over 3/4 being swapped and Workstation will not let you boot the VM. Which wouldn't be a huge issue, except for the added overhead of VIRL itself, requiring anywhere from 2.5 - 5GB of RAM to run. Also keep in mind, I know that swapping that much will just murder performance, but I'm not after performance. I'm after boot 14 routers that move JUST fast enough that I can lab with them until I get home. So I hit a wall there. I tried my GNS3 Server approach (as covered in earlier posts)... but GNS3 can have odd issues with environments that get powered off and on several times. I even ran into issues with shutting down the server with GNS3 still open... then the server would just be perpetually hosed upon next boot. This just felt like WAY too much effort lol.


"Okay, we get it. How did you actually get it working?!"

I had this half baked idea, that I honestly felt wouldn't work. "What if I nest ESXi inside workstation? Give my ESXi VM a small amount of memory, then just try to boot everything?" I know ESXi is way better about swap when it runs out of RAM (hence why I bought  server explicitly for my SP lab), but I didn't think this would work well enough to actually use. I was wrong. I gave ESXi 8GB of RAM, loaded up (10) CSR1KVs and (4) IOS-XRvs and booted everything up. My laptop's CPU spiked really hard, and remained at ~89% for 5-10 minutes. After that though, it seemed to settle down well enough, and... everything booted. More than that, I copied base configs in for all devices and everything was talking!! Response times between devices is fairly high, but no more than dynamips routers really (averaging 30-60msec). That's totally ok though, and the console response is almost perfect after everything is booted. Nearly no lag... enter screen shots!!!



As you can see, all nodes booted and ESXi is just amazing at memory management apparently. On my host side, here's what task manager looks like:





I mean, it's definitely loaded (and when my work VM is running too, mem is at 96% utilized), but my laptop isn't melting!! As a follow up to this post, I might record a video showing off my VMWare config, the full on home lab I'm planning on using vSphere so I can have easier packet captures with a jump host. That's all for now though! Thanks for reading, and comment if you have questions or suggestions.



Thursday, November 19, 2015

The way forward


Blog.. TRANSFORMATION TIME!! Thanks to my long-time, and very good friend Steve Occhiogrosso (found here at ccie-or-null.net) I think I finally have a way forward. Since I passed my lab, you know WAY back in April, I've been feeling a bit empty. Just dying to be working on something new and different. For a while I toyed around with the idea of going the Data Center route (since I work in cloud services), but to be perfectly honest I just didn't feel the fire when I was going through the material. Nothing jumped out and pulled me in, and you really need that when you're considering an IE.

Enter Steve. In a brief chat, I was explaining how DC didn't seem like the path for me. Between a huge influx of interest in DC resulting in next to no available rack time... I wasn't feeling it. Then Steve very casually mentioned "What about Service Provider?"

...




Of course SP makes sense for me! MPLS is one of my absolute most favorite topics, so right off the bat I was really excited. So I cracked open INE's CCIE SPv4 material (all access pass, true nerd) and skimmed over it, all the while getting more and more enthused. Even better, they just revamped the exam mirroring the style of R/S v5 so you can virtualize everything. So, SP is my next mountain to climb, buckle up for SP related posts.


Monday, November 16, 2015

Let's talk about ACI - Cisco's SDN vision.


So everyone keeps saying "SDN is the future!! Network Engineers beware!!", and it's been a few years but generally my reply to that remains the same. "Meh." I'll save my general SDN opinions for another post. This post is about Cisco ACI, and all the fantastic fail that it is. In fairness, I'll start with things I like about ACI first... then I'll drag my soapbox out.

The Good!

(1) VXLAN... oh how I love thee. ACI brings you vxlan right out of the box, links between spines/leaves are IS-IS, and there are literally NO L2 links between fabric nodes. This is pretty awesome, admittedly not remotely unique to ACI, but very cool that this is the default behavior. (2) Bash shell on your network devices. The latter is pretty cool, but I'll go over the downsides soon. It was unsettling to log into a very NX-OS looking shell, but have full use of my favorite Linux commands. I want you (yes you, all 5 of the people that will read this) to know, when I started writing this blog, my intention was to make this section longer. I swear.


So, that all sounds kind of cool right? ACI must not be so bad after all!! Nope.



The Bad, and the Ugly.

I'm going to try my best to keep this section short, but I get the strong feeling I'm only going to become enraged as I type along. Any Windows users out there? It's ok, this is a safe place, I'm typing this post from the comfort of my Windows 10 workstation. Remember when Microsoft decided to take the start button away? That's how MOST of ACI feels. It feels like Cisco took my [slew of obscenities] start button away. Before ranting about what an awful mess it is to try to use REST to configure anything, let's talk about the things you just don't have currently in ACI. I'll start with the biggest one, traceroute. You heard me. You. Can not. Traceroute. 

[Stopping to allow that to sink in]

That's right, no traceroute. Sure the command is there, but it does nothing. Cisco documentation will refer you to "itraceroute", however that traceroute only shows the path within the fabric... making it useless really. I also hate that when converting a 9500 chassis to ACI, if you migrate one SUP to ACI mode... the change isn't replicated to the standby, which can lead to this just... weird flapping scenario until you figure out what happened. I know, because it [slew of obscenities] happened to me. Even better, if Cisco didn't ship your SUPs with the latest SSL certificates, your fabric will not come up. At all, OH, and you have to contact TAC to get it resolved. Again, I know... because it happened to me.

Automation, pretty much the only reason anyone thinks "Yeah, we should implement an SDN solution." Was just... painfully slow. I've used python scripts to push mass changes to non-SDN architecture at other jobs, and that was faster than trying to push changes to ACI. It was just slow. Speaking of slow, the APIC GUI is, as GUIs are, slow. Shutting/no shutting a single interface can take 2 minutes. Log into the APIC, wait. Go to Fabric, wait. Expand inventory, wait. You get it, it's a WebUI... it's freaking slow.

Which brings me to REST, because surely you're reading this thinking "C'mon there has to be a faster way to make those sort of one off changes?!" Yes, there is. REST, sort of your replacement to the ultra fast CLI that we all loved. With REST, there are a number of different ways to POST changes, I decided to use a REST plugin for Firefox. Finding documentation on using REST with ACI, turns out to be a nightmare. After digging through documentation for a period of time, I finally found out how to actually get my auth token so the APIC would allow me to POST said changes. Then I found a doc on shutting/no shutting ports via REST, which turned out to be wrong. Cisco's documentation on using REST with there own product, was wrong. Of course I didn't find this out immediately, because REST is just using HTTP so you don't get handy CLI-like errors telling you specifically where the syntax is invalid... you get HTTP error codes lol. Which is just, awful. So I found some useful information on Cisco support forms that showed the correct formatting of a REST call for that purpose, modified the information to fit my environment AAAANNNNNNDDDD.. new and different HTTP failure code. I tried at this for another 15 minutes or so, before throwing in the towel and just using the slow GUI to make my changes. One port at a time, I finally resolved the given issue I was after. Naturally the next day I had to sit through a sales call with ACI guys talking about how ACI was the best thing since sliced bread, all the while the environment is proving the be least stable we have, and took the most amount of time to deploy.




Honorable mentions for things I also hate about ACI!

- Default behavior doesn't allow for GARP tracking, or ARP flooding. VRRP fail overs were a nightmare. (Can be resolved, just annoying in the heat of tshoot).

- VLANs are localized to the switch, which is fine... but ACI also maps them seemingly arbitrarily to other VLANs. So if you're looking for where VLAN 10 is, you need to see what THAT particular switch mapped VLAN 10 to, and then you can see what ports are in the mapped VLAN. Which will change switch to switch.

- ACI NX-OS is just NX-OSish to make you feel at home until you need to actually use a show command. No tab complete, no context sensitive ANYTHING and if you want the old NX-OS back you can go into vshell, but the commands will often return empty or incomplete information.

- You can not the see "running config" of anything. Which just gets SO old. Sometimes you just want to look at how BGP is configured, or how an interface is configured. Can't do that, so your stuck with using a variety of other show commands to eventually get the information you were after in the first place.

- Contracts. Do a little reading on your own about endpoint groups (EPGs) and contracts. Implementation is sloppy, and doesn't make much sense. Having a white-list only LAN sounds cool, but even Cisco just ends up recommending that you permit any any inside a given group.


That's all I have for now, this post is already too long so I might just do a part 2 later. 

VPLS BGP Signaling (this is really cool)


So, a while back (~ a year ago now) I did a YouTube video/blog post about having a sort of hacked VPLS environment in GNS3. Why?! Because it's awesome to have a multi-point l2vpn running over MPLS, that's why. 


Today, I bring you a post/video that's far from a hack. Full on VPLS using my best friend the CSR1000v, and using BGP for signaling.



Wednesday, November 11, 2015

CSR1000v and GNS3, best thing?


Really quick post today about integrating the CSR1000v into GNS3. First up! Why would you want to do this?


  1.  Latest greatest IOS
  2.  Feature packed (supports VXLAN, OTV, MPLS, VPLS, and a ton more)
  3. What do you want from me? Those first two should be enough.


Video at the bottom for those too impatient to read (like me). First thing you're going to need, a fresh ISO of the CSR1000v, available from cisco.com. I'm using "csr1000v-universalk9.03.14.01.S.155-1.S1-std.iso", but depending when you read this your mileage my vary. After you have that, I think it goes without saying that you'll need GNS3 installed, and VirtualBox.

VirtualBox
In virtualbox, create a new VM. Set the OS type to Linux and version to Other Linux (64-bit). Give it at least 2.5 GB of ram (2560 MB), but if you can spare it 4GB is better. All default values after that are totally fine, you won't need more than the standard 8GB hard drive.

Now open the settings for your new CSR virtual machine, on Serial Ports check port one's "Enable Serial Port". You leave it disconnected, GNS3 will handle the host pipe end (if that doesn't make sense to you, then just know that GNS3 will do some magic and you'll have a working console port).



Next up, go to storage and set the CD drive to point to the ISO you downloaded from Cisco (or from the wild wild internet). When all is said and done your VM should have settings similar to this




Alright, now we can boot it up. You'll see a dialog box asking if you want to (a) auto detect console type (b) use vga or (c) use serial. For now, auto detect is just fine. This process can take a few minutes, IOS-XE will do a nice and neat auto installation and reboot. After a reboot you'll come to a familiar IOS prompt. Last thing we want to do, go into global configuration and set the console port to be serial only "platform console serial". After that, exit global config, wr mem and shut the vm down.

GNS3


Open up GNS3, go to File-->Preferences->VirtualBox VM templates. From here you're going to click "new". From drop down VM list, select your CSR1KV and check the box "Use as a linked back (experimental)". That last step is a huge, what it means is you can dynamically create CSRs without being limited to how many are available in VirtualBox. I.e., you create (1) CSR1KV in VBOX... and that's it. GNS3 will handle dynamically cloning routers for you, giving you a very dynamips-like experience. 



Now click "Finish". 

From here, I like to first right click on my newly added CSR1KV and select "change symbol". Then I select a nice router icon and set the category to "Routers". This will make it so your CSR1KV shows up with all your dynamips routers instead of as a vbox guest.







Finally add some additional network adapters to your new VM by clicking on "edit" and going to "Network". I run mine with 5 nics each. That's it! You're all set, final note is linked base guests can only be run in saved projects. That means you have to create a new project (can't just use a temporary work space) to add your VMs. No big deal really, but if you forget GNS3 will give you a friendly little error/reminder.


See my video on this process below for those of you who don't like reading!