UCS Boot Camp – Day 2

Today we started with a brief discussion around Cisco’s C-Series rack mount servers.  They are based on the technology used in the B-Series blades, including the Fabric Interconnects and presumably (this is an educated guess on my part) the entire Service Profile model.  They will have an optional CAN called Monterey Park which will provide the same functionality as the Palo adapter in the blades (multiple virtual adapters from a single physical port).  There will be three initial rack mount offerings: Los Angles (3U server equivalent to the B250 full-width blade, including the Catalina chipset) and the San Diego 1U and 2U servers (based on the B200 half-width blades).  Squeezing as much life out of Catalina seems to be the big driver for introducing rack mount servers today, but they do fill the small gap that blade servers don’t cover (small business with <4 servers).

We also discussed the five key value propositions for the UCS:

  1. Virtualization – Server hardware enhancements to extend the virtualization capability of a server using technologies like Palo and Catalina
  2. Scalability – Within a single server (memory and NICs) or within the management software (up to 40 chassis & 320 blades through one pane of glass)
  3. Management – Single point of management for all aspects (server, network, storage, chassis) and multiple modes of access to the management  (GUI, console, API)
  4. Statelessness – Complete decoupling of the identity of the server from the hardware (Service Profiles)
  5. Converged Fabric – I think we all know what this one means by now: FCoE running at 10+Gbps

Another limitation that was brought up with the 6100 Fabric Interconnects (FI) is the fact that today there are no 1GbE uplinks to existing Ethernet networks, which requires a company to upgrade or replace switches as part of a UCS implementation.  This is not necessarily a cheap thing for a customer to do.  It seems like this may be a software limitation and may be relatively easy to correct.

After all the product talk, we finally got to our first lab where we got to play in the GUI and console interfaces for the UCS Manager (UCSM).  Cisco took an interesting approach by having the UCSM GUI be deployed as a downloadable Java application.  Though it is a very powerful tool, I have to admit I like the HP C-class’s web-only interface better.

I didn’t take any screenshots (I’ll remember to do so tomorrow), so there’s not a lot to share regarding the interface.  One common theme that came up several times throughout the day was the fact that the tab used to contain all the Service Profiles is called “Server” which is very confusing.  The word server indicates a physical asset (body), not the Service Profile (soul).  Given the focus on the Service Profile within UCS, I’d have thought this would’ve been an easy design element.

Another thing I noticed is that many settings within UCSM are saved, but then must be committed as a separate step.  This process eliminates the potential for error, and I believe this has worked well for many years in the networking realm.  However, it may come as a bit tedious to many server admins.  It could also be a point of error since each chassis needs to be committed separately and this could be easily overlooked.

I recommend checking out Rodos’s and Rich Brambley’s posts for some good posts that include screenshots of the GUI interface.

One of the more powerful features of the UCSM is the complexity masking that occurs with the GUI.  For example, when you add a VLAN within UCSM GUI, it is a simple one screen operation.  Behind the GUI, however, a couple of tasks are completed.  The VLAN is added to both FI and is applied to all uplinks when running in automatic End Host Mode.  Similar functionality exists when creating Service Profiles.

After the lab, we hit the whiteboard again to discuss FCoE in more detail.  If you don’t currently have a good grasp of how Ethernet and Fiber Channel work to begin with, I’d suggest doing some studying.  If you want to know how FCoE really works, you’re going to need to know how those two protocols work.  The good news is you may not need to know it all that deeply.  The UCSM GUI seems to do a good job of hiding a lot of the intricacies of the protocol and it’s implementation between the different components of UCS.  This knowledge will still be crucial for anyone architecting a complete FCoE solution.

I’ll try and tackle some of the basics.  FCoE is made up of two parts: Data, which is SCSI commands inside of Fibre Channel packets and Control information using the FCoE Intialization Protocol (FIP).  FIP is used to assign and maintain FCoE MAC addresses.  There is a process for setting this MAC address that is very similar to the process that DHCP uses.  In fact, they’re similar enough that I won’t take it any further than that.

One thing I came away with is the accuracy of my drawing from yesterday’s post (presented again here for discussion):

UCS - Day 2 - Figure 1

Figure 1: FCoE Switching (again)

I have not modified this drawing, since it is still theoretically possible (and in my opinion very probable in the near term).  What isn’t 100% correct for today is the hop between the vE-Ports.  This is considered a multi-hop FCoE communication, and is currently not a capability.  This is where things start getting into a nuts and bolts discussion.  Take what I’ve posted yesterday and today and then go read Scott Lowe’s post on the same topic.  That should get you a pretty good feel for the situation.  If you survive all that, then you may be brave enough to read through http://fcoe.com/.

You could make this picture work by simply sending the communication through native Fibre Channel between the two Nexus 5000 switches instead of via FCoE.  This is something that will no doubt be fixed at some point in the future.

VNTag is another important piece of the UCS puzzle.  I don’t fully understand how it works, so I won’t dig too deep into it here for fear of providing incorrect information.  What I do understand it to be is a wrapper that exists only in the UCS that is used to route packets through the different components (FI, IOM, Adapters and optionally the hypervisor).

Another related technology is VNLink, which I again don’t fully understand.  One function VMLink does provide is the ability to shutdown downstream ports if there is a connection loss.  HP provides similar functionality with SmartLink.

VNTag and VNLink are two technologies that do not need to be fully understood in order to manage a UCS.  Once again, the UCSM GUI does a great job of hiding this complexity behind a relatively simplistic interface.

The 6100 FI can operate in two different modes, which can be chosen during configuration of the devices:

  • Ethernet Switching + NPV – This mode offers full Ethernet switching, but all fibre storage connections are pinned directly to NPV ports.  No fibre switching.  This is part of the reason why FCoE outside of the UCS is not possible.
  • End Host Mode – This mode pins vEth/vHBA ports to specific uplinks.  The pinning can be done in static or dynamic mode.  Static mode is mostly self explanatory.  Dynamic mode requires that all uplinks be trunks that allow the same set of VLANs/VSANs.  Traffic is then routed out in a round robin style.

End Host Mode is the default and least complicated approach, but Ethernet Switching + NPV allows for more control over your uplink usage.  In the future a third mode will be possible: Ethernet Switching + FC Switching, which should enable FCoE to be routed outside of the UCS.

One odd setting we found today was a global policy that defined the number of server connections out of the FI: 1, 2 or 4.  It can only be set at the top of the Equipment tree and cannot be defined at a lower level.  We debated the merits of such a policy to no definitive resolution.  Later in the day one of our instructors came back with the answer.  This policy defines when an automatic discovery should occur: when 1st, 2nd or 4th cable is connected between the FI and IOM.  I guess a quick response from knowledgeable people is one of the many advantages to having the boot camp in one of Cisco’s main buildings.

The final section for the day introduced us to the Service Profile.  The configuration settings in the Service Profile are what allow the mobility of a server from one blade to another.  For example, if you’re booting from a SAN LUN, then that LUN is probably attached to a WWN on an HBA.  The Service Profile will mask the factory WWN and use the WWN it has defined.  Therefore, whatever blade the Service Profile is attached to will assume this WWN and will boot off the related SAN LUN.  Given the nature of these types of settings, a Service Profile can only be attached to a single blade at a time.

We had an interesting debate as to whether the Service Profile represents the Operating System or the hardware.  Here are the two strongest counter arguments:

  • You have 320 blades and 500 OS instances defined on the SAN.  In order to define these OS instances, you’d need 500 Service profiles.  This would indicate that the Service Profile is a representative of the OS.
  • The Service Profile defines hardware configuration details, including the BIOS settings and firmware versions; therefore the Service Profile is representative of a hardware configuration.

I found myself somewhere in the middle.  To me, the Service Profile exists to not represent either the OS or the HW, but to join the two together.   In a UCS system the blades and the OS (and the network connectivity) are each useless without a Service Profile to tie it all together.  Ultimately, I think it’s greater than any of these views and contains elements of each.

We didn’t have time to finish the Service Profile section, so you’ll have to wait until tomorrow for the rest (just like me!).

Toward the end of the day, I received a Direct Message on Twitter from Rodos asking me to help him validate the existence of an error he discovered in the UCS Boot Camp courseware.  If you’ve been to UCS Boot Camp, you should check out the details here: http://rodos.haywood.org/2009/10/error-on-cisco-ucs-pinning-training.html Great catch Rodos, glad I could help out.