Today we started with a brief discussion around Cisco’s C-Series
rack mount servers. They are based on
the technology used in the B-Series blades, including the Fabric Interconnects
and presumably (this is an educated guess on my part) the entire Service
Profile model. They will have an
optional CAN called Monterey Park which will provide the same functionality as
the Palo adapter in the blades (multiple virtual adapters from a single physical
port). There will be three initial rack
mount offerings: Los Angles (3U server equivalent to the B250 full-width blade,
including the Catalina chipset) and the San Diego 1U and 2U servers (based on
the B200 half-width blades). Squeezing
as much life out of Catalina seems to be the big driver for introducing rack mount
servers today, but they do fill the small gap that blade servers don’t cover
(small business with <4 servers).
We also discussed the five key value propositions for the
UCS:
-
Virtualization – Server hardware enhancements to
extend the virtualization capability of a server using technologies like Palo
and Catalina
-
Scalability – Within a single server (memory and
NICs) or within the management software (up to 40 chassis & 320 blades
through one pane of glass)
-
Management – Single point of management for all
aspects (server, network, storage, chassis) and multiple modes of access to the
management (GUI, console, API)
-
Statelessness – Complete decoupling of the identity
of the server from the hardware (Service Profiles)
-
Converged Fabric – I think we all know what this
one means by now: FCoE running at 10+Gbps
Another limitation that was brought up with the 6100 Fabric
Interconnects (FI) is the fact that today there are no 1GbE uplinks to existing
Ethernet networks, which requires a company to upgrade or replace switches as
part of a UCS implementation. This is
not necessarily a cheap thing for a customer to do. It seems like this may be a software
limitation and may be relatively easy to correct.
After all the product talk, we finally got to our first lab
where we got to play in the GUI and console interfaces for the UCS Manager
(UCSM). Cisco took an interesting
approach by having the UCSM GUI be deployed as a downloadable Java application. Though it is a very powerful tool, I have to
admit I like the HP C-class’s web-only interface better.
I didn’t take any screenshots (I’ll remember to do so
tomorrow), so there’s not a lot to share regarding the interface. One common theme that came up several times
throughout the day was the fact that the tab used to contain all the Service
Profiles is called “Server” which is very confusing. The word server indicates a physical asset
(body), not the Service Profile (soul).
Given the focus on the Service Profile within UCS, I’d have thought this
would’ve been an easy design element.
Another thing I noticed is that many settings within UCSM
are saved, but then must be committed as a separate step. This process eliminates the potential for
error, and I believe this has worked well for many years in the networking
realm. However, it may come as a bit
tedious to many server admins. It could
also be a point of error since each chassis needs to be committed separately
and this could be easily overlooked.
I recommend checking out Rodos’s and Rich
Brambley’s posts for some good posts that include screenshots of the GUI
interface.
One of the more powerful features of the UCSM is the complexity
masking that occurs with the GUI. For
example, when you add a VLAN within UCSM GUI, it is a simple one screen
operation. Behind the GUI, however, a
couple of tasks are completed. The VLAN
is added to both FI and is applied to all uplinks when running in automatic End
Host Mode. Similar functionality exists
when creating Service Profiles.
After the lab, we hit the whiteboard again to discuss FCoE
in more detail. If you don’t currently have
a good grasp of how Ethernet and Fiber Channel work to begin with, I’d suggest
doing some studying. If you want to know
how FCoE really works, you’re going to need to know how those two protocols
work. The good news is you may not need
to know it all that deeply. The UCSM GUI
seems to do a good job of hiding a lot of the intricacies of the protocol and
it’s implementation between the different components of UCS. This knowledge will still be crucial for
anyone architecting a complete FCoE solution.
I’ll try and tackle some of the basics. FCoE is made up of two parts: Data, which is
SCSI commands inside of Fibre Channel packets and Control information using the
FCoE Intialization Protocol (FIP). FIP
is used to assign and maintain FCoE MAC addresses. There is a process for setting this MAC
address that is very similar to the process that DHCP uses. In fact, they’re similar enough that I won’t
take it any further than that.
One thing I came away with is the accuracy of my drawing
from yesterday’s post (presented again here for discussion):
Figure 1: FCoE Switching (again)
I have not modified this drawing, since it is still theoretically
possible (and in my opinion very probable in the near term). What isn’t 100% correct for today is the hop
between the vE-Ports. This is considered
a multi-hop FCoE communication, and is currently not a capability. This is where things start getting into a
nuts and bolts discussion. Take what I’ve
posted yesterday and today and then go read Scott
Lowe’s post on the same topic. That
should get you a pretty good feel for the situation. If you survive all that, then you may be
brave enough to read through http://fcoe.com/.
You could make this picture work by simply sending the
communication through native Fibre Channel between the two Nexus 5000 switches
instead of via FCoE. This is something
that will no doubt be fixed at some point in the future.
VNTag is another important piece of the UCS puzzle. I don’t fully understand how it works, so I
won’t dig too deep into it here for fear of providing incorrect
information. What I do understand it to
be is a wrapper that exists only in the UCS that is used to route packets
through the different components (FI, IOM, Adapters and optionally the
hypervisor).
Another related technology is VNLink, which I again don’t
fully understand. One function VMLink
does provide is the ability to shutdown downstream ports if there is a
connection loss. HP provides similar
functionality with SmartLink.
VNTag and VNLink are two technologies that do not need to be
fully understood in order to manage a UCS.
Once again, the UCSM GUI does a great job of hiding this complexity
behind a relatively simplistic interface.
The 6100 FI can operate in two different modes, which can be
chosen during configuration of the devices:
-
Ethernet Switching + NPV – This mode offers
full Ethernet switching, but all fibre storage connections are pinned directly to
NPV ports. No fibre switching. This is part of the reason why FCoE outside
of the UCS is not possible.
- End Host Mode – This mode pins vEth/vHBA ports
to specific uplinks. The pinning can be
done in static or dynamic mode. Static
mode is mostly self explanatory. Dynamic
mode requires that all uplinks be trunks that allow the same set of
VLANs/VSANs. Traffic is then routed out
in a round robin style.
End Host Mode is the default and least complicated approach,
but Ethernet Switching + NPV allows for more control over your uplink
usage. In the future a third mode will
be possible: Ethernet Switching + FC Switching, which should enable FCoE to be
routed outside of the UCS.
One odd setting we found today was a global policy that
defined the number of server connections out of the FI: 1, 2 or 4. It can only be set at the top of the
Equipment tree and cannot be defined at a lower level. We debated the merits of such a policy to no
definitive resolution. Later in the day
one of our instructors came back with the answer. This policy defines when an automatic
discovery should occur: when 1st, 2nd or 4th cable
is connected between the FI and IOM. I
guess a quick response from knowledgeable people is one of the many advantages
to having the boot camp in one of Cisco’s main buildings.
The final section for the day introduced us to the Service
Profile. The configuration settings in
the Service Profile are what allow the mobility of a server from one blade to
another. For example, if you’re booting
from a SAN LUN, then that LUN is probably attached to a WWN on an HBA. The Service Profile will mask the factory WWN
and use the WWN it has defined.
Therefore, whatever blade the Service Profile is attached to will assume
this WWN and will boot off the related SAN LUN.
Given the nature of these types of settings, a Service Profile can only
be attached to a single blade at a time.
We had an interesting debate as to whether the Service
Profile represents the Operating System or the hardware. Here are the two strongest counter arguments:
-
You have 320 blades and 500 OS instances defined
on the SAN. In order to define these OS
instances, you’d need 500 Service profiles.
This would indicate that the Service Profile is a representative of the
OS.
-
The Service Profile defines hardware configuration
details, including the BIOS settings and firmware versions; therefore the Service
Profile is representative of a hardware configuration.
I found myself somewhere in the middle. To me, the Service Profile exists to not
represent either the OS or the HW, but to join the two together. In a UCS system the blades and the OS (and
the network connectivity) are each useless without a Service Profile to tie it
all together. Ultimately, I think it’s
greater than any of these views and contains elements of each.
We didn’t have time to finish the Service Profile section,
so you’ll have to wait until tomorrow for the rest (just like me!).
Toward the end of the day, I received a Direct Message on
Twitter from Rodos asking me to help him
validate the existence of an error he discovered in the UCS Boot Camp
courseware. If you’ve been to UCS Boot
Camp, you should check out the details here: http://rodos.haywood.org/2009/10/error-on-cisco-ucs-pinning-training.html Great catch Rodos, glad I could help out.