Our first lesson for today was on multitenancy, or setting up the UCS for multiple administrative users. The first step to delegating authority and setting up Role Based Access Controls (RBAC) is to create an organizational chart. This organization doesn’t need to match any company hierarchies, but should make sense on how you will delegate authority (not unlike approaching an Active Directory design). Locales are created on a global level to group multiple organizational nodes together (no default locales exist). Users are also created at a global level, and can be either local or from an external directory (i.e. LDAP, RADIUS). The users are then given multiple roles (several built-in roles are available) and locales. The roles assigned to a user are universal across all the organizational nodes in the locales they are assigned. This means a user can’t have the server-equipment role in one locale and the server-security in another.
When creating and tying together all these pieces, keep in mind these design factors:
- Policies can be used in a Service Profile from a higher node (toward the root), but if the same names are used at multiple levels, only the closest of these policies will be available
- When pools (server or identity) are exhausted at a leaf, then a service profile will attempt to use resources from a pool with the same name in the next node up (all the way to the root)
- While Server Pools can be created at any level of the organization, servers themselves are not affected by or exist within the organization
- When adding a node into a locale, all dependent nodes will be included in the locale
- There is no blocking of inheritance for any elements (including organizational nodes within a locale)
These elements, particularly the last two, sparked of a series of debates that lasted the entire morning. It got so deep that we had one of the developers of the RBAC and one of the product managers in the front of the room using the whiteboard to explain how it works and answering some pretty difficult questions. We also discussed a few elements in the UCSM GUI that did not seem to fully match the functionality that the logic behind it was doing. I really enjoyed the deep two-way discussion with Cisco engineers & programmers it allowed us. This kind of interaction makes having to travel all the way to San Jose to do the training in Cisco’s office (where most of the UCS team sits) completely worthwhile.
After lunch we dug into maintaining the UCS, starting with performing backups and restores. Backups can be one of the following:
- Config-system – All information pertaining to authentication, authorization and accounting (AAA)
- Config-logical – All configuration details not associated with AAA
- Config-all – everything in the config-system and config-logical
- Full-state – The complete DR backup
Currently backups cannot be scheduled through the GUI, but could be setup as a cron job through the CLI.
The final section we covered today was upgrading UCS firmware. Firmware updates are available as individual downloads, but the impression I had was that Cisco would prefer that the entire bundle, which contains firmware for all components, be downloaded for consistency of versions across the domain. The recommended update order is the following (this may be an updated list for previous UCS Boot Camp attendees): Mezzanine cards, BMC, BIOS, CMC, Fabric Interconnects., LSI/SCSI controllers. Just like in the HP c-Class (maybe even more so), all the components work together to implement a solution, so the firmware versions need to stay in sync to some extent.
Some of the updates could be disruptive to the system, so release notes should be used to determine which ones could cause outages. If a Fabric Interconnect or IOM outage would be necessary, we were told that the hardware failover within the Palo & Menlo cards will take about 5 milliseconds.
All firmware versions of all components loaded into UCSM will be maintained in a library, but the individual components will store up to two versions each: a backup version (previously running version) and a startup version (the version to be loaded on next boot). There will also be a version listed for the current running state, which could be different then these two.
The firmware piece felt pretty well screwed together, without any major gotchas or concerns. However, it can be a bit intimidating given the number of components that have updatable firmware, especially when you can consider how many enclosures the domain can scale to.
Overall, I find the UCSM GUI to be not as well refined as the hardware itself. There seemed to be several vestigial GUI elements and other elements that didn’t fully reflect the functions they are providing behind the GUI. You are also unable to modify many elements once they have been created. Some refinement is needed, no doubt, but there is a lot of power in there too.
Miscellaneous things learned today:
- Do not use the “Recover Corrupt BIOS Firmware” functionality. Apparently it has caused major problems in the past. Instructors were not quite sure what it was intended to do in the first place.
- When using the CLI, you must always commit a change after executing the command. Another apparent hold over from switch CLI.
- The erase command on the CLI only wipes out the local Fabric Interconnect, so you’d have to run it on both devices to completely wipe out the entire UCS domain.
- UCS blades do not have a current option for embedded ESXi (as far as I have been told).
- I got confirmation that there is a promotion that when you buy VMware licenses and an UCS you will receive the Nexus 1000v for free. No one knew any details on the promotion, though.
A few useful links encountered today:
One clarification from yesterday’s post, based on a question by Rodos: UCSM does contain some logic and code from BMC, but the GUI and the rest of the code (probably a majority) was created by Cisco. There is a separate SKU that is a full OEM version of BladeLogic to provide the “Manager of managers” functionality.