Integrating Cell based system in to ROCKS

After successfully installing Fedora 9 on Cell based system (Mercury 1U dual cell blade based system), now we had to integrate it in to a ROCKS cluster.

ROCKS sends the appropriate kernel image by looking at the vendor-class-identifier information. Current DHCP configuration file supports only IA64 (EFI), x86_64, x86 and of course, network switches. Although, ROCKS no longer supports IA64 (Itanium), the code is still there.

The first task is to add the Cell system in to the ROCKS database. We decided to add the node as a “Remote Management” appliance than as a compute node. Adding as compute node would modify the configuration files for SGE or PBS and will always show up as “down” status. To do this, execute the following command:

insert-ethers --mac <give your mac id here>

When the insert-ethers UI shows up, select “Remote Management” and hit ok. You may also choose to provide your own hostname using the option “–hostname”

The next task to identify the vendor class identification for the Cell system. After a quick test, it was determined that the system had no vendor class identifier. Since we were dealing with only one system, the best option was to match the MAC ID of the sytem with the following elsif block:

        } elsif ((binary-to-ascii(16,8,":",substring(hardware,1,6))="0:1a:64:e:2a:94")) {
                # Cell blade System
                filename "cellbe.img";
                next-server 10.1.1.1;
        }

“cellbe.img” is the kernel image for the Cell system. This has to be copied to “/tftpboot/pxelinux/”.

These changes will be lost if dhcpd.conf is overwritten, which happens every time you execute insert-ethers or use

dbreport dhcpd

to overwrite the file.

You could generate a patch file and patch the dhcpd.conf every as needed or you could edit

//opt/rocks/lib/python2.4/site-packages/rocks/reports/dhcpd.py

to include the new elsif block everytime the file is generated.

If you see your cell system is trying to load

/install/sbin/kickstart.cgi

means your dhcpd.conf file is overwritten.

References:

http://archives.devshed.com/forums/networking-100/cannot-see-the-offer-and-ack-packet-with-ethereal-2063723.html

http://forums.opensuse.org/network-internet/399162-dhcp-client-identifier-matching.html

http://osdir.com/ml/network.dhcp.isc.dhcp-server/2004-05/msg00037.html

Code block to identify the vendor class identifier and other useful information:

               log(info, concat("Debug Information:\t",
               binary-to-ascii(16,8, ":", substring(hardware,1,6)),
               "\t",
               binary-to-ascii(10,8, "-", option dhcp-parameter-request-list),
               "\t",
               pick-first-value(option vendor-class-identifier,"no-identifier")
               )
               );

Leave a Reply

You must be logged in to post a comment.