Archive for August, 2009

OFED 1.4 stack on RHEL 5.2

Friday, August 28th, 2009

 

I have been working with Infiniband since the first card came out from Topspin. My previous employer was a partner with Topspin for IB products. Having already worked with high speed interconnects like Myrinet, Scali (Dolphin Wulfkit) and of course, multiple versions of PARAMNet among countless others. Many have come and gone but Infiniband is here to stay.

Even with Cisco dropping out of Infiniband, strong support from QLogic, Voltair and Mellanox will keep it going for a while. Cisco has no advantage with Infiniband, their core business is Ethernet and they need to do what they need to do to keep Ethernet the core interconnect for everything. Even though it makes sense for Cisco, HPC is not everything. Its never been in the category of everything else. The requirements of HPC interconnects are very unique – low latency and high bandwidth are the heart and soul. Getting those two in a general purpose network would be nice but who would pay for some thing they don’t need.

Coming to the main topic of this post, configuring ConnectX Infiniband on RHEL 5.2 x86_64 with OFED 1.4.

OFED is very well packaged and most of the time does not need additional work for installation. Here is the simple method:

  1. Download OFED
  2. Extract the files (tar –zxvf OFED-x.y.tgz
  3. run the install script (install.pl)
  4. For non-HPC installation, menu choices 2-1 will suffice, for HPC specific installation, choose 2-2 or 2-3. You are pretty safe choosing 2-3. If you choose 2-2, some infiniband diagnostic utilities wont be installed. However, you will end up with HPC specific packages like MPI.
  5. Make a note of required packages and you can find almost all of them on the RedHat disk. If you are registered to RHN, you can use yum to install the same.
  6. At this point, the needed kernel modules (drivers & upper level protocols) should be installed.
  7. The installer will ask if you like to configure IPoIB (IP tunneling over Infiniband). Say Y if you plan to use IPoIB and provide the IP addresses. If not, say N
  8. Issue a reboot command and after the system reboots, check lsmod for the list of modules currently loaded
  9. You should see a list of kernel modules with names starting with ib_ (ib_cm, ib_core, ib_umad, etc)
  10. At this point, we can safely assume the drivers are loaded and the adapter is working. You can check the status of the installation using the diagnostics included with OFED. More on that below.
  11. We have to have a working Subnet Manager for the Infiniband fabric to work. If you are using a managed switch like QLogic 9024, it generally includes an embedded Fabric Management component. If you are using an entry level switch without an embedded subnet manager or you like to run your own SM on a host system, you can use OpenSM (OpenSubNetManager) component bundled with OFED. Start the OpenSM using the command  /etc/init.d/opensmd start   NOTE: Till you have a working subnet manager, the adapters will not be able to do any useful work.

Diagnostics:

OFED comes with some basic diagnostic commands that can be used to test the status of the cards in your system. One of them is ibv_devinfo. This command prints the adapter status and attributes.

[root@localhost ~]# ibv_devinfo
hca_id: mlx4_0
        fw_ver:                         2.3.000
        node_guid:                      0030:48ff:ff95:d928
        sys_image_guid:                 0030:48ff:ff95:d92b
        vendor_id:                      0×02c9
        vendor_part_id:                 25418
        hw_ver:                         0xA0
        board_id:                       SM_1021000001
        phys_port_cnt:                  2
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 1
                        port_lid:               1
                        port_lmc:               0×00

                port:   2
                        state:                  PORT_DOWN (1)
                        max_mtu:                2048 (4)
                        active_mtu:             2048 (4)
                        sm_lid:                 0
                        port_lid:               0
                        port_lmc:               0×00

In the above output, check the port “state”. When you have a working subnet manager, it will show up as PORT_ACTIVE or PORT_UP. Without a working subnet manager, it will show up as PORT_INIT or POLLING.

The state is shown as PORT_DOWN when there is no cable connected to the port.

To list adapters in the system:

[root@localhost ~]# ibv_devices
    device                 node GUID
    ——              —————-
    mlx4_0              003048ffff95d928

Once you have a working subnet manager and you have at least two ports showing up as “PORT_ACTIVE” on at least two machines, you can test the fabric using a simple pingpong or sendrecv test routines.

Start ibv_rc_pingpong on one machine

Start ibv_rc_pingpong <host name or ip> on another machines. hostname should be the name of the first machine on which the command was started.

If everything is working as it should, you should see the following output:

First host:

[root@localhost x86_64]# ibv_rc_pingpong
  local address:  LID 0×0002, QPN 0×00004a, PSN 0×43da29
  remote address: LID 0×0001, QPN 0×00004a, PSN 0×446364
8192000 bytes in 0.01 seconds = 6202.54 Mbit/sec
1000 iters in 0.01 seconds = 10.57 usec/iter

 

Second Host:

[root@localhost ~]# ibv_rc_pingpong 192.168.0.248
  local address:  LID 0×0001, QPN 0×00004a, PSN 0×446364
  remote address: LID 0×0002, QPN 0×00004a, PSN 0×43da29
8192000 bytes in 0.01 seconds = 6172.16 Mbit/sec
1000 iters in 0.01 seconds = 10.62 usec/iter

Depending on the type of card, cable, switch, OS, board chipset and PCI expansion slot you use, your bandwidth and latency will vary significantly. And this is only a functional test and is not a test for best bandwidth and latency.

Other diagnostic tools:

  1. ibstat – diaply IB device status like firmware version, ports state, GUIDs, etc (similar to ibv_devinfo)
  2. ibnetdiscover – discovers IB network topology
  3. ibhosts – shows IB nodes in topology
  4. ibchecknet – runs IB network validation
  5. ibping – ping IB address
  6. ibdatacounters – summary of ib ports

and more …

Performance Tests:

OFED bundles a few programs to test the bandwidth and latency of your Infiniband fabric.

Bandwidth test:

  1. start ib_read_bw on one machine
  2. start ib_read_bw <hostname or ip> on second machine

Latency Test:

  1. start ib_read_lat on one machine
  2. start ib_read_lat <hostname or ip> on second machine

make sure the power management is turned off before you run these test.

In case of any problems, the first thing to check is the subnet manager, then the ibstat and ibchecknet tools.

AMD delivers OpenCL SDK beta for x86

Thursday, August 6th, 2009

AMD announced the availability of OpenCL SDK for x86 processor cores.

The first publicly available beta of OpenCL SDK will allow developers to write portable code supporting both x86 processors and compatible GPUs. At release, OpenCL SDK will be delivered as a part of the ATI Stream Software Development Kit.

For the uninitiated, OpenCL is an open programming standard, supported by a number of industry vendors, for writing source code to target multi-core CPUs and GPU execution units. OpenCL is designed ground up to support parallel computing paradigms using task-based and data-based parallelism.

NVidia also has an SDK in the works and we can expect to see NVidia’s version very soon, esp. after their continued demos at SIGGRPAH. NVIDIA’s SDK will, obviously, support NVIDIA GPU’s. If they will support x86 cores, is yet to be seen. AMD, on the other hand, has an incentive to support x86 cores and GPU in its release. It will accelerate adoption of ATI Stream GPU’s and Opteron processors. AMD is in a unique position because of its product line – GPU & x86 CPU, whereas, NVIDIA has only GPU and Intel has only CPU. Let’s hope AMD manages to take advantage this opportunity.

AMD’s OpenCL demo with AMD Opteron Istanbul is below. This demo is on a 4 socket AMD Opteron system with six core Istanbul processor. I cant wait to try it on our 48 core AMD Opteron system (8 socket) with Istanbul processors.

HPC Systems is now QLogic Infiniband SignatureHPC Partner

Thursday, August 6th, 2009

We are proud to announce that HPC Systems, Inc. is now QLogic SignatureHPC Partner for Infiniband products.

That means our employees, both sales and technical, are certified in Infiniband technologies. Our sales team will be able to recommend to you the best Infiniband solution for your needs and our technical team will deliver on those commitments.

We have always been ahead in delivering technically sound and well designed Infiniband Solutions to our customers in academic, research, federal govt and cfd space. Our partnership and certification with QLogic takes us one more step ahead in delivering the best in High Peformance Computing to our customers.