bwbug: BWBUG user survey

Greg Lindahl lindahl at
Wed Nov 17 11:53:16 PST 2004

[ Mike Fitzmaurice asked me to post this. I have taken the liberty of 
  producing a text version of the Excel spreadsheet he attached. -- greg ]

Following the talk by John Payne of JLP Associates last week where he
discussed the results of his survey of 45 users of Linux Clusters, much
interest has been expressed in a survey of BWBUG members.  A good start has
been made by the members who filled in the survey at the meeting.  The data

11 Systems growing in two years to 34 systems and processors growing from
~3,000 to ~5,000.

Growing concern about cooling.  Issues with lack of global file systems and
management software.

Interconnection technology beginning to migrate to Infiniband.

And unanimous interest in forming a User Alliance Organization dedicted to
eliminating existing weaknesses and driving the technology to widespread
-------------- next part --------------

Company Name:
Your Name:
email address:

Describe your clusters:
                               Now               In 2-years
Number of Clusters
Number of CPUs

At what stage do you adopt new technologies? Early, Late-early, Middle, Late

Service level agreements for your clusters? Yes, No

What metrics are used? Availability, Jobs run, Performance, Utilization, Other

Are you measuring TCO? Yes, No

Rate the problems you experience in deploying and operating clusters
(1=no problems, 5=most problems)
Time to commission
System Availability
Cooling problems
Loss of run-data when failure occurs
Global File System
Cluster management
I/O performance

What is the latency (MPI to MPI microsecs)?

What interconnect technology:
do you use?                    Myrinet, Quadrics, Infiniband, Gig E, Other
In 2 years?

If 'Other', specify?

Processor preference?   Xeon, Itanium, Opteron, No pref
Deployed now?
In the future?

Processors per node:   Single, 2-way, 4-way, 8-way, 16-way

Major cause of failures:  Hardware,Software

MTBF in hours? for the nodes? for the entire system?

primary source of HW failures: Disks, Fans, Power Supplies, Node, Interconnect
secondary source of HW failures: Disks, Fans, Power Supplies, Node, Interconnect

Software Failures: Linux O/S, Middleware, Applications

Criteria you use to select vendors
(1=not important, 5=mandatory)
Fortune 500 company
Number of installed sites
Reference Customers
On premise commisioning reputation
Technical Support
Vendor Maintenance program
COTS Product (using motherboards)
Blade product (with backplane)
Product Reliability

Would you support a User Group Alliance?
As meeting attendee?
As an organizer?

What should the initial focus be?
Backplane standard
  If other, specify
-------------- next part --------------
A non-text attachment was scrubbed...
Name: bwbug-survey-questions.xls
Type: application/
Size: 34304 bytes
Desc: not available
Url :

More information about the bwbug mailing list