Windows Server 2008 High-Availability Clusters

It is no secret that  HPC Server 2008 will offer the option to make the head node of a HPC cluster highly available. This feature is not in beta 1, but it is being developed. It will exploit fail-over mechanisms provided by Server 2008 (enterprise edition or better), so I thought I'd mention some highlights […]

It is no secret that  HPC Server 2008 will offer the option to make the head node of a HPC cluster highly available. This feature is not in beta 1, but it is being developed. It will exploit fail-over mechanisms provided by Server 2008 (enterprise edition or better), so I thought I'd mention some highlights in this area too. 

High-availability clusters are difficult to set up and troubleshoot on several platforms. With Windows Server 2003 we made progress in simplifying them, but limitations are still significant:

  • You need a configuration that is fully and specifically certified as a cluster in order to obtain support when things go wrong.
  • There is very limited support for geo-clusters, because of limitations in intra-cluster communications, no awareness of storage location and cluster quorum models. Also, geo-clusters require yet another level of certification.
  • Writing cluster-aware applications is not easy. It requires knowledge of cluster-specific APIs in order to produce “resources” usable by the cluster software. Scripting generic application fail-over is supported, but limited in functionality.
  • Troubleshooting by reading cluster logs requires very deep knowledge to interpret the cryptic messages therein.

Full Article

Microsoft, HPC, Server, Windows Server 2008, HPC Server, Cluster, Clustering