Highly Predictable Cluster for Internet Grids
http://www.hpc4u.org/The Next Generation Grid applications will demand Grid middleware for a flexible negotiation mechanism supporting various ways of Quality-of-Service (QoS) guarantees. In this context, a QoS guarantee may cover simultaneous allocations of various kinds of different resources requesting a certain level of Fault Tolerance, which are specified in the form of Service Level Agreements (SLA). Currently, a gap exists between the capabilities of Grid middleware and the underlying resource management systems concerning their support for QoS and SLA negotiation.
Research
The EU project HPC4U will provide an SLA-aware and Grid-enabled Resource Management System for closing this gap. In this context the major research issues will include SLA negotiation and SLA-aware scheduling functionality, and provision of Fault Tolerance by means of application-transparent checkpointing mechanisms, virtualization and checkpointing of storage, and fault tolerance mechanisms at network layer.