Nimrod In The Cloud
Introduction
Example Screen Shots
In late 2008 MeSsAGE Lab ran a student project which involved prototyping a Nimrod/G interface to Amazon's Elastic Compute Cloud (EC2)
. A press release and further details are available here: Nimrod soars into the clouds. This project established the suitability of emerging infrastructure services (like EC2) for Nimrod/G orchestrated high-throughput computing. Since then, researchers in the lab have re-engineered and built up the interface, and it is now available in the Nimrod daily build available off the downloads page. We have also been fortunate to receive a grant through the Amazon in Education program, which has covered the service costs of running real scientific applications on the platform.
Features
- The interface makes use of the boto
client library. This means it will work with any EC2 compatible interface, for example Eucalyptus
clouds.
- Cloud resources can be used alongside clusters or computational grid resources, this allows users to seamlessly mix heterogeneous resources in a single computational experiment.
- A specialised scheduler manages requesting of agents (remote job execution managers) for the cloud, users can nominate conservative or aggressive scheduling with regard to the number of instances launched in a scheduling interval.
- The cloud interface makes full use of Nimrod's meta-scheduling capabilities allowing cloud instances to be utilised efficiently by running multiple jobs or parameter sets on a single instance. This can occur both sequentially and in parallel across machine instances, and also within multi-core machine instances if desired.
- Users provide credentials, service endpoints (these default to the EC2 US-East-1 region), and the type and ID of instances desired (limits on the number of machine instances running in parallel can also be specified).
- Coupled with Nimrod/G's existing economic scheduling capabilities this interface can be used to help a user meet deadlines by scheduling overflow jobs, from e.g. local free of charge resources, onto a commercial cloud such as EC2.
Architecture
Resource specific code (and associated library dependencies) are encapsulated in Nimrod/G by actuator modules. In the past, actuator implementation has involved wrapping and adapting interfaces to local resource managers (e.g. batch systems) and grid-middleware. These typically provide managed execution services and in the cloud computing taxonomy would be categorised as platform-as-a-service (PaaS). On the other hand, infrastructure-as-a-service (IaaS) cloud platforms, such as Amazon's EC2, provide general interfaces to virtualised data-centre infrastructure without the higher level abstractions (jobs, files, virtual organisations) associated with grid-computing.
Fortunately, due to its evolution through cluster and grid computing, Nimrod/G provides much of its own execution machinery and simply uses grid services to launch agents (aka pilot-jobs or execution/job-containers) which form a distributed set of processors. As such, adapting Nimrod/G to exploit IaaS cloud services is (mainly) a matter of creating a new actuator. The high-level architectural diagram below depicts the interaction between relevant modules.
Compared to the grid actuators the EC2 actuator must also manage the resource itself, e.g., bringing up and tearing down VM instances to meet demand from the job scheduler. Then it can perform Nimrod specific initialisation, e.g., copying agents to the VM instances, which can subsequently be invoked and will begin consuming jobs.
Basic Example
The example illustrated by the images on the right demonstrates a Nimrod/G experiment running across a GT4 resource and the Amazon Elastic Compute Cloud. The EC2 resource has been added using the default parameters plus specifying an instance limit of four, for the command line interface this is:nimrod resource add ec2 --key=./ec2id --secret=./ec2secret --instlimit 4
Coming Soon
The next Nimrod release will add the following new features:
- Explicit EBS boot-volume handling.
- Spot instance request support.
Works in progress:
- Generic Windows compute instance support.
- A Nimrod AMI.
Want More Information?
Please contact:
blair.bethwaite@infotech.monash.edu.au
, and/or




