CCP1 - Quantum Directed Virtual Evolution

Jens Thomas Computational Chemistry Group CCLRC Daresbury Laboratory Keckwick Lane Daresbury Warrington WA4 4AD

Dr. Marcus C. Durrant School of Applied Sciences Northumbria University Ellison Building Newcastle Upon Tyne NE1 8ST

Scientific Objectives

The QDVE project uses a genetic algorithm to determine an optimal catalyst for the conversion of nitrogen to hydrazine.

A selection of potential catalysts (collectively a "generation") are created by randomly combining a number of metal centres and ligands in several pre-defined geometries, with an associated range of spin and charge states.

As an "ideal" energy profile for the catalytic cycle is already known, the energy of a complex formed from the potential catalyst and the substrate at each point on the catalytic cycle can be calculated and compared with the energy of the deal value. A scoring system is then used to select promising catalysts, and a new generation created by shuffling the characteristics of the selected complexes (e.g. metal centre, geometry etc) between themselves.

This new generation is then submitted for calculation, and the process is repeated until a catalyst that most closely reproduces the desired energy profile has been found.

The aim of this testbed was to process a single generation of the catalytic compounds on the NW-Grid using GAMESS-UK and to collate the results.

Who was involved

The originator of the project, Marcus Durrant from the University of Northumbria, supplied the 384 input files for the calculations.

Rik Tyer of the eScience department at CCLRC Daresbury laboratory, set up the eMinerals infrastructure within which the jobs were run on the NW-Grid.

Phil Couch, also from the eScience department, wrote the AgentX software that was used to extract various parameters from the XML(CML) output generated by GAMESS-UK, and assisted in altering GAMESS-UK to output the correct CML.

Jens Thomas from the Computational Chemistry Department at CCLRC Daresbury Laboratory ran the testbed and amended GAMESS-UK to write out the required parameters as XML.

Resources

The test bed consisted of 384 GAMESS-UK DFT calculations, each run on 8 processors. The disk space required for each job was of the order of 100MB, although the output files generated (from a few hundred kB to 10MB) were all stored externally on an SRB.

In total 2610 cpu hours were used.

The jobs for this project were largely run over the week starting the 21st of February 2007. A small number of additional jobs were submitted on the 16th and 22nd of March

Grid Technology Used

The methodology of the QDVE project is similar to that for parameter-sweep approach, with a large number of broadly similar, but independent calculations being run.

The energy calculations were Density Functional Calculations run using GAMESS-UK. As the calculations were reasonably large, they were required to be run in parallel on 8 processors in order for the large number of calculations required to be carried out in a reasonable time.

All of the jobs were run within the "e-Minerals infrastructure" which provides a framework for accessing both data and compute resources, and for generating and processing data and metadata generated by the calculations. GAMESS-UK was amended to write out key calculation parameters as XML (specifically Chemical Markup Language), which enabled AgentX (and the associated eMinerals metadata tools) to automatically harvest this data.

The jobs were submitted using Rik Tyer's "monty" program, which provides a way to automate the submission of a large number of jobs to RMCS (part of the eMinerals infrastructure) and to handle the uploading of the data into the SRB. The jobs were run across the Daresbury, Lancaster and Liverpoool nodes of the NW-Grid.

The NW-Grid and associated middleware offered the opportunity to run these batches of calculations far more efficiently and quickly than could be achieved either on a large number of small computers or a large supercomputer.

The NW-Grid was required for this project as it was the only grid that had the large compute resources required for all the calculations, together with the integration of the eMinerals infrastructure.

Scientific Outputs

The testbed achieved more than it had set out to do. The eMinerals framework made running the calculations and processing the results so easy that a number of calculations that were unsuccessful were able to be quickly identified, the inputs altered and the calculations re-submitted, all within the timeframe allotted for the original testbed.

The scientific applications used within this testbed were GAMESS-UK to run the calculations and the CCP1GUI to visualise the results.

GAMESS-UK was altered to write out XML in order for the calculation data to be automatically processed with AgentX. It was also altered to read it's input from a named file and write it's output to a named file (as opposed to reading from/ writing to stdin/ stdout), due to problems redirecting stdin/ stdout when using the grid middleware.

The results of this generation are currently being processed by Marcus Durrant and provide a significant advance towards the overall goal of developing a new catalyst.

Experiences

Overall, the QDVE testbed was very successful. Although there was considerable overhead in setting up the eMinerals infrastructure and altering GAMESS-UK to output the required XML, once this had been done, the time to processes an individual batch of calculations was dramatically reduced from nearly a week to less than hour.

When the initial batch of calculations were submitted a number of the calculations failed for various reasons. However as the eMinerals framework was so efficient, these failures were quickly identified and the calculations were able to be amended and resubmitted, increasing the number and quality of the results.

The initial drawback to the project was the overhead in setting up the eMinerals infrastructure on the NW-Grid and amending GAMESS-UK to work in this environment. This was further complicated by the lack of a suitable Globus jobmanager that was SGE -and SCORE-aware on all sites. This was the intial deficiency that required changing the way GAMESS-UK handled it's stdin/ stdout.

In addition, the QDVE project uncovered a couple of bugs with the way that MCS and AgentX interacted with each other, which meant that, although most of the metadata was correctly handled, there were problems dealing with the metadata when the calculations failed.

QDVE (last edited 2009-02-11 13:26:11 by RobAllan)

This website maintained by Research Computing Services, University of Manchester