A general trend at synchrotron radiation facilities is the increased demand for automated on-line data analysis. Increases in flux, improved optics and fast read-out detectors now enable complex experiments to be performed routinely, where data acquisition and data analysis must not only be linked, but also take place almost simultaneous. The need to automate these processes has become more urgent with the ESRF upgrade programme, in particular for the new MASSIF beamlines for high-throughput macromolecular crystallography (MX). These experiments pose a challenge for both hardware (sample changers, sample centring, etc.) and software (data acquisition and data analysis). In order to automate such experiments, a high level software environment is required to coordinate data acquisition and data analysis. The development of this software requires care as it needs to be robust and highly flexible in order to be able to respond rapidly to the constantly evolving scientific needs and instrumentation developments.

To meet this challenge we have developed a data analysis workbench (DAWN) in collaboration with the Diamond Light Source, EMBL Grenoble and other institutes [1]. This software package/environment contains tools for data visualisation, data analysis and most importantly a workflow tool called Passerelle (developed by the Belgian company iSencia and based on Ptolemy II). Workflow tools have been used extensively for scientific applications [2], however, these tools have not been widely used at synchrotron radiation facilities. Reported here is the first use of a workflow tool to link data analysis and data acquisition. The DAWN workflow tool provides a framework and structure onto which complex on-line data analysis workflows can be built. Workflows can be implemented in traditional programming languages (e.g. Fortran, C/C++, Java, Python, etc.), but the DAWN workflow tool allows beamline scientists (non-programmers) to be highly involved in the developments: with a minimum of training, they can modify and adapt the workflows to meet new scientific requirements without the intervention of a programmer. The DAWN workflow tool helps to bridge the gap between beamline scientists, who have the scientific expertise, and programmers, responsible for implementing the on-line data analysis software components. The role of the programmer is to provide robust workflow ‘actors’, i.e. plugins that perform specific tasks, which can be intuitively used by the beamline scientists. The visual approach of describing the workflows facilitates the communication of the experimental protocols to other scientists by hierarchically structuring and highlighting the main logic (Figure 142).

Fig_142_HL2012.jpg

Fig. 142: The DAWN workflow tool. The screenshot shows a workflow in development. The workflow is represented graphically with actors shown as individual elements.

We have implemented a host of on-line data analysis workflows that are now available at the MX beamlines:

  • Sample re-orientation using Kappa goniometers: Data collected from macromolecular crystals can be improved and optimised by re-orientating the sample prior to data collection. The workflow for sample re-orientation consists of collecting reference images, analysing the images, controlling the Kappa goniostat and calculating optimised data collection strategies.
  • Automatic crystal centring using X-rays: This workflow scans the largest face of a loop using a 2D mesh scan and then performs a 1D scan 90 degrees away on the best area of diffraction, as determined from the first scan. The optimum crystal volume is then automatically centred with respect to the beam, ready for final characterisation and data collection.
  • Automatic control of a dehydration device: The dehydration of macromolecular crystals has long been known for having the potential to increase their diffraction quality. We have developed a workflow which automates the cycle of dehydration, equilibration, acquisition of reference images, data analysis and production of graphs which describe the crystal diffraction quality as a function of the humidity level.
  • Enhanced EDNA [3] crystal characterisation workflow: When characterising crystals, the diffraction quality is unknown. The program BEST [4] can calculate not only the optimal data collection strategy but also the theoretical maximum resolution for a complete data set. This workflow automatically recollects the reference images at the maximum resolution if the crystal diffracts better than the resolution obtained with the initial reference images.
  • Crystal radiation-sensitivity measurement workflow: Radiation damage severely limits the data that can be obtained from single crystals. By sacrificing a crystal, the degree of radiation-sensitivity of similar crystals can be accurately estimated and optimal data collection strategies calculated taking into account this sensitivity.

Fig. 143: a) The workflow GUI component in the beamline user interface MXCuBE. The GUI component displays dynamic content from the workflows and can therefore adapt to the needs of a particular experiment, in this case Kappa reorientation angles. b) The output from an auto-centring workflow. A diffraction quality map is shown overlaid with the sample.

The workflows described above have already been used successfully on ESRF MX beamlines. Most of them make use of existing EDNA plugins, therefore the workflow tool should not be seen as a replacement but as an enhancement of workflows implemented in EDNA. Users interact with the workflows through the standard beamline control interface MXCuBE (Figure 143) [5]. We are now in the process of preparing new workflows which will enable even more complex experiments, for example:

  • Optimisation of data collections from multiple crystals by re-orienting samples and taking into account data already collected
  • Automatic crystal ranking by characterising a large number of samples and selecting those with the best diffraction qualities
  • High-throughput automatic MAD data collection
  • Fully-automatic data collection pipelines for drug discovery

The successful use of workflows for MX experiments implies that other types of experiment could profit from this approach, too.

 

Principal publication and authors

S. Brockhauser (a), O. Svensson (b), M.W. Bowler, (a,b), M. Nanao (a), E. Gordon (b), R.M.F. Leal (b), A. Popov (b), M. Gerring (c), A.A. McCarthy (a) and A. Götz (b), Acta Cryst. D68, 975-984 (2012).

(a) EMBL, Grenoble (France)

(b) ESRF

(c) Diamond Light Source (UK)

 

References

[1] http://www.dawnsci.org. DAWN is a collaboration between the ESRF, the Diamond Light Source (Didcot, UK), EMBL Grenoble, iSencia (Gent, Belgium) and Global Phasing (Cambridge, UK).

[2] I.J. Taylor, E. Deelman, D.B. Gannon and M. Shields, (Eds.), Workflows for e-Science, Springer, London, (2007).

[3] M.-F. Incardona et al., J. Sync. Rad. 16, 872-879 (2009).

[4] A. Popov et al., Acta Cryst. D59, 1145–1153 (2003).

[5] J. Gabadinho, et al., J. Sync. Rad. 17, 700-707 (2010).