An error occured when trying to show the publication. Please check if JavaScript is enabled or try to update your browser.

March 2024 ESRFnews

14

unannotated volumes, selecting those it has the most to

learn from. The human researchers review its output,

applying corrections to be integrated into the next

training loop. This next loop is then concluded, and

another, and so on. According to Cadiou, training a

model to segment one volume with more classical deep-

learning algorithms can take a week or two due to the

manual image-annotation process, whereas this active

learning procedure gives results of sufficient fidelity in

about a day. “The speed-up is all the more required when

we’re working with in situ or operando data, as there are

then numerous volumes to analyse conjointly,” he says.

Many other ESRF users are turning to machine

learning to assist in segmentation, especially when the

raw images contain the unprecedented levels of detail

provided by the new Extremely Brilliant Source. Backed

by a grant from the European Research Council, ESRF

scientist Alexandra Pacureanu is turning to automated

segmentation to resolve neural circuits in mammalian

brains in data from the ID16A nano-imaging beamline.

Meanwhile, drawing on hierarchical phase-contrast

tomography (HiP-CT) data taken at the ESRF’s

flagship BM18 beamline researchers involved in the

Human Organ Atlas HOA project cofunded by the

Chan Zuckerberg Initiative are relying on automated

segmentation to identify various anatomical structures

inside organs particularly blood vessels but also

airways in the lungs the glomeruli or filtering units

of the kidneys and parts of the brain figure 2

Training training training

HiPCT data expose the challenge and potential for

machine learning in segmentation Given that it can

deliver images of entire organs with a resolution down

to the single cell in regions of interest the data volumes

are massive, often a terabyte or more, requiring hefty

processing power. In addition, features such as blood

vessels are genuinely hierarchical – meaning that seg-

mentation has to be performed over disparate length

scales – and vary greatly from person to person. Perhaps

the trickiest problem is the sheer novelty of the imaging

technique: there are simply no data already available

that a machine-learning algorithm can draw on to train

itself. “This is something that is often glossed over, but

machine learning can only be as good as the data used to

train it,” says HOA scientist Claire Walsh at University

College London in the UK. “And making these data is

a huge undertaking. We have two experts labelling each

dataset, and a third to go over the combined labels and

mark areas that need improvement.”

The HOA has an open data policy. That permits

another avenue for machine learning, for its datasets –

which include entire organs, either healthy or afflicted by

various diseases – can be mined by independent research

groups, using their own algorithms and driven by their

own research goals. Indeed, automated mining of open

data is behind what is arguably the most scientifically

influential deeplearning product of recent years

AlphaFold which is developed by DeepMind a research

laboratory based in London UK and owned by the

parent company of Google Trained on experimental

largely synchrotronderived data in the Protein Data

Bank AlphaFold has succeeded where humans could

not by predicting with incredible accuracy protein

structures from their amino acid sequences AlphaFold

predictions can in turn boost the experimental

determination of new structures figure 3

In other areas of synchrotron science machine

learning is not so much about breaking new ground

but making existing ground more accessible Take Xray

Machine

learning can

only be as

good as the

data used to

train it

MACHINE LEARNING

0

X-r a y r e f l e c t i v i t y [a.u.]

1e

–3

1e

–6

1e

–9

1e

–12

1e

–15

1e

–18

1e

–21

1e

–24

0.1

momentum transfer [1/Å]

499.6 Å

398.0 Å

299.3 Å

199.3 Å

147.5 Å

97.2 Å

75.3 Å

measurement

ML result

0.2

0.3

100

100

200

r e a c h e d t h i c k n e s s [Å]

300

400

500

600

200

300

target thickness [Å]

400 500

600

target = reached (theoretical)

Figure 1 Measured

data compared with

deep-learning

predictions for a

crystal-growth

experiment at ID10.

(a) The algorithm

predicts the

relationship between

X-ray momentum

transfer and

reflectivity

oscillations, which

are a measure of

properties such as

thickness and

surface roughness.

(b) The algorithm

predicts when to stop

the in situ molecular

beam deposition for

a certain desired

film thickness.

(a)(b)

ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024ESRF News March 2024
Powered by Fluidbook