SSX and XFEL-PReSTO

Serial Synchrotron X-ray crystallography (SSX) is a subset of Macromolecular X-ray crystallography (MX). SSX diffraction data collection occurs in room temperature, on many small crystals, where each little crystal contribute a single diffraction pattern to the SSX dataset. X-ray Free Electron Laser (XFEL) instruments where the first light sources that were setup for serial crystallography however, some synchrotron light sources also offers the technique to researchers. MicroMAX is a Swedish SSX beamline available since 2023.

To improve Serial Synchrotron X-ray crystallography (SSX) guidance in PReSTO-docs, we want support from the Swedish SSX community! like posted here by MX researchers. Please contact PReSTO team member and suggest improvements perhaps sharing an example script, links to recent tutorials, or other information that may help structural biology newcomers.

CrystFEL

CrystFEL is designed to index and integrate diffraction patterns for serial crystallography (SX), synchrotron serial crystallography (SSX), X-ray Free Electron Laser (XFEL) and Serial Femtosecond XFEL crystallography (SFX) measurements. There is a video CrystFEL tutorial available for version 0.10.0, and a gitlab tutorial for version 0.11.1.

Here we documented a test-run with CrystFEL GUI version 0.10.2 running the GUI at the login node and using the compute nodes via SLURM from withing the GUI, with serial-crystallography data collected at BioMAX. Please note that we allocate too litte compute time in the first and second run of “Index All” but since this might happen others as well we documented the fact that the CrystFEL GUI had to be restarted! as well as the symtoms and consequences of allocating too little compute time from the CrystFEL GUI.

CrystFEL version 0.11.0 is current default at NSC Tetrallith and it passed first testing and version 0.11.1 is installed, but not yet released (November 8, 2024).

MAX IV staff made a bash script to generate crystfel geom file from master file. To run it, one needs h5dump for instance by performing module load CrystFEL in a terminal window prior to running the script. The script requires an input master file to generate an output gemoetry file as: geom_gen.sh input_master.h5 output.geom

All input files needed to run crystfel will be generated automatically for the injector based SSX at BioMAX - see Processing SSX Data

NanoPeakCell

NanoPeakCell is intended to pre-process your serial crystallography data into ready-to-be-inedexed images with CrystFEL, cctbx.xfel and nXDS and can be launched from the PReSTO menu

Cheetah

Cheetah is intended for processing serial diffraction data from free electron laser sources, and which enable taking home only the data with meaningful content available at gihub

nXDS

nXDS integrates and scales X-ray reflection intensities from randomly oriented single-crystals of the same compound, symmetry and cell parameters into a single data set.

cctbx.xfel

cctbx.xfel is a suite of software tools designed to process diffraction data from serial femtosecond crystallography (SFX) measurements at an X-ray free-electron laser (XFEL) or a synchrotron.

geom_gen.sh

MASTERFILE=$1
OUTPUT=$2
  QX=`h5dump -d "/entry/instrument/detector/x_pixel_size" $MASTERFILE | awk '/\(0\): [0-9]/{print $2*1000}'`
  QY=`h5dump -d "/entry/instrument/detector/y_pixel_size" $MASTERFILE | awk '/\(0\): [0-9]/{print $2*1000}'`

  echo Data from a Eiger hdf5
  SENSOR_THICKNESS=`h5dump -d "/entry/instrument/detector/sensor_thickness" $MASTERFILE | awk '/\(0\): [0-9]/{print $2*1000}'`
  X_RAY_WAVELENGTH=`h5dump -d "/entry/instrument/beam/incident_wavelength" $MASTERFILE | awk '/\(0\): [0-9]/{print $2}'`
  photon_energy=$(bc -l <<<"12398/${X_RAY_WAVELENGTH}")

  NX=`h5dump -d "/entry/instrument/detector/detectorSpecific/x_pixels_in_detector" $MASTERFILE | awk '/\(0\): [0-9]/{print $2}'`
  NY=`h5dump -d "/entry/instrument/detector/detectorSpecific/y_pixels_in_detector" $MASTERFILE | awk '/\(0\): [0-9]/{print $2}'`

  # find ORGX and ORGY:
  ORGX=`h5dump -d "/entry/instrument/detector/beam_center_x" $MASTERFILE | awk '/\(0\): [0-9]/{print $2}'`
  ORGY=`h5dump -d "/entry/instrument/detector/beam_center_y" $MASTERFILE | awk '/\(0\): [0-9]/{print $2}'`

  # find DETECTOR_DISTANCE and OSCILLATION_RANGE:
  DETECTOR_DISTANCE=`h5dump -d "/entry/instrument/detector/detector_distance" $MASTERFILE | awk '/\(0\): [0-9]/{print $2*1000}'`

cat >> $OUTPUT << eof
; Camera length (in m) and photon energy
clen            = $(bc -l <<<"${DETECTOR_DISTANCE}/1000")
photon_energy   = $photon_energy

; adu_per_photon needs a relatively recent CrystFEL version.  If your version is
; older, change it to adu_per_eV and set it to one over the photon energy in eV
adu_per_photon = 1 
res             = 13333.3 

; These lines describe the data layout for the Eiger native multi-event files
dim0 = %
dim1 = ss
dim2 = fs
data = /entry/data/data

; Uncomment these lines if you have a separate bad pixel map (recommended!)
mask = /entry/instrument/detector/detectorSpecific/pixel_mask
mask_file = $1
mask_good = 0x0
mask_bad  = 0xFFFFFFFF

; corner_{x,y} set the position of the corner of the detector (in pixels)
; relative to the beam
p0/min_fs        = 0
p0/max_fs        = `expr $NX - 1`
p0/min_ss        = 0
p0/max_ss        = `expr $NY - 1`
p0/corner_x      = -$ORGX
p0/corner_y      = -$ORGY
p0/fs            = x
p0/ss            = y

; used by geoptimiser
;rigid_group_g0 = p0
;rigid_group_collection_c0 = g0
eof

In 2017 Oskar Aurelius from Stockholm University did first testing on CrystFEL installed at NSC and shared the script listed below for multi-node execution of CrystFEL version 0.6.3 using a tutorial for version 0.6.0 to 0.9.1.

#!/bin/sh
# Split a large indexing job into many small tasks and submit using SLURM
# Copyright © 2016-2017 Deutsches Elektronen-Synchrotron DESY,
#                       a research centre of the Helmholtz Association.
# Authors:
#   2016      Steve Aplin <steve.aplin@desy.de>
#   2016-2017 Thomas White <taw@physics.org>
#   2017 Modified by Oskar Aurelius <oskar.aurelius@dbb.su.se>


LAUNCH=TRUE # Launch jobs directly if 'TRUE'. Otherwise just write each script file

MULTI_EVENT_FILES=FALSE #If using multi-event ("CXI") files, should be 'TRUE'
INPUT=files.lst # List of frames to be processed
RUN=trial_run1 # Name of run
GEOM=5HT2B-Liu-2013.geom # Geometry file
STREAMDIR=output # Directory for output. No trailing /
PARAM="--peaks=hdf5 --int-radius=3,4,5 --indexing=mosflm-axes-latt -p 5ht2b.cell" # Parameters for indexamajig

NPROC=16 # Number of CPU cores to use per job
MAX_TIME=01:00:00 # Maximum usage time of one node. hh:mm:ss
SPLIT=1000  # Number of frames per job/node

CRY_V=CrystFEL/0.6.3-PReSTO # Version (module name) of CrystFEL to use
CCP4_V=CCP4/7.0.045-SHELX-ARP-7.6-PReSTO # Version (module name) of CCP4. Needed for mosflm
XDS_V=XDS/20170923-PReSTO # Version (module name) of XDS
L_BIN=/home/x_user/local_bin # Path to extra executables. mosflm

PROPOSAL=snic2017-1-xxx # SNIC proposal for compute time usage
MAIL=name.surname@lu.se  # Email address for SLURM notifications

################################################################################################################

if [[ $MULTI_EVENT_FILES == TRUE ]]; then
   module load $CRY_V
   list_events -i $INPUT -g $GEOM -o events-${RUN}.lst
   if [[ $? != 0 ]]; then
      echo "list_events failed"
      exit 1
   fi
elif [[ $MULTI_EVENT_FILES == FALSE ]]; then
   cp $INPUT events-${RUN}.lst
else
   echo "Have to pick if MULTI_EVEN_FILES is TRUE or FALSE"
   exit 1
fi

# Count total number of events
wc -l events-${RUN}.lst

# Split the events up, will create files with $SPLIT lines
split -a 3 -d -l $SPLIT events-${RUN}.lst split-events-${RUN}.lst

# Clean up
rm -f events-${RUN}.lst

# Loop over the event list files, and submit a batch job for each of them
for FILE in split-events-${RUN}.lst*; do

    # Stream file is the output of crystfel
    STREAM=`echo $FILE | sed -e "s/split-events-${RUN}.lst/${RUN}.stream/"`

    # Job name
    NAME=`echo $FILE | sed -e "s/split-events-${RUN}.lst/${RUN}-/"`
 
    echo "$NAME: $FILE  --->  $STREAM"

    SLURMFILE="${NAME}.sh"

    echo "#!/bin/sh" > $SLURMFILE
    echo >> $SLURMFILE

    echo "#SBATCH --account   $PROPOSAL" >>$SLURMFILE
    echo >> $SLURMFILE

    echo "#SBATCH --time=$MAX_TIME" >> $SLURMFILE
    echo "#SBATCH --nodes=1 --exclusive" >> $SLURMFILE
    echo >> $SLURMFILE

    echo "#SBATCH --workdir   $PWD" >> $SLURMFILE
    echo "#SBATCH --job-name  $NAME" >> $SLURMFILE
    echo "#SBATCH --output    $NAME.out" >> $SLURMFILE
    echo >> $SLURMFILE

    echo "#SBATCH --mail-type ALL" >> $SLURMFILE
    echo "#SBATCH --mail-user $MAIL" >> $SLURMFILE
    echo >> $SLURMFILE

    echo "module load $CRY_V" >> $SLURMFILE  # Load CrystFEL
    echo "module load $CCP4_V" >> $SLURMFILE  # Load CCP4
    echo "module load $XDS_V" >> $SLURMFILE  # Load XDS
    echo "PATH=\$PATH:$L_BIN" >> $SLURMFILE  # Add path with extra executables
    echo >> $SLURMFILE

    command="indexamajig -i $FILE -o $STREAMDIR/$STREAM"
    command="$command -j $NPROC -g $GEOM"
    command="$command $PARAM"  # Indexing and other parameters added here

    echo $command >> $SLURMFILE

    if [ $LAUNCH == TRUE ]; then
      sbatch $SLURMFILE
    fi

done

User Area

User support

Guides, documentation and FAQ.

Getting access

Applying for projects and login accounts.

System status

Everything OK!

No reported problems

Self-service

SUPR
NSC Express