Analytical and semi-analytical expressions for simple shapes.
Small-angle scattering (SAS) experiments applied to nano-scaled systems allow the investigation of the constituents’ overall shape, size, internal structure and arrangement. A standard scattering experiment requires a relatively simple setup and is often applied to investigate a system of particles. In these cases, the measured scattering intensity represents an average over a large number of particles illuminated by the incoming beam. The calculation and modeling of the scattering intensity can be performed by the use of analytical/semi-analytical expressions or by the use of numerical methods. In this book chapter, an overview of current available simulation/modeling methods for SAS will be shown either for systems composed of oriented or for randomly oriented particles. Examples demonstrating the use of the finite element method are presented as well as a newly developed method for calculating scattering intensity for oriented particles.
- small-angle scattering
- finite element method
- oriented particles
- numerical methods
The investigation of internal structure of system at nanoscale permits the comprehension and correlation of its microstructure to its macro properties. Theoretical and experimental methods are widely used to predict and characterize the properties of these systems . Density functional theory (DFT), molecular dynamics (MD) simulations, and Monte Carlo (MC) simulations are just few examples of theoretical methods used for these investigations [2, 3]. However, all these theoretical methods have always to be checked and confirmed by the use of experimental results, in a large number of available experimental methods. Imaging techniques, when applicable, are very useful since they can provide a direct indication of the shape and size of the investigated system. Electron microscopy (EM) methods like transmission electron microscopy (TEM) and scanning electron microscopy (SEM) give important information on the structures in high resolution [4, 5, 6]. However, these methods demand the use of special experimental conditions like measurements in vacuum and the use of coating agents. Therefore, the obtained results can be affected by the experimental technique itself . Scattering/diffraction methods, on the other hand, can be used for systems directly in solution or in the amorphous matrix, with minimum interaction of the radiation with the matter [7, 8]. These methods, namely, small-angle scattering (SAS) either with neutrons (SANS) or X-rays (SAXS), static light scattering (SAS), etc., can provide useful information about the structure of the investigated system. However, scattering methods give information in the Fourier space (reciprocal space/scattering space) which can difficult its interpretation and modeling [8, 9, 10].
In this book chapter, a review about the calculation of scattering patterns from system composed of particles will be presented. First, an overall discussion about the basic scattering theory and the inverse scattering problem is shown. Later, several analysis and modeling methods are described and discussed. Finally, state-of-the-art methods with advanced applications are shown, demonstrating the use of possibility of simulating scattering patterns for oriented particles.
2. Overall aspects of small-angle scattering
There are several approaches for describing the interaction of electromagnetic radiation with matter. In this chapter, the scattering of an incident beam of radiation by a scattering potential will be assumed [7, 11, 12, 13, 14]. A schematic view of the scattering process is shown in Figure 1.
The potential is assumed to be weak (first Born approximation), and therefore the scattering is considered to be elastic; it is also assumed that the radiation does not destroy the internal structures. The target is considered to be sufficiently thin in order to disregard multiple scattering events. In this description a plane monochromatic wave (far-field approximation) is scattered by a finite potential field , and the wave function that expresses this phenomenon is a superposition of the transmitted plane wave and scattered wave [7, 12, 13, 14].
If the scattering potential is weak, the function is equal to the scattering length distribution function , which is directly related to the particle shape. This information is contained in the scattering amplitude , as shown below [7, 12, 13, 14]:
where is the momentum transfer vector, with modulus (λ is the radiation wavelength and 2θ the scattering angle), A0 is an amplitude scattered by one electron (or atom), and the particle form factor is the Fourier transformation of the function [7, 12, 13, 14],
For the cases of particles immersed in a solvent or in a matrix, the important quantity is the scattering length contrast between the particles and the medium. The scattering length density of the matrix is assumed to be constant, and therefore the scattering length contrast is given by . Therefore, the scattering form factor is rewritten as [7, 12, 13, 14].
The scattering intensity by an object is the product by scattering amplitude and its respective complex conjugate,
the scattering intensity can be rewritten,
If there is no preferential orientation in the system, it is necessary to perform averages over the particle orientation. In Eq. (7) this average gives rise to the calculation of the average correlation function or the pair distances distribution function , which is widely used in SAS analysis [7, 12, 13, 14].
An interesting approach is to consider that the particle, or system, is composed by n scatters with scattering length contrast . Each scatter will contribute with the scattering amplitude (Eq. (4)), and the resulting scattering amplitude is the composition of the scattering amplitudes and its phase factors,
It is interesting to mention that Eq. (9) can represent a single particle composed by n subunits or a system composed of particles dispersed in a matrix. Both situations are described by this equation, and several simulation methods are based on it.
By assuming no preferential orientation, from Eq. (9), one obtains the resulting average scattering intensity,
where is the so-called averaged normalized form factor of the particle. Eq. (11) is very important because it demonstrates that the scattering intensity from a system of particles at very low concentration is proportional to the scattering of a single particle.
If the system is concentrated, the second term in Eq. (10) cannot be neglected. Depending on the system characteristics, several approximations can be performed. It is beyond this chapter to consider all possible approaches for the calculation of this interference term on concentrated systems; good reviews can be found in the literature [14, 16, 17, 18, 19, 20, 21, 22]. A usual approach is to decouple the particle shape and interparticle interactions. In this way, the particle form factor and system structure factor are introduced:
In a typical scattering experiment, after interacting with the sample, the scattered radiation is detected, generally, in a two-dimensional detector. In this case, the obtained image is, depending on sample, isotropic or anisotropic, and these patterns are related to the particle shape and size and possible interparticle interactions. The collected scattering intensity is a direct representation of the data in reciprocal space. Therefore, the analysis of SAXS experiment consists in the interpretation of this data in order to retrieve structural information in real space. Even though the real space is three-dimensional, the collected scattering data are two-dimensional (projection on a specific plane) or one-dimensional (particles randomly oriented or a specific direction). Several modeling methods will be discussed for the calculation of scattering intensities from oriented and randomly oriented particles dispersed in a homogeneous matrix. Examples of these methods are the use of analytical and semi-analytical expressions, cube and sphere method, spherical harmonics, optimized Debye formula to systems oriented and randomly oriented, and fast Fourier transformation [12, 14, 23, 24].
3. Modeling methods for SAS data
After the scattering data is collected, it is necessary to perform several procedures to have the scattering intensity ready to be analyzed. The data treatment of the scattering data includes normalization of the intensity, background subtraction, and normalization to absolute scale among several steps, which depends on the specific characteristics of the experimental setup. The overall data treatment process and necessary procedure for proper reduction of the scattering data are described in many articles and books in the literature and will not be presented here [7, 12, 14, 15, 23, 24, 25, 26, 27, 28, 29]. In this chapter we will focus on methods for calculation of the SAS intensity, either for oriented or randomly oriented particles.
|Form factor amplitude|
|R is the radius of the sphere.|
|Rin and Rout are the inner and outer radius of the shell and V is spherical volume.|
|R1, R2 and R3 are the semi-axes of the ellipsoid.|
|R is the radius, L is the length of the cylinder and J1(x) is the first-order Bessel function of the first kind.|
|a, b and c are the edge lengths.|
The calculated intensity can be compared with experimental scattering data, and the model parameters can be optimized in order to improve the agreement between the theoretical and experimental data. The
3.1. Analytical and semi-analytical methods
For the cases where the particle has a simple shape, it is possible to have analytical or semi-analytical expressions for the scattering intensity. There are a large number of examples in the literature , and some examples are shown in Table 1.
The use of the analytical and semi-analytical equations has the advantage of calculating the scattering intensities with a good precision, low computational cost, and very low number of model parameters. If the particles are randomly oriented, it is necessary to perform angular averages on the equations shown in Table 1. Also, if the system is diluted but has polydispersity in size, it is possible to calculate the resulting average scattering intensity by the use of appropriate equations, which are described in the literature .
The calculation of the scattering intensity is reasonably fast and can be performed with high precision. However, analytical or semi-analytical expressions are only available for simple shapes [13, 14, 16, 30]. There are several programs available in the literature with a large database of equations for modeling scattering data as the SASfit program , among many others. In the webpage smallangle.org, there is an updated list of available programs.
3.2. Cube method
Fedorov et al. [32, 33, 34] and Ninio et al.  proposed the so-called cube method, where the models of macromolecules in solution are surrounded by the solvent (or by the matrix where the particles are immersed), and the cube method permits a correct calculation of the volume inaccessible to the solvent. The theoretical intensity is given by
where is the scattering amplitude of the macromolecule in vacuum and has the same volume of the molecule but with homogeneous electron density ρ0 [32, 33]. The calculation of the scattering amplitude of a protein macromolecule with known atomic coordinates can be done with the equation,
where is the scattering factor of the atom and is its coordinate. The determination of , the scattering amplitude of the homogeneous substance filling the macromolecule and its excluded volume, is not trivial, and several authors proposed solutions for it [21, 32, 33, 35, 36].
The idea is to put the macromolecule coordinates in a cubic grid composed of small cubes with edges of 0.5–1.5 Å. The calculated intensity depends on a specific direction. In order to perform random orientation over direction Z, one can take N directions, in reciprocal space, on an sphere of radius
Virtanen and collaborators presented in 2011 an adaptation of the cube method [37, 38], using a procedure, known as HyPred. Basically, these authors were inspired in cube method, to simulate scattering intensities, and also in molecular dynamics (MD) simulations, to find the hydration layer of a protein. In this procedure, with atomic resolution precision, the nonuniform solvent density around a protein is calculated . With this information one can calculate both small- and wide-angle X-ray scattering (SAXS/WAXS) intensities. In 2014, Nguyen and collaborators presented another adaptation of the cube method , using RISM (reference interaction site model) theory. In this application the cube method is used to calculate the contribution from the solvent at amplitude scattering, and in 2016 Nguyen and collaborators  proposed a procedure to extract information about water and ion distributions from analysis of SAXS experiments. This method allows to compute the solvent distribution around the solute allowing to calculate scattering intensities at small- and wide-angle X-ray (SAXS/WAXS) and with less computational time than MD [39, 40]. One example  of these applications, using RISM-SAXS and HyPred, is shown in Figure 2 for lysozyme and shows a good agreement with experimental data to both applications. The results from other applications were also shown just for comparison. There is a good agreement between the experimental data and the simulation performed by HyPred and RISM. CRYSOL obtained a good fit with the experiment up to 1.5 Å−1. The program CRYSOL  has been used as standard program for such calculations and uses the multipole expansion to calculate scattering intensities; this approach will be discussed at Section 3.4. The web server FoXS is based on the Debye formula, and this formula will be discussed in the next section. The web server program AXES calculates the scattering amplitudes of the surface of solvent using a sum of the six elementary scattering functions averaged . The web server AquaSAXS  computes SAXS/WAXS profile of a given structure, and PDB or PQR file is necessary to perform the calculation.
The HyPred method is very useful for the determination of excluded volumes and contrasts. However, it requires the numeric calculation of the intensities, and if the cubic grid is very small, the computational time for the calculation of intensity is very long. Approaches using spherical harmonics proved to be more efficient and precise for the calculation of scattering intensities for usual investigations of macromolecules in solution [36, 41].
3.3. Sphere method and Debye formula
Considering a system composed of
and calculating the average over all possible orientations,
where is the scattering amplitude of a sphere (Eq. (13)).
The Debye equation is very useful because it is possible to compose the volume of the particle by a sum of small spherical volumes. This modeling method, also known as finite element (FE) method, allows the description of the particle shape by the use of small subunits.
The main advantage of this method is that one can easily model very complex objects. However, it has the disadvantage that the calculation is proportional to
Oliveira and collaborators [45, 46] used this kind of procedure to show the first analysis of nanocage structures using scattering radiation techniques. The authors were interested in discovering the influence in the stability and yield to build experimental DNA octahedron nanocages in solutions, using double and single DNA strands . Then, to perform the modeling and compare with experimental data, the double DNA helix models are positioned in the edge, in octahedron geometry, that was truncated by single DNA strands that perform linkers between the helices. Altogether, there are 12 double-stranded B-DNA helices with 18 base pairs each (positioned in the edges) and 24 single-stranded (making truncation procedure). The stability and yield of nanocages were tested varying the length of single strands, with three, four, five, six, or seven nucleotides (to build the linker).
The SAXS models are built using bead atoms, representing DNA in the edge and in linkers of the cages. These DNA models are rigid, and each bead atom is spherical, representing a nucleotide positioned in atom of C2* (PDB format ). The scattering intensities were simulated using Debye equation, Eq. (25). The results are shown in Figure 3, where the simulated theoretical intensity was adjusted to each experimental SAXS data, for the five kinds of nanocage. From this analysis it was possible to obtain the relations between the cage size and the linker size and also the presence of high-order agglomerates (dimers and trimers of cages).
Even with the increase of performance of the new computer processors, the use of the Debye equation is limited to few dozens of subunits, since it involves a double sum. In the next sections, some procedures to speed up the calculation decreasing the computational costs will be shown.
3.4. Spherical harmonics and multipole expansion
In the late 1960s, Harrison  and later Stuhrmann and Svergun [36, 49] proposed an alternative procedure to compute scattering intensities for particles. The main idea is to express the scattering length distribution function distribution as a series of spherical harmonics , which describes an angular envelope function ,
The envelope function is parameterized using multipole expansion
where are spherical harmonics and the multipole  coefficients are complex numbers,
The scattering amplitude is given by,
The spatial resolution of the shape representation (Eq. (27)) is defined by the truncation value L. Thus the particle shape is parameterized by members. Also, the accuracy of its representation increases with
The shape scattering intensity is expressed as
where the partial amplitudes are represented by the power series,
The use of spherical harmonics permits the description of low-resolution shapes with a relatively low number of parameters, and it was the first approach capable to obtain the particle shape directly from the scattering intensity, without any a priori information. This is the first of the so-called ab initio modeling methods for SAS data analysis. This method was implemented in a program, namely, by SASHA , and provides the angular envelope function that gives the best fit of the scattering data. This application is a good option in determination of low-resolution structure without internal cavities and without sharp edges or corners, limited to smooth shapes.
One example of application is shown in Figure 4. In this work, Arndt and collaborators investigated extracellular proteins . By using SAXS investigations, in particular the spherical harmonics approach (program SASHA), it was possible to obtain low-resolution models for the protein
The description of particle shape using the envelope function was a major step for the calculation of the scattering from macromolecules in solution . Given the atomic coordinates for the macromolecule, it is possible to calculate the scattering intensity and excluded volume for the macromolecule. This was implemented in the program CRYSOL and readily demonstrates the presence of a hydration shell around macromolecules in solution. The use of spherical harmonics permits a very fast calculation of the scattering intensity and opened new research lines and opportunities for the use of SAS data.
In the late 1990s, Svergun’s group proposed a set of tools combining the use of spherical harmonics to calculate scattering amplitudes and variation of the Debye equation. In the program called DAMMIN , a search space filled with spherical beads is created, and by the use of a heuristic optimization based on Monte Carlo approach (simulated annealing, SA ), a subset of this set of spheres is selected in order to provide the best fitting of the scattering data. The expression used for the calculation of the scattering intensity is 
The use of spherical harmonics speeds up the calculation process, which is the main drawback of the original Debye equation (Eq. (25)).
Other ab initio methods using the dummy atom approach were proposed by many other authors but using optimized implementations of the Debye equation (see the next section). Chacon [54, 55] proposed ab initio methods using genetic algorithm procedures for the model optimization. A modified procedure was proposed by Doniach and collaborators  changing the genetic algorithm by the so-called “Give‘n’Take” algorithm. Due to its functionality and special features (inclusion of symmetry constraints, multiple curve fitting, etc.), the program DAMMIN is the most used and cited in the literature.
Further implementations performed for the use of ab initio methods applied to the study of macromolecules in solution took advantage of the known atomic resolution information for proteins, available in the protein data bank (https://www.wwpdb.org/) , and composes the ATSAS program suite [17, 41, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69]. Several good reviews can be found in the literature for this subject [14, 17, 36, 50, 60, 65, 67, 68, 69].
3.5. Optimized Debye equation
The Debye equation assumes that the subunits are identical and the arrangement of the subunits defines the particle shape. As mentioned before, the double sum involved in the calculation limits the number of subunits since the computational time increases with O(n2). In order to decrease the computational time, Glatter proposed the use of histograms of distances inside the particle . With this procedure the double sum, Eq. (25), is converted in a single sum [70, 71],
by the use of the histogram of distance between the subunits
In Figure 5A the comparison between analytical equations, Table 1, and the same models built with the FE method and the intensities calculated with the optimized Debye equation is illustrated. The very good agreement between the theoretical analytical intensities and the calculated intensities with the optimized Debye equation demonstrates the precision of the method for its use in calculating complex shapes.
In Figure 5B two models were built using FE method. The red (internal) model is a DNA cage. The DNA molecule was modeled using coarse-grained approach, so each nucleotide corresponds to one spherical subunit place in positions of C2*. The blue model is an icosahedral shell-like structure. The composed model is an icosahedral shell-like with a DNA cage in its interior. The calculation of the scattering intensity was performed using the optimized Debye equation in a slightly different implementation in order to include different contrasts and demonstrate the potential use of this approach. In the figure of the original article is possible see that the histogram approach also permits an easy computation of affine polydispersities [8, 73].
3.6. Fast Fourier transformation
Schmidt-Rohr  proposed the use of a direct method, based on the use of Fourier transform (FT) of a three-dimensional model, to calculate the intensity scattering. Basically, the three-dimensional model is defined, a priori, on a cubic discrete lattice of dimension
To obtain the intensity for a subunit, one can use the equation below,
For a system of identical particles, the total intensity is represented by the convolution of the spatial points of the particles’ center of mass together with the density distribution of one particle ,
Schmidt-Rohr and Chen [74, 76] showed an application of this method to quantitatively simulate small-angle scattering data of hydrated Nafion and were capable to explain the “ionomer peak” visualized in SAXS patterns has been related to the randomly packed water channels internal to cylindrical inverted micelles. These results demonstrated the good transport properties of hydrated Nafion and have given details about its internal structure like diameters of water channels, cluster sizes, the shape of channels, and crystallinity levels .
An advantage of this method is the order of computational cost, O(NlogN), but on the other hand, it does not present the good results when used to systems where the SAS features are of the order of size of systems. For this kind of systems, Monte Carlo distribution function method (MC-DFM) gives better results  (the MC-DFM uses optimized Debye equation, Eq. (33), to simulate scattering intensities). Olds and collaborators  compared the efficiency of these two methods and suggest that the use of FFT method is more efficient for dense systems and complex dense-packed particle systems such as high-density polydisperse hard-sphere models. In this last case, systems of dense arrays of monodisperse spheres, the FFT method can be at least three times more efficient. However, for systems of low density such as extended polymers and dilute systems, FFT is inefficient and is also less useful for systems where it is possible to use the diffuse character of the model and use the atomic coordinates. So, to large model particle systems, dilute particle arrays, polymers, and proteins, the MC-DFM can be a more efficient procedure .
3.7. Optimized Debye equation for oriented particles
The FE method can also be used to calculate the scattering intensities for oriented systems or particles. The calculation can be performed by the use of Eq. (9), but the practical application of this formula is limited since it involves a double sum and vectorial arguments, which makes the computational costs very high. Since the particles are oriented, the scattering intensity is anisotropic and therefore is necessary to compute the two-dimensional scattering pattern. A possible approach was proposed in the seminal Guinier-Fournet book  and consists in applying the equation below
to calculate the scattered intensities in a specific direction. However, this equation can only be used to centrosymmetric particles, which largely limits its application.
Sjöberg  proposed an approach to investigate the effects of interparticle correlations. In this approach the particles or molecules have known form factor, and the correlations can be obtained by the use of single sums, as shown below,
One of the main difficulties on simulating anisotropic two-dimensional scattering pattern is the computational time required to perform the calculation. For example, to make a scattering image with side of m = 512 pixels (a total of 262,144 pixels), it is necessary to perform the calculation for each pixel (which defines a specific value) and for each scatter (
In order to overcome these limitations, Alves and collaborators  recently proposed an innovative procedure to solve this problem. Inspired by the histogram approach used in the optimized Debye equation Eq. (33), it is possible to convert the double sum in Eq. (9) into a single sum over the bins of the histogram. This new equation
permits the fast calculation of the scattering intensity in a given direction. is the number of channels of histogram, is a unitary scattering vector, and is the distance vector between the subunits composing the model.
The values for the dot product of are used to create the histogram of the projection distances , in a specific direction. The construction of the histograms still involves a double sum but is performed only once. Having the histograms, the intensities are easily calculated. One strategy proposed by Alves et al.  is to divide the 2D scattering image in angular slices and compute the histograms for each direction. As shown by the authors, the calculation can be further optimized by the use of parallel computing.
The precision of the method is demonstrated by the use of known analytical equations for simple shapes, as the ones shown in Table 1. Several examples demonstrating the precision of the method are described in the original article . This new method opens a large number of possibilities for the calculation of scattering intensity for oriented particles.
Recent applications using X-ray free-electron lasers (FEL) are capable to produce intense ultrashort pulses (femtoseconds), in nanometer-sized coherent beams, irradiating particles in solution. Due to the special properties of these experiments, it is possible to irradiate single particles. Since the pulse durations are shorter than the characteristic rotational diffusion time of the particle, the obtained scattering intensity corresponds to particle oriented in a given direction. Therefore, if the system is composed of identical particles, multiple scattering images correspond to the scattering intensities from multiple orientations of the particles.
Several authors propose methods to describe the coherent scattering pattern and recover the three-dimensional structure of the scattering particle, based on the method proposed by Kam [78, 79, 80]. The proposed method for calculation of oriented scattering intensities can potentially be used to describe data from FEL experiments. To demonstrate this potentiality, in Figure 6 the two-dimensional scattering pattern simulated for the protein lysozyme in several orientations is presented. This simulation method can also describe models with variable scattering length contrasts and interparticle interactions (structure factor). Several examples can be found in the original article .
4. General conclusions and perspectives
In this chapter a general overview about several procedures to calculate scattering intensifies for system of particles was presented. After a brief description of the general theoretical aspects, several methods for the calculation of scattering intensities were shown, with some typical applications. The main points and limitations of each procedure were discussed.
The analytical calculation of the scattering intensity is restricted to particles with simple geometries. More complicated shapes require the use of simulation methods. The Debye equation provided a first indication in this direction by the use of spherical subunits to build the particle (finite element description—FE) and calculate the scattering intensity for randomly oriented averaging. Its variation, with the use of cubic subunits, gives the so-called cube method. This approach permitted a better calculation of excluded volumes but requires numerical averaging for account for the random orientation or the particles. The original Debye equation involves a double sum which is very inefficient (high computational costs) and cannot be applied for a large number of subunits. Optimized forms of the Debye equation were proposed by the use of histograms of pair distances, which turn the double sum on the number of particles into a single sum over the histogram bins. In this way, this method could be used for fast calculation of scattering intensities and modeling methods. Another modeling method was the use of spherical harmonics for the calculation of the scattering intensity. With the introduction of the envelope function to describe the particle shape, this method proved to be very powerful for the description of proteins in solution and the description of hydration layers. In the last decades, this approach and its development combined with ab initio methods promoted a revolution on the use of scattering data for the investigation and modeling of macromolecules in solution. Fast Fourier transformation methods have been recently applied to calculate the scattering patterns for known shapes, with very interesting applications. Also, based on the FE method, one can use a special development of the optimized Debye equation to compute scattering intensities for oriented particles. This innovative approach permits the fast calculation of 2D scattering patterns and provides new perspectives for the use and analysis of the small-angle scattering method.
CLPO is supported by FAPESP, CNPQ, and INCT-FCx. We acknowledge the Crystallography Journals Online (https://journals.iucr.org/), AIP Publishing LLC, American Chemical Society, and John Wiley & Sons, Inc. for permitting the reproduction of the figures used.
Conflict of interest
The authors declare that they do not have any “conflict of interest.”