Analytical and semi-analytical expressions for simple shapes.

## Abstract

Small-angle scattering (SAS) experiments applied to nano-scaled systems allow the investigation of the constituents’ overall shape, size, internal structure and arrangement. A standard scattering experiment requires a relatively simple setup and is often applied to investigate a system of particles. In these cases, the measured scattering intensity represents an average over a large number of particles illuminated by the incoming beam. The calculation and modeling of the scattering intensity can be performed by the use of analytical/semi-analytical expressions or by the use of numerical methods. In this book chapter, an overview of current available simulation/modeling methods for SAS will be shown either for systems composed of oriented or for randomly oriented particles. Examples demonstrating the use of the finite element method are presented as well as a newly developed method for calculating scattering intensity for oriented particles.

### Keywords

- small-angle scattering
- nanoparticles
- finite element method
- oriented particles
- simulation
- numerical methods

## 1. Introduction

The investigation of internal structure of system at nanoscale permits the comprehension and correlation of its microstructure to its macro properties. Theoretical and experimental methods are widely used to predict and characterize the properties of these systems [1]. Density functional theory (DFT), molecular dynamics (MD) simulations, and Monte Carlo (MC) simulations are just few examples of theoretical methods used for these investigations [2, 3]. However, all these theoretical methods have always to be checked and confirmed by the use of experimental results, in a large number of available experimental methods. Imaging techniques, when applicable, are very useful since they can provide a direct indication of the shape and size of the investigated system. Electron microscopy (EM) methods like transmission electron microscopy (TEM) and scanning electron microscopy (SEM) give important information on the structures in high resolution [4, 5, 6]. However, these methods demand the use of special experimental conditions like measurements in vacuum and the use of coating agents. Therefore, the obtained results can be affected by the experimental technique itself [7]. Scattering/diffraction methods, on the other hand, can be used for systems directly in solution or in the amorphous matrix, with minimum interaction of the radiation with the matter [7, 8]. These methods, namely, small-angle scattering (SAS) either with neutrons (SANS) or X-rays (SAXS), static light scattering (SAS), etc., can provide useful information about the structure of the investigated system. However, scattering methods give information in the Fourier space (reciprocal space/scattering space) which can difficult its interpretation and modeling [8, 9, 10].

In this book chapter, a review about the calculation of scattering patterns from system composed of particles will be presented. First, an overall discussion about the basic scattering theory and the inverse scattering problem is shown. Later, several analysis and modeling methods are described and discussed. Finally, state-of-the-art methods with advanced applications are shown, demonstrating the use of possibility of simulating scattering patterns for oriented particles.

## 2. Overall aspects of small-angle scattering

There are several approaches for describing the interaction of electromagnetic radiation with matter. In this chapter, the scattering of an incident beam of radiation by a scattering potential will be assumed [7, 11, 12, 13, 14]. A schematic view of the scattering process is shown in Figure 1.

The potential is assumed to be weak (first Born approximation), and therefore the scattering is considered to be elastic; it is also assumed that the radiation does not destroy the internal structures. The target is considered to be sufficiently thin in order to disregard multiple scattering events. In this description a plane monochromatic wave (far-field approximation) is scattered by a finite potential field

If the scattering potential is weak, the function

where _{0} is an amplitude scattered by one electron (or atom), and the particle form factor

For the cases of particles immersed in a solvent or in a matrix, the important quantity is the scattering length contrast between the particles and the medium. The scattering length density of the matrix is assumed to be constant, and therefore the scattering length contrast is given by

The scattering intensity

Or, by using the self-correlation function [7, 12, 13, 14],

the scattering intensity can be rewritten,

If there is no preferential orientation in the system, it is necessary to perform averages over the particle orientation. In Eq. (7) this average gives rise to the calculation of the average correlation function

An interesting approach is to consider that the particle, or system, is composed by n scatters with scattering length contrast

Therefore, the total scattering intensity from the group of n scatters at relative positions

It is interesting to mention that Eq. (9) can represent a single particle composed by n subunits or a system composed of particles dispersed in a matrix. Both situations are described by this equation, and several simulation methods are based on it.

By assuming no preferential orientation, from Eq. (9), one obtains the resulting average scattering intensity,

For a system composed of particles at very low concentration, the interference term (second part of Eq. (10)) goes to zero, and the resulting average scattering intensity is [7, 12, 13, 14].

where

If the system is concentrated, the second term in Eq. (10) cannot be neglected. Depending on the system characteristics, several approximations can be performed. It is beyond this chapter to consider all possible approaches for the calculation of this interference term on concentrated systems; good reviews can be found in the literature [14, 16, 17, 18, 19, 20, 21, 22]. A usual approach is to decouple the particle shape and interparticle interactions. In this way, the particle form factor

In a typical scattering experiment, after interacting with the sample, the scattered radiation is detected, generally, in a two-dimensional detector. In this case, the obtained image is, depending on sample, isotropic or anisotropic, and these patterns are related to the particle shape and size and possible interparticle interactions. The collected scattering intensity is a direct representation of the data in reciprocal space. Therefore, the analysis of SAXS experiment consists in the interpretation of this data in order to retrieve structural information in real space. Even though the real space is three-dimensional, the collected scattering data are two-dimensional (projection on a specific plane) or one-dimensional (particles randomly oriented or a specific

## 3. Modeling methods for SAS data

After the scattering data is collected, it is necessary to perform several procedures to have the scattering intensity ready to be analyzed. The data treatment of the scattering data includes normalization of the intensity, background subtraction, and normalization to absolute scale among several steps, which depends on the specific characteristics of the experimental setup. The overall data treatment process and necessary procedure for proper reduction of the scattering data are described in many articles and books in the literature and will not be presented here [7, 12, 14, 15, 23, 24, 25, 26, 27, 28, 29]. In this chapter we will focus on methods for calculation of the SAS intensity, either for oriented or randomly oriented particles.

Form factor amplitude | |
---|---|

Sphere | (13) |

R is the radius of the sphere. | |

Spherical shell | (14) |

R_{in} and R_{out} are the inner and outer radius of the shell and V is spherical volume. | |

Tri-axial ellipsoid | (15) |

(16) | |

R_{1}, R_{2} and R_{3} are the semi-axes of the ellipsoid. | |

Cylinder | (17) |

R is the radius, L is the length of the cylinder and J_{1}(x) is the first-order Bessel function of the first kind. | |

Rectangular prism | (18) |

a, b and c are the edge lengths. |

The calculated intensity can be compared with experimental scattering data, and the model parameters can be optimized in order to improve the agreement between the theoretical and experimental data. The *χ2* (chi-square) test is widely used for scattering experiments because the basic assumption of this test, Gaussian distribution of uncertainties around a certain value, is fulfilled in SAS data. In this test the sum of squares of the differences between experimental and theoretical intensities is divided by the variance on each point, as shown below [7, 17, 30]:

If the *χ2* (chi-square) test is normalized by the difference between the number of experimental data points and the number of independent parameters, a good fitting is obtained when the normalized *χ2* approaches 1. This means that the differences between experimental and theoretical data are of the order of standard deviations.

### 3.1. Analytical and semi-analytical methods

For the cases where the particle has a simple shape, it is possible to have analytical or semi-analytical expressions for the scattering intensity. There are a large number of examples in the literature [30], and some examples are shown in Table 1.

The use of the analytical and semi-analytical equations has the advantage of calculating the scattering intensities with a good precision, low computational cost, and very low number of model parameters. If the particles are randomly oriented, it is necessary to perform angular averages on the equations shown in Table 1. Also, if the system is diluted but has polydispersity in size, it is possible to calculate the resulting average scattering intensity by the use of appropriate equations, which are described in the literature [30].

The calculation of the scattering intensity is reasonably fast and can be performed with high precision. However, analytical or semi-analytical expressions are only available for simple shapes [13, 14, 16, 30]. There are several programs available in the literature with a large database of equations for modeling scattering data as the SASfit program [31], among many others. In the webpage**smallangle.org**

### 3.2. Cube method

Fedorov et al. [32, 33, 34] and Ninio et al. [35] proposed the so-called cube method, where the models of macromolecules in solution are surrounded by the solvent (or by the matrix where the particles are immersed), and the cube method permits a correct calculation of the volume inaccessible to the solvent. The theoretical intensity is given by

where _{0} [32, 33]. The calculation of the scattering amplitude of a protein macromolecule with known atomic coordinates can be done with the equation,

where

The idea is to put the macromolecule coordinates in a cubic grid composed of small cubes with edges of 0.5–1.5 Å. The calculated intensity depends on a specific *q*, so the average scattering intensity is given by [32, 33]

Virtanen and collaborators presented in 2011 an adaptation of the cube method [37, 38], using a procedure, known as HyPred. Basically, these authors were inspired in cube method, to simulate scattering intensities, and also in molecular dynamics (MD) simulations, to find the hydration layer of a protein. In this procedure, with atomic resolution precision, the nonuniform solvent density around a protein is calculated [38]. With this information one can calculate both small- and wide-angle X-ray scattering (SAXS/WAXS) intensities. In 2014, Nguyen and collaborators presented another adaptation of the cube method [39], using RISM (reference interaction site model) theory. In this application the cube method is used to calculate the contribution from the solvent at amplitude scattering, and in 2016 Nguyen and collaborators [40] proposed a procedure to extract information about water and ion distributions from analysis of SAXS experiments. This method allows to compute the solvent distribution around the solute allowing to calculate scattering intensities at small- and wide-angle X-ray (SAXS/WAXS) and with less computational time than MD [39, 40]. One example [39] of these applications, using RISM-SAXS and HyPred, is shown in Figure 2 for lysozyme and shows a good agreement with experimental data to both applications. The results from other applications were also shown just for comparison. There is a good agreement between the experimental data and the simulation performed by HyPred and RISM. CRYSOL obtained a good fit with the experiment up to 1.5 Å^{−1}. The program CRYSOL [41] has been used as standard program for such calculations and uses the multipole expansion to calculate scattering intensities; this approach will be discussed at Section 3.4. The web server FoXS is based on the Debye formula, and this formula will be discussed in the next section. The web server program AXES calculates the scattering amplitudes of the surface of solvent using a sum of the six elementary scattering functions averaged [42]. The web server AquaSAXS [43] computes SAXS/WAXS profile of a given structure, and PDB or PQR file is necessary to perform the calculation.

The HyPred method is very useful for the determination of excluded volumes and contrasts. However, it requires the numeric calculation of the intensities, and if the cubic grid is very small, the computational time for the calculation of intensity is very long. Approaches using spherical harmonics proved to be more efficient and precise for the calculation of scattering intensities for usual investigations of macromolecules in solution [36, 41].

### 3.3. Sphere method and Debye formula

Considering a system composed of *n* identical scatters, randomly oriented, it is possible to rewrite Eq. (10) as

and calculating the average over all possible orientations,

it is possible to obtain the Debye equation [12, 44],

where

The Debye equation is very useful because it is possible to compose the volume of the particle by a sum of small spherical volumes. This modeling method, also known as finite element (FE) method, allows the description of the particle shape by the use of small subunits.

The main advantage of this method is that one can easily model very complex objects. However, it has the disadvantage that the calculation is proportional to *n*^{2}, where *n* is the number of the small objects used in the model. The subunit size defines the precision of the method: the maximum *q* value that can be calculated without the influence of the subunits’ form factor is limited to _{s} is radius of spherical subunits) [14]. Therefore the precision of the method increases with the number of subunits used to represent the particle.

Oliveira and collaborators [45, 46] used this kind of procedure to show the first analysis of nanocage structures using scattering radiation techniques. The authors were interested in discovering the influence in the stability and yield to build experimental DNA octahedron nanocages in solutions, using double and single DNA strands [45]. Then, to perform the modeling and compare with experimental data, the double DNA helix models are positioned in the edge, in octahedron geometry, that was truncated by single DNA strands that perform linkers between the helices. Altogether, there are 12 double-stranded B-DNA helices with 18 base pairs each (positioned in the edges) and 24 single-stranded (making truncation procedure). The stability and yield of nanocages were tested varying the length of single strands, with three, four, five, six, or seven nucleotides (to build the linker).

The SAXS models are built using bead atoms, representing DNA in the edge and in linkers of the cages. These DNA models are rigid, and each bead atom is spherical, representing a nucleotide positioned in atom of C2* (PDB format [47]). The scattering intensities were simulated using Debye equation, Eq. (25). The results are shown in Figure 3, where the simulated theoretical intensity was adjusted to each experimental SAXS data, for the five kinds of nanocage. From this analysis it was possible to obtain the relations between the cage size and the linker size and also the presence of high-order agglomerates (dimers and trimers of cages).

Even with the increase of performance of the new computer processors, the use of the Debye equation is limited to few dozens of subunits, since it involves a double sum. In the next sections, some procedures to speed up the calculation decreasing the computational costs will be shown.

### 3.4. Spherical harmonics and multipole expansion

In the late 1960s, Harrison [48] and later Stuhrmann and Svergun [36, 49] proposed an alternative procedure to compute scattering intensities for particles. The main idea is to express the scattering length distribution function distribution

The envelope function

where

The scattering amplitude

The spatial resolution of the shape representation (Eq. (27)) is defined by the truncation value L. Thus the particle shape is parameterized by *L* [14, 36, 50].

The shape scattering intensity is expressed as

where the partial amplitudes

The use of spherical harmonics permits the description of low-resolution shapes with a relatively low number of parameters, and it was the first approach capable to obtain the particle shape directly from the scattering intensity, without any a priori information. This is the first of the so-called ab initio modeling methods for SAS data analysis. This method was implemented in a program, namely, by SASHA [50], and provides the angular envelope function that gives the best fit of the scattering data. This application is a good option in determination of low-resolution structure without internal cavities and without sharp edges or corners, limited to smooth shapes.

One example of application is shown in Figure 4. In this work, Arndt and collaborators investigated extracellular proteins [51]. By using SAXS investigations, in particular the spherical harmonics approach (program SASHA), it was possible to obtain low-resolution models for the protein *Biomphalaria glabrata* in pH 7 and pH 5.

The description of particle shape using the envelope function

In the late 1990s, Svergun’s group proposed a set of tools combining the use of spherical harmonics to calculate scattering amplitudes and variation of the Debye equation. In the program called DAMMIN [52], a search space filled with spherical beads is created, and by the use of a heuristic optimization based on Monte Carlo approach (simulated annealing, SA [53]), a subset of this set of spheres is selected in order to provide the best fitting of the scattering data. The expression used for the calculation of the scattering intensity is [52]

The use of spherical harmonics speeds up the calculation process, which is the main drawback of the original Debye equation (Eq. (25)).

Other ab initio methods using the dummy atom approach were proposed by many other authors but using optimized implementations of the Debye equation (see the next section). Chacon [54, 55] proposed ab initio methods using genetic algorithm procedures for the model optimization. A modified procedure was proposed by Doniach and collaborators [56] changing the genetic algorithm by the so-called “Give‘n’Take” algorithm. Due to its functionality and special features (inclusion of symmetry constraints, multiple curve fitting, etc.), the program DAMMIN is the most used and cited in the literature.

Further implementations performed for the use of ab initio methods applied to the study of macromolecules in solution took advantage of the known atomic resolution information for proteins, available in the protein data bank (https://www.wwpdb.org/) [47], and composes the ATSAS program suite [17, 41, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69]. Several good reviews can be found in the literature for this subject [14, 17, 36, 50, 60, 65, 67, 68, 69].

### 3.5. Optimized Debye equation

The Debye equation assumes that the subunits are identical and the arrangement of the subunits defines the particle shape. As mentioned before, the double sum involved in the calculation limits the number of subunits since the computational time increases with O(n^{2}). In order to decrease the computational time, Glatter proposed the use of histograms of distances inside the particle [70]. With this procedure the double sum, Eq. (25), is converted in a single sum [70, 71],

by the use of the histogram of distance between the subunits *h(rk)* that compose the model. In this new equation, optimized Debye equation, the construction of the histogram still involves a double sum, but it is performed only once. All the further calculations are done in a single sum, over the histogram bins. If the subunits in the model are randomly distributed, the intensity calculation can be again optimized, dividing the histogram into blocks. So, the computational time cost decreases to O[n/num_{blocks}], where num_{blocks} is the number of blocks [8, 72, 73].

In Figure 5A the comparison between analytical equations, Table 1, and the same models built with the FE method and the intensities calculated with the optimized Debye equation is illustrated. The very good agreement between the theoretical analytical intensities and the calculated intensities with the optimized Debye equation demonstrates the precision of the method for its use in calculating complex shapes.

In Figure 5B two models were built using FE method. The red (internal) model is a DNA cage. The DNA molecule was modeled using coarse-grained approach, so each nucleotide corresponds to one spherical subunit place in positions of C2*. The blue model is an icosahedral shell-like structure. The composed model is an icosahedral shell-like with a DNA cage in its interior. The calculation of the scattering intensity was performed using the optimized Debye equation in a slightly different implementation in order to include different contrasts and demonstrate the potential use of this approach. In the figure of the original article is possible see that the histogram approach also permits an easy computation of affine polydispersities [8, 73].

### 3.6. Fast Fourier transformation

Schmidt-Rohr [74] proposed the use of a direct method, based on the use of Fourier transform (FT) of a three-dimensional model, to calculate the intensity scattering. Basically, the three-dimensional model is defined, a priori, on a cubic discrete lattice of dimension *Na*, with *N3* points spaced by a value *a* and with a scattering density

To obtain the intensity for a subunit, one can use the equation below,

where *m* is the number of the dimensions. The orientation averaging is performed in the final stage, where the sum of the intensity correspondents to each small (discrete) subunit of the lattice is realized (using a procedure developed by author called “channel sharing”) followed by a normalization procedure by q^{2}. This procedure has a low computational cost of *O(N.lnN)*, and, according to this author, the FFT could be applied to obtain two-dimensional diffraction patterns [74, 75].

For a system of identical particles, the total intensity is represented by the convolution of the spatial points of the particles’ center of mass

Schmidt-Rohr and Chen [74, 76] showed an application of this method to quantitatively simulate small-angle scattering data of hydrated Nafion and were capable to explain the “ionomer peak” visualized in SAXS patterns has been related to the randomly packed water channels internal to cylindrical inverted micelles. These results demonstrated the good transport properties of hydrated Nafion and have given details about its internal structure like diameters of water channels, cluster sizes, the shape of channels, and crystallinity levels [76].

An advantage of this method is the order of computational cost, O(NlogN), but on the other hand, it does not present the good results when used to systems where the SAS features are of the order of size of systems. For this kind of systems, Monte Carlo distribution function method (MC-DFM) gives better results [75] (the MC-DFM uses optimized Debye equation, Eq. (33), to simulate scattering intensities). Olds and collaborators [75] compared the efficiency of these two methods and suggest that the use of FFT method is more efficient for dense systems and complex dense-packed particle systems such as high-density polydisperse hard-sphere models. In this last case, systems of dense arrays of monodisperse spheres, the FFT method can be at least three times more efficient. However, for systems of low density such as extended polymers and dilute systems, FFT is inefficient and is also less useful for systems where it is possible to use the diffuse character of the model and use the atomic coordinates. So, to large model particle systems, dilute particle arrays, polymers, and proteins, the MC-DFM can be a more efficient procedure [75].

### 3.7. Optimized Debye equation for oriented particles

The FE method can also be used to calculate the scattering intensities for oriented systems or particles. The calculation can be performed by the use of Eq. (9), but the practical application of this formula is limited since it involves a double sum and vectorial arguments, which makes the computational costs very high. Since the particles are oriented, the scattering intensity is anisotropic and therefore is necessary to compute the two-dimensional scattering pattern. A possible approach was proposed in the seminal Guinier-Fournet book [12] and consists in applying the equation below

to calculate the scattered intensities in a specific direction. However, this equation can only be used to centrosymmetric particles, which largely limits its application.

Sjöberg [77] proposed an approach to investigate the effects of interparticle correlations. In this approach the particles or molecules have known form factor, and the correlations can be obtained by the use of single sums, as shown below,

One of the main difficulties on simulating anisotropic two-dimensional scattering pattern is the computational time required to perform the calculation. For example, to make a scattering image with side of m = 512 pixels (a total of 262,144 pixels), it is necessary to perform the calculation for each pixel (which defines a specific *n* scatters). The calculation is impractical for small models (low number of scatters) even in the nowadays computers. These were the main difficulties presented by McAlister and Grady in their first approach to this problem [25, 26].

In order to overcome these limitations, Alves and collaborators [15] recently proposed an innovative procedure to solve this problem. Inspired by the histogram approach used in the optimized Debye equation Eq. (33), it is possible to convert the double sum in Eq. (9) into a single sum over the bins of the histogram. This new equation

permits the fast calculation of the scattering intensity in a given

The values for the dot product of

The precision of the method is demonstrated by the use of known analytical equations for simple shapes, as the ones shown in Table 1. Several examples demonstrating the precision of the method are described in the original article [15]. This new method opens a large number of possibilities for the calculation of scattering intensity for oriented particles.

Recent applications using X-ray free-electron lasers (FEL) are capable to produce intense ultrashort pulses (femtoseconds), in nanometer-sized coherent beams, irradiating particles in solution. Due to the special properties of these experiments, it is possible to irradiate single particles. Since the pulse durations are shorter than the characteristic rotational diffusion time of the particle, the obtained scattering intensity corresponds to particle oriented in a given direction. Therefore, if the system is composed of identical particles, multiple scattering images correspond to the scattering intensities from multiple orientations of the particles.

Several authors propose methods to describe the coherent scattering pattern and recover the three-dimensional structure of the scattering particle, based on the method proposed by Kam [78, 79, 80]. The proposed method for calculation of oriented scattering intensities can potentially be used to describe data from FEL experiments. To demonstrate this potentiality, in Figure 6 the two-dimensional scattering pattern simulated for the protein lysozyme in several orientations is presented. This simulation method can also describe models with variable scattering length contrasts and interparticle interactions (structure factor). Several examples can be found in the original article [15].

## 4. General conclusions and perspectives

In this chapter a general overview about several procedures to calculate scattering intensifies for system of particles was presented. After a brief description of the general theoretical aspects, several methods for the calculation of scattering intensities were shown, with some typical applications. The main points and limitations of each procedure were discussed.

The analytical calculation of the scattering intensity is restricted to particles with simple geometries. More complicated shapes require the use of simulation methods. The Debye equation provided a first indication in this direction by the use of spherical subunits to build the particle (finite element description—FE) and calculate the scattering intensity for randomly oriented averaging. Its variation, with the use of cubic subunits, gives the so-called cube method. This approach permitted a better calculation of excluded volumes but requires numerical averaging for account for the random orientation or the particles. The original Debye equation involves a double sum which is very inefficient (high computational costs) and cannot be applied for a large number of subunits. Optimized forms of the Debye equation were proposed by the use of histograms of pair distances, which turn the double sum on the number of particles into a single sum over the histogram bins. In this way, this method could be used for fast calculation of scattering intensities and modeling methods. Another modeling method was the use of spherical harmonics for the calculation of the scattering intensity. With the introduction of the envelope function to describe the particle shape, this method proved to be very powerful for the description of proteins in solution and the description of hydration layers. In the last decades, this approach and its development combined with ab initio methods promoted a revolution on the use of scattering data for the investigation and modeling of macromolecules in solution. Fast Fourier transformation methods have been recently applied to calculate the scattering patterns for known shapes, with very interesting applications. Also, based on the FE method, one can use a special development of the optimized Debye equation to compute scattering intensities for oriented particles. This innovative approach permits the fast calculation of 2D scattering patterns and provides new perspectives for the use and analysis of the small-angle scattering method.

## Acknowledgments

CLPO is supported by FAPESP, CNPQ, and INCT-FCx. We acknowledge the Crystallography Journals Online (https://journals.iucr.org/), AIP Publishing LLC, American Chemical Society, and John Wiley & Sons, Inc. for permitting the reproduction of the figures used.

## Conflict of interest

The authors declare that they do not have any “conflict of interest.”