Repository logo
 

Statistical innovations for estimating shape characteristics of biological macromolecules in solution using small-angle x-ray scattering data

Date

2016

Authors

Alsaker, Cody, author
Breidt, F. Jay, advisor
Estep, Don, committee member
Kokoszka, Piotr, committee member
Luger, Karolin, committee member

Journal Title

Journal ISSN

Volume Title

Abstract

Small-angle X-ray scattering (SAXS) is a technique that yields low-resolution images of biological macromolecules by exposing a solution containing the molecule to a powerful X-ray beam. The beam scatters when it interacts with the molecule. The intensity of the scattered beam is recorded on a detector plate at various scattering angles, and contains information on structural characteristics of the molecule in solution. In particular, the radius of gyration (Rg) for a molecule, which is a measure of the spread of its mass, can be estimated from the lowest scattering angles of SAXS data using a regression technique known as Guinier analysis. The analysis requires specification of a range or “window” of scattering angles over which the regression relationship holds. We have thus developed methodology and supporting asymptotic theory for selection of an optimal window, minimum mean square error estimation of the radius of gyration, and estimation of its variance. The theory and methodology are developed using a local polynomial model with autoregressive errors. Simulation studies confirm the quality of the asymptotic approximations and the superior performance of the proposed methodology relative to the accepted standard. We show that the algorithm is applicable to data acquired from proteins, nucleic acids and their complexes, and we demonstrate with examples that the algorithm improves the ability to test biological hypotheses. The radius of gyration is a normalized second moment of the pairwise distance distribution p(r), which describes the relative frequency of inter-atomic distances in the structure of the molecule. By extending the theory to fourth moments, we show that a new parameter ψ can be calculated theoretically from p(r) and estimated from experimental SAXS data, using a method that extends Guinier's Rg estimation procedure. This new parameter yields an enhanced ability to use intensity data to distinguish between two molecules with different but similar Rg values. Analysis of existing structures in the protein data bank (PDB) shows that the theoretical ψ values relate closely to the aspect ratio of a molecular structure. The combined values for Rg and ψ acquired from experimental data provide estimates for the dimensions and associated uncertainties for a standard geometric shape, representing the particle in solution. We have chosen the cylinder as the standard shape and show that a simple, automated procedure gives a cylindrical estimate of a particle of interest. The cylindrical estimate in turn yields a good first approximation to the maximum inter-atomic distance in a molecule, Dmax, an important parameter in shape reconstruction. As with estimation of Rg, estimation of ψ requires specification of a window of angles over which to conduct the higher-order Guinier analysis. We again employ a local polynomial model with autoregressive errors to derive methodology and supporting asymptotic theory for selection of an optimal window, minimum mean square error estimation of the aspect ratio, and estimation of its variance. Recent advances in SAXS data collection and more comprehensive data comparisons have resulted in a great need for automated scripts that analyze SAXS data. Our procedures to estimate Rg and ψ can be automated easily and can thus be used for large suites of SAXS data under various experimental conditions, in an objective and reproducible manner. The new methods are applied to 357 SAXS intensity curves arising from a study on the wild type nucleosome core particle and its mutants and their behavior under different experimental conditions. The resulting Rg2 values constitute a dataset which is then analyzed to account for the complex dependence structure induced by the experimental protocols. The analysis yields powerful scientific inferences and insight into better design of SAXS experiments. Finally, we consider a measurement error problem relevant to the estimation of the radius of gyration. In a SAXS experiment, it is standard to obtain intensity curves at different concentrations of the molecule in solution. Concentration-by-angle interactions may be present in such data, and analysis is complicated by the fact that actual concentration levels are unknown, but are measured with some error. We therefore propose a model and estimation procedure that allows estimation of true concentration ratios and concentration-by-angle interactions, without requiring any information about concentration other than that contained in the SAXS data.

Description

Rights Access

Subject

Citation

Associated Publications