QUAD package¶

Submodules¶

QUAD.dram module¶

Quantitative Uncertainty Analysis for Diffraction (QUAD)¶

Copyright (c) 2018, North Carolina State University All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. The names “North Carolina State University”, “NCSU” and any trade‐name, personal name, trademark, trade device, service mark, symbol, image, icon, or any abbreviation, contraction or simulation thereof owned by North Carolina State University must not be used to endorse or promote products derived from this software without prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ###############################################################################

QUAD.dram.adapt_covariance(i, adapt, s_p, all_Z, epsilon, q, varS1)[source]¶

Adapt the covariance matrix at the given adaption interval.

Args:

i (int): Iteration number.
adapt (int): Adaption interval, user-defined. Default is 20.
s_p (float): Scaling parameter for adapting the covariance. Default is \(\frac{2.4^2}{q}\) where q is the size of the parameter space.
all_Z (ndarray): Storage of z-space samples for each iteration - size(iters, q)
epsilon (float): Constant to prevent singularity of adaptive covariance. Default is 0.0001.
q (int): Number of parameters.
varS1 (ndarray): Current covariance matrix - size(q, q).

Returns:

varS1 (ndarray): Adapted covariance matrix - size(q, q).

QUAD.dram.calculate_bsplinebasis(x, L)[source]¶

Calculate a B-spline basis for the 2-theta values.

Args:

x (ndarray): Vector of 2-theta values - size (n,).
L (int): Number of cubic B-spline basis functions to model the background intensity. Default is 20.

Returns:

B (ndarray): B-spline basis for data - size (n, L)

QUAD.dram.define_lattice(paramList, Calc)[source]¶

Convert the A values refined in GSAS-II to unit cell parameters (a, b, c, alpha, beta, gamma). A = [G11, G22, G33, 2*G12, 2*G13, 2*G23] where the G elements make up the reciprocal metric tensor. Add unit cell parameters to the paramList to be refined in QUAD and add the respective starting values to the starting parameter values list (start). Capabale of handling crystal systems of cubic, tetragonal, hexagonal, rhombohedral, trigonal, orthorhombic, monoclinic, and triclinic and can refine lattice parameters of multiple phases.

Args:

paramList (list): List of parameter names for refinement - size (q).
start (list): List of initial parameter values in parameter space - size (q).
Calc (Calculator): calculator operator that interacts with the designated GPX file by referencing GSAS-II libraries.

Returns:

2-tuple. Tuple entries are

paramList (list): Updated list of parameter names for refinement - size (q).
start (ndarray): Updated list of initial parameter values in parameter space - size (q).

QUAD.dram.diffraction_file_data(x, y, Calc)[source]¶

Extract the intensity values (y) and 2-theta angles (x) from the underlying GPX file or set the user-defined data values.

Args:

x (ndarray): Predefined vector of 2-theta values, if it exists.
y (ndarray): Predefined vector of diffraction pattern intensities, if it exists.
Calc (Calculator): calculator operator that interacts with the designated GPX file by referencing GSAS-II libraries.

Returns:

2-tuple containing the diffraction data. Tuple entries are

x (ndarray): Vector of 2-theta values - size (n,).
y (ndarray): Vector of diffraction pattern intensities - size (n,).

QUAD.dram.estimatecovariance(paramList, start, Calc, upper, lower, x=None, y=None, L=20, delta=0.001)[source]¶

Estimate the covariance of the initial parameter values to initialize the proposal covarianc for DRAM. This is done by first calculating the sensitivity matrix, \(\chi\), through finite differences. Next, the Fisher Information matrix is calulated as \(\chi^T\chi\). Finally, the estimated covariance matrix is calculated as \(s^2*\chi^T\chi\), where s is the standard deviation estimate.

Args:

paramList (list): List of parameter names for refinement - size (q).
start (list): List of initial parameter values in parameter space - size (q).
Calc (Calculator): calculator operator that interacts with the designated GPX file by referencing GSAS-II libraries.
upper (ndarray): Vector of upper limits on a uniform prior distribution in the parameter space - size (q,).
lower (ndarray): Vector of lower limits on a uniform prior distribution in the parameter space - size (q,).
x (ndarray): Vector of 2-theta values - size (n,). Will be created from the GPX file if not user-defined.
y (ndarray): Vector of diffraction pattern intensities - size (n,). Created from the GPX file if not user-defined.
L (int): Number of cubic B-spline basis functions to model the background intensity. Default is 20.
delta (float): Fraction by which the parameters are perturbed with respect to the initial values for finite difference calculation.

Returns:

(dict): Dictionary containing the estimated covariance matrix, estimated variance, and the eigenvalues and eigenvectors of the covariance matrix. The eigenvalues can be examined to determine potentially non-indentifiable parameters by locating the eigenvalues significantly close to zero in value.

QUAD.dram.initialize_cov(initCov, q)[source]¶

If not previously defined, initialize the covariance for the proposal distribution.

Args:

initCov (ndarray): Pre-defined initial covariance matrix in z-space - size (q, q).
q (int): Number of parameters.

Returns:

varS1 (ndarray): Covariance matrix in z-space - size (q, q).

QUAD.dram.initialize_intensity_weight(x, y, scaling_factor=1)[source]¶

Define a heteroscedastic vector to scale the residuals between the model and the data with the intensity of the smoothed data. The smoothed intensity data, y_sm, is obtained from smooth_ydata().

Args:

y (ndarray): Vector of diffraction pattern intensities - size (n,).
x (ndarray): Vector of 2-theta values from diffraction pattern- size (n,).
scaling factor (float): Contribution of smoothed intensity data. Default is 1.

Returns:

var_scale (ndarray): Vector of scaling factors corresponding to intensity data - size (n,).

QUAD.dram.log_post(y, x, BG, Calc, paramList, z, lower, upper, scale, tau_y, m0, sd0)[source]¶

Calculate the log of the posterior probabilities for a set of parameters. The posterior is defined as a log-likelihood of the product of normal distributions multiplied by the prior. For definition of the prior, see prior_loglike().

Args:

y (ndarray): Vector of diffraction pattern intensities - size (n,).
x (ndarray): Vector of 2-theta values - size (n,).
BG (ndarray): Vector of background intensity values - size (n,).
Calc (Calculator): calculator operator that interacts with the designated GPX file by referencing GSAS-II libraries.
paramList (list): List of parameter names for refinement - size (q,).
z (ndarray): Current parameter values in z-space - size (q,).
lower (ndarray): Vector of lower limits on a uniform prior distribution in the parameter space - size (q,).
upper (ndarray): Vector of upper limits on a uniform prior distribution in the parameter space - size (q,).
scale (ndarray): Vector that scales with the intensity of data, heteroscedastic. See function initialize_intensity_weight()
tau_y (float): Model precision. Default initial valus is 1.
m0 (float): Mean of prior normal distribution on z. Default is 0.
sd0 (float): Standard deviation of prior normal distribution on z. Default is 1.

Returns:

posterior (float): Value of the prior times likelihood given current z-space candidate values.

QUAD.dram.prior_loglike(par, m0, sd0)[source]¶

Calculate the log likelihood of the parameters in z-space based on the prior distributions. The prior is a uniform distribution in parameter space which corresponds to the product of normal distributions for each value in z-space.

Args:

par (ndarray): Array of parameter values in z-space - size (q,).
m0 (float): Mean of prior normal distribution on z. Default is 0.
sd0 (float): Standard deviation of prior normal distribution on z. Default is 1.

Returns:

prior (float): Value of prior distribution given current z-values.

QUAD.dram.run_summary(results, start, paramList, path)[source]¶

Print information to the console about the DRAM run after the sampling is completed.

Args:

results (dict): Array of parameter values in z-space - size (q,).
start (list): List of initial parameter values in parameter space - size (q).
paramList (list): List of parameter names for refinement - size (q).
path (str): File path to folder where results and final plots from run will be saved.

QUAD.dram.sample(GPXfile, paramList, variables, start, lower, upper, path, initCov=None, y=None, x=None, L=20, shrinkage=0.2, s_p=5.76, epsilon=0.0001, m0=0, sd0=1, c_y=0.1, d_y=0.1, c_g=0.1, d_g=0.1, c_b=0.1, d_b=0.1, adapt=20, thin=1, iters=5000, burn=2000, update=500, plot=True, fix=False)[source]¶

Args:

GPXfile (str): Filepath for the GPX file underlying the current data
paramList (list): (q) list of GSASII parameter names in the same order as the upper and lower limits being provided
variables (ndarray): (q,) vector of parameter names that matches ‘paramList’
start (list): List of initial parameter values in parameter space - size (q)
lower (ndarray): (q,) vector of lower bounds for the parameter values
upper (ndarray): (q,) vector of upper bounds for the parameter values
path (str): File path to folder where results and final plots from run will be saved. Folder must be created before calling.

Kwargs:

initCov (ndarray) - None: (q, q) matrix to be used as the covariance matrix for the proposal distribution, default value is None. Covariance matrix can be specified with the estimate covariance function. If there is no matrix specified, the function will use a diagonal matrix with 0.05 on the diagonal.
y (ndarray) - None: (n,) vector of intensities. If no values are specified, the function uses the values from the provided GPX file
x (ndarray) - None: (n,) vector of angles (2*theta). If no values are specified, the function uses the values from the provided GPX file
L (float) - 20: number of B-spline basis functions to model the background intensity.
shrinkage (float) - 0.2: Governs covariance change between proposal stages, default is 0.2
s_p (float): \(\frac{2.4^2}{q}\) Scaling parameter for the adaptive covariance, default is set to \(\frac{2.4^2}{q}\) as in Gelman (1995), where q is the dimension of the parameter space
epsilon (float) - 0.0001: Ridge constant to prevent singularity of the adaptive covariance.
m0 (float) - 0: Governs the mean value on the latent z-space of the parameters.
sd0 (float) - 1: Governs the standard deviation on the latent z-space of the parameters.
c_y (float) - 0.1: Shape parameter for Gamma distribution of the error variance.
d_y (float) - 0.1: Scale parameter for Gamma distribution of the error variance.
c_g (float) - 0.1: Shape parameter for Gamma distribution for the error in the prior distribution for the basis function loadings.
d_g (float) - 0.1: Scale parameter for Gamma distribution for the error in the prior distribution for the basis function loadings.
c_b (float) - 0.1: Shape parameter for Gamma distribution for scale of the proportional constribution to the error variance.
d_b (float) - 0.1: Scale parameter for Gamma distribution for scale of the proportional constribution to the error variance.
adapt (float): 20: Controls the adaptation period.
thin (float) - 1: Degree of thinning.
iters (float) - 5000: Number of total iterations to run.
burn (float) - 2000: Number of samples to consider as burn-in.
update (float) - 500: Period between updates printed to the console.
plot (bool) - True: Indicator for whether or not to create trace plots as the sampler progresses.

Returns:

dictionary containing the posterior samples for the parameters and the model timing, dict entries are

param_samples (ndarray): Matrix of posterior samples for the mean process parameters of interest - (nSamples, q)
number_samples (int): Number of samples kept. Equivalent to (iters - burn) if thin=1.
final_covariance (ndarray): Final adapated covariance matrix - size(q, q).
model_variance (ndarray): Vector of posterior samples for the overall model variance - size(nSamples,)
gamma_samples (ndarray): Matrix of posterior samples for the basis function loadings modeling the background intensity - (nSamples, L)
run_time (float): Number of minutes the sampler took to complete.
stage1_accept (ndarray): Acceptance rate of stage 1 DRAM - size(n_keep//update,).
stage2_accept (ndarray): Acceptance rate of stage 2 DRAM - size(n_keep//update).

QUAD.dram.save_results(results, start, lower, upper, paramList, init_cov, path)[source]¶

Save DRAM results to individual files in selected file path. Also save the input values of prior bounds, parameter starting values, and initial proposal covariance. A text file of a list of the refined parameter names is saved to the indicated file path.

Args:

results (dict): Array of parameter values in z-space - size (q,).
start (list): List of initial parameter values in parameter space - size (q).
lower (ndarray): Vector of lower limits on a uniform prior distribution in the parameter space - size (q,).
upper (ndarray): Vector of upper limits on a uniform prior distribution in the parameter space - size (q,).
paramList (list): List of parameter names for refinement - size (q).
initCov (ndarray): (q, q) matrix used as the initial covariance matrix for the proposal distribution. Covariance matrix can be specified with the estimate covariance function.
path (str): File path to folder where results and final plots from run will be saved.

QUAD.dram.smooth_ydata(x, y)[source]¶

Smooth diffraction data intensities at 2-theta values with Locally Weighted Scatterplot Smoothing (lowess) function from statsmodels.

Args:

y (ndarray): Vector of diffraction pattern intensities - size (n,).
x (ndarray): Vector of 2-theta values from diffraction pattern- size (n,).

Returns:

y_sm (ndarray): Vector of smoothed intensity data - size (n,).

QUAD.dram.stage1_acceptprob(z, varS1, y, x, BG, Calc, paramList, lower, upper, var_scale, tau_y, m0, sd0)[source]¶

Calculate the acceptance probability for Stage 1 of the DRAM algorithm.

Args:

z (ndarray): Vector of current parameter values in z-space - size(q,).
varS1 (ndarray): Current covariance matrix - size(q, q).
y (ndarray): Vector of diffraction pattern intensities - size (n,).
x (ndarray): Vector of 2-theta values - size (n,).
BG (ndarray): Vector of background intensity values - size (n,).
Calc (Calculator): calculator operator that interacts with the designated GPX file by referencing GSAS-II libraries.
paramList (list): List of parameter names for refinement - size (q,).
lower (ndarray): Vector of lower limits on a uniform prior distribution in the parameter space - size (q,).
upper (ndarray): Vector of upper limits on a uniform prior distribution in the parameter space - size (q,).
var_scale (ndarray): Vector that scales with the intensity of data, heteroscedastic. See function initialize_intensity_weight()
tau_y (float): Model precision. Default initial valus is 1.
m0 (float): Mean of prior normal distribution on z. Default is 0.
sd0 (float): Standard deviation of prior normal distribution on z. Default is 1.

Returns:

4-tuple containing the stage 1 acceptance probability, candidates, and respective log posterior values. Tuple entries are

can_z1 (ndarray): Vector of candidate 1 parameter values in z-space - size(q,). The candidate is constructed from a multivariate normal distribution with mean z and covariance of varS1. \(can\_z1 = N_q(z,varS1)\).
can1_post (float): Value of the log posterior probability for candidate 1 parameter values, can_z1. Computed with log_post().
cur_post (float): Value of the log posterior probability for the current parameter values, z. Computed with log_post().
R1 (float): Log of Stage 1 acceptance probability.

QUAD.dram.stage2_acceptprob(can1_post, can2_post, cur_post, can_z1, can_z2, z, varS1)[source]¶

Calculate the acceptance probability for Stage 2 of the DRAM algorithm.

Args:

z (ndarray): Vector of current parameter values in z-space - size(q,).
cur_post (float): Value of the log posterior probability for the current parameter values.
varS1 (ndarray): Current covariance matrix - size(q, q).
can_z1 (ndarray): Vector of candidate 1 parameter values in z-space - size(q,). The candidate is constructed from a multivariate normal distribution with mean z and covariance of varS1. \(can\_z1 = N_q(z,varS1)\).
can_z2 (ndarray): Vector of candidate 2 parameter values in z-space - size(q,). The candidate is constructed from a multivariate normal distribution with mean z and covariance of shrinkage*varS1. \(can\_z2 = N_q(z,shrinkage*varS1)\)
can1_post (float): Value of the log posterior probability for candidate 1 parameter values from Stage 1. Computed with log_post().
can2_post (float): Value of the log posterior probability for candidate 2 parameter values from Stage 2. Computed with log_post().

Returns:

R2 (float): Log of Stage 2 acceptance probability.

QUAD.dram.start2z(start, lower, upper)[source]¶

Transform starting parameter values to latent z-space for calculations. Check the compatibility of the dimensions and values for the lower and upper prior bounds.

Args:

start (list): List of initial parameter values in parameter space - size (q).
upper (ndarray): Vector of upper limits on a uniform prior distribution in the parameter space - size (q,).
lower (ndarray): Vector of lower limits on a uniform prior distribution in the parameter space - size (q,).

Returns:

init_z (ndarray): Vector of initial parameter values in z-space - size (q,).

QUAD.dram.traceplots(plot, q, keep_params, curr_keep, paramList, n_keep, update, path)[source]¶: Produce traceplots of sampling at the update intervals. Plot in the console window and save final traceplot in the current folder.

QUAD.dram.update_background(B, var_scale, tau_y, tau_b, L, Calc, y)[source]¶

Update the basis function loadings and then background values.

Args:

B (ndarray): B-spline basis for 2-theta range - size (n, L). See calculate_bsplinebasis().
var_scale (ndarray): Vector of scaling factors corresponding to intensity data- size (n,). See function initialize_intensity_weight()
tau_y (float): Model precision. See updat_tauy()
tau_b (float): Loadings precision for background. See update_taub()
L (int): Number of cubic B-spline basis functions to model the background intensity. Default is 20.
Calc (Calculator): calculator operator that interacts with the designated GPX file by referencing GSAS-II libraries.
y (ndarray): Vector of diffraction pattern intensity data - size(nx1). See diffraction_file_data().

Returns:

2-tuple containing the updated information for the background fit of the data. Tuple entries are

gamma (ndarray): Vector of updated basis loadings - size (L,)
BG (ndarray): Vector of updated background intensity values - size (n,).

QUAD.dram.update_taub(d_g, gamma, c_g, L)[source]¶

Update the background model precision.

Args:

d_g (float): Scale parameter for Gamma distribution for the error in the prior distribution for the basis function loadings. Default is 0.1.
gamma (ndarray): Basis function loadings - size (L,).
c_g (float): Shape parameter for Gamma distribution for the error in the prior distribution for the basis function loadings. Default is 0.1.
L (int): Number of cubic B-spline basis functions to model the background intensity. Default is 20.

Returns:

tau_b (float): Loadings precision for background.

QUAD.dram.update_tauy(y, BG, Calc, var_scale, d_y, c_y, n)[source]¶

Update the model precision.

Args:

y (ndarray): Vector of diffraction pattern intensities - size (n,).
BG (ndarray): Vector of background intensity values - size (n,).
Calc (Calculator): calculator operator that interacts with the designated GPX file by referencing GSAS-II libraries.
var_scale (ndarray): Vector that scales with the intensity of data, heteroscedastic. See function initialize_intensity_weight()
d_y (float): Scale parameter for Gamma distribution of the error variance. Default is 0.1
c_y (float): Shape parameter for Gamma distribution of the error variance.
n (int): Number data points (equivalent to the length of intensities vector, y).

Returns:

tau_y (float): Updated model precision.

QUAD.dram.z2par(z, lower, upper, grad=False)[source]¶

Transform between the bounded parameter space and continuous z-space.

Args:

z (ndarray): Array of parameter values in z-space - size (q,).
lower (ndarray): Vector of lower limits on a uniform prior distribution in the parameter space - size (q,).
upper (ndarray): Vector of upper limits on a uniform prior distribution in the parameter space - size (q,).
grad (bool): If False (default), converts from the z-space to the parameter space.

Returns:

par (ndarray): Array of parameter values in parameter space from the given z values- size (q,).

QUAD.gsas_tools module¶

class QUAD.gsas_tools.Calculator(GPXfile=None)[source]¶

Bases: object

This script calls the neccessary functions for QUAD from the GSAS-II program which are regularly updated by the GSAS-II developers. (https://subversion.xray.aps.anl.gov/trac/pyGSAS)

How it works:

Variables are stored in various python dictionaries (parmDict, calcControls, Phases, and Histograms) for more information about dictionaries see http://www.tutorialspoint.com/python/python_dictionary.htm

Variables of interest that are stored in parmDict include: Background parameters: :0:Back:x (x represent the order) X-ray absorption coef: :0:Absorption X-ray wavelength: :0:Lam Intensity scale factor: :0:Scale 2-theta offset: :0:Zero Various peak parameters: :0:U, :0:V, :0:W, :0:L, :0:X, :0:SH/L

Variables of interest that are stored in calcControls include: Model for background: :0:bakType (default is chebyschev) available models are: chebyschev (default), cosine, lin interpolate, inv interpolate, and log interpolate

Phases contains information about the phases in the material (Space group, atom positions, Debye-Waller factors, preferred orientation, ….)

Change between isotropic and anisotropic Debye-Waller factors:: Phases[Phases.keys()[0]][‘Atoms’][X][9] = ‘I’ (isotropic) or ‘A’ (anisotropic) [X represents atom number (0 to n-1)]
Debye-Waller factors are controled by:: Phases[Phases.keys()[0]][‘Atoms’][X][10-16] (isotropic case only used element 10)

Histograms contains information about the diffraction data file (this should not be changed)

Calculate()[source]¶: Calculate the profile fit for the current parameter setup

#Load the parameters Histograms = self._Histograms varyList = self._varyList parmDict = self._parmDict Phases = self._Phases calcControls = self._calcControls pawleyLookup = self._pawleyLookup restraintDict = self._restraintDict rbIds = self._rbIds

UpdateLattice(parmVarDict, phase_index)[source]¶: Update the lattice parameters in the current model for all phases

UpdateParameters(parmVarDict=None)[source]¶: Update parameters in the current model

convert_lattice(paramList, params)[source]¶: Convert between A and unit cell parameters. A is defined in GSAS-II as A = [G11, G22, G33, 2*G12, 2*G13, 2*G23] with G as the reciprocal metric tensor elements.

QUAD package¶

Submodules¶

QUAD.dram module¶

Quantitative Uncertainty Analysis for Diffraction (QUAD)¶

QUAD.gsas_tools module¶

Module contents¶