Diagnostics¶

class
pysgmcmc.diagnostics.
PYSGMCMCTrace
(chain_id, samples, varnames=None)[source]¶ Adapter class to connect the worlds of pysgmcmc and pymc3. Represents a single chain/trace of samples obtained from a sgmcmc sampler.

classmethod
from_sampler
(chain_id, sampler, n_samples, keep_every=1, varnames=None)[source]¶ Instantiate a trace with id chain_id by extracting n_samples from sampler.
Parameters:  chain_id (int) – A numeric id that uniquely identifies this chain/trace.
 sampler (pysgmcmc.sampling.MCMCSampler subclass) – A sampler used to generate samples for this trace.
 n_samples (int) – Number of samples to extract from sampler for this chain/trace.
 keep_every (int) – Keep every `keep_every`th sample in each chain.
 varnames (List[String] or NoneType, optional) – TODO: DOKU
Returns: trace – A wrapper around n_samples samples for variables with names varnames extracted from sampler. (Unique) chain id of this trace will be chain_id.
Return type: Examples
Below we show how to use this classmethod to obtain a PYSGMCMCTrace. Furthermore, we demonstrate that we can apply burnin steps prior to passing the sampler, which enables us to automatically thin/remove all burnin samples prior to recording the actual trace.
We start by defining our problem, as cost function we use the negative log likelihood of a mixture of gaussians (gmm1).
>>> import tensorflow as tf >>> from itertools import islice >>> from pysgmcmc.samplers.relativistic_sghmc import RelativisticSGHMCSampler >>> from pysgmcmc.diagnostics.objective_functions import gmm1_log_likelihood >>> gmm1_negative_log_likelihood = lambda *args, **kwargs: gmm1_log_likelihood(*args, **kwargs) >>> session = tf.Session() >>> params = [tf.Variable(0., dtype=tf.float32, name="p")]
Next, we set up our sampler and perform 100 steps of burnin. We skip the samples obtained and do not record them as part of our PYSGMCMCTrace.
>>> n_burn_in_steps = 100 >>> sampler = RelativisticSGHMCSampler(params=params, cost_fun=gmm1_negative_log_likelihood, dtype=tf.float32, session=session) >>> session.run(tf.global_variables_initializer()) >>> _ = islice(sampler, n_burn_in_steps)
Finally, we extract a PYSGMCMCTrace from the (already burntin) sampler using PYSGMCMCTrace.from_sampler.
>>> n_samples = 1000 >>> varnames = ["p"] >>> chain_id = 1234 # unique id >>> trace = PYSGMCMCTrace.from_sampler(sampler=sampler, n_samples=n_samples, varnames=varnames, chain_id=chain_id) >>> session.close() >>> isinstance(trace, PYSGMCMCTrace), len(trace), trace.varnames (True, 1000, ['p'])

get_values
(varname, burn=0, thin=1)[source]¶ Get all sampled values in this trace for variable with name varname.
Parameters:  varname (string) – Name of a given target parameter of the sampler. Usually, this corresponds to the name attribute of the tensorflow.Variable object for this target parameter.
 burn (int, optional) – Discard the first burn sampled values from this chain. Defaults to 0.
 thin (int, optional) – Only return every thin`th sampled value from this chain. Defaults to `1.
Returns: sampled_values – All values for variable varname that were sampled in this chain. Formatted as (N, D) numpy.ndarray where N is the number of sampler steps in this chain and D is the dimensionality of variable varname.
Return type: np.ndarray (N, D)
Examples
This method makes each variable in a trace accessible by its name:
>>> import tensorflow as tf >>> graph = tf.Graph() >>> params = [tf.Variable(0., name="x"), tf.Variable(0., name="y")] >>> params[0].name, params[1].name ('x_1:0', 'y_1:0')
These names can be used to index the trace and obtain all sampled values for the corresponding target parameter:
>>> names = [variable.name for variable in params] >>> dummy_samples = [[0., 0.], [0.2, 0.2], [0.3, 0.5], [0.1, 0.]] >>> trace = PYSGMCMCTrace(chain_id=0, samples=dummy_samples, varnames=names) >>> trace.get_values(varname="x_1:0"), trace.get_values(varname="y_1:0") (array([ 0. , 0.2, 0.3, 0.1]), array([ 0. , 0.2, 0.5, 0. ]))
If a queried name does not correspond to any parameter in the trace, a ValueError is raised:
>>> names = [variable.name for variable in params] >>> dummy_samples = [[0., 0.], [0.2, 0.2], [0.3, 0.5], [0.1, 0.]] >>> trace = PYSGMCMCTrace(chain_id=0, samples=dummy_samples, varnames=names) >>> trace.get_values(varname="FANTASYVARNAME") Traceback (most recent call last): ... ValueError: Queried `PYSGMCMCTrace` for values of parameter with name 'FANTASYVARNAME' but the trace does not contain any parameter of that name. Known variable names were: '['x_1:0', 'y_1:0']'

classmethod

pysgmcmc.diagnostics.
pymc3_multitrace
(get_sampler, n_chains=2, samples_per_chain=100, keep_every=10, parameter_names=None)[source]¶ Extract chains from sampler and return them as pymc3.MultiTrace object.
Parameters:  get_sampler (callable) – A callable that takes a tensorflow.Session object as input and returns a (possibly already burntin) instance of a pysgmcmc.sampling.MCMCSampler subclass.
 parameter_names (List[String] or NoneType, optional) – List of names for each target parameter of the sampler. Defaults to None, which attempts to look the parameter names up from the target parameters of the sampler returned by get_sampler.
Returns: multitrace – TODO: DOKU
Return type: pymc3.backends.base.MultiTrace
Examples
TODO ADD EXAMPLE

pysgmcmc.diagnostics.
effective_sample_sizes
(get_sampler, n_chains=2, samples_per_chain=100)[source]¶ Calculate ess metric for a sampler returned by callable get_sampler. To do so, extract n_chains traces with samples_per_chain samples each.
Parameters:  get_sampler (callable) – Callable that takes a tensorflow.Session as input and returns a (possibly already “burntin”) pysgmcmc.sampling.MCMCSampler subclass instance.
 n_chains (int, optional) – Number of individual chains/traces to extract. Defaults to 2.
 samples_per_chain (int, optional) – Number of samples each individual chain should contain. Defaults to 100.
Returns: ess – Dictionary that maps the variable name of each target parameter of a sampler obtained by callable get_sampler to the corresponding effective sample size of (each dimension of) this variable.
Return type: Notes
The diagnostic is computed as:
\[\hat{n}_{eff} = \frac{mn}{1 + 2 \sum_{t=1}^T \hat{\rho}_t}\]where \(\hat{\rho}_t\) is the estimated autocorrelation at lag t, and T is the first odd positive integer for which the sum \(\hat{\rho}_{T+1} + \hat{\rho}_{T+1}\) is negative.
References
Gelman et al. (2014)
Examples
Simple (arbitrary) example to showcase usage:
>>> import tensorflow as tf >>> from pysgmcmc.samplers.sghmc import SGHMCSampler >>> def get_sampler(session): ... params = [tf.Variable([1.0, 2.0], name="x", dtype=tf.float64)] ... cost_fun = lambda params: tf.reduce_sum(params) ... sampler = SGHMCSampler(params=params, cost_fun=cost_fun, session=session) ... return sampler ... >>> ess_vals = effective_sample_sizes(get_sampler=get_sampler) >>> type(ess_vals) <class 'dict'> >>> param_name = list(ess_vals.keys())[0] >>> param_name.startswith("x") True >>> len(ess_vals[param_name]) # x:0 has two dimensions, we have one ess value for each 2

pysgmcmc.diagnostics.
gelman_rubin
(get_sampler, n_chains=2, samples_per_chain=100)[source]¶ Calculate gelman_rubin metric for a sampler returned by callable get_sampler. To do so, extract n_chains traces with samples_per_chain samples each.
The GelmanRubin diagnostic tests for lack of convergence by comparing the variance between multiple chains to the variance within each chain. If convergence has been achieved, the betweenchain and withinchain variances should be identical. To be most effective in detecting evidence for nonconvergence, each chain should have been initialized to starting values that are dispersed relative to the target distribution.
Parameters:  get_sampler (callable) – Callable that takes a tensorflow.Session as input and returns a (possibly already “burntin”) pysgmcmc.sampling.MCMCSampler subclass instance.
 n_chains (int, optional) – Number of individual chains/traces to extract. Defaults to 2.
 samples_per_chain (int, optional) – Number of samples each individual chain should contain. Defaults to 100.
Returns: gelman_rubin – Dictionary of the potential scale reduction factors, \(\\hat{R}\).
Return type: Notes
The diagnostic is computed by:
\[\begin{split}\\hat{R} = \\frac{\\hat{V}}{W}\end{split}\]where \(W\) is the withinchain variance and \(\\hat{V}\) is the posterior variance estimate for the pooled traces. This is the potential scale reduction factor, which converges to unity when each of the traces is a sample from the target posterior. Values greater than one indicate that one or more chains have not yet converged.
References
Brooks and Gelman (1998) Gelman and Rubin (1992)
Examples
Simple (arbitrary) example to showcase usage:
>>> import tensorflow as tf >>> from pysgmcmc.samplers.sghmc import SGHMCSampler >>> def get_sampler(session): ... params = [tf.Variable([1.0, 2.0], name="x", dtype=tf.float64)] ... cost_fun = lambda params: tf.reduce_sum(params) ... sampler = SGHMCSampler(params=params, cost_fun=cost_fun, session=session) ... return sampler ... >>> factors = gelman_rubin(get_sampler=get_sampler) >>> type(factors) <class 'dict'> >>> param_name = list(factors.keys())[0] >>> param_name.startswith("x") True >>> len(factors[param_name]) # x:0 has two dimensions, we have one factor for each 2
Sampler Diagnostics¶

pysgmcmc.diagnostics.sampler_diagnostics.
effective_sample_sizes
(get_sampler, n_chains=2, samples_per_chain=100)[source]¶ Calculate ess metric for a sampler returned by callable get_sampler. To do so, extract n_chains traces with samples_per_chain samples each.
Parameters:  get_sampler (callable) – Callable that takes a tensorflow.Session as input and returns a (possibly already “burntin”) pysgmcmc.sampling.MCMCSampler subclass instance.
 n_chains (int, optional) – Number of individual chains/traces to extract. Defaults to 2.
 samples_per_chain (int, optional) – Number of samples each individual chain should contain. Defaults to 100.
Returns: ess – Dictionary that maps the variable name of each target parameter of a sampler obtained by callable get_sampler to the corresponding effective sample size of (each dimension of) this variable.
Return type: Notes
The diagnostic is computed as:
\[\hat{n}_{eff} = \frac{mn}{1 + 2 \sum_{t=1}^T \hat{\rho}_t}\]where \(\hat{\rho}_t\) is the estimated autocorrelation at lag t, and T is the first odd positive integer for which the sum \(\hat{\rho}_{T+1} + \hat{\rho}_{T+1}\) is negative.
References
Gelman et al. (2014)
Examples
Simple (arbitrary) example to showcase usage:
>>> import tensorflow as tf >>> from pysgmcmc.samplers.sghmc import SGHMCSampler >>> def get_sampler(session): ... params = [tf.Variable([1.0, 2.0], name="x", dtype=tf.float64)] ... cost_fun = lambda params: tf.reduce_sum(params) ... sampler = SGHMCSampler(params=params, cost_fun=cost_fun, session=session) ... return sampler ... >>> ess_vals = effective_sample_sizes(get_sampler=get_sampler) >>> type(ess_vals) <class 'dict'> >>> param_name = list(ess_vals.keys())[0] >>> param_name.startswith("x") True >>> len(ess_vals[param_name]) # x:0 has two dimensions, we have one ess value for each 2

pysgmcmc.diagnostics.sampler_diagnostics.
gelman_rubin
(get_sampler, n_chains=2, samples_per_chain=100)[source]¶ Calculate gelman_rubin metric for a sampler returned by callable get_sampler. To do so, extract n_chains traces with samples_per_chain samples each.
The GelmanRubin diagnostic tests for lack of convergence by comparing the variance between multiple chains to the variance within each chain. If convergence has been achieved, the betweenchain and withinchain variances should be identical. To be most effective in detecting evidence for nonconvergence, each chain should have been initialized to starting values that are dispersed relative to the target distribution.
Parameters:  get_sampler (callable) – Callable that takes a tensorflow.Session as input and returns a (possibly already “burntin”) pysgmcmc.sampling.MCMCSampler subclass instance.
 n_chains (int, optional) – Number of individual chains/traces to extract. Defaults to 2.
 samples_per_chain (int, optional) – Number of samples each individual chain should contain. Defaults to 100.
Returns: gelman_rubin – Dictionary of the potential scale reduction factors, \(\\hat{R}\).
Return type: Notes
The diagnostic is computed by:
\[\begin{split}\\hat{R} = \\frac{\\hat{V}}{W}\end{split}\]where \(W\) is the withinchain variance and \(\\hat{V}\) is the posterior variance estimate for the pooled traces. This is the potential scale reduction factor, which converges to unity when each of the traces is a sample from the target posterior. Values greater than one indicate that one or more chains have not yet converged.
References
Brooks and Gelman (1998) Gelman and Rubin (1992)
Examples
Simple (arbitrary) example to showcase usage:
>>> import tensorflow as tf >>> from pysgmcmc.samplers.sghmc import SGHMCSampler >>> def get_sampler(session): ... params = [tf.Variable([1.0, 2.0], name="x", dtype=tf.float64)] ... cost_fun = lambda params: tf.reduce_sum(params) ... sampler = SGHMCSampler(params=params, cost_fun=cost_fun, session=session) ... return sampler ... >>> factors = gelman_rubin(get_sampler=get_sampler) >>> type(factors) <class 'dict'> >>> param_name = list(factors.keys())[0] >>> param_name.startswith("x") True >>> len(factors[param_name]) # x:0 has two dimensions, we have one factor for each 2