Models

Bayesian Neural Network

class pysgmcmc.models.bayesian_neural_network.BayesianNeuralNetwork(network_architecture=<function simple_tanh_network>, batch_size=20, normalize_input: bool = True, normalize_output: bool = True, num_steps: int = 13000, burn_in_steps: int = 3000, keep_every: int = 100, loss=<class 'pysgmcmc.models.losses.NegativeLogLikelihood'>, metrics=(<class 'torch.nn.modules.loss.MSELoss'>, ), logging_configuration: Dict[str, Any] = {'datefmt': 'y/m/d', 'level': 20}, optimizer=<class 'pysgmcmc.optimizers.sghmc.SGHMC'>, **optimizer_kwargs)[source]
__init__(network_architecture=<function simple_tanh_network>, batch_size=20, normalize_input: bool = True, normalize_output: bool = True, num_steps: int = 13000, burn_in_steps: int = 3000, keep_every: int = 100, loss=<class 'pysgmcmc.models.losses.NegativeLogLikelihood'>, metrics=(<class 'torch.nn.modules.loss.MSELoss'>, ), logging_configuration: Dict[str, Any] = {'datefmt': 'y/m/d', 'level': 20}, optimizer=<class 'pysgmcmc.optimizers.sghmc.SGHMC'>, **optimizer_kwargs) → None[source]

Bayesian Neural Network for regression problems.

Bayesian Neural Networks use Bayesian methods to estimate the posterior distribution of a neural network’s weights. This allows to also predict uncertainties for test points and thus makes Bayesian Neural Networks suitable for Bayesian optimization. This module uses stochastic gradient MCMC methods to sample from the posterior distribution.

See [1] for more details.

[1] J. T. Springenberg, A. Klein, S. Falkner, F. Hutter
Bayesian Optimization with Robust Bayesian Neural Networks. In Advances in Neural Information Processing Systems 29 (2016).
Parameters:
  • network_architecture (pysgmcmc.torch_typing.NetworkFactory, optional) – Function mapping integer input dimensionality to an (initialized) torch.nn.Module.
  • normalize_input (bool, optional) – Specifies if inputs should be normalized to zero mean and unit variance.
  • normalize_output (bool, optional) – Specifies whether outputs should be unnormalized.
  • num_steps (int, optional) – Number of sampling steps to perform after burn-in is finished. In total, num_steps // keep_every network weights will be sampled. Defaults to 10000.
  • burn_in_steps (int, optional) – Number of burn-in steps to perform. This value is passed to the given optimizer if it supports special burn-in specific behavior. Networks sampled during burn-in are discarded. Defaults to 3000.
  • keep_every (int, optional) – Number of sampling steps (after burn-in) to perform before keeping a sample. In total, num_steps // keep_every network weights will be sampled. Defaults to 100.
  • loss (pysgmcmc.torch_typing.TorchLoss, optional) – Loss to use. Default: pysgmcmc.models.losses.NegativeLogLikelihood
  • logging_configuration (typing.Dict[str, typing.Any], optional) – Configuration for pythons logging module to use. Specifying “level” as logging.INFO or lower in this dictionary enables displaying a progressbar for training. If no “level” is specified, logging.INFO is assumed as default choice. Defaults to {“level”: logging.INFO, “datefmt”: “y/m/d”}.
  • optimizer (torch.optim.Optimizer, optional) – Function that returns a torch.optim.optimizer.Optimizer subclass. Defaults to pysgmcmc.optimizers.sghmc.SGHMC.
__weakref__

list of weak references to the object (if defined)

_keep_sample(step: int) → bool[source]
Determine if the network weight sample recorded at step should be stored.
Samples are recorded after burn-in (step > self.num_burn_in_steps), and only every self.keep_every th step.
Parameters:step (int) – Current iteration count.
Returns:should_keep – Sentinel that is True if and only if network weights should be stored at step.
Return type:bool
network_weights

Extract current network weight values as np.ndarray.

Returns:weight_values – Numpy array containing current network weight values.
Return type:np.ndarray
train(x_train: numpy.ndarray, y_train: numpy.ndarray)[source]

Train a BNN using input datapoints x_train with corresponding labels y_train. :param x_train: Input training datapoints. :type x_train: numpy.ndarray (N, D) :param y_train: Input training labels. :type y_train: numpy.ndarray (N,)

Losses

class pysgmcmc.models.losses.NegativeLogLikelihood(parameters: Iterable[torch.Tensor], num_datapoints: int, variance_prior: Callable[torch.Tensor, torch.Tensor] = <function log_variance_prior>, weight_prior: Callable[Iterable[torch.Tensor], torch.Tensor] = <function weight_prior>, size_average: bool = True, reduce: bool = False)[source]

Impementation of BNN negative log likelihood for regression problems.

__init__(parameters: Iterable[torch.Tensor], num_datapoints: int, variance_prior: Callable[torch.Tensor, torch.Tensor] = <function log_variance_prior>, weight_prior: Callable[Iterable[torch.Tensor], torch.Tensor] = <function weight_prior>, size_average: bool = True, reduce: bool = False) → None[source]
Instantiate a loss object for given network parameters.
Requires num_datapoints of the entire regression dataset for proper scaling.
Parameters:
  • parameters (typing.Iterable[torch.Tensor]) – Pytorch variables of BNN parameters.
  • num_datapoints (int) – Total number of datapoints of the entire regression dataset to process.
  • variance_prior (pysgmcmc.torch_typing.VariancePrior, optional) – Prior for BNN variance. Default: pysgmcmc.models.priors.log_variance_prior.
  • weight_prior (pysgmcmc.torch_typing.WeightPrior, optional) – Prior for BNN weights. Default: pysgmcmc.models.priors.weight_prior.
forward(input: Iterable[torch.Tensor], target: Iterable[torch.Tensor]) → torch.Tensor[source]

Compute NLL for 2d-network predictions input and (batch) labels target.

Parameters:
  • input (pysgmcmc.torch_typing.Predictions) – Network predictions.
  • target (pysgmcmc.torch_typing.Targets) – Labels for each datapoint in the current batch.
Returns:

nll – Scalar value. NLL of BNN predictions given as input with respect to labels target.

Return type:

torch.Tensor

pysgmcmc.models.losses.get_loss(loss_cls: Union[Callable[[Iterable[torch.Tensor], int], Callable[[Iterable[torch.Tensor], Iterable[torch.Tensor]], torch.Tensor]], Callable[Callable[[Iterable[torch.Tensor], Iterable[torch.Tensor]], torch.Tensor]]], **loss_kwargs) → Callable[[Iterable[torch.Tensor], Iterable[torch.Tensor]], torch.Tensor][source]
Wrapper to use NegativeLogLikelihood interchangeably with other pytorch losses.
loss_kwargs is expected to be a dict with key parameters mapped to network parameters and key num_datapoints mapped to an integer representing the amount of datapoints in the entire regression dataset.
Parameters:
  • loss_cls (pysgmcmc.torch_typing.TorchLoss) – Class type of a loss, e.g. pysgmcmc.models.losses.NegativeLogLikelihood.
  • loss_kwargs (dict) – Keyword arguments to be passed to loss_cls. Must contain keys parameters for BNN parameters and num_datapoints for the amount of datapoints in the entire regression dataset.
Returns:

loss_instance – Instance of loss_cls.

Return type:

pysgmcmc.torch_typing.TorchLossFunction

pysgmcmc.models.losses.to_bayesian_loss(torch_loss)[source]
Wrapper to make pytorch losses compatible with our BNN predictions.
BNN predictions are 2-d, with the second dimension representing model variance. This wrapper essentially passes only the network mean prediction into torch_loss, which allows us to evaluate torch_loss on our network predictions as normally.
Parameters:torch_loss (pysgmcmc.torch_typing.TochLoss) – Class type of a pytorch loss to evaluate on our BNN, e.g. torch.nn.MSELoss.
Returns:Class type that behaves like torch_loss but assumes inputs coming from a BNN. It will evaluate torch_loss on the BNN predictions first dimension, on the mean prediction, only.
Return type:torch_loss_changed

Architectures

Priors