bayesmix/hierarchies/likelihoods
Likelihoods¶
The Likelihood sub-module represents the likelihood we have assumed for the data in a given cluster. Each Likelihood class represents the sampling model

for a specific choice of the probability density function
.
Main operations performed¶
A likelihood object must be able to perform the following operations:
First of all, we require the
lpdf()andlpdf\_grid()methods, which simply evaluate the loglikelihood in a given point or in a grid of points (also in case of a emph{dependent} likelihood, i.e., with covariates associated to each observation) [lpdf()andlpdf_grid]In case you want to rely on a Metropolis-like updater, the likelihood needs to evaluation of the likelihood of the whole cluster starting from the vector of unconstrained parameters [
cluster_lpdf_from_unconstrained()]. Observe that theAbstractLikelihoodclass provides two such methods, one returning adoubleand one returning astan::math::var. The latter is used to automatically compute the gradient of the likelihood via Stan’s automatic differentiation, if needed. In practice, users do not need to implement both methods separately and can implement only one templated methodmanage the insertion and deletion of a datum in the cluster [
add_datumandremove_datum]update the summary statistics associated to the likelihood [
update_summary_statistics]. Summary statistics (when available) are used to evaluate the likelihood function on the whole cluster, as well as to perform the posterior updates of
. This usually gives a substantial speedup
Code structure¶
In principle, the Likelihood classes are responsible only of evaluating the log-likelihood function given a specific choice of parameters
.
Therefore, a simple inheritance structure would seem appropriate. However, the nature of the parameters
can be very different across different models (think for instance of the difference between the univariate normal and the multivariate normal paramters). As such, we employ CRTP to manage the polymorphic nature of Likelihood classes.
The class AbstractLikelihood defines the API, i.e. all the methods that need to be called from outside of a Likelihood class.
A template class BaseLikelihood inherits from AbstractLikelihood and implements some of the necessary virtual methods, which need not be implemented by the child classes.
Instead, child classes must implement:
compute_lpdf: evaluates
update_sum_stats: updates the summary statistics when an observation is allocated or de-allocated from the hierarchyclear_summary_statistics: clears all the summary statisticsis_dependent: defines if the given likelihood depends on covariatesis_multivariate: defines if the given likelihood is for multivariate data
In case the likelihood needs to be used in a Metropolis-like updater, child classes should also implement:
cluster_lpdf_from_unconstrained: evaluates
, where
is the vector of unconstrained parameters.
Abstract Classes¶
-
class AbstractLikelihood¶
Abstract class for a generic likelihood
This class is the basis for a curiously recurring template pattern (CRTP) for
Likelihoodobjects, and is solely composed of interface functions for derived classes to use.A likelihood can evaluate the log probability density faction (lpdf) at a certain point given the current value of the parameters, or compute directly the lpdf for the whole cluster.
Whenever possible, we store in a
Likelihoodinstance also the sufficient statistics of the data allocated to the cluster, in order to speed-up computations.Subclassed by BaseLikelihood< LaplaceLikelihood, State::UniLS >, BaseLikelihood< FALikelihood, State::FA >, BaseLikelihood< UniLinRegLikelihood, State::UniLinRegLS >, BaseLikelihood< MultiNormLikelihood, State::MultiLS >, BaseLikelihood< UniNormLikelihood, State::UniLS >, BaseLikelihood< Derived, State >
Public Functions
-
virtual ~AbstractLikelihood() = default¶
Default destructor.
-
virtual std::shared_ptr<AbstractLikelihood> clone() const = 0¶
Returns an independent, data-less copy of this object.
-
inline double lpdf(const Eigen::RowVectorXd &datum, const Eigen::RowVectorXd &covariate = Eigen::RowVectorXd(0)) const¶
Public wrapper for
compute_lpdf()methods.
-
inline virtual double cluster_lpdf_from_unconstrained(Eigen::VectorXd unconstrained_params) const¶
Evaluates the log likelihood over all the data in the cluster given unconstrained parameter values. By unconstrained parameters we mean that each entry of the parameter vector can range over (-inf, inf). Usually, some kind of transformation is required from the unconstrained parameterization to the actual parameterization.
- Parameters:
unconstrained_params – vector collecting the unconstrained parameters
- Returns:
The evaluation of the log likelihood over all data in the cluster
-
inline virtual stan::math::var cluster_lpdf_from_unconstrained(Eigen::Matrix<stan::math::var, Eigen::Dynamic, 1> unconstrained_params) const¶
This version using
stan::math::vartype is required for Stan automatic differentiation. Evaluates the log likelihood over all the data in the cluster given unconstrained parameter values. By unconstrained parameters we mean that each entry of the parameter vector can range over (-inf, inf). Usually, some kind of transformation is required from the unconstrained parameterization to the actual parameterization.- Parameters:
unconstrained_params – vector collecting the unconstrained parameters
- Returns:
The evaluation of the log likelihood over all data in the cluster
-
virtual Eigen::VectorXd lpdf_grid(const Eigen::MatrixXd &data, const Eigen::MatrixXd &covariates = Eigen::MatrixXd(0, 0)) const = 0¶
Evaluates the log-likelihood of data in a grid of points
- Parameters:
data – Grid of points (by row) which are to be evaluated
covariates – (Optional) covariate vectors associated to data
- Returns:
The evaluation of the lpdf
-
virtual bool is_multivariate() const = 0¶
Returns whether the likelihood models multivariate data or not.
-
virtual bool is_dependent() const = 0¶
Returns whether the likelihood depends on covariate values or not.
-
virtual void set_state_from_proto(const google::protobuf::Message &state_, bool update_card = true) = 0¶
Read and set state values from a given Protobuf message.
-
virtual void set_state_from_unconstrained(const Eigen::VectorXd &unconstrained_state) = 0¶
Read and set state values from the vector of unconstrained parameters.
-
virtual void write_state_to_proto(google::protobuf::Message *out) const = 0¶
Writes current state to a Protobuf message by pointer.
-
virtual void set_dataset(const Eigen::MatrixXd *const dataset) = 0¶
Sets the (pointer to) the dataset in the cluster.
-
virtual void add_datum(const int id, const Eigen::RowVectorXd &datum, const Eigen::RowVectorXd &covariate = Eigen::RowVectorXd(0)) = 0¶
Adds a datum and its index to the likelihood.
-
virtual void remove_datum(const int id, const Eigen::RowVectorXd &datum, const Eigen::RowVectorXd &covariate = Eigen::RowVectorXd(0)) = 0¶
Removes a datum and its index from the likelihood.
-
inline void update_summary_statistics(const Eigen::RowVectorXd &datum, const Eigen::RowVectorXd &covariate, bool add)¶
Public wrapper for
update_sum_stats()methods.
-
virtual void clear_summary_statistics() = 0¶
Resets the values of the summary statistics in the likelihood.
-
virtual Eigen::VectorXd get_unconstrained_state() = 0¶
Returns the vector of the unconstrained parameters for this likelihood.
-
virtual ~AbstractLikelihood() = default¶
-
template<class Derived, typename State>
class BaseLikelihood : public AbstractLikelihood¶ Base template class of a
LikelihoodobjectThis class derives from
AbstractLikelihoodand is templated overDerived(needed for the curiously recurring template pattern) andState: an instance ofBaseState- Template Parameters:
Derived – Name of the implemented derived class
State – Class name of the container for state values
Public Functions
-
BaseLikelihood() = default¶
Default constructor.
-
~BaseLikelihood() = default¶
Default destructor.
-
inline virtual std::shared_ptr<AbstractLikelihood> clone() const override¶
Returns an independent, data-less copy of this object.
-
inline virtual double cluster_lpdf_from_unconstrained(Eigen::VectorXd unconstrained_params) const override¶
Evaluates the log likelihood over all the data in the cluster given unconstrained parameter values. By unconstrained parameters we mean that each entry of the parameter vector can range over (-inf, inf). Usually, some kind of transformation is required from the unconstrained parametrization to the actual one.
- Parameters:
unconstrained_params – vector collecting the unconstrained parameters
- Returns:
The evaluation of the log likelihood over all data in the cluster
-
inline virtual stan::math::var cluster_lpdf_from_unconstrained(Eigen::Matrix<stan::math::var, Eigen::Dynamic, 1> unconstrained_params) const override¶
This version using
stan::math::vartype is required for Stan automatic differentiation. Evaluates the log likelihood over all the data in the cluster given unconstrained parameter values. By unconstrained parameters we mean that each entry of the parameter vector can range over (-inf, inf). Usually, some kind of transformation is required from the unconstrained parametrization to the actual one.- Parameters:
unconstrained_params – vector collecting the unconstrained parameters
- Returns:
The evaluation of the log likelihood over all data in the cluster
-
virtual Eigen::VectorXd lpdf_grid(const Eigen::MatrixXd &data, const Eigen::MatrixXd &covariates = Eigen::MatrixXd(0, 0)) const override¶
Evaluates the log-likelihood of data in a grid of points
- Parameters:
data – Grid of points (by row) which are to be evaluated
covariates – (Optional) covariate vectors associated to data
- Returns:
The evaluation of the lpdf
-
inline int get_card() const¶
Returns the current cardinality of the cluster.
-
inline double get_log_card() const¶
Returns the logarithm of the current cardinality of the cluster.
-
inline std::set<int> get_data_idx() const¶
Returns the indexes of data points belonging to this cluster.
-
virtual void write_state_to_proto(google::protobuf::Message *out) const override¶
Writes current state to a Protobuf message by pointer.
-
inline virtual Eigen::VectorXd get_unconstrained_state() override¶
Returns a vector storing the state in its unconstrained form.
-
inline void set_state(const State &state_, bool update_card = true)¶
Updates the state of the likelihood with the object given as input.
-
inline virtual void set_state_from_proto(const google::protobuf::Message &state_, bool update_card = true) override¶
Read and set state values from a given Protobuf message.
-
inline virtual void set_state_from_unconstrained(const Eigen::VectorXd &unconstrained_state) override¶
Updates the state of the likelihood starting from its unconstrained form.
-
inline virtual void set_dataset(const Eigen::MatrixXd *const dataset) override¶
Sets the (pointer to) the dataset in the cluster.
-
inline const Eigen::MatrixXd *get_dataset() const¶
Returns the (pointer to) the dataset in the cluster.
-
virtual void add_datum(const int id, const Eigen::RowVectorXd &datum, const Eigen::RowVectorXd &covariate = Eigen::RowVectorXd(0)) override¶
Adds a datum and its index to the likelihood.
-
virtual void remove_datum(const int id, const Eigen::RowVectorXd &datum, const Eigen::RowVectorXd &covariate = Eigen::RowVectorXd(0)) override¶
Removes a datum and its index from the likelihood.
-
inline void clear_data()¶
Resets cardinality and indexes of data in this cluster.
Classes for Univariate Likelihoods¶
-
class UniNormLikelihood : public BaseLikelihood<UniNormLikelihood, State::UniLS>¶
A univariate normal likelihood, using the
State::UniLSstate. Represents the model:
where
are stored in a State::UniLSstate. The sufficient statistics stored are the sum of the
‘s and the sum of
. Public Functions
-
inline virtual bool is_multivariate() const override¶
Returns whether the likelihood models multivariate data or not.
-
inline virtual bool is_dependent() const override¶
Returns whether the likelihood depends on covariate values or not.
-
virtual void clear_summary_statistics() override¶
Resets the values of the summary statistics in the likelihood.
Protected Functions
-
virtual double compute_lpdf(const Eigen::RowVectorXd &datum) const override¶
Evaluates the log-likelihood of data in a single point
- Parameters:
datum – Point which is to be evaluated
- Returns:
The evaluation of the lpdf
-
virtual void update_sum_stats(const Eigen::RowVectorXd &datum, bool add) override¶
Updates cluster statistics when a datum is added or removed from it
- Parameters:
datum – Data point which is being added or removed
add – Whether the datum is being added or removed
-
inline virtual bool is_multivariate() const override¶
-
class UniLinRegLikelihood : public BaseLikelihood<UniLinRegLikelihood, State::UniLinRegLS>¶
A scalar linear regression model, using the
State::UniLinRegLSstate. Represents the model:
where
are stored in a State::UniLinRegLSstate. The sufficient statistics stored are the sum of
, the sum of
and the sum of
. Public Functions
-
inline virtual bool is_multivariate() const override¶
Returns whether the likelihood models multivariate data or not.
-
inline virtual bool is_dependent() const override¶
Returns whether the likelihood depends on covariate values or not.
-
virtual void clear_summary_statistics() override¶
Resets the values of the summary statistics in the likelihood.
Protected Functions
-
virtual double compute_lpdf(const Eigen::RowVectorXd &datum, const Eigen::RowVectorXd &covariate) const override¶
Evaluates the log-likelihood of data in a single point
- Parameters:
datum – Point which is to be evaluated
covariate – Covariate vector associated to datum
- Returns:
The evaluation of the lpdf
-
virtual void update_sum_stats(const Eigen::RowVectorXd &datum, const Eigen::RowVectorXd &covariate, bool add) override¶
Updates cluster statistics when a datum is added or removed from it
- Parameters:
datum – Data point which is being added or removed
covariate – Covariate vector associated to datum
add – Whether the datum is being added or removed
-
inline virtual bool is_multivariate() const override¶
-
class LaplaceLikelihood : public BaseLikelihood<LaplaceLikelihood, State::UniLS>¶
A univariate Laplace likelihood, using the
State::UniLSstate. Represents the model:
where
is the mean and center of the distribution and
is the variance. The scale parameter
is then
. These parameters are stored in a State::UniLSstate. Since the Laplace likelihood does not have sufficient statistics other than the whole sample, theupdate_sum_stats()method does nothing.Public Functions
-
inline virtual bool is_multivariate() const override¶
Returns whether the likelihood models multivariate data or not.
-
inline virtual bool is_dependent() const override¶
Returns whether the likelihood depends on covariate values or not.
-
inline virtual void clear_summary_statistics() override¶
Resets the values of the summary statistics in the likelihood.
Protected Functions
-
virtual double compute_lpdf(const Eigen::RowVectorXd &datum) const override¶
Evaluates the log-likelihood of data in a single point
- Parameters:
datum – Point which is to be evaluated
- Returns:
The evaluation of the lpdf
-
inline virtual void update_sum_stats(const Eigen::RowVectorXd &datum, bool add) override¶
Updates cluster statistics when a datum is added or removed from it
- Parameters:
datum – Data point which is being added or removed
add – Whether the datum is being added or removed
-
inline virtual bool is_multivariate() const override¶
Classes for Multivariate Likelihoods¶
-
class MultiNormLikelihood : public BaseLikelihood<MultiNormLikelihood, State::MultiLS>¶
A multivariate normal likelihood, using the
State::MultiLSstate. Represents the model:
where
are stored in a State::MultiLSstate. The sufficient statistics stored are the sum of the
‘s and the sum of
. Public Functions
-
inline virtual bool is_multivariate() const override¶
Returns whether the likelihood models multivariate data or not.
-
inline virtual bool is_dependent() const override¶
Returns whether the likelihood depends on covariate values or not.
-
virtual void clear_summary_statistics() override¶
Resets the values of the summary statistics in the likelihood.
Protected Functions
-
virtual double compute_lpdf(const Eigen::RowVectorXd &datum) const override¶
Evaluates the log-likelihood of data in a single point
- Parameters:
datum – Point which is to be evaluated
- Returns:
The evaluation of the lpdf
-
virtual void update_sum_stats(const Eigen::RowVectorXd &datum, bool add) override¶
Updates cluster statistics when a datum is added or removed from it
- Parameters:
datum – Data point which is being added or removed
add – Whether the datum is being added or removed
-
inline virtual bool is_multivariate() const override¶
-
class FALikelihood : public BaseLikelihood<FALikelihood, State::FA>¶
A gaussian factor analytic likelihood, using the
State::FAstate. Represents the model:
where Lambda is a
matrix, usually
and
is a diagonal matrix. Parameters are stored in a State::FAstate. We store as summary statistics the sum of the
‘s, but it is not sufficient for all the updates involved. Therefore, all the observations allocated to a cluster are processed when computing the cluster lpdf. Public Functions
-
inline virtual bool is_multivariate() const override¶
Returns whether the likelihood models multivariate data or not.
-
inline virtual bool is_dependent() const override¶
Returns whether the likelihood depends on covariate values or not.
-
virtual void clear_summary_statistics() override¶
Resets the values of the summary statistics in the likelihood.
Protected Functions
-
virtual double compute_lpdf(const Eigen::RowVectorXd &datum) const override¶
Evaluates the log-likelihood of data in a single point
- Parameters:
datum – Point which is to be evaluated
- Returns:
The evaluation of the lpdf
-
virtual void update_sum_stats(const Eigen::RowVectorXd &datum, bool add) override¶
Updates cluster statistics when a datum is added or removed from it
- Parameters:
datum – Data point which is being added or removed
add – Whether the datum is being added or removed
-
inline virtual bool is_multivariate() const override¶