Practice Exercise 7.C
Bayesian Learning and Continuous Probability

Back to practice exercises.

1: Background Reading

2: Learning Goals

  • Review continous probability.
  • Compute posterior probability densities from a conjugate prior.
  • Apply Bayesian parameter estimation to perform classification.

3: Directed Questions

  1. Which of the following distributions are discrete, which are continuous, and which are special cases of one another?
    • Uniform(0, 1)
    • Bernoulli(p)
    • Beta(α, β)
    • Categorical(p1, ..., pk)
    • Dirichlet(α1, ..., αk)
    [solution]

  2. What is a hyperparameter? [solution]

  3. What does it mean to say that the Beta distribution family is a conjugate prior for the Bernoulli distribution? [solution]

4: Exercise: Parameter Estimation with the Beta Distribution

  1. Suppose X and Y are independent random samples from the same Bernoulli(θ) distribution, where we assume θ ~ Uniform(0, 1) in the prior.
    • (a) If we observe X = 1, what is the posterior distribution of the parameter θ? [solution]
    • (b) What is the expected value of Y given X? i.e. find E(Y) = E(E(Y | θ)) where the outer expectation is taken over the posterior distribution from part (a). [solution]

  2. Open the beta distribution applet. Which member of the beta distribution family corresponds to the uniform prior? In other words, for which values of α and β is Beta(α, β) equivalent to Uniform(0, 1)? [solution]
    • (a) Suppose we start with a Beta(α, β) prior belief in the parameter θ. If n positive examples (class 1) and m negative examples (class 0) are observed, what are the parameters of the posterior beta distribution? [solution]
    • (b) What is the posterior distribution when the prior is uniform? [solution]
    • (c) What is the expected value of θ over the posterior from part (b)?[solution]
    • (d) Compare your answer in part (c) against the MLE of θ using pseudocounts. [solution]
    • (e) According our posterior belief, what is the probability of the next sample belonging to class 1? [solution]
    • (f) The value calculated in part (c) is called the Bayes estimate. Suppose the true parameter value is θ*. What is the expected value of the Bayes estimate, in terms of θ*, if we obtain a random sample of size N (i.e. m+n = N is fixed)? [solution]
    • (g) Name one optimality property of the Bayes estimate. [solution]
    • (h) An estimator is called unbiased if its expected value is equal to the true parameter value, regardless of the true parameter's value. Is the Bayes estimate unbiased? [solution]

  3. Suppose corn fields are chosen at random, as squares of side length between 1 and 5 metres. What are the minimum and maximum possible area? [solution] Assuming a uniform distribution in the allowed range of lengths, what is the median length and area? [solution] What would the median area be if we instead assumed the area to be uniformly distributed between 1 and 25 square metres? [solution] Does it make sense to use a uniform prior when no additional information is given? [solution]

5: Learning Goals Revisited

  • Review continous probability.
  • Compute posterior probability densities from a conjugate prior.
  • Apply Bayesian parameter estimation to perform classification.

Valid HTML 4.0 Transitional