4 Matching Annotations
  1. Last 7 days
    1. Describe how youcould incorporate this information into your analysis.

      Flag: suggested answer (don't read if don't want to see a (possibly incorrect) attempt:

      Update - realise some bi-modal continuous distribution may be better (but potentially difficult to perform the update)

      Attempt: we model the parameter pi in a Bayesian way: we put a distribution on pi (0.7 w.p 1/2, 0.2 w.p 1/2) then we weight the 1/2 with the likelihood of the observations, given that parameter (i.e. what is the likleihood when pi = 0.7, multiply that by 1/2 then divide by the normalizing constant to get our new probability for pi = 0.7 (do the same for pi = 0.2, the normalizing constant is the sum of the 'scores' for 0.7 and 0.2 i.e. 1/2 * likelihood so we can't 'divide by the normalising constant until we have the score for both 0.2 and 0.7)

    2. xplain your answers

      Flag - suggested answer (don't read if don't want to see a (possibly incorrect) attempt:

      Grateful for comments here as I am not very certain on the situations that the MLE approach is better vs situations where Bayesian approach is better

      Suggested answer:

      c(i) Is frequentist approach where we have one parameter estimate (the MLE) c(ii) bayesian approach - distribution over parameters and we update our prior belief based on observations If we have no prior belief - c(i) may be a better estimate (i.e. in (my version of) c(ii) we are constraining the parameters to be 0.7 or 0.2 and updating our relative convictions about these - which is a strong prior asssumption (we can never have 0.5 for instance) If we do have prior belief and also want to incorporate uncertainty estimations in our parameters, I think c(ii) is better If the MLE is 0.7 then we will have c(i) giving 0.7 and c(ii) giving 0.7 with a very high probability and 0/2 with a very low probability to the methods will perform similarly

    3. likelihood estimator of π?

      Flag: suggested answer (don't read if don't want to see a )(possibly incorrect) attempt:

      attempt: MLE = k/3

    4. If you thought that this assumption was unrealistic, howwould you relax this assumption

      Flag: Don't read if don't want to see a (possibly incorrect) attempt of an answer: (Grateful for any comments/disagreements, further points to add)

      Attempted answer: Assumption is that, given a class, features are independent. We could relax this by using 2-d gaussians for our class distributions that have non-zero covariance (off-diagonal) terms so that we have dependencies between features (currently we have these set to zero for independence)