- Last 7 days
-
mlpr.inf.ed.ac.uk mlpr.inf.ed.ac.uk
-
Describe how youcould incorporate this information into your analysis.
Flag: suggested answer (don't read if don't want to see a (possibly incorrect) attempt:
Update - realise some bi-modal continuous distribution may be better (but potentially difficult to perform the update)
Attempt: we model the parameter pi in a Bayesian way: we put a distribution on pi (0.7 w.p 1/2, 0.2 w.p 1/2) then we weight the 1/2 with the likelihood of the observations, given that parameter (i.e. what is the likleihood when pi = 0.7, multiply that by 1/2 then divide by the normalizing constant to get our new probability for pi = 0.7 (do the same for pi = 0.2, the normalizing constant is the sum of the 'scores' for 0.7 and 0.2 i.e. 1/2 * likelihood so we can't 'divide by the normalising constant until we have the score for both 0.2 and 0.7)
-
xplain your answers
Flag - suggested answer (don't read if don't want to see a (possibly incorrect) attempt:
Grateful for comments here as I am not very certain on the situations that the MLE approach is better vs situations where Bayesian approach is better
Suggested answer:
c(i) Is frequentist approach where we have one parameter estimate (the MLE) c(ii) bayesian approach - distribution over parameters and we update our prior belief based on observations If we have no prior belief - c(i) may be a better estimate (i.e. in (my version of) c(ii) we are constraining the parameters to be 0.7 or 0.2 and updating our relative convictions about these - which is a strong prior asssumption (we can never have 0.5 for instance) If we do have prior belief and also want to incorporate uncertainty estimations in our parameters, I think c(ii) is better If the MLE is 0.7 then we will have c(i) giving 0.7 and c(ii) giving 0.7 with a very high probability and 0/2 with a very low probability to the methods will perform similarly
-
likelihood estimator of π?
Flag: suggested answer (don't read if don't want to see a )(possibly incorrect) attempt:
attempt: MLE = k/3
-
If you thought that this assumption was unrealistic, howwould you relax this assumption
Flag: Don't read if don't want to see a (possibly incorrect) attempt of an answer: (Grateful for any comments/disagreements, further points to add)
Attempted answer: Assumption is that, given a class, features are independent. We could relax this by using 2-d gaussians for our class distributions that have non-zero covariance (off-diagonal) terms so that we have dependencies between features (currently we have these set to zero for independence)
-