Hypothesis

34 Matching Annotations

Jun 2016
rasbt.github.io rasbt.github.io

SequentialFeatureSelector - mlxtend

7
1. hmf 27 Jun 2016
  
  in Public
  
  devation
  
  deviation
2. hmf 27 Jun 2016
  
  in Public
  
  lenghts
  
  lengths
3. hmf 27 Jun 2016
  
  in Public
  
  exlusion
  
  exclusion
4. hmf 27 Jun 2016
  
  in Public
  
  reapeated
  
  repeated
5. hmf 27 Jun 2016
  
  in Public
  
  reapeated
  
  repeated
6. hmf 27 Jun 2016
  
  in Public
  
  perfomance
  
  performance
7. hmf 27 Jun 2016
  
  in Public
  
  feeature
  
  Typo
Visit annotations in context

Annotators

hmf

URL

rasbt.github.io/mlxtend/user_guide/feature_selection/SequentialFeatureSelector/
Mar 2015
cs231n.github.io cs231n.github.io

CS231n Convolutional Neural Networks for Visual Recognition

16
1. hmf 09 Mar 2015
  
  in Public
  
  to beat random search in a carefully-chosen intervals.
  
  to beat random search in a carefully-chosen intervals.
  
  remove the "a"?
  
  Typo
2. hmf 09 Mar 2015
  
  in Public
  
  only training for 1 epoch or even less
  
  only training for 1 epoch or even less .. so we check only in several layers of the network or maybe for all of them during the first epoch?
  
  question
3. hmf 09 Mar 2015
  
  in Public
  
  That is, we are generating a random random with a uniform distribution, but then raising it to the power of 10.
  
  That is, we are generating a random random with a uniform distribution, but then raising it to the power of 10.
  
  The word random is repeated?
  
  Typo
4. hmf 09 Mar 2015
  
  in Public
  
  Tue to the denominator term in the RMSprop update
  
  True to the denominator term in the RMSprop update
  
  Typo
5. hmf 09 Mar 2015
  
  in Public
  
  the step decay dropout is slightly
  
  remove the word dropout?
  
  Typo
6. hmf 09 Mar 2015
  
  in Public
  
  theoretical converge guarantees
  
  theoretical convergence guarantees
  
  Typo
7. hmf 09 Mar 2015
  
  in Public
  
  update has recently
  
  update that has recently
  
  Typo
8. hmf 09 Mar 2015
  
  in Public
  
  set of parameters
  
  set of weights per network layer?
9. hmf 09 Mar 2015
  
  in Public
  
  model capacity
  
  Needs some more explaining? Reference to bias- variance trade-off? Link to VC dimension?
10. hmf 09 Mar 2015
  
  in Public
  
  validation/training accuracy
  
  I have usually encountered the use of error instead of accuracy. Normally found when discussion the bias-variance trade-off. Seems to be more intuitive to me. Maybe we can have an equivalent error graph on the opposite side of the accuracy graph?
11. hmf 09 Mar 2015
  
  in Public
  
  appears more as a slightly more interpretable
  
  appears as a slightly more interpretable (remove first more)
  
  typo
12. hmf 09 Mar 2015
  
  in Public
  
  sizes of million parameters
  
  can have sizes in the millions parameters can have millions of parameters
  
  Typo suggestion
13. hmf 09 Mar 2015
  
  in Public
  
  Therefore, a better solution might be to force a particular random seed before evaluating
  
  Don't understand. What is the random seed used for? Selecting drop-out nodes whose back prop will be checked?
  
  Question
14. hmf 09 Mar 2015
  
  in Public
  
  If your gradcheck for only ~2 or 3 datapoints then you will almost certainly gradcheck for an entire batch.
  
  Just to confirm: if I am using a batch of 10 data points to update the gradient, I need only 2 to 3 of those data points. And this is true irrespective of the size of the batch?
  
  Question
15. hmf 09 Mar 2015
  
  in Public
  
  combine the parameters into a single large parameter vector
  
  The documentation talks of weights and parameters. I assume in this case the parameters are the weights. Maybe reinforce this by adding in parenthesis the word weights? Helps us differentiate between the weight matrix and the hyper-parameters.
16. hmf 09 Mar 2015
  
  in Public
  
  hack the code to remove the data loss contribution.
  
  Maybe it should be: hack the code to remove the regularization loss contribution.
  
  Typo
Visit annotations in context

Tags

typo

question

suggestion

Typo

Question

Annotators

hmf

URL

cs231n.github.io/neural-networks-3/
cs231n.github.io cs231n.github.io

CS231n Convolutional Neural Networks for Visual Recognition

4
1. hmf 09 Mar 2015
  
  in Public
  
  U1 = np.random.rand(*H1.shape) < p
  
  How does this work? Does it set all elements of some randomly selected rows of the weight matrix to 0?
2. hmf 09 Mar 2015
  
  in Public
  
  This is motivated by based on a compromise and an equivalent analysis
  
  Typo/grammar: The motivation for this is based on a compromise and an equivalent analysis
  
  Typo
3. hmf 09 Mar 2015
  
  in Public
  
  This turns out to be a mistake,
  
  I think SGD will also fail for 0 values because any change in the gradient will be multiplied by these zeros and will therefore never change. Is this thinking correct?
4. hmf 09 Mar 2015
  
  in Public
  
  with proper data normalization it is reasonable to assume that approximately half of the weights will be positive and half of them will be negative
  
  Can anyone explain why?
Visit annotations in context

Tags

Typo

Annotators

hmf

URL

cs231n.github.io/neural-networks-2/
cs231n.github.io cs231n.github.io

CS231n Convolutional Neural Networks for Visual Recognition

3
1. hmf 09 Mar 2015
  
  in Public
  
  to to
  
  Typo
  
  typo
2. hmf 09 Mar 2015
  
  in Public
  
  xi scaled
  
  \(x_i\) is scaled
  
  typo
3. hmf 06 Mar 2015
  
  in Public
  
  Notice that this is the gradient only with respect to the row of W that corresponds to the correct class. For the other rows where j≠yi the gradient is:
  
  I am at a loss here (no pun intended). So for a given class \(y_i\) I only calculate the gradient for all those \(L_i\) that are labelled with \(y_i\). I assume I have to do this for all \(y_i\). So what do I use the expression below for?
  
  TIA.
Visit annotations in context

Tags

typo

Annotators

hmf

URL

cs231n.github.io/optimization-1/
cs231n.github.io cs231n.github.io

CS231n Convolutional Neural Networks for Visual Recognition

1
1. hmf 09 Mar 2015
  
  in Public
  
  The final loss for this example is 1.58 for the SVM and 0.452 for the Softmax classifier
  
  The figure above has a value of 1.04 for the softmax case. I think that should be \(0.452\).
  
  typo
Visit annotations in context

Tags

typo

Annotators

hmf

URL

cs231n.github.io/linear-classify/
cs231n.github.io cs231n.github.io

CS231n Convolutional Neural Networks for Visual Recognition

3
1. hmf 09 Mar 2015
  
  in Public
  
  The synapses are not just a single weight a complex non-linear dynamical system
  
  Typo. Grammar.
  
  The synapses are not just a single weight ,but a complex non-linear dynamical system
  
  typo
2. hmf 09 Mar 2015
  
  in Public
  
  noone
  
  The usual spelling is "no one" and to a lesser extent "no-one". Just nit-picking. B-)
  
  typo
3. hmf 09 Mar 2015
  
  in Public
  
  dropou
  
  typo: drop-out
  
  typo
Visit annotations in context

Tags

typo

Annotators

hmf

URL

cs231n.github.io/neural-networks-1/

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL

Tags

Annotators

URL