The frame is likely made of wood or metal, because most picture frames are made of these materials.
Literally Bogdanov's ‘degression’ (https://philarchive.org/archive/GARABA-3 p. 11 (351))
The frame is likely made of wood or metal, because most picture frames are made of these materials.
Literally Bogdanov's ‘degression’ (https://philarchive.org/archive/GARABA-3 p. 11 (351))
It's important to note that array numbering in Julia starts with 1, not with 0 as in most other languages.
“It makes sense from a mathematics point of view” © Julia creators. Another surprise for coders
the following dataset
age is String?
You do not need to include any additional packages for it.
Are you serious? Built-in linear algebra? Who would have thought
Basic linear algebra features are already integrated into the Julia standard library.
Whoa! How surprising for Python coders
because of it simplicity
Nope. R is simple. Python is of acceptable difficulty
iterator
A very confusing term for someone outside Python
: When indexing into a dataset(), what happens in the background is a call to .getitem(i)
Interestingly, identical(ds$.getitem(1), ds[1]) returns FALSE
.getitem = function(i) { list(x = self$x[i, ], y = self$y[i]) },
It is straightforward like that. I wonder if there are cases when it is not?
items
Observations?
consisting of one input
What is “one input” in this context? A single observation?
For example, for sentiment classification tokens awesome, brilliant, great will have higher probability given positive class then negative. Similarly, tokens awful, boring, bad will have higher probability given negative class then positive.
Certainly not the reason why the Naïve Bayes is called so, but this is indeed amazingly naïve.
The reason why it is not suitable for sentiment analysis, maybe except for baby talk sentiment analysis.
aka Add-1 smoothing
Pythonist jargon
prior probability P(y=k): class probability before looking at data (i.e., before knowing x);
Use responsibly
The hypothesis is that your brain searched for other words that can be used in the same contexts, found some (e.g., wine), and made a conclusion that tezgüino has meaning similar to those other words. This is the distributional hypothesis:
Our brain can also notice the spelling, so it rather thinks of tequila than wine
But how do we know what is meaning?
!!
Functions should take the data structures that users have as opposed to the data structure that developers want. For example, a model function’s only interface should not be constrained to matrices. Frequently, users will have non-numeric predictors such as factors.
Sacred truth
In the context of modeling, it is also important to avoid highly technical jargon, such as Greek letters or obscure terms in terms.
Why do they hate Greek letters so much in America? Greek letters are concise way of naming values you otherwise should name with some long symbol strings. In addition, Greek argument and value names provide the connection between code and theory making the user more prepared to understand corresponding papers if she had to.
Медицина: врачи могут использовать бота для получения ответов на медицинские вопросы и улучшения качества диагностики и лечения.
Нет! Только если для мозгового штурма.
Note that parsnip constrains the outcome column of a classification model to be encoded as a factor; using binary numeric values will result in an error.
This is right!
Some of the original argument names can be fairly jargon-y. For example, to specify the amount of regularization to use in a glmnet model, the Greek letter lambda is used. While this mathematical notation is commonly used in the statistics literature, it is not obvious to many people what lambda represents (especially those who consume the model results).
Hands off the Greek letters! :)) They imply minimal math background necessary to understand the models.
For other types of models, the interfaces may be even more disparate. For a person trying to do data analysis, these differences require the memorization of each package’s syntax and can be very frustrating.
Holy truth
The parsnip package, one of the R packages that are part of the tidymodels metapackage, provides a fluent and standardized interface for a variety of different models. In this chapter, we give some motivation for why a common interface is beneficial for understanding and building models in practice and show how to use the parsnip package.
Unified interface for different modelling functions from different packages is a great idea, and it is the prime reason I learn tidymodels at all.
This begs the question: “How can we tell what is best if we don’t measure performance until the test set?”
How can we tell which model is final?
These results can be “stacked” and added to a ggplot(), as shown in Figure 3.3.
Cool!
As an example, M. Kuhn and Johnson (2020) use data to model the daily ridership of Chicago’s public train system using predictors such as the date, the previous ridership results, the weather, and other factors. Table 1.1 shows an approximation of these authors’ hypothetical inner monologue when analyzing these data and eventually selecting a model with sufficient performance.
A great example
Chapter 4, “Subsetting”, now distinguishes between [ and [[ by their intention: [ extracts many values and [[ extracts a single value (previously they were characterised by whether they “simplified” or “preserved”).
I think the difference is that [ extracts subset and [[ extracts element.
Without calculus and the ideas of functions and their derivatives, Smith was not able to think about prices in a modern way where price is shaped by demand and supply. Instead, for Smith, each item has a “natural price”: a fixed quantity that depends on the amount of labor used to produce the item. Nowadays, we understand that productivity changes as new methods of production and new inventions are introduced.
So, the amount of labor used to produce the item changes following advances in technology, so does its "natural price", while market price fluctuates around the latter influenced by demand and supply. What's wrong?
“people” / “passengers” / “customers” / “patients” / “cases” / “passenger deaths”: these are different different ways to refer to people. we will consider such quantities to have dimension P, for population.
And also media audience
There are other dimensions: volume, force, pressure, energy, torque, velocity, acceleration, and such. These are called compound dimensions because we represent them as combinations of the fundamental dimensions, L, T, and M. The notation for these combinations involves multiplication and division.
Interesting to think about dimension in media analysis
Another clue is whether “zero” means “nothing.” Daily temperatures in the winter are often near “zero” on the Fahrenheit or Celcius scales, but that in no way means there is a complete absence of heat. Those scales are arbitrary. Another way to think about this clue is whether negative values are meaningful. If so, thinking in terms of orders of magnitude is not likely to be useful.
So, media audience is better to think about in terms of orders of magnitude