57 Matching Annotations
  1. Last 7 days
    1. 当空间维度是无穷而且不可数的时候(没有一个可数的基底),无法运用有限维或可数维度空间的办法来定义范数,但对于可积函数空间,仍然能够定义类似的概念

      什么是无穷维空间

  2. Jun 2021
    1. two-dimensional shapes must be scaled by area rather than radius or by side length. Linear scaling for a two-dimensional shape results in exaggerated differences.

      [[distinguish between]] symbols() and points() in terms of the arguments: if using symbols, the shape must be scaled by area rather than by radius or by side length.

    2. Drawing symbols on a map is a lot like drawing points on a map. There are two main differences. The first is that you use symbols() instead of points(). The former lets you add scaled circles, squares, rectangles, stars, and other symbols. The second difference is that you must scale the two-dimensional symbols correctly. Whereas points only represent location, scaled shapes both represent location and a second metric that corresponds to the location.

      The Pictures using function symbols() as compared to points()

      image

    1. Oftentimes, other files accompany the .shp file with extensions .dbf and .prj. The first is called the shapefile attribute format, and it stores data that is associated to the geographic features encoded in the main shapefile. For example, you might have counties in your shapefile (.shp), and the unemployment rate for each county would be in the attribute file format (.dbf).

      The shape file is stored as [[.shp]] extension. the file format encodes points, lines and polygons in geometric space and is a common way to distribute spatial data.

      there may be other extensions accompany the .shp files, which includes .prj or .dbf

      [[.prj]] is in attribute file format - it could be used to store unemployment rate for each county. It also contains **projection of the coordinates in the [[shapefiles]]

    1. There are many ways to add geographic boundaries in R, but there are two main ways. The first uses the maps package and relies on boundary data loaded by the package. The second brings in external data in the form of shapefiles, which is more flexible but requires a bit more care.

      Maps are layers placed one on top of the others. #[[important gain]].

  3. flowingdata.com flowingdata.com
    1. The third package, rgdal, relies on a library called GDAL, which stands for Geospatial Data Abstraction Library. It helps you with map projections and geographic math.

      rgdal is a package which help with map projections and geographic maths in #R. #package #[[R package]] #outline

  4. May 2021
    1. the meaning of a function should be independent of the parameter names chosen by its author -- has important consequences for programming languages. The simplest consequence is that the parameter names of a function must remain local to the body of the function.

      The parameter names of a functino must remain local to the body of the function

      [[important gain]] #python #[[function {programming}]]

    1. “As the number of identically distributed, randomly generated variables increases, their sample mean (average) approaches their theoretical mean.”Besides being easily one of the most important laws of statistics, this is the basis for Monte Carlo simulations and allows us to build a stochastic model by the method of statistical trials.

      [[strong law of large numbers(SLLN)]] is the basis for [[monte carlo method]] and allows us to build a [[stochastic model]] by the [[method of statistical trails]]

    2. He used the tools of random sampling and inferential statistics to model likelihoods of outcomes, originally applied to a card game (Monte Carlo Solitaire).

      [[monte carlo method]] is used to model the [[likelihood]] of outcomes using random sampling and inferential statistics

    3. There are a broad spectrum of Monte Carlo methods, but they all share the commonality that they rely on random number generation to solve deterministic problems.

      The key to #[[monte carlo method]] is the [[random number generation]].

    4. Monte Carlo (MC) methods are a subset of computational algorithms that use the process of repeated random sampling to make numerical estimations of unknown parameters.

      Definition of #[[monte carlo method]] : it uses the process of repeated random sampling to make numerical estimations of unknown parameters.

    1. The difference between the two is that machine learning emphasizes optimization and performance over statistical inference. Statistics is also concerned about performance but would like to calculate the uncertainty associated with parameters of the model. It will try to model the population statistics from the sample data points to assess that uncertainty.

      [[[[difference between]]Data Science and Statistics]] : statistics focuses on [[statistical inference]] while machine learning emphasizes optimization and performance (prediction).

      Statisticians assess the [[uncertainty]] associated with parameters of the model. (standard deviation of the parameter????) #Answer to #Question [[[[What]] does [[uncertainty]] refer to in [[statistical inference]]?]]

    2. The aim of inference is to find statistical properties of the underlying data and to estimate the uncertainty about those properties.

      [[What]] is [[statistical inference]]: to find [[statistical property]] of the underlying data and to estimate the [[uncertainty]] about those properties.

    1. Even ignoring the advantage of avoiding a notational disagreement between countable and uncountable cases, just because something is countable doesn't mean that there is a canonical bijection between it and ℕ

      The map between [[indexed set]] and the set being indexed is not [[bijection]], the index function is a [[surjection]].

  5. Apr 2021
    1. For example, some intervals of the function on the top map to smaller intervals on the substituted function indicating "a higher density".

      The difference in the area reflected the difference in [[density]], when $$x$$ moves by $$\Delta$$, the corresponding $$\Delta u$$ may be smaller, this reflects a "higher density", defined as $$\Delta x/\Delta u$$

    2. It can be seen from the plot that the areas under the curves differ a lot

    1. Gaussian processes are non-parametric (although kernel hyperparameters blur the picture) they need to take into account the whole training data each time they make a prediction.

      For example, when making prediction based on [[linear regression model]] with only two parameters, they only need to take into account the two numbers, while in [[Gaussian process]] they need to take the entire training data into account.

    2. Gaussian processes are a non-parametric method. Parametric approaches distill knowledge about the training data into a set of numbers. For linear regression this is just two numbers, the slope and the intercept, whereas other approaches like neural networks may have 10s of millions. This means that after they are trained the cost of making predictions is dependent only on the number of parameters.

      [[Gaussian process]] is non-parametric method. [[parametric approach]] reduce the traning data into set of numbers. For instance, in [[linear regression model]], it's only two numbder (slope, intercept). The cost of making predictions is dependent only on the number of parameters.

    3. how Gaussian processes are ever supposed to generalize beyond their training data given the uncertainty property discussed above. Well the answer is that the generalization properties of GPs rest almost entirely within the choice of kernel.

      A natural #Question : how can the GP be generalized beyond the cope of the training data given the property of [[Gaussian process]]

      Answer rest almost rest entirely within the [[choice of kernel]]

    4. When you’re using a GP to model your problem you can shape your prior belief via the choice of kernel

      [[Gaussian process]] allows you to incorporate export knowledge by shape [[prior belief]] of the distribution by making choices among [[kernel]]

    5. A key benefit is that the uncertainty of a fitted GP increases away from the training data — this is a direct consequence of GPs roots in probability and Bayesian inference.

      Advantage of [[Gaussian process]] as compared to other [[machine learning]] method: the [[uncertainty]] increases when moving away from the training data as can be seen from the graph:

    6. Similarly to the narrowed distribution of possible heights of Obama what you can see is a narrower distribution of functions. The updated Gaussian process is constrained to the possible functions that fit our training data —the mean of our function intercepts all training points and so does every sampled function. We can also see that the standard deviation is higher away from our training data which reflects our lack of knowledge about these areas.

    7. use Bayes’ rule to update our belief about the function to get the posterior Gaussian process AKA our updated belief about the function we’re trying to fit.

      [[poterior gaussian process]] is obtained through [[Bayes' rule]] also known as our [[updated belief]] about the function we are trying to fit.

    8. Instead of observing some photos of Obama we will instead observe some outputs of the unknown function at various points. For Gaussian processes our evidence is the training data.

      Observation (analogy: obama's photo) is output of the unknown function at various function (for [[Gaussian process]] the evidence is called [[training data]])

    9. A Gaussian process is a probability distribution over possible functions.

      [[Gaussian process]] is a probability distribution over possible functions.

    10. We can see that Obama is definitely taller than average, coming slightly above several other world leaders, however we can’t be quite sure how tall exactly. The probability distribution shown still reflects the small chance that Obama is average height and everyone else in the photo is unusually short.

    11. Let’s consider that we’ve never heard of Barack Obama (bear with me), or at least we have no idea what his height is. However we do know he’s a male human being resident in the USA. Hence our belief about Obama’s height before seeing any evidence (in Bayesian terms this is our prior belief) should just be the distribution of heights of American males.

      An illustrative #Example of how to update beliefs: #insight the [[belief]] of a person follows a distribution!!! Example: barack obama's height. Suppose we cannot measure it precisely, we can still make some guess (what we believe is true) of his height.

    12. Bayesian inference might be an intimidating phrase but it boils down to just a method for updating our beliefs about the world based on evidence that we observe. In Bayesian inference our beliefs about the world are typically represented as probability distributions and Bayes’ rule tells us how to update these probability distributions.

      [[intuition]] of bayesian inference: updating beliefs (parameters in the model) based on evidence. A natural question to ask is how to update? - based on [[Bayes' rule]]

    13. sampling from a probability distribution. This means going from a set of possible outcomes to just one real outcome — rolling the dice in this example.

      [[Sampling]] means going from a set of possible outcomes to just one real outcome

    14. In many real world scenarios a continuous probability distribution is more appropriate as the outcome could be any real number and example of one is explored in the next section.

      Why modeling real word using [[continuous probability distribution]]: it makes sense because the outcome is not finite nor countable - it could be ANY REAL number

    15. discrete probability distributions as there are a finite number of possible outcomes

      [[discrete probability distribution]] in contrast to [[continuous distribution]] by the number of outcomes : [[discrete probability distribution]] only have finite number of possible outcomes??????

    16. Some uncertainty is due to our lack of knowledge is intrinsic to the world no matter how much knowledge we have. Since we are unable to completely remove uncertainty from the universe we best have a good way of dealing with it. Probability distributions are exactly that and it turns out that these are the key to understanding Gaussian processes.

      Some [[uncertainty]] in the world are intrinsic([[uncertainty principle]]). These uncertainty cannot be removed regardless of how much knowledge we have. The way to deal with the intrinsic uncertainty can be better understood by [[Gaussian process]] under the category of [[probability distribution]]

    17. Gaussian processes are another of these methods and their primary distinction is their relation to uncertainty.

      [[Gaussian process]] has distinctive relation to [[uncertainty]]

    18. Machine learning is an extension of linear regression in a few ways. Firstly is that modern ML

      Machine learning is an extension to linear model which deals with much more complicated situation where we take few different inputs and get outputs.

    19. Machine learning is using data we have (known as training data) to learn a function that we can use to make predictions about data we don’t have yet.

      Machine learning is linear regression and is used for making prediction

    1. Mathematical explanations are fundamentally different because no part of a mathematical system can be otherwise than it is given without changing the entire system as a whole.

      Why mathematics statement is not causalL

    2. Mathematical explanations are not causalFrom someone whose primary professional activity is to teach the mathematical modelling of causal relations, this might seem like a controversial, not to say contrary statement. But while mathematics can be used to describe causal relations — that is to say mathematical systems can be put into correspondence with real-world relationships — those systems, in contrast to the theoretical systems of many other disciplines, are not themselves causally constructed.

      Mathematical explanations are not causal

    3. Everyone who has studied mathematics has experienced the chill of mathematical limbo, but my experience is that graduates of other disciplines, accustomed to prompt acquisition of perhaps only provisional apprehension of a theoretical whole, find the cold all the more acutely uncomfortable on account of its unfamiliarity. For them, that long interlude without even a glimpse of the big picture can be intensely disconcerting if not downright terrifying.

      ... sad and true - -

    4. In the course of this instruction, I have noticed a number of fundamental differences in perspective and disposition between graduates from different disciplinary backgrounds. The identification and recognition of these paradigmatic distinctions has been instrumental in helping those individuals master those foundational mathematical concepts that were otherwise so elusive to them.

      I am reading this articles to identify if I have perspectives bad for understanding maths

    5. I will argue that high-functioning in the literary humanities (as well as the social sciences) can actually present a substantial impediment to the mastery of the mathematical language that mediates scientific knowledge.

      The key points of the paper

    6. intellectual activity into two distinct “cultures”: science, engineering and mathematics on the one hand, and what he called literary humanities on the other

      Intellectual activity being separated into scientific vs literary humanities.

  6. Mar 2021
    1. The Taylor series of a real or complex-valued function f (x) that is infinitely differentiable at a real or complex number a is the power series f ( a ) + f ′ ( a ) 1 ! ( x − a ) + f ″ ( a ) 2 ! ( x − a ) 2 + f ‴ ( a ) 3 ! ( x − a ) 3 + ⋯ , {\displaystyle f(a)+{\frac {f'(a)}{1!}}(x-a)+{\frac {f''(a)}{2!}}(x-a)^{2}+{\frac {f'''(a)}{3!}}(x-a)^{3}+\cdots ,}

      What's the connection between a series and the function?

      -- because of the phrase: "the Taylor series of a ... function "

    1. one of Julia's most important benefits is its ability to achieve C-like performance while keeping the syntax elegant.

      C-like performance with elegant syntax

  7. Feb 2021
    1. 对实数集 中的一些子集定义测度,就是为它们定义长度。从概率的角度来看,确定区间 的某个子集的测度,就是确定在区间 中随机取一点,它落在这个子集内的概率。

      利用几何(长度)和概率来理解测度。

    1. He skillfully takes the focus off from matrices and shifts the reader’s attention more towards linear mappings. This makes his proofs elegant, simple, and pleasing. Conscious of the reader’s possible unfamiliarity as well as time frame, Axler does a fine job of preparing and developing readers’ understanding rather than fully detailing application methods and formulas.

      Linear Algebra Done Right is a book that develop reader with understanding gradually... so if I don't find my understanding gradually developed, what does it mean?

    1. sigma代数涵盖了“一切可能事件的一切可能组合”

      一切可能事件的一切组合似乎是个不错的信息的定义。。。

    2. 我们说是随机变量Z所携带的信息

      我们知道Z,就知道\(\sigma(Z)\)里的事件有没有发生

      例如我们知道Z=3,就知道\(\{(3,3),(3,4),(4,3)\} \in \sigma\langle Z\rangle\)发生了,但这并不代表知道了\(\{(3,3)\} \notin \sigma\langle Z\rangle\)

    3. 所以这个sigma代数越大,那么信息就越多

      最大的sigma代数是\(\mathcal{F}\)?

      • sigma代数的大小如何理解?
      • 信息的定义?
    4. 滤链(filtration)

      $$\left\{\mathcal{I}_{n}, n \geq 0\right\}$$

      一组滤链

      • 滤链定义在$$(\Omega, \mathscr{F}, \mathscr{P})$$

      • 滤链满足$$\mathcal{I}{0} \subseteq \mathcal{I}{1} \subseteq \mathcal{I}_{2} \subseteq \cdots$$

    5. 所以为什么说sigma代数表示信息?其实当我们给定某个sigma代数的时候,就是假设这个sigma代数的每个事件都能确切知道其有没有发生,自然就是我们对于样本空间中什么事件发生的一部分知识啊!

      给定某个sigma代数,能够明确知道sigma代数的事件有没有发生。

    1. illness, she proposed, had a startling ability to renew our awareness of the worlds inside and outside of us.

      Why illness is important

    2. Woolf achieves two things: she argues that illness has been unfairly dismissed as unworthy of representation in literature, and, before she has even made this argument, she has already proved it by showing the vast range of experiences that illness comprises, the way sickness makes even an otherwise mundane experience seem tinged with Bardic drama, or the rainy-night fire of noir; illness, in other words, contains the grand battles and unmapped tundras and emotions bright-dark as Picasso’s harlequins present in so many books labeled “important,” yet it rarely appears as a main theme.

      Woolf achieves justification for illness

    3. “Considering how common illness is,” Woolf writes, how tremendous the spiritual change that it brings, how astonishing, when the lights of health go down, the undiscovered countries that are then disclosed, what wastes and deserts of the soul a slight attack of influenza brings to view, what precipices and lawns sprinkled with bright flowers a little rise of temperature reveals, what ancient and obdurate oaks are uprooted in us by the act of sickness, how we go down into the pit of death and feel the waters of annihilation close above our heads and wake thinking to find ourselves in the presence of the angels and the harpers when we have a tooth out and come to the surface in the dentist’s arm-chair and confuse his “Rinse the mouth—rinse the mouth”

      The elaboration of why it is

      a strange thing illness has not taken its place with love and battle and jealousy among the prime themes of literature

    4. Still, despite her macabre history with influenza, Woolf still managed to scoff at the pandemic in its early days. In a diary entry from 1918, she off-handedly recorded a neighbor’s succumbing to influenza along with the weather, as if both were equally mundane and unimportant: “Rain for the first time for weeks today, & a funeral next door; dead of influenza.” A few months later, she remarked sarcastically, upon noting that her writerly friend Lytton Strachey was avoiding London due to the pandemic, that “we are, by the way, in the midst of a plague unmatched since the Black Death, according to the Times, who seem to tremble lest it settle upon Lord Northcliffe, & thus precipitate us into peace.” The sardonic tone suggests that Woolf initially viewed the pandemic as a bit of an overblown joke, the comparisons to the plague histrionic.

      Woolf's view towards the pandemic happened during 1918

  8. Jan 2021
    1. 到了成熟的资本主义时期,一切都被商品化——按照马克思的说法,旧时代“一切固定的东西都烟消云散了”。

      经济层面的现代性就是将一切商品化的趋势

    2. 它的核心是,个人乃最高的价值,个人及其权利是社会的法律、政治、经济和文化的根基。

      从自由主义的角度谈论现代性

    1. This, in turn, leads us to explore our own selves more deeply, drifting through our Escherian staircases, our orca-dotted seas. In this way, illness becomes a bridge to writing about both ourselves and the world around us more sharply. Illness is itself novelistic, epic, lyric, if we allow ourselves to express its contours.

      Woolf's main idea on illness.