1,046 Matching Annotations
  1. Oct 2021
    1. Looking at the OLS plot, we get the impression that we've identified certain text messages that outperform others. The Bayesian plots show us that this perception is incorrect. According to the Bayesian models, we have almost no idea which messages are better than others.

      Two of my colleagues ask:

      I'm a bit confused about where the final two graphs have come from - it almost looks like an error?! The Bayesian models clearly did indicate variability among the messages, just not as much as OLS, then all of a sudden the messages seem exactly the same as one another in the final graphs. They look like the presentation of prior distributions around .02 rather than a posterior estimated from the data!

    2. Imagine we randomly select one text message and put the data for that treatment in a locked box. What should our prior belief about the effect of this text message be? Empirical Bayes says, roughly, that our prior belief about the effect of the message we locked in the box should be the average effect of the other 18. We can also use the variability in the effects of the other 18 messages to tell us how confident we should be in our prior, giving us a prior distribution.

      Something like ‘estimating a posterior for the effect of treatment 1 using the effects on treatments 2-17 to construct a prior’ …

      it feels like ‘unfair peeking’ along one margin perhaps, leading to a too-narrow posterior for the treatment effects overall, but perhaps a reasonable posterior for relative effects?

    3. We can apply the same logic to the flu study. Imagine we randomly select one text message and put the data for that treatment in a locked box. What should our prior belief about the effect of this text message be? Empirical Bayes says, roughly, that our prior belief about the effect of the message we locked in the box should be the average effect of the other 18. We can also use the variability in the effects of the other 18 messages to tell us how confident we should be in our prior, giving us a prior distribution.

      I almost get this argument but something feels missing

    4. In Bayesian terms, we've constructed a prior belief about the next season's rookies' OBP from data about this season's rookies' OBP.

      but isn't this like using "data from previous studies" as you said the Classical Bayes did?

      Others at RP:

      I agree, using last season's data to estimate the likely performance of a fresh rookie is just a prior plain and simple, and not quite the same as generating the prior of each condition of a current experiment from all the other conditions, one by one!

    5. That's where empirical Bayes comes to the rescue. Empirical Bayes estimates a prior based on the data

      my reading of McElreath is that he would be against that sort of 'peeking'

    6. The original data aren't yet available, but we can approximately reproduce the data given what we know about the study. We know that 47,306 participants were evenly assigned to one of 19 treatments or a control condition. The outcome was binary (did the patient get a vaccine or not), and we know the vaccination rate in each treatment from the PNAS publication.

      How is this not exactly the same as what you would get from the individual data? What more information would that get you?

    7. You should use Bayesian analysis when comparing 4 or more "things."

      This seems like a strange blanket rule. Only if I'm comparing 4 or more things?

    1. declare_sampling(S = complete_rs(N, n = 50)) +

      I'm not sure this is letting sample size vary in a meaningful way ... or the way intended. With the 'n=50' option, it seems to do the same thing when we compare designs?

    2. diagnose_design(designs)

      show the output here.

      What is 'coverage' telling us here? Something about traditional confidence intervals versus the simulated ones?

    3. Our simulation and diagnosis tools can take a list of designs and simulate all of them at once, creating a column called design to keep track. For example: diagnose_design(designs)

      This is spilling out a massive amount of

      Warning messages:
      1: The argument 'estimand = ' is deprecated. Please use 'inquiry = ' instead.
      
    4. A designer is a function that makes designs based on a few design parameters.

      Making the arguments the things we wish to consider varying.

      i was confused at first because I thought designer was a new object in your package.

    5. redesign(design, N = c(100, 200, 300, 400, 500))

      This generated a list of (lists of?) dataframes. I'm not sure what to do with it. I guess it generates the simulated sample data, estimates, etc. (but not diagnosands?) for each of the compinations in the list of arguments, here the sample sizes of 100, 200, etc... (as well as the original 50)?

    6. diagnose_design(simulation_df, diagnosands = study_diagnosands)

      when I run this I get a much larger output matrix, including columns Design, Inquiry, and more...

    7. Building a design from design steps

      Suggestion: You might put this section first so that people have an idea of how it all fits together. Otherwise it's hard to understand what each step is al about and why we are doing this.

    8. declare_inquiry(PATE = mean(Y_Z_1 - Y_Z_0)) +

      but where were Y_Z_1 and Y_Z_0 defined as potential outcomes?

      Y_Z_1 and Y_Z_0 are created (and named) as potential outcomes as a result of the potential_outcomes function

    9. difference_in_means

      "Difference-in-means estimators that selects the appropriate point estimate, standard errors, and degrees of freedom for a variety of designs: unit randomized, cluster randomized, block randomized, block-cluster randomized, matched-pairs, and matched-pair cluster randomized designs"

      difference_in_means takes the place of 'a model function, e.g. lm or glm' (default is lm_robust)

    10. diagnose_design(design, diagnosands = study_diagnosands)

      What seems to be missing here is ... what if I want the free parameter to be the treatment effect size ... i.e., minimum detectable effect size? Maybe they give a guide to this elsewhere?

    11. declare_model( Y_Z_0 = U, Y_Z_1 = Y_Z_0 + 0.25)

      I'm not sure that this first bit of code was necessary. What does it do? Is this meant to be here?

      Ah, this is another syntax for declaring/defining the potential outcomes... you should assign it to an object so this is clear. And is the first part standalone, or does it need N and U to be defined?

    12. When we run this population function, we will get a different 100-unit dataset each time, as shown here.

      You don't really show how to generate this dataset at this point

      You should add the draw_data(+ NULL) thing here

    13. As an example of a two-level hierarchical data structure, here is a declaration for 100 households with a random number of individuals within each household.

      cf fundraising pages and donations/donors

    14. As a bonus, the data also includes the probability that each unit is assigned to the condition in which it is in (Z_cond),

      where is Z_cond? I don't see it

    1. see also [39])

      The second study seems to go a bit in the opposite direction -- the people in the vignettes did seem to be rewarded (assessed as having both higher warmth and competence) by the participants for achieving a greater benefit, holding cost constant

    2. Although altruism is generally rewarded, several studies suggest that effective altruism is not [6,36,37].

      this seems important. I like how it is stated here "several studies suggest" ... rather than as a universal truth

    3. Parochialism also biases cost-benefit calculations that could lead to effective giving.

      this doesn't seem like evidence for a bias for to me. Also, the evidence seems a bit thin here.

      And why should parochialism cause a bias to CB calculation anyway?

    4. K.F. Law, D. Campbell, B. GaesserBiased benevolence: The perceived morality of effective altruism across social distancePersonal Soc Psychol Bull (2021), Article 014616722110027

      return to this

    5. approximately two thousand cases of trachoma—a bacterial infection of the eye that can lead to permanent blindness [4].

      This latter figure has been 'recanted' by Singer and others, see posts on the EA forum. It is true that charities differ by order of magnitude in the effectiveness, this particular claim about blindness prevention is overstated.

  2. Sep 2021
    1. ...: Optional model-function-specific parameters. As with args, these will be quosures and can be varying().

      I think this is meant to be a first-level item (byllet)

    1. Linear Models: the absolute value of the t-statistic for each model parameter is used.

      t-stat is the coefficient divided by the estimated standard error of the coefficient.

      But we are already using 'standardized coefficients' ... so (recall) why wouldn't these already denote importance?

  3. Aug 2021
    1. Encourage more “here’s what we aren’t doing” posts (e.g. OpenPhil posting what they don’t fund)

      How does this encourage retention? Can you elaborate on the theory of change here?

    2. Make EA easier for women with children

      In what ways? Is it a timing/scheduling thing? Or should, e.g., GWWC have a 'child deduction' in counting income for pledges?

    3. Number of people who listed that reason (without prompting)

      Recall this is the count of people who list that they think 'this is the reason for others leaving'. The table, seen by itself, might make it seem like these are reasons people themselves cited for leaving

    4. I spoke with approximately 20 people who were recommended on the basis of their knowledge about EA retention. These were mostly non-University group organizers. There was moderate agreement about the reasons people leave and stay in EA.The major reasons why people leave EA are: inability to contribute, lack of cultural fit or interpersonal conflict, major life events (moving, having a child), burnout/mental health.

      This seems like a qualitative/vignette sort of study. Interviews were conducted with individuals separately.

      I think there is some risk of a sort of 'double-counting' here: people may be reporting and taking on-board what they heard others say or write.

    1. Given this, CEA is evaluating alternative metrics. Our current top choice is to focus on people who use our products, instead of those who are "engaged" with EA in a more subjective sense. This allows us to analyze larger populations, improving the power of our tests.

      This seems very promising to me, I'd definitely recommend pursuing this approach.

      Also, as I said in the notes on the other post, you can get more of an upper bound to complement these lower bounds by reaching out to people/ giving people incentives to respond, and using 'those who respond' as the denominator.

    2. For example, to detect a change in retention rate from 95% to 90%, we need a sample of 185 individuals.[2]

      the basis for this calculation seems approximately correct, but "detect" has a particular operationalization here. You still should be able to have some reasonably tight credible-intervals over the change in retention with a smaller sample.

      That said, the selectivity of the sample, and some of the concerns above, make this limited in other ways

    3. remaining attendees who worked for an EA organization.

      But this is a very special group. 'People who work for an EA org' should not be taken as representative of a typical engaged EA

    4. population, 50-70% of the individuals who engaged in 2020 also engaged in some way in 2021.

      again this is likely a lower bound because of:

      • limits to email matching
      • other forms of engagement not mentioned here, e.g., local EA group meetings, EA Facebook groups, etc

      Make it clearer that its a lower bound, I would suggest

    5. Over the past six months, CEA has moved to unify our login systems. As of this writing, event applications, the EA Forum, and EA Funds/GWWC all use the same login system. This means that we are less likely to have issues with people using different emails.

      This is really promising!

    6. Peter Wildeford has done the largest non-manual retention analysis I know, which looked at the percentage of people who answered the EA survey using the same email in multiple years. He found retention rates of around 27%, but cautioned that this was inaccurate due to people using different email addresses each year.

      He didn't really consider these to be 'retention rates in EA'

    7. 50-70% of people who engaged with CEA's projects in 2020 also engaged with one of our projects so far in 2021, using a naïve method of matching people (mostly looking at email addresses).

      Maybe better to note that this is likely a lower bound.

    8. engagement with CEA's projects as a proxy.

      "Projects" makes it sound like a higher level of engagement than it is. Accessing the EA forum is high engagement but 'projects' sounds like they were given funding, working for CEA, etc.

    1. 29.8% is much closer to the annual retention estimate produced by Peter Wildeford based on the 2018 EA Survey

      This is probably not a correct interpretation of what Peter was saying ... something like "30% return to complete the EA Survey again year on year" is not at all the same as a measure of 'retention'

    2. EAG Reattendees

      this needs to be made more clear -- what exactly are the denominators here? And what is the outcome denoted 'retention'? Does an EAG attendee have to go to another EAG to count as 'retained'?

    3. We identified markers of retention that we could reliably analyze from year to year for the three cohorts. 

      These are mainly "search for positive values" methods .. I expect high precision (few false positives) but poor recall (many false negatives). I.e., I expect this to underestimate retention for the group in question.

      An alternative which could get at the upper bound would take the people that you could locate/contact as the denominator. ...

      And perhaps so that it wasn't too much of an overestimate, give a strong incentive for people to respond/identify themselves.

    4. Employees at certain organizations connected to effective altruismRecipients of Community Building GrantsAttendees of EA Global

      This is obviously a selected group that is on the very very highly engaged end of the spectrum.

    1. specify() allows you to specify the variable, or relationship between variables, that you’re interested in. hypothesize() allows you to declare the null hypothesis. generate() allows you to generate data reflecting the null hypothesis. calculate() allows you to calculate a distribution of statistics from the generated data to form the null distribution.

      This is the core, it seems.

    2. If this probability is below some pre-defined significance level 𝛼α\alpha, then we can reject our null hypothesis.

      NHST ... can this tool accommodate more informative measures like Bayes Factors (as I understand them)

    3. we start by assuming that the observed data came from some world where “nothing is going on” (i.e. the observed effect was simply due to random chance), and call this assumption our null hypothesis.

      very uch classical frequentist NHST

    1. formalS4classes describe the data model and the conditional test procedures,consisting of multivariate linear statistics, univariate test statistics and a reference distribu-tion.

      what are s4 classes... I wish I knew

  4. Jul 2021
    1. To create a symbol, simply convert a string or a variable containing a string with the sym() function. Plug in these symbols anywhere dplyr expects a symbol or an expression of only one field name as a function argument using the !! operator.

      you need to turn a field name (text) into a 'symbol object' and then you can use it to represent a tibble column etc. But I don't understand how this is different from a quosure

    2. To capture a quosure, simply capture an expression within the quo() function

      the example below is great:

      # Capture a quosure
      gear_filter <- quo(gear == 5)
      
      # Filter our data with our quosure
      mtcars %>%
        filter(!!gear_filter)
      
    3. he R interpreter does not evaluate the expression before it is passed to the mutate function. Since the expression is passed in a raw format to the function, mutate can do whatever it wants with it and evaluate it however it likes

      a function gets the 'raw' argument, here cyl+1 ... not an evaluated version of it (here would be a vector of numbers)

      So the function can do what it wants with cyl+1

    1. The funding overhang also created bottlenecks for people able to staff projects, and to work in supporting roles.

      I don't understand what the 'bottlenecks' being referred to here are.

    2. Personally, if given the choice between finding an extra person for one of these roles who’s a good fit or someone donating $X million per year, to think the two options were similarly valuable, X would typically need to be over three, and often over 10.

      this is shockingly high

    3. A big uncertainty here is what fraction of people will ever be interested in EA in the long term – it’s possible its appeal is very narrow, but happens to include an unusually large fraction of wealthy people. In that case, the overhang could persist much longer.

      motivation for 'market testing' survey/outreach work.

    4. Working at an EA org is only one option, and a better estimate would aim to track the number of people ‘deployed’ in research, policy, earning to give, etc. as well.

      we have this in the EA survey ... something we could put more focus on in cross-year work

    5. Overall, my guess is that we’re only deploying 1–2% of the net present value of the labour of the current membership.

      I don't get where this estimate come from. And is this 1-2% of lifetime or 1-2% of the available labour per year?

    6. GWWC members, EA Funds, Founders Pledge members, Longview Philanthropy, SFF, etc. have all grown significantly (i.e. more than doubling) in the last five years.

      this is not reflected in EA Survey data, although that may not be picking up the same things. The EA Survey seems to show fairly constant donation rates across years, and response rates are not increasing.

      But it could be that differential non-response to the EA survey is masking a trend of growth and donations.

    7. t could crash the price

      this needs to be a bit more nuanced. this is a particular kind of 'iliquidity' .. and I'm not even sure that's the right word.

      If 'selling means crashing', one should reestimate the value

    8. I’ve

      in above the 'crash' would usually be overstated but here the market is probably pretty volatile in response to perceived inside information

    9. hen I think the stock is most relevant, since that determines how many resources will be deployed in the long term.

      I disagree -- the total expenditure over time is important. If each year we raise and spend more and more we are still growing and having a big impact!

    1. the slope becomes r if x and y have identical standard deviations.

      important point to remember... just normalize and the correlation coefficient is the slope

  5. Jun 2021
    1. rbeta(1, alpha, alpha) - 1

      rbeta ... generates n=1 random 'deviates' from the beta function with parameters alpha ... what does this mean in context?

    1. Account for a positive correlation between the growth premium and population growth, as people are more likely to move to a fast growing city

      Is this different from 5?

    1. inary search is particularly useful for this. To do a binary search, you repeatedly remove half of the code until you find the bug. This is fast because, with each step, you reduce the amount of code to look through by half.

      I think I came up with this on my own, it seemed obvious. Anyone else?

  6. May 2021
    1. One restriction of summarise() is that it only works with summary functions that return a single value. That means that you can’t use it with functions like quantile() that return a vector of arbitrary length:

      you can use it, but it adds rows

    2. workflow for managing many models,

      We saw a workflow for managing

      • split the data apart
      • run the same model model for each of the groups
      • save this all in a single organized tibble
      • report and graph the results in different ways
    3. by_country %>% mutate(glance = map(model, broom::glance)) %>% unnest(glance)

      Note that unnest seems to spread out the elements of the glance output into columns, but as these are specific to each country, it doesn't add more rows (while unnesting resids would do so).

    4. resids = map2(data, model, add_residuals)

      how does this syntax work? How do data and model end up referring to columns in the by_country tibble?

      Because it's inside the 'mutate' I guess, so the data frame is 'implied'.

    1. mutate(fit = flatten(pmap(.l = list(.f = funcs, .formulas = models, data = dat), .f = modelr::fit_with)))

      this actually 'runs' the model. Check out the element [[4]][1] of the object defined here

    1. You can also use an integer to select elements by position: x <- list(list(1, 2, 3), list(4, 5, 6), list(7, 8, 9)) x %>% map_dbl(2) #> [1] 2 5 8

      select same element across each list

    1. The immediate consequence of the exogeneity assumption is that the errors have mean zero: E[ε] = 0, and that the regressors are uncorrelated with the errors: E[XTε] = 0.

      remember we are talking about the errors in the true equation not the estimated residuals ... the latter are set orthogonal to the X's as part of the minimization problem.

    2. OLS estimation can be viewed as a projection onto the linear space spanned by the regressors. (Here each of X 1 {\displaystyle X_{1}} and X 2 {\displaystyle X_{2}} refers to a column of the data matrix.)

      the diagram is not so clear. So the \(X\hat{\beta}\) is some vector in \(x_1, x_2\) space, but what exactly does it represent?

      Perhaps it would be the line in the direction along which we get the greatest predicted increase (per distance) in the the y variable... but so what?

    3. In other words, the gradient equations at the minimum can be written as: ( y − X β ^ ) T X = 0. {\displaystyle (\mathbf {y} -X{\hat {\boldsymbol {\beta }}})^{\rm {T}}X=0.}

      this comes from a standard first order condition in vector calculus

    4. when y is projected orthogonally onto the linear subspace spanned by the columns of X.

      Here I believe they mean ...

      • the goal is to minimize the 'length' of the residual vector, where the 'length' is defined by squaring all the residuals and adding them up

      • this goal is attained (derived through vector calculus) when 'y is projected orthogonally onto the linear subspace spanned by the column.'

      But, suppose there are 2 predictor variables, age and height. The linear subspace spanned by columns of X will typically represent (e.g.) all values of age and height (including negative and ridiculously large ones.)

      So what does it mean 'when \(\hat{y}\) is projected orthogonally unto this subspace'?

    5. ing from the moment conditions E [ x i ( y i − x i T β ) ] = 0. {\displaystyle \mathrm {E} {\big [}\,x_{i}(y_{i}-x_{i}^{T}\beta )\,{\big ]}=0.}

      rem: \(y_i - x_i^{T}\beta\) is the distance between the "predicted" (or 'projected') and actual y. It is a distance or difference in the y (outcome) dimension only. I.e., (in 2 dimensions) the 'vertical distance'.

      This is not the same as the Euclidean distance (L2 norm) between the observation in x,y space and the 'prediction plane' -- the latter is orthogonal by definition.

      Here we are saying that the sum of the 'prediction error' (the vertical distances) weighted by the values of each x is set to be zero, and this must hold for all x's. But each vertical distance is of course positive and thus each of these terms themselves must be positive. The orthogonality condition is saying 'get these weighted vertical distances to sum to zero please'.

      This condition arises as a result of the previous 'minimize the sum of squared deviations' problem.

    6. In particular, this assumption implies that for any vector-function ƒ, the moment condition E[ƒ(xi)·εi] = 0 will hold.

      but we sometimes use other fancier moment conditions in Econometrics IIRC.

    7. The OLS estimator β ^ {\displaystyle {\hat {\beta }}} in this case can be interpreted as the coefficients of vector decomposition of ^y = Py along the basis of X.

      as in principal-component analysis

    1. In statistics and signal processing, a minimum mean square error (MMSE) estimator is an estimation method which minimizes the mean square error (MSE), which is a common measure of estimator quality, of the fitted values of a dependent variable. In the Bayesian setting, the term MMSE more specifically refers to estimation with quadratic loss function. In such case, the MMSE estimator is given by the posterior mean of the parameter to be estimated. Since the posterior mean is cumbersome to calculate, the form of the MMSE estimator is usually constrained to be within a certain class of functions. Linear MMSE estimators are a popular choice since they are easy to use, easy to calculate, and very versatile. It has given rise to many popular estimators such as the Wiener–Kolmogorov filter and Kalman filter.

      And remember that the Max Likelihood estimator is setting the parameters (betas) so as to 'maximize the likelihood of the data given these parameters, assuming (e.g.) a normally distributed error structure'.

      In contrast, the Bayesian 'estimator' (really the posterior) will consider the likelihood of different parameters given the data (the 'likelihood') and the prior (normal) distribution.

  7. Apr 2021
    1. Principal axis factoring is not to be confused with principal components analysis (PCA), which strictly speaking is not a type of common factor analysis because it generates components rather than factors. Unlike factors, components include the unique variances of the observed variables. This is similar to how we calculate composite scores in classical test theory (though PCA finds the best possible solution to reducing variables, with their error, into composites). PCA is appropriate when we are not interested in assuming that there is an underlying latent variable such as a construct corresponding with the components. When we assume that the factors do represent latent traits, common factor analysis is more appropriate than PCA. Because we are interested in measuring latent traits or constructs, we do not use PCA in this chapter.

      PCA versus EFA

    2. library(psych) fa(r = , nfactors = , n.obs = , fm = , rotate = ) The first argument, which can replace the r =, is for the data, whether it be a data frame of the raw data or a correlation or covariance matrix.

      So whatever this is doing it's a function of the covariance matrix and the sample size

    1. but in order to account for correlations, the current best-practice approach is to follow Katz, Kling and Liebman (2007) in calculating bootstrapped estimates of adjusted p-values using a modification of the free step-down algorithm of Westfall and Young (1993).

      outdated?

    1. Do a bit more work. Re-check that your project is still in a functional state. Commit again but this time amend your previous commit. RStudio offers a check box for “Amend previous commit” or in the shell: git commit --amend --no-edit The --no-edit part retains the current commit message of “WIP”. Don’t push! Your history now looks like this: A -- B -- C -- WIP* but the changes associated with the WIP* commit now represent your last two commits, i.e. all the accumulated changes since state C. Keep going like this. Let’s say you’ve finally achieved

      repeated 'amending commits' to avoid a clutter of commits. You can't push in the interim

    1. git commit --all -m "WIP" git checkout master Then when you come back to the branch and continue your work, you need to undo the temporary commit by resetting your state. Specifically, we want a mixed reset. This is “working directory safe”, i.e. it does not affect the state of any files. But it does peel off the temporary WIP commit. Below, the reference HEAD^ says to roll the commit state back to the parent of the current commit (HEAD). git checkout issue-5 git reset HEAD^

      quick switches between branches without losing content

  8. Mar 2021
    1. . Bilder and Loughin (2007) introduceda flexible loglinear modeling approach that allows researchers to consider alternative associationstructures somewhere between SPMI and complete dependence. Within this framework, a modelunder SPMI is given aslog(μab(ij))=γij+ηWa(ij)+ηYb(ij)

      association under independence (as a linear model). Probability is the product, so log probability is the sum of each probability

    2. asymptotic distribution is a linear combination ofindependentχ21random variables (Bilder and Loughin, 2004)

      but with a different asymmetric distribution

    3. a test for simultaneous pairwise marginalindependence (SPMI), involves determining whether eachW1, . . . ,WIis pairwise independent of eachY1, . . . ,YJ. OurMI.test()function calculates their modified Pearson statistic

      it sums the Chi-sq over pairwise combinations

    4. Examining all possible combinations of the positive/negative item responses between MRCVsis the preferred way to display and subsequently analyze MRCV data.

      rather bulky

    Annotators

  9. Feb 2021
    1. To deal with this, we organised all of the factors into six overarching categories, comprising three barriers and three facilitators: 1. Difficulties in accessing evidence (six studies) 2. Challenges in understanding the evidence (three studies) 3. Insufficient resources (six studies) 4. Knowledge sharing and ease of access (six studies) 5. Professional advisors and networks (three studies) 6. A broader definition of what counts as credible evidence and better standardisation of reporting (three studies).

      barriers and facilitators organised - seems to miss psychological factors?

    1. hey run conjoint analysis: in which customers are offered goods with various combinations of characteristics and price – maybe a pink car with a stereo for £1,000, a pink car without a stereo for £800, a blue car for £1,100 and a blue car without a stereo for £950 – to identify how much customers value each characteristic.

      But these are usually (always) hypothetical choices, I believe.

    2. et me tell you a story. Once upon a time, researcher Dean Karlan was investigating microloans to poor people in South Africa, and what encourages people to take them. He sent people flyers with various designs and offering loans at various rates and sizes. It turns out that giving consumers only one choice of loan size, rather than four, increased their take-up of loans as much as if the lender had reduced the interest rate by about 20 percent. And if the flyer features a picture of a woman, people will pay more for their loan – demand was as high as if the lender had reduced the interest rate by about a third. Nobody would say in a survey or interview that they would pay more if a flyer has a lady on it. But they do. Similarly, Nobel Laureate Daniel Kahneman reports that, empirically, people are more likely to be believe a statement if it is written in red than in green. But nobody would say that in a survey, not least because we don’t know it about ourselves.

      on self-reported motivations

    1. Just like K-means and hierarchical algorithms go hand-in-hand with Euclidean distance, the Partitioning Around Medoids (PAM) algorithm goes along with the Gower distance.

      why can't I do hierarchical with Gower distance?

    2. The silhouette width is one of the very popular choices when it comes to selecting the optimal number of clusters. It measures the similarity of each point to its cluster, and compares that to the similarity of the point with the closest neighboring cluster. This metric ranges between -1 to 1, where a higher value implies better similarity of the points to their clusters.

      This is under- explained.

      Silhouette width of each obs: Scaled measure of dissimilarity from (nearest) neighbor cluster relative to dissimilarity from own cluster.

    3. library(cluster)gower_df <- daisy(german_credit_clean, metric = "gower" , type = list(logratio = 2))

      Code needs a line

        mutate_if(is.character, as.factor)
      

      To avoid an error

    4. We find that the variable amount needs a log transformation due to the positive skew in its distribution.

      just by visual inspection?

      the others DON'T all seem normally distributed to me

    1. For each observation iii, calculate the average dissimilarity aiaia_i between iii and all other points of the cluster to which i belongs. For all other clusters CCC, to which i does not belong, calculate the average dissimilarity d(i,C)d(i,C)d(i, C) of iii to all observations of C. The smallest of these d(i,C)d(i,C)d(i,C) is defined as bi=minCd(i,C)bi=minCd(i,C)b_i= \min_C d(i,C). The value of bibib_i can be seen as the dissimilarity between iii and its “neighbor” cluster, i.e., the nearest one to which it does not belong. Finally the silhouette width of the observation iii is defined by the formula: Si=(bi−ai)/max(ai,bi)Si=(bi−ai)/max(ai,bi)S_i = (b_i - a_i)/max(a_i, b_i).

      Silhouette width of each obs: Scaled measure of dissimilarity from (nearest) neighbor cluster relative to dissimilarity from own cluster.

    2. The total WSS measures the compactness of the clustering and we want it to be as small as possible.

      as small as possible (within sample) for a given number of clusters

    3. To avoid distortions caused by excessive outliers, it’s possible to use PAM algorithm, which is less sensitive to outliers.

      another solution to outliers?

    4. Next, the wss (within sum of square) is drawn according to the number of clusters. The location of a bend (knee) in the plot is generally considered as an indicator of the appropriate number of clusters.

      need more explanation here. What is the value of this "within sum of square" and why does a 'bend' lead to the appropriate number

    5. K-means algorithm can be summarized as follow: Specify the number of clusters (K) to be created (by the analyst) Select randomly k objects from the dataset as the initial cluster centers or means Assigns each observation to their closest centroid, based on the Euclidean distance between the object and the centroid For each of the k clusters update the cluster centroid by calculating the new mean values of all the data points in the cluster. The centoid of a Kth cluster is a vector of length p containing the means of all variables for the observations in the kth cluster; p is the number of variables. Iteratively minimize the total within sum of square. That is, iterate steps 3 and 4 until the cluster assignments stop changing or the maximum number of iterations is reached. By default, the R software uses 10 as the default value for the maximum number of iterations.

      the implicit claim is that this 'mean-finding' procedure will minimise the sum of squared distances

    1. A successful evaluation of discriminant validity shows that a test of a concept is not highly correlated with other tests designed to measure theoretically different concepts.

      But what if the traits you are trying to measure are actually correlated in the real world?

    1. The remaining term, 1 / (1 − Rj2) is the VIF. It reflects all other factors that influence the uncertainty in the coefficient estimates. The VIF equals 1 when the vector Xj is orthogonal to each column of the design matrix for the regression of Xj on the other covariates. By contrast, the VIF is greater than 1 when the vector Xj is not orthogonal to all columns of the design matrix for the regression of Xj on the other covariates. Finally, note that the VIF is invariant to the scaling of the variables

      VIF interpretation

    2. It turns out that the square of this standard error, the estimated variance of the estimate of βj, can be equivalently expressed as:[3][4] var ^ ( β ^ j ) = s 2 ( n − 1 ) var ^ ( X j ) ⋅ 1 1 − R j 2 , {\displaystyle {\widehat {\operatorname {var} }}({\hat {\beta }}_{j})={\frac {s^{2}}{(n-1){\widehat {\operatorname {var} }}(X_{j})}}\cdot {\frac {1}{1-R_{j}^{2}}},} where Rj2 is the multiple R2 for the regression of Xj on the other covariates (a regression that does not involve the response variable Y). This identity separates the influences of several distinct factors on the variance of the coefficient estimate: s2: greater scatter in the data around the regression surface leads to proportionately more variance in the coefficient estimates n: greater sample size results in proportionately less variance in the coefficient estimates var ^ ( X j ) {\displaystyle {\widehat {\operatorname {var} }}(X_{j})} : greater variability in a particular covariate leads to proportionately less variance in the corresponding coefficient estimate The remaining term, 1 / (1 − Rj2) is the VIF. It reflects all other factors that influence the uncertainty in the coefficient estimates

      a useful decomposition of the variance of the estimated coefficient

    1. When treatment assign-ment takes place in waves, it is natural to adapt Thompson sampling by assigning a non-random numberpdtNtof observations in wavetto treatmentd, in order to reduce ran-domness. The remainder of observations are assigned randomly so that expected sharesremain equal topdt.

      not sure what this means

    1. Q = 12 n k ( k + 1 ) ∑ j = 1 k ( r ¯ ⋅ j − k + 1 2 ) 2 {\displaystyle Q={\frac {12n}{k(k+1)}}\sum _{j=1}^{k}\left({\bar {r}}_{\cdot j}-{\frac {k+1}{2}}\right)^{2}} . Note

      Q is something that will increase the more certain wine tends to be ranked systematically lower or higher than average

    2. Find the values r ¯ ⋅ j = 1 n ∑ i = 1 n r i j {\displaystyle {\bar {r}}_{\cdot j}={\frac {1}{n}}\sum _{i=1}^{n}{r_{ij}}}

      average rank of wine j across all raters

  10. Jan 2021
    1. Definitions

      @Jasonschukraft wrote:

      Not sure where to put this comment, but how are you thinking about uncertainty about effectiveness? There's a small pool of donors who deny that GiveWell has identified the most effective global poverty/health charities because (e.g.) GiveWell is too focused on "randomista" interventions and doesn't give enough weight to "systematic" interventions.

    2. Individual donors, governments and firms demonstrate substantial generosity (e.g., UK charity represents 0.5-1% of GDP, US charity around 2% of GDP).

      Things to emphasize, from Jason Shukraft conversation.

      Do the ‘masses of donors’ matter, or only the multimillionaire response? The average person … do small donations add up Also, knowing more about how average people to respond to analytical information (in an other regarding /social context) will inform how to influence good LT decision-making. (edited) 4:05 how to get USDA to care about animals/WAW… government to care about LT

    1. how people react to the presentation of charity-effectiveness information.

      @JasonSchukraft wrote:

      Maybe. I suppose it depends on our goals. Do we want people to give to top charities for the right reason (i.e., because those charities are effective) or do we just want people to give to top charities, simpliciter? If the latter, then maybe it doesn't matter how people react to effectiveness information; we should just go with whatever marketing strategy maximizes donations.

  11. Dec 2020
    1. Beem101: Project, discussion of research

      I was asked about the 'structure' of the project. This depends on the option, on your topic choice, and on how you wish to pursue it. Nonetheless, a rough structure might look like the following:

      Across the topics (more or less... it depends on the project option and topic)

      1. Introduce the topic, model, question, overview of what you are going to do, and why this is relevant and interesting (some combination of this)
      • The Economic theory/theories and model(s) presented

      • with reference to academic authors (papers textbooks)

      • using formal (maths) modeling, giving at least one simple but formal presentation, and explaining it clearly and in your own voice (remember to explain what all variables mean),

      • considering the assumptions and simplifications of the model, the 'Economic tool/fields considered' (e.g., optimisation, equilibrium)

      • Sensitivity of the 'predictions' to the assumptions

      • The justification for these assumptions

      • Relationship between this model and your (applied) topic or focus... are the assumptions relevant, what are the 'predictions' etc.

      1. The application or real world example:
      • Explain it in practical terms and what the 'issues and questions are' (possibly engaging the previous literature a bit, but not too much)
      • describe and express it formally
      • relate it to the model/theory and apply the model theory to your real world example

      • Try to 'model it' and derive 'results' or predictions, continually justifying the application of the model to the example

      1. Presenting and assessing the insights from the model for the application and vice/versa
      • considering the relevance and sensitivity
      • what alternative models might be applied, how might it be adjusted
      • Discuss 'what modeling and theory achieved or did not achieve here'

      Note that "2" could come before or after "3" ... you can present the application first, or the model first... (or there might even be a way to go between the two, presenting one part of each)

  12. Oct 2020
    1. pure’ altruism or ‘warm glow’ altruism (Andreoni 1990; Ashraf and Bandiera 2017)

      This classification is often misunderstood and misused. The Andreoni 'Warm Glow' paper was meant to consider a fairly simple general question about giving overall, not to unpick psychological motivations.

    1. Formula[edit] The Y-intercept of the SML is equal to the risk-free interest rate. The slope of the SML is equal to the market risk premium and reflects the risk return tradeoff at a given time: S M L : E ( R i ) = R f + β i [ E ( R M ) − R f ] {\displaystyle \mathrm {SML} :E(R_{i})=R_{f}+\beta _{i}[E(R_{M})-R_{f}]\,} where: E(Ri) is an expected return on security E(RM) is an expected return on market portfolio M β is a nondiversifiable or systematic risk RM is a market rate of return Rf is a risk-free rate

      The key equation ... specifying risk vs return

    2. The Y-intercept of the SML is equal to the risk-free interest rate. The slope of the SML is equal to the market risk premium and reflects the risk return tradeoff at a given time: S M L : E ( R i ) = R f + β i [ E ( R M ) − R f ] {\displaystyle \mathrm {SML} :E(R_{i})=R_{f}+\beta _{i}[E(R_{M})-R_{f}]\,} where: E(Ri) is an expected return on security E(RM) is an expected return on market portfolio M β is a nondiversifiable or systematic risk RM is a market rate of return Rf is a risk-free rate

      This is one statement of the key relationship.

      The point is that the market will have a single tradeoff between unavoidable (nondiversifiable) risk and return.

      Asset's returns must reflect this, according to the theory. Their prices will be bid up (or down), until this is the case ... the 'arbitrage' process.

      Why? Because (assuming borrowing/lending at a risk free rate) *any investor can achieve a particular return for a given risk level simply by buying the 'diversified market basket' and leveraging this (for more risk) or investing the remainder in the risk free-asseet (for less risk). (And she can do no better than this.)

    1. If the fraction q {\displaystyle q} of a one-unit (e.g. one-million-dollar) portfolio is placed in asset X and the fraction 1 − q {\displaystyle 1-q} is placed in Y, the stochastic portfolio return is q x + ( 1 − q ) y {\displaystyle qx+(1-q)y} . If x {\displaystyle x} and y {\displaystyle y} are uncorrelated, the variance of portfolio return is var ( q x + ( 1 − q ) y ) = q 2 σ x 2 + ( 1 − q ) 2 σ y 2 {\displaystyle {\text{var}}(qx+(1-q)y)=q^{2}\sigma _{x}^{2}+(1-q)^{2}\sigma _{y}^{2}} . The variance-minimizing value of q {\displaystyle q} is q = σ y 2 / [ σ x 2 + σ y 2 ] {\displaystyle q=\sigma _{y}^{2}/[\sigma _{x}^{2}+\sigma _{y}^{2}]} , which is strictly between 0 {\displaystyle 0} and 1 {\displaystyle 1} . Using this value of q {\displaystyle q} in the expression for the variance of portfolio return gives the latter as σ x 2 σ y 2 / [ σ x 2 + σ y 2 ] {\displaystyle \sigma _{x}^{2}\sigma _{y}^{2}/[\sigma _{x}^{2}+\sigma _{y}^{2}]} , which is less than what it would be at either of the undiversified values q = 1 {\displaystyle q=1} and q = 0 {\displaystyle q=0} (which respectively give portfolio return variance of σ x 2 {\displaystyle \sigma _{x}^{2}} and σ y 2 {\displaystyle \sigma _{y}^{2}} ). Note that the favorable effect of diversification on portfolio variance would be enhanced if x {\displaystyle x} and y {\displaystyle y} were negatively correlated but diminished (though not eliminated) if they were positively correlated.

      Key building block formulae.

      • Start with 'what happens to the variance when we combine two assets (uncorrelated with same expected return)'

      • What are the variance minimizing shares and what is the resulting variance of the portfolio.

    1. “Sue’s mother” RaRaR_a “Sue’s lecturer in the UK” →→\rightarrow false (so it’s not ‘transitive’)

      I think this is where Andrea meant to ask her question:

      I wanted to ask how is this a false statement? I want to clarify. Is it that, she is a mother and and this does not relate with her being a lecturer in the UK? From my understanding the theory of transitive means that there is consistency, hence from the first statement to the last it would make sense…

    1. Students

      A household chooses how to invest ... to lay aside money for future consumption... which asset to buy To store this value and hopefully get “high payoffs” with little risk

    Tags

    Annotators

  13. Sep 2020
  14. Aug 2020
    1. 4. It is difficult to find cause neutral funding.I think funders like to choose their cause and stick with it so there is a lack of cause neutral funding. 

      A good point!

    2. (Also, I have never worked in academia so there may be theories of change in the academic space that others could identify.)

      There are some explicit 'Impact targets' in the REF, and pots of ESRC funding for 'impact activities'.

      In general I don't think we believe that our 'publications' will themselves drive change. It's more like publications $$\rightarrow$$ status $$\rightarrow$$ influence policymakers

    3. But for a new organisation to solely focus on doing the research that they believed would be most useful for improving the world it is unclear what the theory of change would be.

      I'm not quote sure how this is differentiated from 'for a big funder'

    4. I think that people are hesitant to do something new if they think it is being done, and funders want to know why the new thing is different so the abundance of organisations that used to do cause prioritisation research or do research that is a subcategory of cause prioritisation research limits other organisations from starting up.

      Very good point. I think this happens in a lot of spheres.

    5. Theoretical cause selection beyond speculation. Evidence of how to reason well despite uncertainty and more comparisons of different causes.

      I also think this may have run into some fundamental obstacles.

    6. Let me give just one example, if you look at best practice in risk assessment methodologies[5] it looks very different from the naive expected value calculations used in EA

      I agree somewhat, but I'm not sure if the 'risk-assesment methodologies' are easily communicated, nor if they apply to the EA concerns here.

    7. e. From my point of view, I could save a human life for ~£3000. I don’t want to let kids die needlessly if I can stop it. I personally think that the future is really important but before I drop the ball on all the things I know will have an impact it would be nice to have:

      Reasonable statement of 'risk-aversion over the impact that i have'

    8. Now let’s get a bit more complicated and do some more research and find other interventions and consider long run effects and so on”. There could be research looking for strong empirical evidence into:the second order or long run effects of existing interventions.how to drive economic growth, policy change, structural changes, and so forth.

      These are just extremely difficult to do/learn about. Economists, political scientists, and policy analysts have been debating these for centuries. I'm not sure there are huge easy wins here.