147 Matching Annotations
  1. Sep 2021
  2. Jul 2020
    1. The ball is now in your court: think how to use skills developed through the book to formulate good research questions, form collaborative teams, and shape the future of scholarship.

      nice

    1. .

      The challenges you describe here are real and important, and I think the story is helpful. However, I think these problems could occur in one paper not just a series of papers

    1. Therefore, we see hybrid approaches were humans and computational analysis are deeply engaged in the sense-making process as one emergent approach for doing computational social science work.

      Agreed. But I think that some of the people doing "solo computational social science" probably think they are doing hybrid. I think the lines between these could be more clear.

    1. These highlights invite us to consider new ways of doing research, relaxing the traditional strong division between theory-first (deductive) and theory-last (inductive) thinking.

      agreed. I like the contrast between theory-first and theory-last

    2. (With any and all meanings of theory - disciplines within social sciences and computational social science may have different ideas about what constitutes theory.)

      this seems to important to put in parentheses

    1. This was a long introduction to say that this book tries not to advocate for a particular way of doing research, but embrace openess on these questions.

      I would say you are advocating for an approach: an open approach.

    1. In computational social sciences, the operationalisation process involves transforming a theorethical perspective into an algorithm.

      I'm not sure this is always true

    1. Technological imagination refers to the ability to use computational tools to capture the interesting questions.

      This is important. I like the the way you make this parallel to sociological imagination.

    2. Based on my own experiences, these challenges can even become emotionally charged. Differences in core assumptions may be difficult to discuss even among academics. However, for computational social science to succeed, everyone must be able to cooperate across these types of collaborations. We will return to the opportunities and problems of multidisciplinary collaboration in Chapter 11.

      I'm glad that you address this explicitly

    3. However, the gist of the work is to ensure that the research and findings speak with existing social science literature.

      the theory-driven approach feels a bit different than the others to me because it includes work some the other categories, whereas some of the other categories are mutually exclusive.

    1. Within this book, social science theory refers to the collective knowledge developed and conceptualised in social scientists over decades.

      does "theory" also then include empirical results?

    1. decreases a negative social phenomenon or increases a positive social phenomenon.

      I often hear this community described as data science for social good.

    1. Rather, it seems to adapt methods familiar to physicists and complexity scientists to social science research problems.

      Based on this I think you are talking about more than just agent-based modeling.

    2. After the tournament, the winning strategy was tit-for-tat, a simple approach where one defected only in cases where the opposite side had already defected in the previous turn.

      To me this is a really surprising result, but you don't really tell the story and emphasize the cool ending.

    1. Thus, their work shows that political polarisation occurs in political blogs. There were clear divisions based on parties.

      could these sentences be combined?

    1. Therefore, it is hard to justify that computational social science would be a holistic discipline but rather a multidisciplinary mesh of scholars doing research using computation with social science questions in mind.

      what do you think?

    2. including me

      I like that you explicitly include yourself here. In the previous paragraph I was wondering what you think about the "end of theory" debate.

  3. Dec 2019
    1. .

      Further, I am grateful to the following people for telling me about errors and typos in the hardback edition: Nimrod Priell, David Marker, Giannis Kanellopoulos, Hiroki Takikawa, Jun Tsunematsu,Takuto Sakamoto, Shinya Obayashiand, Anna Ballarino, Arthur Spiriling, and the hypothesis user named arnaud.

  4. Aug 2019
    1. Figure 4.3: Schematic of the experimental design from Schultz et al. (2007). The field experiment involved visiting about 300 households in San Marcos, California five times over an eight-week period. On each visit, the researchers manually took a reading from the house’s power meter. On two of the visits, they placed doorhangers on each house providing some information about the household’s energy usage. The research question was how the content of these messages would impact energy use.

      In the figure, "3 week" should be "3 weeks"

    1. Figure 3.7: Demographics of respondents in W. Wang et al. (2015). Because respondents were recruited from XBox, they were more likely to be young and more likely to be male, relative to voters in the 2012 election. Adapted from W. Wang et al. (2015), figure 1.

      In the x-tick marks in the panel titled "State", Obama should be capitalized and Romney should be capitalized.

  5. Apr 2019
  6. Feb 2019
  7. Oct 2018
  8. Jul 2018
  9. Dec 2016
    1. [, ] In a lovely paper, Lewis and Rao (2015) vividly illustrate a fundamental statistical limitation of even massive experiments. The paper—which originally had the provocative title “On the Near-impossibility of Measuring the Returns to Advertising”—shows how difficult it is to measure the return on investment of online ads, even with digital experiments involving millions of customers. More generally, the paper clearly shows that it is hard to estimate small treatment effect amidst noisy outcome data. Or stated diffently, the paper shows that estimated treatment effects will have large confidence intervals when the impact-to-standard-deviation (δ¯yσ

      Here's an improved version of this activity:

      https://gist.github.com/msalganik/064678b4eb7625e3ecb25e8a65eff38b

    1. [, , ] Michel et al. (2011) constructed a corpus emerging from Google’s effort to digitize books. Using the first version of the corpus, which was published in 2009 and contained over 5 million digitized books, the authors analyzed word usage frequency to investigate linguistic changes and cultural trends. Soon the Google Books Corpus became a popular data source for researchers, and a 2nd version of the database was released in 2012. However, Pechenick, Danforth, and Dodds (2015) warned that researchers need to fully characterize the sampling process of the corpus before using it for drawing broad conclusions. The main issue is that the corpus is library-like, containing one of each book. As a result, an individual, prolific author is able to noticeably insert new phrases into the Google Books lexicon. Moreover, scientific texts constitute an increasingly substantive portion of the corpus throughout the 1900s. In addition, by comparing two versions of the English Fiction datasets, Pechenick et al. found evidence that insufficient filtering was used in producing the first version. All of the data needed for activity is available here: http://storage.googleapis.com/books/ngrams/books/datasetsv2.html In Michel et al.’s original paper (2011), they used the 1st version of the English data set, plotted the frequency of usage of the years “1880”, “1912” and “1973”, and concluded that “we are forgetting our past faster with each passing year” (Fig. 3A, Michel et al.). Replicate the same plot using 1) 1st version of the corpus, English dataset (same as Fig. 3A, Michel et al.) Now replicate the same plot with the 1st version, English fiction dataset. Now replicate the same plot with the 2nd version of the corpus, English dataset. Finally, replicate the same plot with the 2nd version, English fiction dataset. Describe the differences and similarities between these four plots. Do you agree with Michel et al.’s original interpretation of the observed trend? (Hint: c) and d) should be the same as Figure 16 in Pechenick et al.) Now that you have replicated this one finding using different Google Books corpora, choose another linguistic change or cultural phenomena presented in Michel et al.’s original paper. Do you agree with their interpretation in light of the limitations presented in Pechenick et al.? To make your argument stronger, try replicate the same graph using different versions of data set as above.

      Here's an improved version of this activity: https://gist.github.com/msalganik/21a585ff38bee58db320ed3329d801b1

  10. Nov 2016
  11. Jul 2016
  12. bit-by-bit.lukebaker.aghosted.com bit-by-bit.lukebaker.aghosted.com
  13. May 2016