59 Matching Annotations
  1. Nov 2018
    1. Disambiguation is done by appending the political division to the name of the place in order of specificity. If this fails to uniquely determine locations, distances to the closest reference points in the text are used to break ties.

      is specificity referring to the political divisions mentioned in the surrounding context? is it a scale?

    1. lynching was a ritual that made power visible, yet its power depended in part on its lack of visibility in the official records.

      data shows this? i thought this was inherent based on the act of lynching

    2. In fact, Mathews turns away from the empirical and quantitative methods most akin to data visualization in history.

      how is data collected if not relying on this info

    1. The fault is not with the source, since it is the borrowing for humanistic projects that is problematic, not the statistical graphics themselves. They work just fine for statistical mat:te

      problem is with humanists, not graphs

    2. graphical expressions suited to the needs and methods of humanists should get a boost from exposing the operations and limitations of current conventions.

      how would one suit these needs?

  2. Oct 2018
    1. people and their social lives based on metadata only, without much reference to the actual content of what they say.

      someone can obtain a lot of information about someone by looking at what groups they members of, reminds me of metadata collected by social media: you can learn a lot about someone based on groups of followers or pages that individuals like or repost

    2. I will show how we can use this “metadata” to find key persons involved in terrorist groups operating within the Colonies at the present time

      very interesting

    1. indicating the degree to which Jefferson relied on his staff to implement his various directives

      concluded that jefferson relied on his slaves heavily from the data

    2. You’ll notice that I’ve arranged the correspondents in groups—indicated by the different colors

      did this require him to go through thousands of written historical archives?

    3.  It’s an arc diagram that visualizes the people with whom Jefferson corresponded about James Hemings.

      data similar to those that we collect in class, only visual

    4. both those that the editors missed, and there are quite a few, as I’ve discovered in my research; and those that refer to Hemings by one of his several nicknames

      incredible how other scarce, historical documents can confirm this kind of information

    1. as big a proportion

      if they are comparing these words to all other abstracts relating to history, wouldn't these words show up significantly less than others because these words deal with a smaller scope of historical events?

    1. Because topic modeling transforms or compresses free data (raw narrative text) into structured data (topics as a ratio of word tokens and their strength of representation in documents) it is tempting to think of it as “solving” text

      not able to "solve" abstract texts, how can one compress raw data into significance of pieces like poetry

    1. wedo. When you don't expect much from tools, it shifts the interpretative responsibility for making senseof the rich variety of ways that texts can be represented

      important distinction: analysis can come from these programs but interpretation is reliant on the individual

    2. The hundred hours of video uploaded to YouTube everyminute would remain largely inaccessible were it not for text-based searches of the title, description, andother metadata.

      did not realize the gravity of text analysis; used everywhere

    3. called “intelligent” and blacks more likely to be called“natural.”

      effective use of text analysis in order to identify language treatment differences between race

    1. Each text is converted into a matrix of word frequencies, transforming it into an entirely numerical dataset.

      unclear how numerical data set is calculated, probability?

    2. The computer is directed to create a set of probability tables populated with random numbers, and then it gradually refines them by computing the same pair of mathematical functions hundreds or thousands of times in a row.

      probability is primary source of this type of system

    1. Needless to say, any attempts at finding direct correlations between historical events and stylistic breaks are subject to human prejudices

      if data is reliant on researcher's judgement isn't the data skewed?

    2. the method of plotting and inspecting the trend may be applied only to verify hypotheses stipulated earlier by traditional diachronic linguistics.

      suggests that researchers cannot predict this data which is problematic

    1. The results also suggest that all six measures of psychological well-being are lower in middle age compared with younger and older age groups. In addition, women appear to be less affected by unemployment than men.

      why are older and younger individuals more happy than middle-aged individuals?

    1. Kievit et al.Simpson’s paradoxFIGURE 1 | Example of Simpson’s Paradox.Despite the fact that thereexists a negative relationship between dosage and recovery in both malesand females, when grouped together, there exists a positive relationship.All figures created using ggplot2 (Wickham, 2009). Data in arbitrary units.Here, we argue that (a) SP occurs more frequently than com-monly thought, and (b) inadequate attention to SP results inincorrect inferences that may compromise not only the questfor truth, but may also jeopardize public health and policy. Weexamine the relevance of SP in several steps. First, we describeSP, investigate how likely it is to occur, and discuss work show-ing that people are not adept at recognizing it. Next, we reviewexamples drawn from a range of psychological fields, to illus-trate the circumstances, types of design and analyses that areparticularly vulnerable to instances of the paradox. Based on thisanalysis, we specify the circumstances in which SP is likely tooccur, and identify a set of statistical markers that aid in its iden-tification. Finally, we will provide countermeasures, aimed at theprevention, diagnosis, and treatment of SP—including a softwarepackage in the free statistical environment R (Team, 2013) createdto help researchers detect SP when testing bivariate relationships.WHAT IS SIMPSON’S PARADOX?Strictly speaking, SP is not actually a paradox, but a counterintu-itive feature of aggregated data, which may arise when (causal)inferences are drawn across different explanatory levels:

      can all conceptual ideas be explained through the collection of data?

    2. This apparent paradox has signifi-cant implications for the medical and social sciences:

      extremely interesting that this seemingly unrelated paradox affects medical and social sciences

    1. have had sex with either exclusively the same sex or both sexes in the past year;have had sex with either exclusively the same sex or both sexes in the past 5 years;have had at least one same-sex sex partner since 18;have had at least half of sex partners since 18 to be of the same sex.

      answers my previous question

    2. arriving at these results, I am the first to be able tocompare self-identified sexual orientation with the sexual behavior proxies utilizedin the bulk of this emerging literature

      interested in how data was collected, does an individual's employer required to know someone's sexual preference?

    1. But a lack of data has complicated efforts to understand the aggregate effects of myriad federal, state, and local efforts to reduce reoffending.

      This data is extremely useful, individuals should be able to collect this information

    2. Longer-term recidivism also fell. Prisoners released in these states in 2010 were 13 percent less likely than the 2005 cohort to return to prison at least once by the end of the fifth year after release

      I wonder what the caused this steep decline in the long term

    1. This might include differences in the individual characteristics of prisoners, changes in sentence length of those released, or the relative number of releases from each state.

      how can one attribute a specific reason for the decline

    2. The BJS study includes only those sentenced to one year or more, whereas Pew has included all releasedprisoners in each year.

      large difference in numbers

    1. n the development of national income accounts and related measures of macroeconomic behavior. The new DAE program is concerned mainly with long-term chan

      this data is much smaller in scale as opposed to collecting data at the macro level

    2. nineteenth century. Similar cycling appears to have occurred in England. A variety of factors, including crop mix, urbanization, occupation, intensity of lab

      extremely interesting that these factors contributed to height and health issues

    1. nough. And wh:11 of the numerical probahility of tot:LI ck• struc..-tion within tl1rec centuries

      seemingly relies heavily on numerical information

  3. Sep 2018
    1. Desmond’s ultimate policy solution is a universal housing voucher program, which would certainly alleviate the eviction crisis, but will require a long-term push of political will and does nothing to disrupt the profitability and exploitation in the landlord-tenant relationship that Evicted draws attention to.

      i wish they had given more detail about his voucher program proposal

    2. data gathered by the corporation is not nearly as complete or accurate as that gathered by community organizations in the state.

      relying on big companies to collect data= a lot of money and not as accurate

    3. nor did they articulate a clear political program of how the data would be used to advocate for policies to protect the tenants in the communities we work in. We asked for further clarification on how the

      if they are not creating policies to help those who are encountering this problem what is the point in collecting that data?

    1. it is only appropriate that we willbe moved to model it as a database.

      an extremely interesting way to think about the world through the lens of a database

    1. we just need to help make people more of aware of the details of doing this in the digital, as well as our physical worlds.

      this article did not do enough to inform me of my basic securities as an "average citizen" online