140 Matching Annotations
  1. Dec 2023
  2. Nov 2023
    1. // NOTE: The element exists on the original form but is hidden and gets rerendered, which leads to intermittent detached DOM issues cy.contains('Next').click().wait(4000)
    1. // instead of visiting each page and waiting for all // the associated resources to load, we can instead // just issue a simple HTTP request and make an // assertion about the response body cy.request('/admin') .its('body') .should('include', '<h1>Admin</h1>') instead of cy.visit

  3. Oct 2023
    1. Morgan, Robert R. “Opinion | Hard-Pressed Teachers Don’t Have a Choice on Multiple Choice.” The New York Times, October 22, 1988, sec. Opinion. https://www.nytimes.com/1988/10/22/opinion/l-hard-pressed-teachers-don-t-have-a-choice-on-multiple-choice-563988.html.

      https://web.archive.org/web/20150525091818/https://www.nytimes.com/1988/10/22/opinion/l-hard-pressed-teachers-don-t-have-a-choice-on-multiple-choice-563988.html. Internet Archive.

      Example of a teacher pressed into multiple-choice tests for evaluation for time constraints on grading.

      He falls prey to the teacher's guilt of feeling they need to grade every single essay written. This may be possible at the higher paid levels of university teaching with incredibly low student to teacher ratios, but not at the mass production level of public education.

      While we'd like to have education match the mass production assembly lines of the industrial revolution, this is sadly nowhere near the case with current technology. Why fall prey to the logical trap?

    1. Barzun, Jacques. “Opinion | Multiple Choice Flunks Out.” The New York Times, October 11, 1988, sec. Opinion. https://www.nytimes.com/1988/10/11/opinion/multiple-choice-flunks-out.html.

      Archived copy at https://web.archive.org/web/20231022192353/https://www.nytimes.com/1988/10/11/opinion/multiple-choice-flunks-out.html. Internet Archive.

      Barzun takes standardized multiple-choice tests to task.

      A version of this article appears in Barzun's book: Barzun, Jacques. Begin Here: The Forgotten Conditions of Teaching and Learning. University of Chicago Press, 1991. http://archive.org/details/begin-here-the-forgotten-conditions-of-teaching-and-learning.

    2. He pointed out that these questions penalize the more imaginative and favor those who are content to collect facts. Therefore, multiple-choice test statistics, in all their uses, are misleading.

      He = Banesh Hoffman

      This is tangentially similar to Malcolm Gladwell's claim that standardized testing for law school privileges certain types of thinkers over others, something which creates thinkers who are good at quick things with respect to time pressures rather than slower and more deliberate thinkers who are needed at higher level functions like the Supreme Court.

      See: The Tortoise and the Hare, S4 E2 of Revisionist History https://www.pushkin.fm/podcasts/revisionist-history/the-tortoise-and-the-hare

      testing imagination versus fact memorization/simple recall compared with thinking quickly under pressure or slowly with time and increased ability to reason

    3. Multiple-choice questions test nothing but passive-recognition knowledge, not active usable knowledge.
    4. But since their adoption, the results of the huge effort and expense of public schooling have been less and less satisfactory.

      their = multiple-choice tests


      Multiple-choice tests usually test for basic facts or simple answers, and aren't well designed for testing complex chains of reasoning, particularly at the lower levels.

    5. arguments in favor of these ''objective'' tests: They are easy to grade; uniformity and unmistakable answers imply fairness; one can compare performance over time and gauge the results of programs; the validity of questions is statistically tested and the performance of students is followed up through later years.

      Some of the benefits of multiple-choice tests.

      Barzun misses the fact that these are not just easy for teachers to grade, but they're easier for mass grading by machines in a century dominated by standardization of knowledge in a world dominated by standardized mechanization for a mass-production oriented society.

      Cross reference educational reforms of Eliot following the rise of Taylorism.

    6. It is harmful to learning and teaching.

      Barzun calls out multiple-choice tests as harmful to both learning and teaching.

    7. But to the best of my knowledge the central feature of modern schooling has never been taken up: the multiple-choice test.

      Barzun places the multiple-choice test as the central feature of modern schooling. This has a bit of a hyperbolic feel, but it's certainly a modern invention which aims to evaluate a low level of learning while still making it simple for teachers to quickly grade student's work.

      Because of it's incredibly low-level function, these multiple-choice tests should be used only for the lowest level functionality as well.

  4. Sep 2023
    1. We will try to add two tests for response code in order to know that our request was successful. Another test we will add for response time <  2 sec in order to understand how fast request was processed by server. If it will be executed slower then for 2 seconds, our test will fail. In this case I use 2 seconds just for example it might be greater or lower number, but 7 seconds is usually a maximum time for request execution. So in order to add tests, go to “Tests” in request section of application and add this few lines : tests["Status code is 200"] = responseCode.code === 200; tests["Response time is less than 200ms"] = responseTime < 2000; When this is done hit on Send button again and execute your first test.

      Good case -- importance of adding tests to validate response codes and times, ensuring optimal server performance and response.

  5. Aug 2023
    1. I ran into the same problem and never really found a good answer via the test objects. The only solution I saw was to actually update the session via a controller. I defined a new action in one of my controllers from within test_helper (so the action does not exist when actually runnning the application). I also had to create an entry in routes. Maybe there’s a better way to update routes while testing. So from my integration test I can do the following and verfiy: assert(session[:fake].nil?, “starts empty”) v = ‘Yuck’ get ‘/user_session’, :fake => v assert_equal(v, session[:fake], “value was set”)
  6. May 2023
  7. Mar 2023
    1. Detailed descriptions, assumptions, limitations and test cases of many popular statistical methods for ecological research can be found in the GUSTAME server (Buttigieg and Ramette, 2014), and in the review by Paliy and Shankar (2016).
  8. Oct 2022
  9. Aug 2022
    1. The narrator considers this as vandalism and finds it hard to believe how anyone "educated enough to have access to a university library should do this to a book." To him "the treatment of books is a test of civilized behaviour."

      Highlighted portion is a quote from Kuehn sub-quoting David Lodge, Deaf Sentence (New York: Viking 2008)

      Ownership is certainly a factor here, but given how inexpensive many books are now, if you own it, why not mark it up? See also: Mortimer J. Adler's position on this.


      Marking up library books is a barbarism; not marking up your own books is a worse sin.

  10. Jul 2022
  11. May 2022
    1. The term independent is considered more appropriate than self, as in self-hosted, considering the latter can give the wrong impression that it only refers to situations where the owners of a website decided to physically host it on hardware that is physically controlled and managed by them.

      This idea of independently hosted versus self-hosted comes up frequently in IndieWeb chat. The IndieWeb doesn't generally participate in the "purity test" of requiring full self-hosting as a result.

  12. Apr 2022
  13. Mar 2022
  14. Jan 2022
    1. Dr. Thrasher wrote a book! (2022, January 8). My cousin wanted to get tested. She waited in an auto testing line for 6.5 hours, and stayed in it bc she was traveling to bury her Daddy. How many people give up in such long lines? How many cases upwards of a million are we losing bc Biden et all failed on home tests? Https://t.co/Q7WVy5qD4v [Tweet]. @thrasherxy. https://twitter.com/thrasherxy/status/1479826389142491146

  15. Nov 2021
    1. Nobody is perfect; nobody is pure; and once people set out to interpret ambiguous incidents in a particular way, it’s not hard to find new evidence.

      Wouldn't it be better for us to focus our efforts and energies on people who are doing bigger mass scale harms on society?

      Surely the ability to protect some of these small harms undergird ability to build up protection for much larger harms.

      Why are we prosecuting these smaller harms rather than the larger (especially financial and) institutional harms?

      It is easier to focus on the small and specific rather than broad and unspecific. (Is there a name for this as a cognitive bias? There should be, if not. Perhaps related to the base rate fallacy or base rate neglect (a form of extension neglect), which is "the tendency to ignore general information and focus on information only pertaining to the specific case, even when the general information is more important." (via Wikipedia)

      Could the Jesuits' descent into the particular as a method help out here?

  16. Sep 2021
  17. Jun 2021
  18. watermark.silverchair.com watermark.silverchair.com
    1. Qureshi, A. I., Baskett, W. I., Huang, W., Lobanova, I., Naqvi, S. H., & Shyu, C.-R. (2021). Re-infection with SARS-CoV-2 in Patients Undergoing Serial Laboratory Testing. Clinical Infectious Diseases, ciab345. https://doi.org/10.1093/cid/ciab345

    1. Thus, by adding system tests, we increase the maintenance costs for development and CI environments and introduce potential points of failures or instability: due to the complex setup, flakiness is the most common problem with end-to-end testing. And most of this flakiness comes from communication with a browser.
    2. For example, Database Cleaner for a long time was a must-have add-on: we couldn’t use transactions to automatically rollback the database state, because each thread used its own connection; we had to use TRUNCATE ... or DELETE FROM ... for each table instead, which is much slower. We solved this problem by using a shared connection in all threads (via the TestProf extension). Rails 5.1 was released with a similar functionality out-of-the-box.
    3. even acceptance tests (though the latter are ideologically different)
    1. How to test at the correct level?
    2. As many things in life, deciding what to test at each level of testing is a trade-off:
    3. Unit tests are usually cheap, and you should consider them like the basement of your house
    4. A system test is often better than an integration test that is stubbing a lot of internals.
    5. Only test the happy path, but make sure to add a test case for any regression that couldn’t have been caught at lower levels with better tests (for example, if a regression is found, regression tests should be added at the lowest level possible).
    6. GitLab is transitioning from controller specs to request specs.
    7. These kind of tests ensure that individual parts of the application work well together, without the overhead of the actual app environment (i.e. the browser). These tests should assert at the request/response level: status code, headers, body. They’re useful to test permissions, redirections, what view is rendered etc.
    8. These tests should be isolated as much as possible. For example, model methods that don’t do anything with the database shouldn’t need a DB record. Classes that don’t need database records should use stubs/doubles as much as possible.
    9. Black-box tests at the system level (aka end-to-end or QA tests)
    10. White-box tests at the system level (aka system or feature tests)
    1. A common cause of a large number of created factories is factory cascades, which result when factories create and recreate associations.
    2. Test speed GitLab has a massive test suite that, without parallelization, can take hours to run. It’s important that we make an effort to write tests that are accurate and effective as well as fast.
    3. :js is particularly important to avoid. This must only be used if the feature test requires JavaScript reactivity in the browser. Using a headless browser is much slower than parsing the HTML response from the app.
    4. Use Factory Doctor to find cases where database persistence is not needed in a given test.
    5. Parameterized tests
    6. This style of testing is used to exercise one piece of code with a comprehensive range of inputs. By specifying the test case once, alongside a table of inputs and the expected output for each, your tests can be made easier to read and more compact.
  19. Apr 2021
  20. Mar 2021
    1. בזמן שבאירופה ובאזורים נוספים בעולם ממשיכים להתמודד עם התפרצויות קשות של נגיף הקורונה והווריאנטים השונים שלו, ומטילים בשל כך הגבלות חדשות, באנגליה הורשו היום (ב') מיליוני תושבים לצאת מהבתים, במסגרת גל הקלות בסגר שהוטל שם בתחילת השנה. במסגרת גל ההקלות מורשים תושבי אנגליה לצאת בחופשיות מהבתים שלהם ולהתקהל במקומות פתוחים בקבוצות של עד שישה אנשים, משני בתי אב שונים. גם פעילויות ספורט במקומות פתוחים אפשריים כעת.

      sdjk bkjsdgkbgjk

    1. Why separate out red tests from green tests? Because my green tests serve a fundamentally different purpose. They are there to act as a living specification, validating that the behaviors work as expected. Regardless of whether they are implemented in a unit testing framework or an acceptance testing framework, they are in essence acceptance tests because they’re based upon validating behaviors or acceptance criteria rather than implementation details.
    1. There’s no need to test controllers, models, service objects, etc. in isolation
    2. Run the complete unit with a certain input set, and test the side-effects. This differs to the Rails Way™ testing style, where smaller units of code, such as a specific validation or a callback, are tested in complete isolation. While that might look tempting and clean, it will create a test environment that is not identical to what happens in production.
    3. Integration tests for controllers: These Smoke tests only test the wiring between controller, operation and presentation layer.
    4. Unit tests for operations: They test all edge cases in a nice, fast unit test environment without any HTTP involved.
  21. Feb 2021
  22. Oct 2020
  23. Sep 2020
  24. Aug 2020
    1. Test Your Readiness: Data Practices

      This seems to be the same overall Readiness test available in all Chapters. Consider segmenting the Readiness test into portions that align with the particular chapter that the learner is in.

  25. Jul 2020
  26. Jun 2020
    1. It is not customary in Rails to run the full test suite before pushing changes. The railties test suite in particular takes a long time, and takes an especially long time if the source code is mounted in /vagrant as happens in the recommended workflow with the rails-dev-box.As a compromise, test what your code obviously affects, and if the change is not in railties, run the whole test suite of the affected component. If all tests are passing, that's enough to propose your contribution.
  27. May 2020
  28. Apr 2020
    1. Then the programmer(s) will go over the scenarios, refining the steps for clarification and increased testability. The result is then reviewed by the domain expert to ensure the intent has not been compromised by the programmers’ reworking.
    1. Enable Frictionless Collaboration CucumberStudio empowers the whole team to read and refine executable specifications without needing technical tools. Business and technology teams can collaborate on acceptance criteria and bridge their gap.
  29. Mar 2020
    1. . However, the data did not support a meresimilarity effect: Our results were robust to controlling for partic-ipants’ own moral judgments, such that participants who made adeontological judgment (the majority) strongly preferred a deon-tological agent, whereas participants who made a consequentialistjudgment (the minority) showed no preference between the two

      But this is a lack of a result in the context of a critical underlying assumption. Yes, the results were 'robust', but could we really be statistically confident that this was not driving the outcome? How tight are the error bounds?

    1. For automated testing, include the parameter is_test=1 in your tests. That will tell Akismet not to change its behaviour based on those API calls – they will have no training effect. That means your tests will be somewhat repeatable, in the sense that one test won’t influence subsequent calls.
  30. Jan 2020
    1. Yes; everything needed to run the tests are bundled inside the test suite or executable. There's no connections to foreign processes or systems. I.e, no talking to databases or reading files from disk. If necessary, these connection points are faked / mocked.

      Tests running in isolation don't depends on external systems to work.

  31. Dec 2019
    1. In React, there are different aspects of UI Testing. We categorize them as follows along with their tooling:
    2. Storybook integrates with Jest Snapshot through an add-on called StoryShots. StoryShots adds automatic Jest Snapshot Testing to our codebase by using our existing Storybook stories as the input for Jest Snapshot Testing.
  32. Nov 2019
    1. If you're writing a tool for developers, it's a really common case that you want to write a test to ensure that a good error or warning message is logged to the console for the developers using your tool. Before snapshot testing I would always write a silly regex that got the basic gist of what the message should say, but with snapshot testing it's so much easier.
    2. These four things lead to a near total loss in the intended utility of integrated/functional tests: as the code changes make sure nothing is broken.
    3. (After all, it's not like the past snapshot was well understood or carefully expressed authorial intent.) As a result, if a snapshot test fails because some intended behavior disappeared, then there's little stated intention describing it and we'd much rather regenerate the file than spend a lot of time agonizing over how to get the same test green again.
    4. They are generated files, and developers tend to be undisciplined about scrutinizing generated files before committing them, if not at first then definitely over time. Most developers, upon seeing a snapshot test fail, will sooner just nuke the snapshot and record a fresh passing one instead of agonizing over what broke it.
    5. Good tests encode the developer's intention, they don't only lock in the test's behavior without editorialization of what's important and why. Snapshot tests lack (or at least, fail to encourage) expressing the author's intent as to what the code does (much less why)
    1. I very rarely use snapshot testing with react and I certainly wouldn't use it with shallow. That's a recipe for implementation details. The whole snapshot is nothing but implementation details (it's full of component and prop names that change all the time on refactors). It'll fail any time you touch the component and the git diff for the snapshot will look almost identical to the one for your changes to the component.This will make people careless about to the snapshot updates because they change all the time. So it's basically worthless (almost worse than no tests because it makes you think you're covered when you're not and you won't write proper tests because they're in place).
    2. I should also add that I'm in favor of relying more heavily on integration testing. When you do this, you need to unit test fewer of your simple components and wind up only having to unit test edge cases for components (which can mock all they want).
    1. You want to write maintainable tests for your React components. As a part of this goal, you want your tests to avoid including implementation details of your components and rather focus on making your tests give you the confidence for which they are intended. As part of this, you want your testbase to be maintainable in the long run so refactors of your components (changes to implementation but not functionality) don't break your tests and slow you and your team down.
    2. We try to only expose methods and utilities that encourage you to write tests that closely resemble how your web pages are used.
    3. The more your tests resemble the way your software is used, the more confidence they can give you.
    4. Most of the damaging features have to do with encouraging testing implementation details. Primarily, these are shallow rendering, APIs which allow selecting rendered elements by component constructors, and APIs which allow you to get and interact with component instances (and their state/properties) (most of enzyme's wrapper APIs allow this).
    1. For instance, an integration test could verify that all necessary props are passed through from the tested component to a specific child component.
    1. Snapshot testing is great as it let us capture strings that represent our rendered components and the store it in a separate snapshot file to compare later in order to ensure that UI is not change. While it is ideal for React apps, we can use snapshots for comparing values that are serialized from other frameworks.
    2. Screenshot Test: Applications are not often screenshot tested. However, if the business requirement is there, screenshot tests can be used to diff two screenshots from the same application state in order to verify whether something (styling, layout, …) has changed. It’s similar to a snapshot test, whereas the snapshot test only diffs the DOM and the screenshot test diffs screenshots.
    3. Snapshot Test: Introduced by Facebook’s library Jest, Snapshot Tests should be the lightweight variation of testing (React) components. It should be possible to create a DOM snapshot of a component once a test for it runs for the first time and compare this snapshot to a future snapshot, when the test runs again, to make sure that nothing has changed. If something has changed, the developer has to either accept the new snapshot test (the developer is okay with the changes) or deny them and fix the component instead.
  33. Oct 2019
  34. Aug 2019
  35. Mar 2019
    1. This 69 page PDF offers good advice on writing a variety of types of test questions. It is called "Is this a trick question?" Despite the length of the PDF, it is easy to browse if you are interested in writing a specific type of question. As may be suggested by the length, this resource is more comprehensive than others. Rating 5/5

  36. Jan 2019
    1. Although we believe that this study establishes the presence of g in data from these non-Western cultures, this study says nothing about the relative level of general cognitive ability in various societies, nor can it be used to make cross-cultural comparisons. For this purpose, one must establish measurement invariance of a test across different cultural groups (e.g., Holding et al., 2018) to ensure that test items and tasks function in a similar way for each group.

      This is absolutely essential to understanding the implications of the article.

    2. Panga Munthu test of intelligence

      To me, this is the way to create tests of intelligence for non-Western cultures: find skills and manifestations of intelligence that are culturally appropriate for a group of examinees and use those skills to tap g. Cross-cultural testing would require identifying skills that are valued or developed in both cultures.

    1. This means that it is actually easier to measure intelligence than many other psychological constructs. Indeed, some individuals trying to measure other constructs have inadvertently created intelligence tests

      When I learned this, it blew my mind.

    2. the Stanford-Binet intelligence test

      Although the Stanford-Binet is historically important, the Wechsler family of intelligence tests have been more popular since the 1970s.

  37. Aug 2018
  38. Jan 2017
    1. Sara Holbrook had two of her poems used on the Texas state assessment tests. She verifies what I thought as a student. The questions are ridiculous. The test makers seem to think that their interpretation of a work is the only interpretation, and that they can read the author's mind and know their intent.

      "Texas paid Pearson $500 million bucks to administer the tests". Is that right? Was that for just one year? What else could we do with $500 million?

      She mentions a study showing that the results of another standardized test could be predicted pretty well using just three data points about families in the community: the percentage with income over $200K; the percentage in poverty; the percentage with bachelor's degrees. So the standardized test tells you nothing that you can't guess by looking at local incomes and education levels.

      What a scam.

  39. Aug 2016
  40. Oct 2014