So I realized what I value more than the quality of the tests and documentation is that I want somebody to have _used_ the thing.
Simon认为实际使用体验比测试和文档质量更重要,这反映了他对软件实用性的关注。
So I realized what I value more than the quality of the tests and documentation is that I want somebody to have _used_ the thing.
Simon认为实际使用体验比测试和文档质量更重要,这反映了他对软件实用性的关注。
So I realized what I value more than the quality of the tests and documentation is that I want somebody to have _used_ the thing. If you've got a vibe coded thing which you have used every day for the past two weeks, that's much more valuable to me than something that you've just spat out and hardly even exercised.
作者认为评估软件时,实际使用经验比测试和文档质量更重要,这改变了传统的软件评估标准。
Most passing SWE-Bench solutions are not accepted by maintainers.
大多数人认为通过自动化基准测试(如SWE-Bench)通过的AI系统在实际应用中也能表现良好,但作者指出事实恰恰相反——大多数通过测试的解决方案实际上并不被维护者接受。这挑战了AI评估领域的有效性,表明自动化测试可能无法反映真实世界的质量标准。
This class of bug is insidious because it evades every layer of defense. It will not be caught in development testing — who runs a test for 50 days? It will not be flagged in code review — the logic looks perfectly reasonable.
大多数人认为代码审查和测试能捕获大多数系统性缺陷,但作者认为这个bug的特殊性使其能够逃避所有常规检测手段。这挑战了软件质量保证的基本假设,暗示某些缺陷只有在极端条件下才会显现,而常规开发流程无法覆盖这些场景。
This is the most simulative version of a controller. It will try and mimic real user behaviour. It's the recommended version to use when the goal of the load-test is finding out how many concurrently active users the target instance supports.
As of right now the full build takes over an hour to run, and this makes cycling for PRs and quick iterative development very difficult.
Video on Functional Core, Imperative Shell paradigm. Recommended in Hypothes.is testing documentation
the functional core, imperative shell pattern
Link to video on "Boundaries" doesn't go into depth on the functional core, imperative shell pattern. However, this one does: https://www.destroyallsoftware.com/screencasts/catalog/functional-core-imperative-shell
For new code, it’s usually a good idea to design the code so that it’s easy to test with “real” objects, rather than stubs or mocks.
We keep our functional tests separate from our unit tests, in the tests/functional directory. Because these are slow to run, we will usually write one or two functional tests to check a new feature works in the common case, and unit tests for all the other cases.
Keep functional & unit tests separate. Functional for common cases, unit for all others.
To run the backend test suite only call tox directly
Probably means, "Call tox directly if you only want to run the backend test suite."
Black Box testing: Software on the rack
Black Box testing: Software on the rack
Black Box testing is defined as a testing technique in which the functionality of an application is tested without looking at the internal code structure, implementation details and knowledge of internal paths of the software. This type of testing is completely based on software requirements and specifications.

It is important that you achieve optimal test results with software testing without deviating from the test goal. But how do you determine whether you are following the right test strategy? For this you have to follow a number of basic principles.

A test case is a series of actions that are performed to determine a specific function or functionality of your application. Test scenarios are rather vague and include a wide range of variables. However, testing is all about being very specific. That is why we need elaborate test cases.
Test cases, examples and Best Practices
A test case is a series of actions that are performed to determine a specific function or functionality of your application. Test scenarios are rather vague and include a wide range of variables. However, testing is all about being very specific. That is why we need elaborate test cases.

STLC - Software Testing Life Cycle
Software Testing Life Cycle (STLC) is defined as a set of activities performed to perform software testing. The Software Testing Life Cycle refers to a testing process with specific steps that must be performed in a specific order to ensure that quality objectives are met.

I am firmly convinced that asserting on the state of the interface is in every way superior to asserting on the state of your model objects in a full-stack test.
The more your tests resemble the way your software is used, the more confidence they can give you.
This is why for a recent Angular+Rails project we chose to use a testing stack from the backend technology’s ecosystem for e2e testing.
There are times to stretch individually and as a team, but there are also times to take advantage of what you already know.
targeting what the user actually sees
The most important guideline to give is the following: Write clean unit tests if there is actual value in testing a complex piece of logic in isolation to prevent it from breaking in the future Otherwise, try to write your specs as close to the user’s flow as possible
It’s better to test a component in the way the user interacts with it: checking the rendered template.
There’s no need to test controllers, models, service objects, etc. in isolation
Run the complete unit with a certain input set, and test the side-effects. This differs to the Rails Way™ testing style, where smaller units of code, such as a specific validation or a callback, are tested in complete isolation. While that might look tempting and clean, it will create a test environment that is not identical to what happens in production.
Excel: Why using Microsoft’s tool caused Covid-19 results to be lost. (2020, October 5). BBC News. https://www.bbc.com/news/technology-54423988
kentbeck,
Library author here. I'm always fascinated by new ways people can invalidate my assumptions. I mean that in a sincerely positive way, as it results in learning.
Nivi Mani on Twitter. "I cannot stop smiling! Here is a first peek at the data from our online browser-based intermodal preferential looking set-up! We replicate the prediction effect (boy eats big cake, Mani & Huettig, 2012) using our online webcam testing software @julien__mayor @Kindskoepfe_Lab" / Twitter. (n.d.). Twitter. Retrieved June 15, 2020, from https://twitter.com/nivedita_mani/status/1265556217486815232
Han, H., & Dawson, K. J. (2020). JASP (Software) [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/67dcb
What’s the COVID-19 re-entry plan? Experts debate Europe’s tricky road ahead. (n.d.). Science|Business. Retrieved April 20, 2020, from https://sciencebusiness.net/news/whats-covid-19-re-entry-plan-experts-debate-europes-tricky-road-ahead
development is hard because you have to preserve the ability to quickly improve the product in the future
It is also good practice to make sure that your load testing is functionally correct. Both the performance and functional goals can be codified using thresholds and checks (like asserts).
"The more your tests resemble the way your software is used, the more confidence they can give you. "
You want to write maintainable tests for your React components. As a part of this goal, you want your tests to avoid including implementation details of your components and rather focus on making your tests give you the confidence for which they are intended. As part of this, you want your testbase to be maintainable in the long run so refactors of your components (changes to implementation but not functionality) don't break your tests and slow you and your team down.
We try to only expose methods and utilities that encourage you to write tests that closely resemble how your web pages are used.
The more your tests resemble the way your software is used, the more confidence they can give you.
Most of the damaging features have to do with encouraging testing implementation details. Primarily, these are shallow rendering, APIs which allow selecting rendered elements by component constructors, and APIs which allow you to get and interact with component instances (and their state/properties) (most of enzyme's wrapper APIs allow this).
This library is a replacement for Enzyme.
Here are my tools of choice for testing React apps:react-test-renderer for snapshot unit testingAct API for unit testing React componentsJest for unit and integration testing of JavaScript codeCypress for end to end / ui testing
Something that I've found helps greatly with testing is that when you have code with lots of nested function calls you should try to refactor it into a flat, top level pipeline rather than a calling each function from inside its parent function. Luckily in clojure this is really easy to do with macros like -> and friends, and once you start embracing this approach you can enter a whole new world of transducers. What I mean by a top level pipeline is that for example instead of writing code like this: (defn remap-keys [data] ...some logic...) (defn process-data [data] (remap-keys (flatten data))) (defn get-data [] (let [data (http/get *url*)] (process-data data))) you should make each step its own pure function which receives and returns data, and join them all together at the top with a threading macro: (defn fetch-data [url] (http/get url)) (defn process-data [data] (flatten data)) (defn remap-keys [data] ...some logic...) (defn get-data [] (-> *url* fetch-data process-data remap-keys)) You code hasn't really changed, but now each function can be tested completely independently of the others, because each one is a pure data transform with no internal calls to any of your other functions. You can use the repl to run each step one at a time and in doing so also capture some mock data to use for your tests! Additionally you could make an e2e tests pipeline which runs the same code as get-data but just starts with a different URL, however I would not advise doing this in most cases, and prefer to pass it as a parameter (or at least dynamic var) when feasible.
testing flat no deep call stacks, use pipelines
You shouldn’t really be doing this anyway – you should have composed them, possibly via IOC.
Anybody can explain some more his idea?