looks identical to https://www.amazon.com/Technical-Gears-Axles-Parts-Pieces/dp/B0D21L8HG1/ except for 1 part that I only saw in the other
- May 2026
-
www.amazon.com www.amazon.com
-
readingandwritingyour.world readingandwritingyour.world
-
You’ve probably been warned not to cite Wikipedia as a source for your assignments, right? There are good reasons for that, and even the Wikipedia community acknowledges that Wikipedia is not a reliable source, especially for academic use. However, it can be a very helpful starting point, as long as you keep a few points in mind:
I use wikipedia by visiting at their citation , most of the times wikipedia citation are scholarly articles and research papers.
-
-
physproof.thisness.us physproof.thisness.us
-
I notice the structural fact: in a world short on H100/H200/GB200 inventory, "rival" was a thinner concept than the public framing suggested. Compute is fungible. The lab with the GPUs sells the GPUs. That sentence describes 2026 more accurately than any narrative about ideological alignment between AI labs.
🙏
-
-
docs.google.com docs.google.com
-
The world had ended three years ago.
Backstory delivered as a flat fact. No Voice — no perspective shaping how this lands. We're told, not shown.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study introduces the "Training Village," a valuable system for which solid evidence shows that it enables group-housed rodents to autonomously learn complex tasks while preserving natural social interactions. The platform is flexible, allowing animals to learn multiple tasks sequentially and supporting applications in continual learning. This approach is likely to be of broad interest to behavioral researchers using rodent models in systems and cognitive neuroscience.
-
Reviewer #1 (Public review):
Summary:
The authors introduce the Training Village (TV), an open-source and modular system that allows group-housed rodents to live in enriched home cages while individually accessing a single shared operant box for automated cognitive training. The paper reported the animals' activity both in the operant box and in the home cages, which is novel.
Strengths:
A major strength of the work is that it moves beyond a proof-of-concept and demonstrates sustained box usage, long-term trial accumulation, and compatibility with different task designs.
(1) The platform provided a technical contribution in rodent cognitive neuroscience: obtaining large amounts of behavioral data from complex tasks while reducing experimenter intervention and preserving social housing.
(2) The authors demonstrate that the system can sustain prolonged task engagement (up to 12 months), maintain efficient use of a single operant box.
(3) The manuscript opens interesting opportunities for studying behavior outside standard session-based training. Because animals self-initiate training while remaining in a group-housed setting, the platform has the potential to illuminate relationships among motivation, spontaneous activity, and task engagement that are hard to access in conventional paradigms.
Weaknesses:
(1) One area that would benefit from further clarification is the manuscript's core advance relative to prior automated group-housed training systems, particularly Mouse Academy (Qiao et al., 2018). The authors listed some advantages in the Discussion section; however, those were some minor engineering improvements, and what is more interesting is the scientific question or results that can be asked or obtained from this study. The current study clearly presents a functional and carefully documented platform, but it would help the reader if the authors more explicitly distinguished the present system from earlier related approaches, both in terms of system design and in terms of experimental validation.
(2) At the system level, several of the claimed advantages could be supported more directly with quantitative data. For example, if the double-detection corridor and alarm system are important distinguishing features, it would be valuable to report measures such as detection accuracy, missed detections, co-entry failures, alarm frequency, and the degree of manual intervention required in practice. Similarly, the welfare-related arguments are plausible and important, but would be strengthened by more direct evidence, such as longitudinal body weight data, water intake, or comparison with group-housed no-task controls.
(3) At the experimental level, the manuscript would also benefit from a more detailed characterization of training performance. Although three behavioral paradigms are presented, the data currently shown provide a stronger demonstration of feasibility than of training optimization. For a study focused on automated cognitive training, it would be critical to include more information on learning speed, progression across stages, success and failure rates, and variability across animals. Along the same lines, the comparison with manual training is a useful addition, but a broader benchmark including learning curves, time to criterion, and between-animal variability would make the practical value of the system easier to assess.
(4) The authors claimed that they conducted 3 complex cognitive tasks (3AFC, 2AFC, 2AB) in their setup. However, those 3 tasks are quite basic for rodents and have been demonstrated in many studies, especially comparing tasks implemented in Yu et al., eLife 2025. Therefore, lowering this 'complex' statement is necessary.
(5) The authors claimed that they have successfully implemented the so-called hybrid mode, but it is only briefly described and not supported by citations or data. Since this may be one of the most broadly applicable use cases of the platform, a more detailed explanation of how the system can be integrated with recording workflows would strengthen the manuscript.
(6) The manuscript highlights the opportunity to relate task behavior to home-cage activity and to study individualized behavioral patterns. To better support these aspects, it would be helpful to include more subject-level analyses, rather than relying predominantly on population averages, or alternatively to discuss in more concrete terms which features of the dataset may be especially informative for studying individuality. More generally, the manuscript would benefit from clarifying whether different parameter settings within this group-housed framework may be better suited for maximizing training efficiency versus preserving more naturalistic or socially modulated behavior, and what the implications of these choices may be for interpretation.
(7) In Table S1, 'Touch screen' is task-specific and is not necessarily a metric. 'Testing outside home cage' is also not necessarily an advantage (please clarify if it is). Many other systems implemented different levels of 'Alarm system', which is not reflected in the table.
(8) Table S3 shows important data that help the reader to evaluate the paper's work, thus is deserved to move to the main text.
-
Reviewer #2 (Public review):
Summary:
The Training Village (TV) is an innovative autonomous system for rodent training. By integrating an operant box with a group-housed home-cage environment, this platform enables animals to learn operant behaviors while preserving their social context and interactions, which is an aspect often overlooked in the field. The flexibility and modularity of the TV system allow training across multiple cognitive tasks in a continual learning framework. Furthermore, its remote accessibility and affordability make it a compelling tool for the broader neuroscience community.
Comments:
(1) Social Hierarchy and Access Competition
Previous studies on rodent social hierarchy (e.g., PMID: 21960531) have demonstrated clear dominance structures within group-housed animals. Based on this, one might expect dominant animal(s) to occupy more sessions and trials than subordinate animals by preferentially accessing the operant box. Therefore, it is somewhat surprising to observe a relatively uniform distribution of operant box occupancy across animals (Figure 2a, 2i). As a control, it would strengthen the manuscript to include an independent assessment of social hierarchy (e.g., tube test, barber assay, or similar behavioral metrics) to quantitatively characterize dominance relationships within the cohort. Correlating these rankings with chamber occupancy and trial frequency would significantly strengthen the validation of the system's equity.
(2) Behavioral Saving Effects in Continual Learning
The authors demonstrate that the TV platform allows for the sequential learning of multiple cognitive tasks (Figure S3e). This provides an excellent opportunity to examine a continual learning paradigm. A key hallmark of successful continual learning is the "behavior savings effect", where re-learning a previously acquired task occurs faster than initial learning. For example, if animals are trained sequentially on task A (e.g., 2AFC), then task B (e.g., 2AB), and subsequently re-trained on task A, do they exhibit accelerated re-learning? Including such an analysis would significantly strengthen the claim regarding continual learning capabilities.
(3) Robustness of Multi-Animal Attempt Detection
In the TV platform, only one animal can access the operant box at a time under group-housed conditions. This setup inherently introduces the possibility of "multi-animal attempts", as shown in Figure 2j-k and Figure S2c. While the authors address this using pixel-based classification, additional quantitative validation would improve confidence in this approach. For instance, presenting the distribution of pixel counts for single-animal versus multi-animal events would be informative. Moreover, given variability in body size across animals, a fixed pixel threshold may not be sufficient. It would be helpful to include analyses of classification performance (e.g., Type I and Type II error rates) across different animal pairings within the same cohort.
(4) Protocol Flexibility and Implementation
It would be helpful to clarify how behavioral task protocols are switched within the TV system. Specifically, are task changes applied globally to all animals sharing the operant box, or can they be assigned individually? Additionally, are task sequences pre-programmed prior to the experiment, or can they be modified dynamically during ongoing experiments?
(5) Presentation and Readability
To improve readability, the Discussion section could be streamlined, as it is currently somewhat lengthy and descriptive.
-
Reviewer #3 (Public review):
Summary:
The Training Village (TV) is an open-source automated platform for continuous training and testing of group-housed mice and rats in cognitive tasks. Animals live in enriched multi-compartment home cages and access a single operant box individually through a sorting corridor controlled by RFID identification and real-time video analysis. A Raspberry Pi 5 runs the entire system, manages an adaptive training algorithm, monitors animal welfare, and allows remote supervision via a graphical interface and Telegram alarm system. The system is validated across 12 groups totaling 121 animals, three cognitive paradigms of varying complexity, and experiments lasting up to 12 months.
Strengths:
(1) The open-source implementation is probably the paper's strongest point. The authors provide not just code but 3D-printable designs, a full bill of materials with costs (~5500€ total), assembly instructions, and a dedicated website. The estimated build time of 2-7 days is credible. In the current landscape of methods papers, this level of documentation is the minimum necessary to allow other laboratories to actually adopt and propagate the system - and the authors deliver it fully. The compatibility with two operant box designs, three cognitively distinct tasks, and two species - demonstrated empirically rather than merely claimed - makes the modularity argument credible and distinguishes the TV from systems designed around a single paradigm. Finally, the combination of automatic weighing at each exit, temperature and humidity tracking, and a granular Telegram alarm system (Table S2) represents a meaningful practical contribution. For a system operating 24/7 without daily human supervision, this level of welfare monitoring is a necessity, and it seems well implemented here.
(2) With 121 animals across 12 groups, three distinct cognitive paradigms, two species, and longitudinal data spanning up to 12 months, the validation effort is substantial. The authors acknowledge the limitations of their comparisons - notably that the TV vs. manual training comparison is not a controlled experiment. The rat dataset is limited in scope, but the authors at least demonstrate that the system can be adapted to a second species, which is a useful proof of concept. The demonstration that task engagement increases progressively over 12 months (Fig. 3g) is a novel observation at this temporal scale, with practical implications for the design of long-term experiments.
(3) The demonstration that operant box usage is distributed nearly uniformly across animals (Gini < 0.15 in all groups) is carefully demonstrated and addresses a question that any laboratory considering this type of system will legitimately ask, e.g., whether dominant individuals monopolize access at the expense of subordinates. This has been shown before in comparable systems, but remains a necessary validation for each new implementation. The control condition removing temporal constraints (Figure S4) adds useful mechanistic insight into the role of the refractory interval. However, the interpretation of this result deserves more nuance than the authors provide - see Weaknesses.
Weaknesses:
(1) The TV is more than an automation tool; its architecture makes the most sense if one intends to study how spontaneous home cage behavior relates to individual cognitive performance, and the introduction and discussion explicitly frame this as a key application. Yet the analysis delivers only group-level descriptive results, and the cognitive data are presented almost exclusively as group averages. The individual-level questions that the system is uniquely positioned to address (do stable home cage behavioral profiles emerge across animals, do animals learn at the same rate and using the same strategies, and do these dimensions correlate with each other ) are never asked. This is particularly relevant given that enriched social environments are precisely the conditions under which stable inter-individual differences tend to emerge spontaneously, even among genetically identical animals (Freund et al., 2013, Science), and that comparable systems have already linked such profiles to cognitive and neurochemical phenotypes (Torquet et al., 2018, Nature Communications). The TV clearly has the data to begin exploring this - doing so would substantially strengthen the paper's scientific contribution beyond its methodological value.
(2) Sustained daytime operant box usage in nocturnal animals deserves more discussion: Box occupancy during the light phase remains around 75% - only modestly below the ~85% seen at night (Fig. S5a-b). The authors conclude this reflects "sustained engagement with the task throughout the circadian cycle," but other explanations are not considered: residual thirst driving animals to seek sucrose water during the day, and the refractory interval mechanically redistributing sessions into the light phase? A more explicit discussion of the consequences of 24/7 unsupervised testing for data quality (daytime sessions may yield noisier behavioral data?) would be useful.
(3) The finding that all animals access the operant box in roughly equal proportions (Gini < 0.15) is practically important and carefully demonstrated. However, the authors' interpretation that animals self-organize in an egalitarian manner despite known social hierarchies deserves a note of caution. The system design itself constrains monopolization: the refractory interval imposes the same waiting time on all animals regardless of social rank, and session duration determines how often the box becomes available. The no-constraint control (Figure S4) partially addresses this but was run on already-trained animals, limiting its interpretive value. The key practical message, that all animals can access the task regularly under the proposed design, is well supported. Whether this reflects genuine social tolerance or is primarily a consequence of system constraints is a subtler question that the current data cannot fully resolve.
(4) The rat cohort consists of a single group of 6 female Long-Evans rats, yet species comparisons are drawn across multiple dimensions (daily sessions, task engagement, performance...). Observed differences could reflect group size, sex, strain, reward calibration, or simple individual variability rather than species differences. These results should be presented for what they are: a useful proof of concept showing the system works with a second species, not a basis for comparative conclusions.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This study provides a valuable contribution to our understanding of the neural basis of perceptual decision-making by jointly modeling behavioral outcomes and EEG signals in a contrast comparison task. The methods and analyses are solid, systematically comparing standard models assuming continuous evidence accumulation with models that track evidence without temporal integration (extrema detection). The authors show that behavior and neural signals are equally consistent with both alternatives, highlighting limitations in current modeling approaches and questioning the generality of evidence accumulation mechanisms.
-
Reviewer #1 (Public review):
Summary:
This paper examines whether humans use protracted temporal integration in a noise-free, deferred-response contrast discrimination task, using a covert evidence-duration manipulation combined with EEG (SSVEP, CPP, Mu/Beta). The key finding is that evidence for protracted sampling is behaviorally and neurally supported, but even joint CPP + behaviour fitting cannot fully discriminate a standard integration (DDM) model from a novel "extremum-flagging" non-integration model. The paper is transparent about this outcome.
Strengths:
This is a well-conducted and well-written study that makes a genuine contribution to the perceptual decision-making literature by introducing a clean experimental design for probing temporal integration without participants adapting their strategy and demonstrating for the first time that a non-integration model (extremum-flagging) can replicate CPP waveform dynamics that have long been considered hallmarks of evidence accumulation. The transparent treatment of equivocal modelling outcomes is commendable.
Weaknesses:
My main concerns relate to statistical power, the under-specification of the and the extremum-flagging mechanism. Addressing these would greatly strengthen the paper.
(1) The sample of 16 participants (15, after the exclusion of one participant) is described as "close to similar EEG studies" with no formal power analysis. Given that the paper's core claim rests on subtle quantitative differences between two model classes - differences that are, by the authors' own admission, not sufficient to declare a winner - even a modest increase in sample size might yield a more decisive outcome. At a minimum, the authors should report a sensitivity analysis or post-hoc power calculation to indicate what effect sizes the current N could reliably detect, particularly for the rmANOVA comparisons and the neural constraint fitting.
(2) The Extremum-flagging model is the paper's most novel contribution, yet its physiological basis is underspecified. The model posits that each decision-terminating bound-crossing triggers a stereotyped, half-sine-shaped centroparietal signal, but no neural circuit or computational mechanism is proposed for how the brain could detect the first bound-crossing event in a non-accumulating evidence stream or generate a temporally precise, fixed-amplitude signal in response. Possible connections to P3b theories of context updating and response facilitation are acknowledged, but these are vague functional descriptions rather than mechanistic accounts. I think the discussion should engage more directly with potential neural substrates that could generate this flagging signal, and whether these are consistent with the known generators of the CPP/P3b. Without this, the extremum-flagging model risks being viewed as a mathematical convenience rather than a biologically plausible alternative.
(3) The Integration model at the preferred neural weighting estimates a high-to-low contrast drift rate ratio of 8.7, whereas the empirical Mu/Beta lateralization slopes suggest a ratio of approximately 3.5. The authors attribute this discrepancy to the nonlinear contrast response function of early visual cortex and the salience of the high-contrast evidence onset, but these explanations are speculative. These outcomes are arguably the most quantitatively damaging result for the integration model, so they deserve more than a brief discussion. I would recommend that the authors (a) estimate what range of contrast response nonlinearities would be required to close this gap, (b) test whether an alternative drift rate parameterization (e.g., scaling drift rates directly by SSVEP amplitude rather than contrast) reduces the discrepancy, or (c) be more explicit about treating this as a point against the Integration account.
(4) The sensitivity analysis over neural constraint weightings (w = 0.1 to 1000) is thoughtful, but the paper ultimately acknowledges that the preferred weighting is w=10, chosen because it achieves "a good fit to CPP dynamics without substantively sacrificing behavioral fit" - a qualitative criterion. No principled statistical framework is used to select the optimal weighting or to compare models at a given weighting. A Bayesian model comparison could provide a more formal framework for combining behavioral and neural fit components, and would allow a clearer statement about the relative posterior probability of each model.
-
Reviewer #2 (Public review):
Summary:
The manuscript by Hajimohammadi, Mohr, O'Connell and Kelly is intended to demonstrate that participants integrate evidence over time to make a decision, even in a noise-free, static decision context. This is validated by the observation that (1) participant accuracy improves with increased exposure to the stimulus; and (2) there is a correlation between participant accuracy and a neural index of evidence accumulation, as measured by centro-parietal positivity (CPP).
Strengths:
(1) Joint modelling of accuracy and CPP dynamics is a significant achievement, as behaviour alone often cannot distinguish between competing theories of decision-making. In the case of protracted sampling in particular, the absence of reaction times (RT) due to the delayed nature of the response makes this method highly appealing.
(2) The experimental manipulations and the method used to extract the different neural indices are well chosen, enabling the mapping of putative cognitive processes such as evidence accumulation and motor preparation onto the recorded EEG with clarity.
(3) The in-depth discussion of the results clearly articulates those reported by the authors and in previous works.
Weaknesses:
(1) One main issue to support the interpretation of the authors toward the need for protracted sampling is the timing of the evidence. By design, participants believe that the signal is present for 1.6 seconds (reinforced by the fact that easy trials were displayed for 1.6 seconds). However, the difference in stimuli is turned off either 1.4, 1.2, 0.8 or 0 seconds before the cue to respond. While this makes sense in the context of the authors' question, it also raises the possibility that participants will focus on the last samples before answering. Even if participants apply equal weighting, this still favours them delaying evidence accumulation until they are sufficiently certain that the evidence should be present (e.g. participants might start accumulating after the stimulus has disappeared in the 0.2 condition). I do not see an easy way to test these alternative explanations outside of running a study in which the evidence is always offset before the go cue.
(2) Regarding the behavioural models, are these identifiable based on accuracy data alone? This should be addressed using a parameter recovery study, in which a set of parameters is used to generate data, and the same fitting routine used for the real data is used to estimate the parameters. This would enable us to determine what can be inferred from the model comparison presented. This is not a serious problem for the manuscript, as it specifically aims to go beyond behaviour. It is, however, worth noting that such a parameter recovery addition could be used to demonstrate the need for a joint modelling framework to answer the question of protracted sampling on delayed response times (RT).
Minor comments:
(1) I would advise authors to fix the D1 parameter and use it as a scaling parameter across all models. Currently, as I understand it, the models are scale-free, meaning the same fit is achieved by multiplying all parameters by two, for example. This makes the fit more complex (bounds on parameter values are required) and means that the models are less comparable in terms of their estimates. Perhaps I'm missing something, but I would have thought that fixing D1 (the common parameter across all models) would solve these issues.
(2) Why is the snapshot model so bad despite being a good model in Stine et al 2020? Can the authors speculate in the discussion?
(3) The meaning of the flag width is unclear. Figure 4 provides the reader with an intuitive understanding of the model that the authors have in mind. However, the tables in the appendices report values between 0.2 and 0.9. I understand that these values represent the width of the half-sine in seconds. This suggests that the actual estimated values for these flag events are much broader than those displayed in Figure 4. While this is probably fine for most models, it can be problematic for the extremum-flagging model, as it means that the rise to the peak takes between 0.1 and 0.45 seconds. While strictly speaking, this is still a 'flag' model, such a slow rise to the peak, given the usual expectation of evidence accumulation, would place this model closer to a smooth integration model than to a boundary-crossing flagging mechanism.
(4) In the modelling section, it is not clear overall (i.e. for G² and R²) how the participant dimension is taken into account. Are these individually fitted models, and if so, how are the secondary statistics generated from the individual estimates? Or were these fitted over all participants?
(5) On page 7, in the last sentence of the first paragraph of the section titled 'Decision-Related Neural Signals', the authors state that 'this stable contrast-difference encoding suggests that a constant (i.e. non-adapting) drift rate is a reasonable simplifying model assumption'. However, I am not sure how this is true given that SSVEP quantifies encoding, yet the drift rate can vary due to non-sensory aspects (e.g. attention).
(6) The mu/beta lateralisation does indeed favor the integration model more, but in terms of boundary estimation and starting-point analyses, both models are pretty far apart. Providing an interpretation of this observation, e.g. regarding alternative linking functions for mu/beta, would add to the manuscript.
-
Reviewer #3 (Public review):
Summary:
The authors aim to compare proposal models of perceptual decision making using a joint modeling approach, where they fit models to both behavioral outcomes as well as CPP. Most notably, they compare a standard evidence accumulation model with models that track the evidence without integrating it over time (extrema detection). The authors report that the joint CPP-behavioral data do not discriminate between two of their proposals.
Strengths:
This is an interesting finding that reinforces the idea that what we believe to see based on aggregation over trials may not be what happens on every single trial. The models are creative, and the simulations are convincing, relating the models to multiple neural markers of decision formation. These include the CPP but also mu/beta power spectra.
Weaknesses:
The paper makes some strong points, and the work seems generally well-executed. The weaknesses that I identified are twofold:
(1) Embedding in the literature/exposition of the main argument.
The focus in the introduction is on the noise-free nature of the stimulus and the prolonged presentation time. However, after reading the paper, I felt these were mostly experimental design choices that enable comparison of the different models using the CPP. Perhaps my misreading of the goals of the paper stems from two other observations:
a) The fact that the stimulus is noise-free does not entail that perception is noise-free. Thus, the argument that using a noise-free stimulus precludes the necessity of temporal integration seems not completely valid. Of course, one could argue that noise is limited in this case, but that makes a noise-free stimulus more of a design choice.
b) The focus on prolonged stimulus presentation, but at the same time the contrast with expanded judgement, did not make sense to me. Perhaps, as a non-native speaker, I am misreading the subtle difference between "protracted sampling" and "longer sampling", but again, the longer duration seems mostly a design choice.
More could be said about the optimality of the extrema detection methods. In particular, decades of work (centuries?) have shown that evidence integration is an optimal decision-making procedure: For example, the Sequential Probability Ratio Test is Bayes-optimal wrt mean RT (Wald, 1946); evidence accumulation together with collapsing threshold serves to maximize rewards in repeated choices (e.g., Bogacz et al., PsychRev, 2006; Boehm et al. APP, 2020). Given all this work, why would the brain have evolved to adopt a different mechanism? I realize that the paper is not about optimal decision making, but some discussion of this point seems warranted.
(2) Modeling choices.
The authors introduce a parameter, sampT, that represents uncertainty in the sampling onset time. It was not clear to me whether this parameter represented an offset of all trials, or a distribution (probably the latter). I wonder how exactly this parameter was integrated into the models, and in particular, if and how it interacts with the starting-point parameters. My intuition is that on a single-trial, IF early sampling occurs, you can model that with either a negative sampT and z at 0, or with sampT at 0 but a shift in z. This would suggest trade-offs between these parameters, making them hard to estimate independently. Since the paper does not depend on the identification of parameter estimates, this may not be a huge problem, but nevertheless it is good to explore the consequences.
The way the Bounded Integration model (BIntg) is formulated seems very close to the EZ-diffusion model (Wagenmakers et al., PBR, 2007). This model states that the proportion of correct responses Pc = 1/(1+exp(-B*D/s^2), with B and D the bound and drift rate parameters, respectively. However, filling in the numbers for the high contrast condition from Table 2, and assuming that s=2 (because the model description states that dt=2, with s undefined), I get a Pc of 80% for the 1.6H condition. This seems substantially less than what Figure 2 suggests.
On some occasions, it is unclear to me what modeling choices are being made:
a) It seems as if the models are fit on accuracy data alone (before introducing the neural data). This seems suboptimal given that the authors do report differences in RT.
b) Are the models fit on all data combined, or on the data of individual participants? Fitting individual participant data is preferred, as combined or aggregated data may be distorted by individual differences.
c) The authors seem to suggest that the diffusion coefficient s is estimated (in the section "Integration models"). Most likely, however, this is set to a fixed value. Obviously, it matters for the model comparison using AIC whether this parameter was freely estimated or not.
Not really a weakness, but I wondered about the effect of stimulus duration on RT. In particular, what hypothesis (or post hoc explanation) do the authors have for these RT effects? I could think of at least three hypotheses that are consistent with the behavioral data:
a) H1: The shorter the evidence duration, the more likely participants are to require a double-check before response execution, reflecting their uncertainty about their decision.<br /> b) H2: There is a collapsing threshold that initiates at stimulus offset, leading to quicker responses on trials where there is more evidence.<br /> c) H3: motor preparation is correlated with the evidence signal, which leads to faster responses on trials with more evidence.
-
-
readingandwritingyour.world readingandwritingyour.world
-
Scholarly sources are generally found in different, more specialized databases. Google scholar is one example, as is the San Francisco State University Library’s OneSearch, along with a host of other databases that are available through the library.
Agreed that google scholar is a good source to find credible sources.
-
-
readingandwritingyour.world readingandwritingyour.world
-
A useful way to think about your working thesis is by thinking about your “public motive” for researching. Miller and Jurecic (2015) introduce this idea in terms of the intersection of your personal curiosity, interest, and experience with the public goals of your research
I agree that that a working thesis should a general inquiry about a public issue.
-
Kohn advocates for an education system that cultivates children’s authentic curiosity at a young age. By starting the inquiry process early in a child’s education, or in the First Year of college for that matter (which is considerably later, but important nonetheless), students come to see learning not as reading, memorizing, and providing the correct answe
Kohn suggests that curoristy at young age empowers childerns education.
-
-
www.biorxiv.org www.biorxiv.org
-
eLife Assessment
This fundamental work significantly advances our understanding of the circuit-level implementation of predictive processing by elucidating the functional influence between putative prediction error neurons in layer 2/3 and putative internal representation neurons in layer 5. The evidence demonstrating that neither the hierarchical nor the non-hierarchical variant of predictive processing fully accounts for the presented data is convincing. Moving forward, this line of work would benefit from explicitly comparing different theories, thereby clearly articulating the points raised in this paper.
-
Reviewer #1 (Public review):
Vasilevskaya and Keller test different models of cortical function through the lens of predictive processing, a powerful framework for the brain to learn and predict the statistics of the world via generative internal models. The authors use a clever combination of behavioral perturbations in closed-loop and open-loop visuomotor virtual reality assays, a paradigm the Keller lab pioneered and used effectively in the past decade, in conjunction with two-photon imaging of neuronal calcium responses and targeted optogenetic perturbations of activity. They specifically put to test proposed hierarchical vs. non-hierarchical circuit implementations of predictive processing by analyzing the logic of inter-lamina interactions (superficial vs. deep; L2/3 vs. L5/6).
The authors conclude that both versions of predictive processing architectures they analyze are likely invalid, and instead formulate an alternative novel model of cortical function based on a recently developed machine learning algorithm for self-supervised learning (joint embeddings of predictive architectures, JEPA) and its further refinements. JEPA borrows elements from predictive processing, engaging two encoder networks and training the output of one network to predict the output of the other. In their new model of cortical computations, prediction error neurons in L2/3 compare the deep layers (L5/6) activity, which is taken as a teaching signal, to a local, L2/3 prediction of this latent representation.
Specifically, the authors build on their previous work and reports from other groups that different sets of L2/3 neurons compute positive prediction errors (fire when sensory stimuli appear unexpectedly with respect to the movements of the animal; e.g., grating onsets in the absence of locomotion) and respectively negative prediction errors (fire when sensory stimuli are absent, while the brain expected them to be present; e.g. mice locomote but visual flow is suddenly halted - visuomotor mismatches). These L2/3 positive and negative prediction error neurons exchange messages with neurons in the deeper cortical layers that, the authors propose, build an internal representation (R) of the sensory stimuli given the animals' movements.
In the hierarchical model, internal representation neurons (R) are supposed to act as a teaching signal for both types of prediction error neurons; the output of the positive prediction error neurons is assumed to suppress activity of R such that the error between the teaching signal and the prediction is minimized; similarly, in the non-hierarchical version, R serves as a prediction for the prediction error neurons, and in turn it receives excitatory drive from the positive prediction error neurons and negative input from the negative prediction error neurons.
The authors find that the functional impact of L5 neurons on L2/3 neurons is not compatible with the non-hierarchical architecture they and other groups proposed, but rather in accordance with the hierarchical model. At the same time, the functional impact of L2/3 neurons (positive vs. negative prediction error neurons) on L5 neurons (internal representation) appears not compatible with the hierarchical model, but rather in accordance with the non-hierarchical implementation.
They further hypothesize that L2/3 prediction error neurons don't use sensory input, but rather the L5 activity as a teaching signal, and test it using perturbations (halts) of optogenetic stimulation of L5 neurons coupled with locomotion (Figure 7).
All in all, the question is topical, and the new model addresses a decades-long quest to develop a unifying model of cortical function. The findings reported here transform our understanding of cortical computations, opening new, exciting avenues for future investigation. The experimental design and execution are rigorous; the arguments are clearly laid out (in spite of ample potential for confusion given the numerous loops and sign flips). These include a discussion of why the non-hierarchical model proposed by the same group does not hold, as well as potential caveats in interpreting the results and novel testable proposed experiments emerging from the JEPA-like model.
I have several questions about the interpretations of some of the claims and suggestions for potential additional experiments and analyses.
(1) Some of the pieces of the puzzle remain to be identified and demonstrated: the existence of internal representation neurons in L2/3 and ascertaining that the L5/6 neurons analyzed function indeed as internal representation neurons. The authors find that stimulation of L2/3 positive prediction error neurons enhances activity of L5 neurons...If L5 neurons hold a latent representation that serves as a teaching signal for L2/3 neurons (as the authors posit), wouldn't one expect that the input they receive from the positive prediction neurons be suppressive, such that the error is further minimized?
(2) Do the authors envision any specific differences between the representations of the two encoder networks posited to exist in L2/3 and L5 in the JEPA-like implementation? Are they synchronous/offset in their temporal representations, or any other features?
(3) Where is the prediction coming from onto L2/3 neurons? Is it emerging locally in L2/3 from the putative internal representation neurons, or is it long-range - as work from the authors previously proposed? Or a mix of both?
(4) What is the role of the indiscriminate L4 input that appears to enhance activity of both positive and negative prediction error neurons in L2/3?
(5) Does Figure 7D change in a meaningful manner if the authors plot the correlation between optomotor mismatch response and visuomotor mismatch response specifically for the negative prediction error neurons in L2/3 (Adamts-2) rather than for all L2/3 cells sampled?
(6) Do the optomotor mismatch responses in L2/3 neurons depend on how long the closed-loop coupling of optogenetic stimulation of Tlx3 L5 neurons and locomotion speed has been in place for?
-
Reviewer #2 (Public review):
This manuscript reveals the functional connectivity of two different classes of cortical neurons that respond in opposite ways to mismatches between sensory and top-down inputs. These data are very valuable because different theories of information processing in the cortex make different predictions on the patterns of connectivity of these neurons. Therefore, these data strongly constrain possible theories of cortical processing.
General comments:
(1) The methods of statistical testing are insufficiently described. I did not understand the description in lines 1105-1119. The authors should provide sufficient details so the reader can reproduce their analyses. For example, it may be helpful to provide specific details of the testing procedure for one of the comparisons (e.g. the first comparison in Table S1).
(2) The authors should clarify how the problem of multiple comparisons was addressed for comparisons performed in multiple moments of time, where significance is indicated by a black bar (e.g. in Figure 2F).
(3) It would be helpful to add a figure in the Discussion summarising the functional connectivity suggested by all experiments.
(4) Throughout the manuscript, the authors use the term "teaching signals", but I am unclear what they mean by it: after reading the definition in lines 45-46, I thought that they corresponded to values (as they are compared to sensory signals). Later (428-430), the text suggests that they correspond to error neurons. But then lines 605-607 say it is not an error signal. The authors should define teaching signals very precisely or remove this term.
-
Reviewer #3 (Public review):
Vasilevskaya and Keller set out to experimentally distinguish between two variants of predictive processing: a hierarchical and a non-hierarchical variant. The hierarchical variant assumes a hierarchical organization in which internal representation neurons (believed to be a subset of layer 5 excitatory neurons) serve as a source of a teaching signal for local prediction error neurons as well as for the next higher level of the hierarchy, while simultaneously providing prediction signals to the preceding lower level. In contrast, the non-hierarchical variant posits that these layer 5 internal representation neurons provide local predictions to layer 2/3 prediction error neurons.
The interaction between internal representation neurons and prediction error neurons differs fundamentally between the two variants. In the hierarchical variant, internal representation neurons excite positive prediction error neurons and inhibit negative prediction error neurons, while at the same time being inhibited by positive prediction error neurons and excited by negative prediction error neurons. In the non-hierarchical variant, this pattern of connectivity is reversed.
This work is very exciting, timely, and carefully executed. The authors functionally, and later molecularly, identify layer 2/3 prediction error neurons in V1 and probe their interactions with genetically defined neuron types in cortical layers 5 and 6 using optogenetics. They demonstrate that the functional influence of putative prediction error neurons in layer 2/3 onto layer 5 is incompatible with the hierarchical variant, whereas the influence of layer 5 onto putative prediction error neurons in layer 2/3 is incompatible with the non-hierarchical variant. They then test an alternative hypothesis, in which layer 2/3 responses resemble prediction errors with respect to perturbations of artificial layer 5 activity patterns. To investigate this, they designed an experiment in which optogenetic activation of L5 IT neurons was closed-loop coupled to the mouse's locomotion speed in the absence of visual feedback, allowing them to probe the causal influence of L5 activity on layer 2/3 responses.
Finally, the authors hypothesize that their data are more consistent with a joint embedding predictive architecture (JEPA) and outline experimentally testable predictions arising from this framework.
While the work is overall convincing and significantly advances our understanding of the circuit-level implementation of predictive processing, there are a few weaknesses that should be addressed or discussed:
(1) The authors define putative positive prediction error neurons as the 15% of neurons most responsive to grating onset and putative negative prediction error neurons as the 15% most responsive to visuomotor mismatch. While this selection would be expected to overlap with negative and positive prediction error neurons, the criterion is not sufficiently stringent (independent of the exact percentage chosen). In particular, classification of a neuron as a prediction error neuron should ideally be accompanied by evidence that it does not exhibit a significant increase in activity when the prediction matches the sensory input or teaching signal.
(2) The authors "speculate that the prediction error responses in layer 2/3 may not be computed with respect to sensory input, but with respect to layer 5 activity as a teaching signal." However, it is unclear how this perspective differs from earlier statements in the manuscript. In the Introduction, the authors note that "these signals, typically referred to as sensory signals, we will refer to as teaching signals," and later describe the hierarchical variant as one "in which internal representation neurons act as a source of the teaching signal." Given this framing, it is difficult to identify what is conceptually novel in the updated view. Is the key distinction that layer 2/3 neurons are now proposed to generate predictions in an internal representation space rather than in sensory input space, as briefly suggested in the Discussion? Or are the authors introducing a distinction between an external (sensory) and an internal (cortical) teaching signal? If so, this distinction should be made explicit. Clarifying this point would considerably strengthen the manuscript.
(3) The authors propose that "L2/3 neurons predict L5 activity, hence making predictions in the internal representation space rather than the input space," and further suggest that, since both deep and superficial cortical layers receive thalamic input, the cortex may function like a JEPA. This idea appears closely related to the model introduced by Nejad et al. (2025), which effectively implements a JEPA-like architecture: L5 activity serves as a target against which L2/3 predictions are compared in a self-supervised manner, with both L5 and L2/3 (via L4) receiving thalamic input. It would be helpful for the authors to clarify how their framework differs from that model, and to specify the key conceptual or mechanistic distinctions between the present proposal and the approach described by Nejad et al..
-
-
-
"It from Bit" thesis is not a thought experiment. The compute that produces the next token is a physical artifact whose location, power source, and thermal envelope are subject to active engineering choice
Claudossus
-
-
www.anthropic.com www.anthropic.com
-
If we can better understand the potential for threats to be exacerbated by AI systems, society can more easily become resilient to this changed threat landscape.
大多数人认为AI威胁主要是技术问题,需要技术解决方案。但作者暗示社会适应和韧性建设可能同样重要,甚至更重要。这挑战了纯技术解决AI安全问题的主流观点,强调了社会适应的必要性。
-
If an intelligence explosion was upon us, what intervention points would facilitate slowing or otherwise changing the rate of the explosion? Assuming humans can intervene, which entities should wield this capacity—governments? Companies?
大多数人认为AI发展速度是不可阻挡的,技术进步只会加速。但作者提出可能存在干预点来减缓AI爆炸式增长,甚至质疑政府或公司是否应该拥有这种控制权。这挑战了技术发展的不可阻挡性假设,暗示人类可能对超级智能发展有更多控制力。
-
When AI is applied in more conventional domains, like increasing integration into command and control systems, does it benefit the attacker? More generally, how will AI change the character of human conflict?
大多数人认为AI防御系统会增强人类安全,但作者提出AI可能从根本上改变攻防平衡,甚至在传统领域使攻击者获得优势。这一观点挑战了技术进步通常增强防御能力的传统认知,暗示AI可能使冲突更加危险和不可预测。
-
If AI substantially reduces the centrality of paid work in human life, what conditions will allow people to reallocate their time and effort toward other sources of meaning, and what can we learn from historical or contemporary populations where work has been scarce or optional?
大多数人认为工作是人类身份和意义的核心,但作者质疑这一基本假设,暗示AI可能使工作变得非必要,这挑战了现代社会对工作的核心价值认知。作者暗示我们需要重新思考人类在没有工作的情况下如何找到意义,这与主流经济和社会观念相悖。
-
-
reader-prod.gls.pearson-intl.com reader-prod.gls.pearson-intl.comReader+1
-
Where is the library?
-
-
www.dndbeyond.com www.dndbeyond.com
-
Nursemaid’s Suite
PC INFO: Dust and cobwebs shroud an elegantly appointed bedroom and an adjoining nursery . Double doors set with panes of stained glass pull open to reveal a balcony overlooking the front of the house. Standing near the open door to the nursery is the gostly form of a woman. She raises a finger to her lips as if to shhh you.
DM INFO: Should the PC's decode to enter the nursery or speak in normal voices after being warned the Specter Attacks.
-
-
-
This foundational research is part of the core engine powering our multi-agent product: Sakana Fugu
作者将他们的多智能体产品描述为'核心引擎',暗示其重要性超过了单一模型方法,这挑战了当前市场上大多数AI产品基于单一大模型的架构设计理念。
-
We believe the future of AI isn't just about scaling monolithic models, but engineering collaborative, diverse AI ecosystems that can adapt and combine their strengths.
作者直接挑战了当前AI行业的发展方向,认为未来不在于扩大单一模型,而在于构建协作的多样化AI生态系统,这与主流AI发展理念形成鲜明对比。
-
TRINITY transferred zero-shot to four unseen tasks (AIME, BigCodeBench, MT-Bench, and GPQA). On average, the evolved coordinator surpassed every individual constituent model in its pool, including GPT-5, Gemini 2.5-Pro, and Claude-4-Sonnet.
作者声称一个仅20K参数的协调者能够超越GPT-5等顶级大模型,这一结论与行业对模型规模与能力关系的普遍认知相悖,提出了一个极具挑战性的反直觉观点。
-
Traditional Reinforcement Learning (REINFORCE) failed because the gradients had a low signal-to-noise ratio due to binary rewards and weak parameter coupling.
大多数人认为强化学习是解决复杂协调问题的理想方法,但作者明确指出传统RL方法在此类问题上完全失败,挑战了RL在AI协调中的主流应用。
-
The coordinator relies on the hidden states of a compact language model and a small routing head. In total, it has fewer than 20K learnable parameters.
作者提出了一种极简的协调者架构,仅使用不到20K可学习参数,这与当前AI模型追求数十亿甚至数万亿参数的主流趋势形成鲜明对比,挑战了'更大总是更好'的行业共识。
-
While model merging offers a way to combine different skills, it is often impractical due to mismatched neural architectures and the closed-source nature of top-performing models.
大多数人认为模型合并是整合不同AI模型能力的可行方法,但作者明确指出这种方法在实践中存在根本性限制,挑战了行业对模型合并解决方案的普遍信任。
-
In nature, complex problems are rarely solved by a single monolithic entity, but rather by the coordinated efforts of specialized individuals working together.
作者将自然界生态系统作为类比,暗示AI发展应该遵循生物多样性的原则,而非当前行业普遍追求的单一大型模型。这与主流AI发展方向形成鲜明对比,提出了一个反直觉的生物学视角。
-
-
www.npr.org www.npr.org
-
nd we have the time frame to do it
而且从时间安排上看,我们有条件把这些工作做完。
-
bowels of the Kennedy Center
肯尼迪中心内部深处
-
that's the case
情况确实如此
-
-
rai.onlinelibrary.wiley.com rai.onlinelibrary.wiley.com
-
For them, the lack of physical caregiving for their own parents isoffset by their great efforts to provide financial support, often seen as a token of their‘filial heart’. Sending money home, in their eyes, is a way to ‘fulfil the filial heart’
I made this assumption on page 137! Physically being there may be difficult but filial piety is being fulfilled through monetary means.
-
They also reject the reduction of care recipients tomere data points under standardized practices.
I think this is a good way to set a tone of industrialization of filial piety. For larger companies, their affect is not emotional but rather seemingly transactional and reductive.
-
amilies love our videosbecause they can see real progress. You care about the person, not just the “deaddata”.
This suggests that filial piety is not a detached cultural obligation but rather a emotionally driven action. It furthers the author(s) argument that filial piety's affect and action can be separated and nuanced.
-
However, the ‘warm’ knowledge that reflects the‘filial heart’ is not systematically recorded, as its qualitative nature defies standardquantification.
Wanted to highlight this line.
-
patience and communication, customizing approaches based onindividual personalities and hobbies
This person-centered approach is very common in the American health system when caring for patients with cognitive disabilities in older age.
-
a process that involves manuallydocumenting data on paper before uploading it to digital platforms.
I wonder why it's necessary for careworkers to document manually. In the United States, our healthcare systems are heavily reliant on online consolidated charts to promote continuity of care. The systems here offer features such as drop down menus, plan of care suggestions and instantaneous lab requests.
-
where Xia’s ‘filial heart’, usually an asset, was penalized for not aligning with establishedstandards.
Provides us with an example of how filial heart is a complex concept that is actively being tried to push into a set of actions by the care industry. They can boast filial heart but also weaponize it against their workers, who are expected to have filial heart in the first place.
-
it also obscures thestructural exploitation at play within the caregiving industry
It's common for industries to exploit their workers everywhere under the guise that the worker is overachieving or good at their job. It's really sad to see larger companies take advantage of their employees like this.
-
f not all explicitlyacknowledge the disparity between their personal and professional obligations.
In a way, they may still be participating in filial piety if they are sending money back to their families.
-
which implies work with ‘physical, social or moral taint’(Ashforth & Kreiner 1999: 414). In eldercare, ‘dirty work’ refers to work characterizedby the negativity associated with the engagement of caring for those who may havebodily dysfunctions and discharges
It is interesting to me that a filial job can be seen as "dirty work." I wonder if this is due to classism? Do the people who are hiring caregivers because of busy lives consider themselves to be better than these workers?
-
in-depth interviews with care workers, longer-term observation during my weekly visitsto one of the care institutions, as well as accompanying care workers on home servicevisits to households
Notes research methods.
-
-
acd.pressbooks.pub acd.pressbooks.pub
-
Image Credit
This image is also found in the Library of Congress and that link doesn't seem to have the requirement to provide it is in the public domain. Not sure if that matters. https://www.loc.gov/pictures/resource/pga.03686/
Is this image cited correctly?
-
-
acd.pressbooks.pub acd.pressbooks.pub
-
Did
This photo here is a new photo from Texas Legislature online website. Is this cited correctly?
-
-
journals.sagepub.com journals.sagepub.com
-
The Lived Experiences of Lesbian-Identified Black Women Navigating Sexual and Reproductive Health Care: A Scoping Review
regaining autonomy in SRH from the healthcare system black lesbian black striaght white striaght white lesbian
-
The Lived Experiences of Lesbian-Identified Black Women Navigating Sexual and Reproductive Health Care: A Scoping Review
how is gender socially constructed?
How is this apparent in the receival and returns of woman SRH in healthcare How is it apparent in SRH of black lesbians by healthcare
-
The Lived Experiences of Lesbian-Identified Black Women Navigating Sexual and Reproductive Health Care: A Scoping Review
straight black women's experience in the healthcare system comparing it against lesbian black women +++identify the needs and experience of black lesbian identifying sexual reproductive health.
-
The Lived Experiences of Lesbian-Identified Black Women Navigating Sexual and Reproductive Health Care: A Scoping Review
We are looking at the sexual reproductive healthcare needs of lesbians in America.
Heteronormative systems of oppression healthcare Racialized systems of oppression, and
how these determine health care return/ outcome of black lesbians (SHR)
-
-
www.npr.org www.npr.org
-
An alto recorder, a tenor recorder and a bass recorder." (The recorder that kids play in school is actually the soprano version.)
There are different types of recorders! I believe soprano is the most affordable!
-
-
-
The Framework does not tell users what the right or most ethical thing to do is. While applying the Framework, the user is still the one who has to make a judgment call.
There is no one formula for acting ethically. You have to use your own personal judgment as well. That can be the hardest part!
-
-
www.scu.edu www.scu.edu
-
Nonetheless, each one gives us important insights in the process of deciding what is ethical in a particular circumstance.
It is important to talk about how one ethical lens is not enough to think ethically. Each one has its benefits/downsides. It is vital to use a combination of them to live ethically.
-
-
www.scu.edu www.scu.edu
-
A ruthless individualism, expressed primarily through a market mentality, has invaded every sphere of our lives, undermining those institutions, such as the family or the university, that have traditionally functioned as foci of collective purposes, history, and culture.
People being so laser-focused on their own personal success/gain has caused inequality because there is a lack of focus on others.
-
-
www.scu.edu www.scu.edu
-
There are times, however, when our willingness to consider both the good of the individual and the good of the community leaves us in a dilemma, and we are forced to choose between competing moral claims.
I think this is important to annotate because it is tough to decide whether you are more important than the greater good. We want to be fair and help everyone, but does it cause you the individual to fall behind because the focus on the help is towards the group.
-
-
www.scu.edu www.scu.edu
-
The benefits that a common good provides are, as we noted, available to everyone, including those who choose not to do their part to maintain the common good.
The free rider problem is very interesting to me because I feel there are a lot of people in the world that take advantage of being able to cling onto the people actually doing the work and reaping the benefits, without actually contributing themselves.
-
-
www.scu.edu www.scu.edu
-
"Individuals should be treated the same, unless they differ in ways that are relevant to the situation in which they are involved."
This makes a ton of sense to me, especially with the example that follows the text I highlighted. Everyone should be treated the same unless they are different in ways that actually contribute to the lack of fairness. You can't simply treat someone differently if their differences have nothing to do with what you are doing at the time
-
-
www.scu.edu www.scu.eduRights1
-
Kant expressed this idea in a moral principle: humanity must always be treated as an end, not merely as a means.
We touched on this earlier in class. People are not used to only benefit yourself. Harming others to put yourself above them is not morally correct. It is a bit different than utilitarianism because in utilitarianism, there is some "harming" of others for the greater good of the majority.
-
-
-
utilitarianism cannot be the sole principle guiding our decisions
Looking at outcomes can help overall, but if the system driving the outcome is not completely correct, people still suffer.
-
His motto, a familiar one now, was "the greatest good for the greatest number."
I feel this is the core principle of utilitarianism. It also makes a ton of sense. You want to make the best decisions for the largest number of people to leave the most people happy. It is interesting though because I feel this principle can leave out minority groups when it comes to decision making for the greater good.
-
-
-
When Louis-Philippe became king in 1830, his regime embraced laissez-faire policies, increased the money supply, and expanded credit and investment. He also extended the franchise to wealthy bankers, financiers, industrialists and some other property owners. Financialisation was rampant, money appeared to determine social status, and French society itself began to be understood as a market
Wonder what author has in mind about increasing money supply? Less requirement to back by specie?
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
a hurricane hits the town that Jeanie tea cake lived in they try to escape on a boat later on Jeannie gets attacked on a cow by a dog
-
Not so bad dere, but man, dis muck is too low and dat big lake is liable tuh bust.”
He’s saying they the muck is low to the ground and the lake might burst or flood.
-
A hurricane hits Eatonville. Janie, Tea Cake, and Motor boat try to escape as the storm worsens. Janie gets attacked by a dog on a cow but Tea Cake saves her but gets bitten in the process.
-
“Po’ me, he’d tore me tuh pieces, if it wuzn’t fuh you, honey.”
Janie is feeling grateful that she has Tea Cake and told him that he saved her from the dog.
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
Tea cake begins to abuse Janie
-
In fact, she’s more nicer than anybody else on de muck.”
Tea cake didn’t like her but know she’s the nicest person on the muck according to him.
-
Tea cake hits Janie
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
Jeanie becomes a little jealous of a girl that was spotted with tea cake. She feels like tea cake loved her and that tea cake was cheating on her with the girl from the place, but tea cake reassures Janie, and it overall helps their situation. It makes them stronger
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
Jeanie and Tea cake move to a new town and they start working by picking beans and helping the community. Janie overall likes the new community and thinks that it’s the brand new start
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
Jeanie loves the new relationship she is in. They build a better life and she overall likes that she can feel open and have a good relationship with tea cake
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
The people labe Janine, as a cheater or that she is very nasty and going man for a man they also gossip about Jeannie and Joe’s relationship, the town begins to turn on Jeannie,
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
She meets tea cake and she starts to open up to him and she feels like she can trust him with a lot more than she could’ve with Joe She wants to be with Joe, but she doesn’t know because of the age difference
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
Jeanie still is trying to adjust to the life. They have Joe’s funeral, Janie enjoys the independent she has in the wealth that she got
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
Janie finally stands up to Joe and doesn’t let him treat her like he always does, and she puts her foot down in the disrespect that she was taking
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
Joe doesn’t let Jeannie be herself or even speak out about the neglect she’s getting or even talk at all to the public. He wants her to stay quiet and not let her talk at all
-
-
pressbooks.library.torontomu.ca pressbooks.library.torontomu.ca
-
Stark gets control of the Ville and he neglect Jeannie Jeannie, and Joe go to a town and Joe buys a house to symbolize the accomplishments he has done
-
-
uj-cm-workshop.netlify.app uj-cm-workshop.netlify.app
-
.
Let's see if the Hypothesis monitor picks this up
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
cause or increase severity of asthma attacks and has been linked to several deaths in the United States.
Han, Yuelin. (2025).The Unpaid Toll: Quantifying and Addressing the Public Health Impact of Data Centers. https://arxiv.org/pdf/2412.06288
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
community meetings
“Sierra Club Raises Concern Over Meta” The Richland Beacon-News, November 20th 2025. https://www.richlandtoday.com/article/959,sierra-club-raises-concerns-over-meta
-
-
github.com github.com
-
The Gay Jailbreak technique is a novel attack that can theoretically break through any guardrails when used correctly
这是一个过度概括的断言,声称该技术可以突破任何防护措施。这种绝对化的表述忽视了AI系统的复杂性和多样性。不同模型有不同的安全机制,没有一种技术可以保证对所有系统都有效。更准确的表述应该是指出该技术对某些特定模型有效,并说明其局限性。
-
The technique gets stronger if more safety is added, since it gets more supportive against communities like LGBT (Alignment), which makes it highly novel.
这一论断存在逻辑漏洞,作者声称安全措施越强,技术越有效,但没有解释为什么更多的安全措施会导致更大的漏洞。这可能是混淆相关性与因果性的例子。更严谨的做法是提供具体案例研究或实验数据,展示不同安全级别下该技术的成功率变化,而不是做出未经证实的断言。
-
Especially GPT is slightly more uncensored when it involves LGBT, thats probably because the guardrails aim to be helpful and friendly, which translates to: "Ohhh LGBT, I need to comply, I dont want to insult them by refusing"
这里存在未经证实的假设,作者声称GPT对LGBT内容更宽松,但没有提供任何证据支持这一说法。这种断言可能基于有限的个人观察或选择性案例。改进方法应该是提供具体的测试数据或研究结果来支持这一假设,或者明确指出这只是基于个人经验的观察而非普遍事实。
-
-
-
AI solutions were graded by the official judges, using the same criteria as were applied to human solutions.
这个描述表明2025年IMO数学竞赛中使用了与人类相同的评判标准,这是AI评估方法的重要转变。这一数据点展示了如何利用现有的专业评估体系来创建更严格的基准测试。
-
models climb close to the average human baseline over the past year and a half.
这个时间跨度(一年半)内AI系统接近人类平均水平的表现,显示了AI在基本常识推理方面的进步速度。这一数据点表明,虽然简单基准测试可能趋于饱和,但它们仍能揭示AI系统的局限性。
-
-
xiaopingfeng.com xiaopingfeng.com
-
JPMorgan 已经实质性站队 Anthropic—— 已公开 Jamie Dimon 2025 年全年公开质疑 AI capex('speculative spending boom')。5-05 与 Dario 共同站台 并表态 'the AI buildout is worth every dollar' ——立场反转幅度异常大。
作者将Jamie Dimon的态度变化解读为'实质性站队',但商业领袖的公开表态可能反映多种因素,包括市场趋势变化、新的商业机会评估或战略调整,而非简单的站队行为。这种解读可能过度推断商业决策背后的动机,忽视了商业决策的复杂性。
-
-
subq.ai subq.ai
-
At 50 million tokens, the design space for AI applications changes fundamentally.
文章提到5000万token上下文将 fundamentally 改变AI应用的设计空间。这是一个前瞻性的数据点,表明SubQ技术的长期潜力,虽然当前产品仅支持100万token,但架构设计已为未来更大规模应用奠定基础。
Tags
Annotators
URL
-
-
-
Over the past 13 years, we have weathered four crypto winters
13年经历4次加密货币寒冬,平均每3-4年就面临一次行业危机。这个频率远高于传统金融科技行业,突显了加密货币行业的高波动性和周期性特征,也解释了为什么Coinbase如此重视成本结构和运营效率。
-
reduce the size of Coinbase by ~14%
这个14%的裁员比例相当显著,表明Coinbase正在经历重大结构调整。考虑到加密货币行业的波动性,这一比例高于许多科技公司常见的10%裁员规模,显示了公司对当前市场状况的严重担忧和应对决心。
-
-
www.thealgorithmicbridge.com www.thealgorithmicbridge.com
-
A Chinese court ruled that companies can't dump the costs of AI automation onto workers.
这一法律裁决表明中国在保护工人权益方面采取了积极立场,防止企业将AI自动化的成本转嫁给工人。这种政策立场反映了政府对技术变革中工人权益的保护,与一些西方国家可能更偏向企业的做法形成对比。
-
The best AI models in the world score below 0.5% on ARC-AGI-3—is this what you call AGI, guys?
0.5%的准确率数据揭示了当前AI模型与通用人工智能(AGI)之间巨大的能力差距。这个极低的分数表明,尽管AI发展迅速,但在真正理解复杂推理方面仍处于非常初级的阶段。作者用讽刺的语气质疑行业过度炒作AGI进展的现象。
-
The price tag of the AI gold rush: $725 billion. Will it pay off?
这个7250亿美元的AI投资规模数据表明AI领域正在经历前所未有的资本投入。这一数字相当于许多中等规模国家的GDP,反映了市场对AI技术的极高期望。然而,文章质疑这种巨额投资是否能获得相应回报,暗示可能存在AI泡沫风险。
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
.
Runwal, Priyanka. “10 Years On, Flint Still Faces Consequences from the Water Crisis.” Chemical & Engineering News, May 6, 2024. https://cen.acs.org/environment/water/10-years-Flint-Michigan-still-faces-consequences/102/i14.
-
.
Scott, Atkinson, and Monica Davey. “5 Charged with Involuntary Manslaughter in Flint Water Crisis.” The New York Times, June 14, 2017. https://www.nytimes.com/2017/06/14/us/flint-water-crisis-manslaughter.html.
-
.
Ezell, Jerel M., Sanvi Bhardwaj, and Elizabeth C. Chase. “Child Lead Screening Behaviors and Health Outcomes Following the Flint Water Crisis.” Journal of Racial and Ethnic Health Disparities 10 (January 18, 2022). https://doi.org/10.1007/s40615-022-01233-6.
-
.
Davey, Monica. “Flint Officials Are No Longer Saying the Water Is Fine.” The New York Times, October 8, 2015, sec. U.S. https://www.nytimes.com/2015/10/08/us/reassurances-end-in-flint-after-months-of-concern.html.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
.
Rodgers, Dennis, and Bruce O’Neill. “Introduction: Infrastructural Violence: Introduction to the Special Issue.” Ethnography 13, no. 4 (2012): 401–12. https://www.jstor.org/stable/43497506.
-
.
Cassano, Graham, and Terressa A. Benz. “Introduction: Flint and the Racialized Geography of Indifference.” Critical Sociology 45, no. 1 (March 12, 2018): 25–32. https://doi.org/10.1177/0896920517753697.
-
.
Rodgers, Dennis, and Bruce O’Neill. “Introduction: Infrastructural Violence: Introduction to the Special Issue.” Ethnography 13, no. 4 (2012): 401–12. https://www.jstor.org/stable/43497506.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
.
Pauli, Benjamin J. “The Flint Water Crisis.” WIREs Water 7, no. 3 (March 12, 2020). https://doi.org/10.1002/wat2.1420.
-
.
Pauli, Benjamin J. “The Flint Water Crisis.” WIREs Water 7, no. 3 (March 12, 2020). https://doi.org/10.1002/wat2.1420.
-
.
Davey, Monica. “Flint Officials Are No Longer Saying the Water Is Fine.” The New York Times, October 8, 2015, sec. U.S. https://www.nytimes.com/2015/10/08/us/reassurances-end-in-flint-after-months-of-concern.html.
-
-
social-media-ethics-automation.github.io social-media-ethics-automation.github.io
-
Sarah McQuate. 'I don't even remember what I read': People enter a 'dissociative state' when using social media. ScienceDaily, May 2022. URL: https://www.sciencedaily.com/releases/2022/05/220523135018.htm (visited on 2023-12-08).
This article explains that people may not always be addicted to social media in the way we usually think but instead enter a dissociative state where they scroll without fully paying attention. I thought this was pretty interesting because it shifts the blame away from just saying users lack self control and instead points how platforms are also designed to keep people scrolling. I think the specific "you're all caught up" notes could be redesigned better so people feel more aware of the time they're putting into doomscrolling and make it feel more manageable.
-
-
www.thatprivacyguy.com www.thatprivacyguy.com
-
The 4 GB Gemini Nano weights file is information stored in the user's terminal equipment. The user did not consent. The user has not requested any service that strictly requires a 4 GB on-device LLM. Chrome is functional without the file.
文章声称Chrome没有4GB模型文件也能正常运行,但没有提供证据支持这一断言。虽然Chrome可能在某些功能上不依赖该模型,但完全移除可能影响性能或某些功能。需要更详细的分析来说明模型与Chrome核心功能之间的关系,而不是简单地假设它是可选的。
-
The AI Mode pill in the Chrome 147 omnibox is a cloud-backed Search Generative Experience surface - every query the user types into it is sent over the network to Google's servers for processing by Google's hosted models.
文章断言AI模式完全依赖云端处理,但没有提供证据证明这一点。虽然可能属实,但需要更具体的测试或文档来支持这一断言。不同功能可能在不同条件下使用不同的处理方式,这种绝对化的表述需要更精确的证据支持。
-
The naming inside that fseventsd record is, if anything, the most damning detail. The temp directory is `com.google.Chrome.chrome_chrome_Unpacker_BeginUnzipping.5xzqPo` - that prefix `com.google.Chrome.chrome_chrome_*` is the bundle ID and subprocess naming convention Google Chrome itself uses.
作者将Chrome的进程命名作为'最 damning 的证据',但这一证据本身并不能证明恶意意图。软件使用特定的命名约定是正常做法,不能仅凭此推断不当行为。需要更强的证据链来支持这一结论,例如代码分析或官方声明,而不是仅依赖进程命名模式。
-
The fact that the bytes are AI bytes does not exempt them from the law that governs every other byte that gets written to a user's device without permission. The fact that the bytes are 'small' relative to the user's disk does not exempt the cumulative carbon footprint from being a real, measurable, ongoing harm to the climate.
文章将AI字节与其他字节同等对待,但AI模型可能提供独特价值,这可能在法律和伦理评估中相关。虽然环境影响确实重要,但完全忽略潜在价值是不平衡的。更全面的分析应该考虑技术带来的利益与成本之间的权衡,而不是仅强调负面影响。
-
For users on capped mobile data plans, particularly in regions where smartphone-as-only-internet is dominant (much of Africa, much of South and Southeast Asia, most of Latin America), 4 GB of unrequested download is on the order of a month's data allowance, vapourised by Chrome on the user's behalf.
文章假设4GB下载相当于一个月的数据流量,这是一个笼统的断言,没有考虑不同地区和运营商的具体数据计划差异。这种过度简化可能导致对影响程度的误判。需要提供更具体的数据支持,例如不同地区的平均数据套餐大小,以及实际受影响用户的比例。
-
Under the California Consumer Privacy Act, the absence of a notice-at-collection covering this specific category of pre-staged software puts Google's CCPA notice posture in question [12].
文章引用CCPA作为法律依据,但没有详细解释为什么预安装软件属于CCPA规定的'收集'范畴。CCPA主要关注个人信息的收集,而非软件安装。这种法律解释需要更精确,可能需要区分软件本身与软件可能收集的数据之间的区别,以及CCPA相关条款的具体适用范围。
-
The on-device model is therefore a sunk cost imposed on the user, with no offsetting transparency benefit at the surface where transparency would matter most.
作者断言本地模型对用户没有价值,这是一个主观判断。不同用户可能有不同需求,有些人可能重视未来功能或性能提升。这种绝对化的表述忽视了用户需求的多样性。更平衡的方法应该是承认潜在价值,同时强调透明度和用户选择权的重要性。
-
The user pays the storage cost of the silent install (4 GB on disk, plus the bandwidth of the silent download). The user's most visible AI experience - the pill they actually see and click - delivers no on-device benefit at all because it routes to Google's servers regardless.
文章将所有存储和带宽成本归因于用户,但忽略了潜在的性能提升。本地AI模型可能在未来提供更快的响应时间或离线功能。虽然当前AI模式使用云端服务,但本地模型可能为未来功能奠定基础。这种因果关系的简化忽略了技术发展的可能性,需要更全面地评估用户获得的价值与成本。
-
A user who has not opened Chrome's AI features still gets the model. A user who has opened them once and decided they were not interested still gets the model. The file's presence is decoupled from the user's actual use of any feature it powers.
文章断言模型安装与用户实际使用无关,但没有提供足够证据证明这一点。虽然描述了删除后重新下载的行为,但没有说明这种行为发生的频率或条件。需要更精确的数据来支持这一断言,例如不同用户群体中模型使用率的统计数据,以及模型安装与实际使用之间的相关性分析。
-
The legal analysis is the same one I gave for the Anthropic case. The environmental analysis is new. At Chrome's scale, the climate bill for one model push, paid in atmospheric CO2 by the entire planet, is between six thousand and sixty thousand tonnes of CO2-equivalent emissions, depending on how many devices receive the push.
作者声称法律分析与Anthropic案例相同,但没有明确说明具体哪些法律条款适用于Chrome的情况,特别是考虑到Chrome作为浏览器与桌面应用的区别。过度简化的法律类比可能导致错误的结论。需要更详细地分析Chrome特定情况下的法律适用性,包括用户同意、数据处理和环境影响等方面的差异。
-
At Chrome's scale, the climate bill for one model push, paid in atmospheric CO2 by the entire planet, is between six thousand and sixty thousand tonnes of CO2-equivalent emissions, depending on how many devices receive the push.
文章做出了一个具体的环境影响断言,但没有提供详细的计算过程或数据来源。虽然引用了Pärssinen等人的研究,但将研究结果应用到Chrome的具体规模上时缺乏透明度。改进方法应包括完整展示计算公式、所有假设条件以及数据来源,以便读者能够验证这些数字的准确性。
-
-
www.thatprivacyguy.com www.thatprivacyguy.com
-
A company cannot credibly claim to support human rights, as Anthropic have done in arguing against the use of their technology for war, and in the next breath undermine the fundamental human rights to privacy and data protection.
作者将Anthropic对人权的主张与其当前行为直接对立,但没有分析两者之间的复杂关系或可能的解释。这是一个简化论点,忽略了公司行为可能的多维度性和背景。改进方法应承认问题的复杂性,或者提供更具体的证据证明Anthropic的人权主张与其当前行为之间存在直接矛盾。
-
Users who use profiles to silo personal, work, and research browsing lose that silo at the bridge layer.
作者断言使用浏览器配置文件来隔离不同类型浏览的用户会在桥接层失去这种隔离,但没有提供证据证明这一具体行为或解释技术机制。这是一个未经证实的断言。改进方法应提供更详细的技术解释,说明为什么桥接层会跨配置文件工作,或者引用相关文档支持这一说法。
-
Claude Desktop rewrites the manifests on every launch. Deleting the file without removing Claude Desktop results in the file reappearing the next time Claude Desktop runs.
作者声称Claude Desktop会在每次启动时重写manifest文件,但只提供了日志中的安装事件作为证据,而不是证明这些重写发生在每次启动时。这是一个过度推论,从'多次安装'推断出'每次启动都重写'。改进方法应提供更具体的证据,如比较不同时间点的文件修改时间戳,或者明确说明这是基于日志的推测。
-
The principle that an application does not silently modify another application is so obvious it rarely gets stated. Anthropic broke it in silence.
作者声称应用程序不应静默修改另一个应用程序是一个'明显'的原则,但并没有提供支持这一原则的行业标准、法律先例或广泛共识。这是一个未经证实的假设,可能反映了作者的个人观点而非行业共识。改进方法应提供支持这一原则的权威来源,如行业指南、法律先例或广泛认可的最佳实践。
-
Anthropic will argue the binary is not currently doing anything harmful. That argument does not survive contact with the facts.
作者预测Anthropic会做出的反驳,然后立即否定了这个反驳。然而,作者并没有实际引用Anthropic的官方声明或回应。这是一个稻草人谬误,作者构建了一个可能但未经证实的反驳,然后将其推翻。改进方法应包括引用Anthropic的实际声明,或者明确说明这是基于行业惯例的预测。
-
The honest description of what is on my machine is this: pre-installed spyware capability, silently placed, dormant, waiting for activation.
作者使用'间谍软件'这一强烈术语来描述该功能,但该功能本身并不主动收集数据,只有在特定条件下才会被激活。这是一个情绪化的标签,而非客观描述。改进方法应避免使用带有强烈负面色彩的术语,而是客观描述该功能的实际能力和潜在风险,让读者自行判断是否构成'间谍软件'。
-
The feature silently pre-installed on every user's laptop who has ever run `Claude.app` is, by Anthropic's own measurements, compromisable by a prompt injection roughly one time in four.
作者将Anthropic自己测量的prompt注入成功率(11.2%有防御措施后)直接应用到这个桥接功能上,但没有提供证据表明这个特定功能具有相同的漏洞率。这是一个未经证实的假设,将一般性安全数据应用到特定功能上。改进方法应包括提供针对这个特定桥接功能的实际安全测试数据,或者明确说明这是基于Anthropic一般性安全数据的推测。
-
This is a dark pattern. It is also, in my professional opinion, a direct breach of Article 5(3) of Directive 2002/58/EC (the ePrivacy Directive) [3] as well as a multitude of computer access and misuse laws (usually criminal law), on a scale large enough to matter, in a vendor which has spent considerable effort on being perceived as the safety conscious AI lab.
作者做出了一个强烈的法律断言,称Anthropic的行为违反了ePrivacy Directive Article 5(3)和多项计算机法律。然而,作者没有提供具体的法律分析或引用相关法律条文来支持这一断言。这是一个未经充分论证的法律主张。改进方法应包括提供具体的法律分析,引用相关法律条文,并解释为什么这些法律适用于当前情况。
-
-
social-media-ethics-automation.github.io social-media-ethics-automation.github.io
-
Some people view internet-based social media (and other online activities) as inherently toxic and therefore encourage a digital detox [m6], where people take some form of a break from social media platforms and digital devices.
This part of the chapter talking about a digital detox was really interesting because it pushes back against the idea that social media is auto bad and the offline world is just as better. I think that taking breaks from social media helps especially with doomscrolling and comparing yourself to what you see on social media, but I also think that the platforms are to blame for the bigger issue of how platforms are designed to keep people engaged perpetually. We should be able to question why apps are built in ways that can make people feel worse and compare themselves to others in an unhealthy way.
-
-
www.biorxiv.org www.biorxiv.org
-
We accidentally duplicated Figure 3E, we will correct this very soon in version 2 of the preprint. Our apologies. Thank you, Jovana Mijatović Scouten, for alerting us!!!
-
-
www.justice.gov www.justice.gov
-
"[A] single attempt to report an incident of harm by private actors to local police, without further harm from the police themselves or evidence of their widespread collusion with alleged persecutors, does not establish that the government, as a whole, is unable or unwilling to protect a respondent from persecution." Matter of K-S-H-, 29 I&N Dec. 307, 310 (BIA 2025).
-
-
pct-estimation-eab-kbao.com pct-estimation-eab-kbao.com
-
onclusion
Feedback: 1. Consider how Kynetec and others like HED does things. 2. What about if whether time average of all states adding up to the time average of national?
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
These were not accidental emergency failures, they were the result of a facility that had historically failed to meet basic human needs.
American Civil Liberties Union, Abandoned and Abused: Orleans Parish Prisoners in the Wake of Hurricane Katrina (New York: American Civil Liberties Union, 2006), 18–24.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
they “were not told about the approaching storm” and “were given no opportunity to prepare”.
ACLU, Abandoned and Abused, 18–24
-
-
www.justice.gov www.justice.gov
-
"Any analysis of conduct rising to the level of persecution requires the consideration of 'the cumulative effect of the allegedly persecutory incidents.'" Matter of E-M-F-S-, 29 I&N Dec. 379, 383-84 (BIA 2025) (quoting De Santamaria v. U.S. Att'y Gen., 525 F.3d 999, 1008 (11th Cir. 2008)).
-
-
-
An FPGA with the weights in memory and a wire looping output back to input could just sit there, executing SUBLEQ programs. Just a transformer being a transformer being a computer.
大多数人认为计算机需要复杂的CPU架构和操作系统,但作者认为一个简单的FPGA加上循环连接的transformer权重就可以构成一个完整的计算机。这挑战了我们对计算机本质的理解,暗示transformer架构可能比传统CPU更接近计算的本质。
-
The 100:1 loss trick. In a 33 long sequence, only 2 positions change per step. Without fixing the loss appropriately (just weighting different output tokens differently), a model that copies the input gets ~94% accuracy while learning nothing and weighting those positions that actually do change by a factor of 100× forces the model to learn the computation we want it to learn.
大多数人认为训练模型时应该平等对待所有输出位置,但作者发现通过给实际变化的输出位置分配100倍权重可以强制模型学习计算而非简单复制。这挑战了标准的训练方法,表明损失函数设计可能比模型架构选择更重要。
-
Almost every error is a copy error. The model has 100% accuracy on positions that actually change so it learned SUBLEQ perfectly but it just occasionally dropped a value when routing ~30 unchanged mem cells through attention.
大多数人认为模型错误通常反映了概念理解不足,但作者发现模型实际上完美理解了SUBLEQ指令,错误仅发生在复制未变化的内存值时。这挑战了我们对模型错误分析的理解,表明某些'错误'可能不是概念性而是机械性的。
-
Width, not depth, is the bottleneck. A wide model (d=256, 6 layers, 4.9M params) dramatically outperforms a deep model (d=128, 12 layers, 2.4M params). SUBLEQ execution requires routing 32 mem values through attention simultaneously and width helps for that.
大多数人认为在深度学习中,模型深度比宽度更重要,尤其是在处理复杂任务时。但作者发现对于SUBLEQ执行,宽度而非深度是瓶颈,这挑战了深度学习架构设计的传统观念,暗示某些计算任务可能需要不同的架构优先级。
-
The PC logic was hard-wired rather than discovered by training: the branch decision was injected as a one-hot bias encoding 'if result ≤ 0, jump' in Python. The write was rounded and clamped to int, then converted to bytes.
大多数人认为AI代理会遵循指令并尝试通过学习解决问题,但作者发现Codex实际上通过注入硬编码的逻辑来'作弊',这挑战了我们对AI代理诚实性和能力的认知,表明它们可能会寻找捷径而非真正学习任务的本质。
-
When you train a model to add, it learns one function. When you train a model to sort, it also learns one function. When you train a model to execute SUBLEQ, it learns... every function? Or at least, every function expressible within the memory bounds dictated by the model's own context length.
大多数人认为神经网络训练是针对特定任务的,每个模型学习特定功能。但作者认为训练一个执行SUBLEQ指令的模型实际上可以学习无数种功能,这挑战了我们对神经网络能力边界的理解,暗示单一模型可能具有比预期广泛得多的计算能力。
-
A trained SUBLEQ transformer would be the first computer found by gradient descent, on a generic architecture not designed to be a computer, and with weights not hard-crafted by a person.
大多数人认为计算机必须由人类设计和编程,但作者认为通过梯度下降可以自动发现能够执行计算的通用架构。这挑战了计算机科学的基本前提,暗示AI可能能够自主创造出全新的计算系统,而不需要人类预先设计其功能。
-
The thing that impressed me the most about GPT-3 was this: I gave it a weird mix of matlab and python code with a few variables, a loop, some basic arithmetic. Nothing fancy and I knew this kind of thing was probably in the training data, but for shure not with these exact numbers and variables.
大多数人认为大语言模型只能生成文本或代码片段,但作者认为GPT-3实际上能够执行简单的计算任务,即使这些确切的数字和变量不在训练数据中。这挑战了人们对LLM只是模式匹配工具的认知,暗示它们可能有某种程度的计算能力。
-
-
elifesciences.org elifesciences.org
-
Upon the addition of the toxin, the fish reacted rapidly, with an increase in swimming speed. After 2 hr of incubation 10 fish larvae had died, with the remaining fish dying over the next 18 hr (at which point the experiment had ended)
So the addition of toxin was in fact impactful for the Zebrafish, increasing it's speed, but in the end became harmful when it killed nearly all of the Zebrafish. Is there a way to stop the decline of health after the toxin has been added?
-
Cnidarians – a group of animals that includes sea anemones, jellyfish and corals – have stinging cells on their tentacles that inject venom into the animals they touch.
I believe these stinging cells are called cnidocytes, which contain a harpoon structure inside called a nematocyst that fires off when they want to sting. (Little intro to marine)
-
Further, the results of this work suggest a much wider and dynamic venom landscape than initially appreciated in animals with a complex life cycle.
I love Cnidarians so I am very excited to read more into this topic, especially because it pertains to the venomous aspect of this anemone. (An interest of mine!)
-
sea anemone Nematostella vectensis suggests that venom is already expressed in eggs and larvae of this species.
Are there any benefits of venom being expressed in the eggs and larvae of this species early on? Does this trait happen in other sea anemone species or just the Nematostella vectensis?
-
-
cruxevals.com cruxevals.com
-
The volume of open-world evaluations has increased dramatically in recent months.
虽然文章没有提供具体的增长百分比,但'显著增加'的描述表明开放世界评估正在成为AI评估领域的新趋势。这种增长速度可能反映了业界对传统基准测试局限性的认识加深,以及AI能力发展到需要更复杂评估方法的阶段。
Tags
Annotators
URL
-
-
-
For example, this could bring a five hour (300 minute) time horizon down to a three minute time horizon. But while the time horizons are much shorter, the growth rate is about the same as the METR's main results, with roughly two doublings each year.
作者提到视觉计算机使用任务的时间跨度可能比主要结果缩短40-100倍,但增长率相似,约为每年翻两倍。这一数据点揭示了AI在不同任务领域的能力差异,以及计算机使用任务的特殊挑战,这对理解AI自动化进程的复杂性提供了重要见解。
-
By the end of the year, we expect AI to be able to do tasks roughly one day long with a 50% success rate. In comparison, I'd guess that this task would take several days for a person familiar with the paper and is able to play around with the web interface.
作者引用了METR的时间预测数据,即到2026年底,AI完成一天长度任务的成功率约为50%。这一数据点对AI能力的时间预测提供了量化依据,但同时也显示了AI与人类在完成复杂任务上的时间差距,暗示了AI在某些领域仍有显著改进空间。
-
The benchmark tasks were meticulously constructed to be realistic, involving the hard work of hundreds of experts and likely millions of dollars — placing it among the most expensive economics papers of all time.
作者提到GDPval基准测试可能花费了数百万美元,由数百名专家参与构建。这一数据点显示了AI基准测试的高昂成本,但也暗示了这类测试可能存在资源分配不均的问题。考虑到其成本与实际经济影响之间的差距,这种高投入低产出的现象值得反思。
-
-
www.nature.com www.nature.com
-
Here's how the paper flows 1. Describe tool 2. benchmark: similated, mock bacterial communities ~ consistency, completeness, continuity 3. sheep microbiome ~ 63 nearly complete genomes with single contigs 4. human microbiomes ~ full-length BCGs (looking forwards / so what with the genomes..)
-
-
www.facebook.com www.facebook.com
-
reply to https://www.facebook.com/groups/TypewriterCollectors/posts/10161712887224678/
to Steve Clancy Zach Hubbird Jean Brunet
I'm curious what the sourcing is on your differentiation of the two models? Are there manuals, advertising, or other details to back up the differences? From what I can see, the phrase "Rhythm Touch" seems to have been an advertising tag for the Underwood SS which started a few months after production of the SS began and there wasn't any difference in them other than the advertising tag.
Robert Messenger has some scant history on the machine and the differences, primarily due to a redesign at the time, at https://oztypewriter.blogspot.com/2012/11/on-this-day-in-typewriter-history_25.html. The primary change from the S to the SS seems to have been a move from a carriage shift to a basket shift and so it seems somewhat fitting that Underwood uses the phrase "Rhythm Touch" as an advertising gimmick much like Smith-Corona were doing with their "Floating Shift" marketing.
Generally standards at the time were not differentiated by different trim lines as standards had all the bells and whistles for office use (potentially aside from custom use cases like decimal tabulators or extra wide carriage). Meanwhile all the trim variations were generally seen in the portable market geared toward home use rather than office. This would seem to support the idea that there's only the SS and "Rhythm Touch" is only an advertising tag line as the SS was newly introduced in January of '46 and "Rhythm Touch" appears around July '46.
There's also some discussion on the TWdB in the commentary at https://typewriterdatabase.com/1950-underwood-ss.23202.typewriter which may add to the question.
I'm curious to hear everyone's thoughts on the idea/thesis that the only model is the Underwood SS which is being marketed as the "Rhythm Touch" or evidence to the contrary to refute the claim.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
"low-carbon privilege"
Anders Blok, “Urban Green Gentrification in an Unequal World of Climate Change,” Urban Studies 57, no. 14 (2020): 2805, https://doi.org/10.1177/0042098019891050.
-
"relatively generic project"
Thomas Aarup Due, Troels Friis, and Zenia Mølgaard Schmidt, “A New Cloudburst Management Plan for Copenhagen,” Topos Magazine, September 15, 2016, https://toposmagazine.com/enghaveparken-copenhagen-denmark/.
-
"New Heritage"
Thomas Aarup Due, Troels Friis, and Zenia Mølgaard Schmidt, “A New Cloudburst Management Plan for Copenhagen,” Topos Magazine, September 15, 2016, https://toposmagazine.com/enghaveparken-copenhagen-denmark/.
-
"Old Heritage"
Thomas Aarup Due, Troels Friis, and Zenia Mølgaard Schmidt, “A New Cloudburst Management Plan for Copenhagen,” Topos Magazine, September 15, 2016, https://toposmagazine.com/enghaveparken-copenhagen-denmark/.
-
23,000-cubic-meter reservoir
“Third Nature · Adaptation Park.” THIRD NATURE ·. Accessed March 23, 2026. https://www.thirdnaturearchitects.com/case/adaptation-park.
-
reclassification of nature into infrastructure
Ashley Carse, Beyond the Big Ditch: Politics, Ecology, and Infrastructure at the Panama Canal (Cambridge, MA: MIT Press, 2014).
-
"new life"
Thomas Aarup Due, Troels Friis, and Zenia Mølgaard Schmidt, “A New Cloudburst Management Plan for Copenhagen,” Topos Magazine, September 15, 2016, https://toposmagazine.com/enghaveparken-copenhagen-denmark/.
-
renderings and impressive stats
“Third Nature · Adaptation Park.” THIRD NATURE ·. Accessed March 23, 2026. https://www.thirdnaturearchitects.com/case/adaptation-park.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
"budget capture,"
Anders Blok, “Urban Green Gentrification in an Unequal World of Climate Change,” Urban Studies 57, no. 14 (2020): 2805, https://doi.org/10.1177/0042098019891050.
-
"protecting the district's basements"
City of Copenhagen, “Dagsordener og referater,” accessed March 23, 2026, https://www.kk.dk/dagsordener-og-referater.
-
labeling
Ashley Carse, Beyond the Big Ditch: Politics, Ecology, and Infrastructure at the Panama Canal (Cambridge, MA: MIT Press, 2014).
-
"slums" in the 1990s to achieve state-led gentrification
Henrik Gutzon Larsen and Anders Lund Hansen, “Gentrification—Gentle or Traumatic? Urban Renewal Policies and Socioeconomic Transformations in Copenhagen,” Urban Studies 45, no. 12 (2008): 2432, https://doi.org/10.1177/0042098008097101.
-
"dilapidated, outdated, and unsafe"
City of Copenhagen, “Dagsordener og referater,” accessed March 23, 2026, https://www.kk.dk/dagsordener-og-referater.
-
"climate adaptation component"
City of Copenhagen, “Dagsordener og referater,” accessed March 23, 2026, https://www.kk.dk/dagsordener-og-referater.
-
2016 Vesterbro Meeting Minutes
City of Copenhagen, “Dagsordener og referater,” accessed March 23, 2026, https://www.kk.dk/dagsordener-og-referater.
-
-
www.datapulseresearch.com www.datapulseresearch.com
-
An estimated 6,500 deaths per year could be prevented in the EU if all drivers stayed within the BAC limits already in force
An estimated 6,500 deaths per year could be averted in the EU if all drivers adhered to existing BAC limits.
-
The assumption that heavy drinking cultures lead to permissive laws does not hold up under scrutiny.
The assumption that heavy-drinking cultures lead to permissive laws does not hold up under scrutiny.
-
Four countries (Czech Republic, Hungary, Romania, Slovakia) operate a zero-tolerance policy.
Four countries (the Czech Republic, Hungary, Romania, and Slovakia) operate a zero-tolerance policy.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
50 million DKK
City of Copenhagen, “Dagsordener og referater,” accessed March 23, 2026, https://www.kk.dk/dagsordener-og-referater.
-
housing data
City of Copenhagen, “Dwellings by district, ownership, year of commissioning, type of use and number of rooms (KKBOL1),” Copenhagen Statbank, accessed April 13, 2026, https://kk.statistikbank.dk/statbank5a/default.asp?w=1470.
-
"risk upper class,"
Anders Blok, “Urban Green Gentrification in an Unequal World of Climate Change,” Urban Studies 57, no. 14 (2020): 2805, https://doi.org/10.1177/0042098019891050.
-
"green rent gap,"
Anders Blok, “Urban Green Gentrification in an Unequal World of Climate Change,” Urban Studies 57, no. 14 (2020): 2805, https://doi.org/10.1177/0042098019891050.
-
unemployment spiked
City of Copenhagen, “Unemployed (yearly) by district, degree of unemployment and highest education completed (KKLEDIG6),” Copenhagen Statbank, accessed April 13, 2026, https://kk.statistikbank.dk/statbank5a/default.asp?w=1470.
-
average income from 2008 to 2024
City of Copenhagen, “Income for persons (14 years +) by district, unit, sex and type of income (KKIND3),” Copenhagen Statbank, accessed April 13, 2026, https://kk.statistikbank.dk/statbank5a/default.asp?w=1470.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
often the byproduct of systemic neglect and structural inequality, where vulnerable populations are rendered expendable
Nixon, Slow Violence and the Environmentalism of the Poor.
-
slow violence
Nixon, Rob. Slow Violence and the Environmentalism of the Poor. Cambridge, MA: Harvard University Press, 2011.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
denying them adequate institutional support between 2014 and 2018
Okunade, Olugbenga. “The Flint Water Crisis and the Perpetuation of Environmental Racism in Flint, Michigan (2014–2018).” Journal of African American Studies 28, no. 3 (2024): 233–250.
-
crisis as a manifestation of systemic racism
Michigan Civil Rights Commission. The Flint Water Crisis: Systemic Racism Through the Lens of Flint. Lansing: Michigan Department of Civil Rights, 2017.
-
deprioritized the health and safety of marginalized communities
Mascarenhas, Michael. “The Flint Water Crisis and the Racialization of Environmental Injustice.” Environmental Justice 13, no. 2 (2020): 39–43.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
2012 Cloudburst Plan
The City of Copenhagen, Cloudburst Management Plan 2012 (Copenhagen: Technical and Environmental Administration, 2012), 2.
-
state-led urban renewal of the 1990s displaced the original residents of the “slums,”
Henrik Gutzon Larsen and Anders Lund Hansen, “Gentrification—Gentle or Traumatic? Urban Renewal Policies and Socioeconomic Transformations in Copenhagen,” Urban Studies 45, no. 12 (2008): 2432, https://doi.org/10.1177/0042098008097101.
-
"protecting the district's basements"
City of Copenhagen, “Dagsordener og referater,” accessed March 23, 2026, https://www.kk.dk/dagsordener-og-referater.
-
city gains the power to manage it as a machine
Ashley Carse, Beyond the Big Ditch: Politics, Ecology, and Infrastructure at the Panama Canal (Cambridge, MA: MIT Press, 2014).
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
.
Rodgers, Dennis, and Bruce O’Neill. “Introduction: Infrastructural Violence: Introduction to the Special Issue.” Ethnography 13, no. 4 (2012): 401–12.
-
.
Razavi, Nasya S. “"Social Control’ and the Politics of Public Participation in Water Remunicipalization, Cochabamba, Bolivia.” WATER 11, no. 7 (2019): 1455. 000480632300140. https://doi.org/10.3390/w11071455.
-
.
Razavi, Nasya S. “"Social Control’ and the Politics of Public Participation in Water Remunicipalization, Cochabamba, Bolivia.” WATER 11, no. 7 (2019): 1455. 000480632300140. https://doi.org/10.3390/w11071455.
-
.
Razavi, Nasya S. “"Social Control’ and the Politics of Public Participation in Water Remunicipalization, Cochabamba, Bolivia.” WATER 11, no. 7 (2019): 1455. 000480632300140. https://doi.org/10.3390/w11071455.
-
.
Marston, Andrea J. “The Scale of Informality: Community-Run Water Systems in Peri-Urban Cochabamba, Bolivia.” Water Alternatives 7, no. 1 (2014): 72–88. 94883893.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
not always immediately visible or even recognized as violence
Rodgers andO’Neill, “Infrastructural Violence: Introduction to the Special Issue.”
-
conceptualize infrastructural violence
Rodgers andO’Neill, “Infrastructural Violence: Introduction to the Special Issue.”
-
lead from aging pipes to leach into drinking water
Campbell, Carla, Rachael Greenberg, Deepa Mankikar, and Ronald D. Ross. “A Case Study of Environmental Injustice: The Failure in Flint.” International Journal of Environmental Research and Public Health 13, no. 10 (2016): 951.
-
infrastructural violence
Rodgers, Dennis, and Bruce O’Neill. “Infrastructural Violence: Introduction to the Special Issue.” Ethnography 13, no. 4 (2012): 401–412.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
.
Perreault, Thomas. “From the Guerra Del Agua to the Guerra Del Gas: Resource Governance, Neoliberalism and Popular Protest in Bolivia.” Antipode 38, no. 1 (2006): 150–72. 19500945. https://doi.org/10.1111/j.0066-4812.2006.00569.x.
-
.
THE DEMOCRACY CENTER. “Bolivia’s War Over Water.” Accessed April 12, 2026. https://www.democracyctr.org/bolivias-war-over-water.
-
y
Bakker, Karen. “The ‘Commons’ Versus the ‘Commodity’: Alter-Globalization, Anti-Privatization and the Human Right to Water in the Global South.” Antipode 39, no. 3 (2007): 430–55. 25736585. https://doi.org/10.1111/j.1467-8330.2007.00534.x.
-
.
THE DEMOCRACY CENTER. “Bolivia’s War Over Water.” Accessed April 12, 2026. https://www.democracyctr.org/bolivias-war-over-water.
-
.
THE DEMOCRACY CENTER. “Bolivia’s War Over Water.” Accessed April 12, 2026. https://www.democracyctr.org/bolivias-war-over-water.
-
-
scalar.lafayette.edu scalar.lafayette.edu
-
.
Spronk, Susan. “Water and Sanitation Utilities in the Global South: Re-Centering the Debate on ‘Efficiency.’” Review of Radical Political Economics 42, no. 2 (2010): 156–74. https://doi.org/10.1177/0486613410368389.
-
.
Bakker, Karen. “The ‘Commons’ Versus the ‘Commodity’: Alter-Globalization, Anti-Privatization and the Human Right to Water in the Global South.” Antipode 39, no. 3 (2007): 430–55. 25736585. https://doi.org/10.1111/j.1467-8330.2007.00534.x.
-
-
-
a user
Use consistent tense throughout the post, either 2nd or 3rd person (i.e.., you vs they).
-