models climb close to the average human baseline over the past year and a half.
这个时间跨度(一年半)内AI系统接近人类平均水平的表现,显示了AI在基本常识推理方面的进步速度。这一数据点表明,虽然简单基准测试可能趋于饱和,但它们仍能揭示AI系统的局限性。
models climb close to the average human baseline over the past year and a half.
这个时间跨度(一年半)内AI系统接近人类平均水平的表现,显示了AI在基本常识推理方面的进步速度。这一数据点表明,虽然简单基准测试可能趋于饱和,但它们仍能揭示AI系统的局限性。
The volume of open-world evaluations has increased dramatically in recent months.
虽然文章没有提供具体的增长百分比,但'显著增加'的描述表明开放世界评估正在成为AI评估领域的新趋势。这种增长速度可能反映了业界对传统基准测试局限性的认识加深,以及AI能力发展到需要更复杂评估方法的阶段。
Total AI computing capacity has been doubling approximately every seven months
AI计算能力每7个月翻倍的增长率远超摩尔定律(约18-24个月翻倍),反映了AI领域对计算资源的极度渴求和产业投入的快速增长。这种指数级增长趋势是不可持续的,将面临物理极限、能源供应和制造成本等多重挑战,可能在未来几年内放缓。
Worldwide IT spending is expected to reach $6.31 trillion in 2026, up 13.5% from 2025, according to Gartner.
Gartner的预测提供了一个重要的数据点,说明了全球IT支出的增长趋势,这背后可能隐藏着更深层次的行业变化。
The rankings, set up by a Meta employee on its intranet using company data, measure how many tokens — the units of data processed by AI models — employees are burning through.
这一观点揭示了‘tokenmaxxing’作为衡量员工AI使用能力的新趋势,暗示了数据消耗成为衡量生产力的一种方式。
Pindrop reported a 475 percent year-over-year increase in synthetic voice attacks against insurance call centers across 2025.
475%的年增长率表明语音合成攻击呈爆炸性增长。这一惊人的数字反映了AI语音技术的普及和攻击者利用这些技术的速度。保险公司成为主要目标是因为理赔主要通过电话处理,这使得语音验证成为关键安全环节。
Three of the four metrics (ECI, log METR 50% time horizon, and a math-focused index we constructed from several math benchmarks) show strong evidence that progress has sped up relative to a global linear trend fit to data from 2023 onward.
这个数据点表明75%的AI能力指标显示加速趋势,这是一个相当高的比例。文章提到这种加速始于2023年,与推理模型的出现时间吻合。这个比例值得注意,因为它表明AI进步可能正在经历一个质的转变,而非仅仅是量的累积。
Meta is not alone in pursuing such a vision: Anthropic debuted tech capable of doing this [in 2024] and OpenAI last year announced [“Operator”] – a tool that can use a web browser on a human’s behalf.
大多数人可能认为Meta在追求这种愿景方面是独一无二的,但作者指出Anthropic和OpenAI也在进行类似的研究,这表明这种趋势可能比人们想象的更普遍。
I just hope the industry doesn't abandon the Model Context Protocol. The dream of seamless AI integration relies on standardized interfaces, not a fractured landscape of hacky CLIs.
这是一个关于行业方向的深刻担忧。作者暗示了一个令人不安的趋势:行业可能过早放弃MCP这一标准化接口,转而采用碎片化的CLI方案。这不仅会导致用户体验下降,还可能阻碍AI与服务的无缝集成,影响整个生态系统的发展。
The industry is currently witnessing a decisive shift toward more permissive, standardized licenses as developers increasingly prioritize ease of integration and legal certainty.
令人惊讶的是:AI行业正经历向更宽松、标准化许可证的明显转变,这反映了开发者日益重视集成便利性和法律确定性。这一趋势表明,随着AI模型的成熟,许可证选择正成为与模型性能同等重要的因素,改变了AI开发的格局。
Five hyperscalers now own over two-thirds of global AI compute, rising from 60% in Q1 2024.
令人惊讶的是:这五大超大规模云服务提供商对全球AI计算资源的控制力在短短一年内从60%增长到67%,显示出AI计算资源正以前所未有的速度向少数科技巨头集中,这可能加剧AI发展的不平衡。
The launch shows Meta is increasingly betting that efficiency, product integration, and distribution, not just model size, will define the next phase of competition in AI.
这揭示了AI行业正在从单纯追求更大模型转向更注重实用性和集成度的重要转变。Meta的战略表明,未来AI竞争的关键可能不是模型规模,而是如何将AI无缝集成到现有产品中并提高效率。这种转变可能会重塑整个AI行业的发展方向和投资重点。
血糖トレンド(変動)を知る意味
the field of smart wearable devices has developed rapidly in recent years. Among them, medical wearable devices have become one of the most promising fields.
Highlights the rise of wearable medical devices as a key trend in mobile health.
Deutscher explains that languages tend to shift from being morecomplex to less complex
for - language evolution - trend - simplification
if your treatments are ordered, don't compare each mean with each other mean (multiple comparisons), instead do one test for trend to ask if the outcome is linearly related with treatment number
How do you do hypothesis testing for trends for an ordered categorical variable?
Could you convert x to numbers (1,2,3) and run a linear regression y ~ x? or even categorical ordered variables can be linearly regressed?
Michel Forst, UN-Berichterstatter zur Aarhus-Konvention, hat die europäischen Regierungen aufgefordert, Klima-Aktivist:innen zu unterstützen statt sie zu kriminalisieren. Die zunehmende Repression gefährde das Erreichen der Pariser Klimaziele und Demokratie und Menschenrechte in Europa. Forst erwartet, dass Protest und direkte Aktion zunehmen, weil die aktuelle Politik vieler europäischer Regierungen die wissenschaftlichen Erkenntnisse zu globaler Erhitzung, Biodiversitätsverlust und Umweltverschmutzung nicht respektiert. https://www.theguardian.com/environment/2024/feb/28/european-nations-must-end-repression-of-peaceful-climate-protest-says-un-expert
Positionspapier von Michel Forst: https://unece.org/sites/default/files/2024-02/UNSR_EnvDefenders_Aarhus_Position_Paper_Civil_Disobedience_EN.pdf
here are several ways I havefound useful to invite the sociological imagination:
C. Wright Mills delineates a rough definition of "sociological imagination" which could be thought of as a framework within tools for thought: 1. Combinatorial creativity<br /> 2. Diffuse thinking, flâneur<br /> 3. Changing perspective (how would x see this?) Writing dialogues is a useful method to accomplish this. (He doesn't state it, but acting as a devil's advocate is a useful technique here as well.)<br /> 4. Collecting and lay out all the multiple viewpoints and arguments on a topic. (This might presume the method of devil's advocate I mentioned above 😀)<br /> 5. Play and exploration with words and terms<br /> 6. Watching levels of generality and breaking things down into smaller constituent parts or building blocks. (This also might benefit of abstracting ideas from one space to another.)<br /> 7. Categorization or casting ideas into types 8. Cross-tabulating and creation of charts, tables, and diagrams or other visualizations 9. Comparative cases and examples - finding examples of an idea in other contexts and time settings for comparison and contrast 10. Extreme types and opposites (or polar types) - coming up with the most extreme examples of comparative cases or opposites of one's idea. (cross reference: Compass Points https://hypothes.is/a/Di4hzvftEeyY9EOsxaOg7w and thinking routines). This includes creating dimensions of study on an object - what axes define it? What indices can one find data or statistics on? 11. Create historical depth - examples may be limited in number, so what might exist in the historical record to provide depth.
towards amateur cultural modes aided by digital media and networking is amuch broader sociotechnical trend
Newsnodes USA COVID-19 Monitor. (n.d.). Retrieved April 20, 2022, from https://newsnodes.com/us
Katherine Ognyanova. (2022, February 15). Americans who believe COVID vaccine misinformation tend to be more vaccine-resistant. They are also more likely to distrust the government, media, science, and medicine. That pattern is reversed with regard to trust in Fox News and Donald Trump. Https://osf.io/9ua2x/ (5/7) https://t.co/f6jTRWhmdF [Tweet]. @Ognyanova. https://twitter.com/Ognyanova/status/1493596109926768645
Carl T. Bergstrom. (2022, January 20). Hope springs eternal. Https://t.co/GAghN6U4LI [Tweet]. @CT_Bergstrom. https://twitter.com/CT_Bergstrom/status/1484279472605982720
Dr Duncan Robertson [@Dr_D_Robertson]. (2021, October 29). ONS Covid survey. 2% of the population +ve. “The percentage of people testing positive for COVID-19 increased for all age groups, except for those in school Year 12 to those aged 34 years, where the trend was uncertain in the week ending 22 October 2021” https://ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/bulletins/coronaviruscovid19infectionsurveypilot/29october2021 https://t.co/1n9KVq6wDT [Tweet]. Twitter. https://twitter.com/Dr_D_Robertson/status/1454050450106376192
Coronavirus (COVID-19) Infection Survey, UK: 29 October 2021, Office for National Statistics
zunehmende Bedeutung der „explainable AI", die unter anderem mit Hilfe semantischer Wissensmodelle implementiert werden kann
Explainable AI
Einsatz von Machine Learning und Graphen-Technologien
ML und Graph-Technologie
Einsatz hochskalierbarer Graphen-Datenbank-Technologien, die durch die Integration von semantischen Middleware-Komponenten, Visualisierungswerkzeugen und Editoren auch von Nicht-Technikern und Fachexperten bedient werden können,
Trend
Cataloging of data using a strong Metadata Management operating model that includes the business data stewards will be a priority in 2022.
trend
Elliott, P., Eales, O., Bodinier, B., Tang, D., Wang, H., Jonnerby, J., Haw, D., Elliott, J., Whitaker, M., Walters, C., Atchison, C., Diggle, P., Page, A., Trotter, A., Ashby, D., Barclay, W., Taylor, G., Ward, H., Darzi, A., … Donnelly, C. (2022). Post-peak dynamics of a national Omicron SARS-CoV-2 epidemic during January 2022 [Working Paper]. http://spiral.imperial.ac.uk/handle/10044/1/93887
Diego Bassani, PhD 🏠😷 💉 💉 💉. (2022, January 7). Seasonality, huh? Https://t.co/WcarGXqRSY [Tweet]. @DGBassani. https://twitter.com/DGBassani/status/1479278943328944130
Colin Davis. (2021, December 20). Update for 20th Dec. The trend line still reflects 1.8 day doubling (it’s 1.7 days if we look at just the last week). Today’s number is down, but I wouldn’t read too much into that at this point. Https://t.co/kOCjxhRbop [Tweet]. @ProfColinDavis. https://twitter.com/ProfColinDavis/status/1472969632705392640
ReconfigBehSci. (2021, December 12). RT @thehowie: BREAKING: Hospitalizations South Africa Hospitalizations⬆️70% week over week Gauteng Province⬆️50% GP⬆️1.2% from yesterday IC… [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1470065925596618755
COVID-19 Vaccine Breakthrough Weekly Update—Minnesota Dept. Of Health. (n.d.). Retrieved November 23, 2021, from https://www.health.state.mn.us/diseases/coronavirus/stats/vbt.html
Greg Jericho. (2021, August 29). The next fortnight could get pretty damn ugly. Here’s hoping the we are near the peak https://t.co/HW1JY4xy8c [Tweet]. @GrogsGamut. https://twitter.com/GrogsGamut/status/1431789682380066818
COVID in Israel: Serious Cases Decline for Fourth Consecutive Day. (n.d.). Haaretz. Retrieved 6 September 2021, from https://www.haaretz.com/israel-news/covid-in-israel-serious-cases-decline-for-fourth-consecutive-day-1.10179105
Derek Thompson. (2021, August 25). Adult hospitalizations since July 1 vs. Vaccinations, by state: 1) The relationship between more vaccines and less hospitalization is pretty straightforward. 2) Holy moly, Florida. Among states with more than one shot per person, FL really is on its own island of pain. Https://t.co/tuTAdUT0OM [Tweet]. @DKThomp. https://twitter.com/DKThomp/status/1430643278337163267
We’ve analyzed thousands of COVID-19 misinformation narratives. Here are six regional takeaways—Bulletin of the Atomic Scientists. (n.d.). Retrieved August 1, 2021, from https://thebulletin.org/2021/06/weve-analyzed-thousands-of-covid-19-misinformation-narratives-here-are-six-regional-takeaways/
Dvir Aran. (2021, July 27). You’ve probably seen reports from Israel on low vaccine effectiveness in this wave. Is it because of Delta? Waning immunity? We think the reason is mostly that we got the denominator wrong. Https://t.co/yloh5Vo9Xi [Tweet]. @dvir_a. https://twitter.com/dvir_a/status/1420059124700700677
ReconfigBehSci. (2021, June 8). RT @TWenseleers: Across whole of England, 76% [74-78%] of all newly diagnosed infections now B.1.617.2. We see familiar pattern of combinat… [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1402633076396552193
ReconfigBehSci. (2021, June 7). RT @JamesWard73: Thanks to the person who pointed me towards this report: Https://t.co/XZZ7JsdA8t (you know who you are!)—Lots of interes… [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1402050855449202696
ReconfigBehSci. (2021, June 5). RT @TWenseleers: Estimated growth rate advantage of B.1.617.2 vs B.1.1.7 is 8.3%/day [7.9-8.8%] 95% CLs and 4%/day [3-5%] for B.1.617.1 vs… [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1402243029407178757
ReconfigBehSci. (2021, June 14). RT @AdamJKucharski: Compare relationship between R and proportion of sequences consistent with the B.1.1.7 variant in autumn 2020 (below le… [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1404604886117797888
ReconfigBehSci. (2021, June 1). RT @nikosbosse: Predictions from the second week of the UK Covid-19 Crowd Forecasting Challenge are in. Https://t.co/GfzSBYRmgq On average… [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1399869840928612354
Prof. Christina Pagel. (2021, April 15). THREAD on VACCINATION & EQUITY in ENGLAND: I know I’ve tweeted about this before, but now we can look at how gaps by deprivation and ethnicity change with age groups and what that might mean... TLDR: widening gaps but access and communication will be key I suspect 1/5 [Tweet]. @chrischirp. https://twitter.com/chrischirp/status/1382725119773134848
Researchers reported that large populations consistently develop highly similar category systems. This may be relevant to lexical aspects of large communication networks and cultures such as folksonomies and language or human communication, and sense-making in general.
ow we might accelerate the path to the Metaverse by a focus on desktop access by remote workers: Could it be that the Metaverse starts with people working together in virtual offices, and then staying around and connecting for various reasons outside of work?
this is actually a fascinating theory and i betcha this is actually how the metaverse starts to take hold... people looking for better alternatives than zoom for virtual events, conferences, work meetings, and birthday parties online
this will merge with video games to create the metaverse
its hilarious to think of it this way
Katz points out this is an “extreme example to prove a point.” YellowHeart wants to show people how much control can be put into the ticket with smart contracts. Going forward, he says this same tech can be used for general tickets, which could be a huge advancement in the secondary market. Every time an NFT is resold, a percentage of money earned could go to the artist — or whoever is included in the contract, perhaps even a charity. (In such instances, YellowHeart can also set a maximum price that the NFT can be resold at, eradicating scalpers.)
Dit is volgens mij een killer feature van NFT's in muziek. De rest is leuk maar dit is superinteressant
The COVID Tracking Project. (2020, November 11). Our daily update is published. States reported 1.2 million tests and 131k cases, the highest single-day total since the pandemic started. There are 62k people currently hospitalized with COVID-19. The death toll was 1,347. Https://t.co/WPoX9Nj7ef [Tweet]. @COVID19Tracking. https://twitter.com/COVID19Tracking/status/1326321342933831680
ReconfigBehSci. (2020, December 5). RT @bhrenton: On average, one person in the U.S. died of Covid-19 every 43 seconds last week. On this morning’s @CNNnewsroom, @Christi_Paul… [Tweet]. @SciBeh. https://twitter.com/SciBeh/status/1335906973750059009
Miro Weinberger. (2020, December 3). Our 1st Covid-19 wastewater tests since Thanksgiving just came in—Virus levels are up significantly citywide. I hope that all of #BTV will look at this graph and see what I see: A call to action, to stop gathering with other households, and to get tested ASAP if you have https://t.co/8nxTwOOcFA [Tweet]. @MiroBTV. https://twitter.com/MiroBTV/status/1334613511692017664
Dr Dominic Pimenta 💙 Please Just Stay At Home. (2021, January 3). @piersmorgan All the times the down-players were plain wrong: Https://t.co/Rx25xBHSgR https://t.co/ilXjRxMtXG [Tweet]. @DrDomPimenta. https://twitter.com/DrDomPimenta/status/1345737029464670208
Turk, E., Čelik, T., Smrdu, M., Šet, J., Kuder, A., Gregorič, M., & Kralj-Fišer, S. (2021). ADHERENCE TO COVID-19 MITIGATION MEASURES IN SLOVENIA: THE ROLE OF SOCIODEMOGRAPHIC AND PERSONALITY FACTORS. PsyArXiv. https://doi.org/10.31234/osf.io/hrfyk
Tim Blais (2021, January 17). @acapellascience: Vaccine wellerman let’s gooooo #seashantytok https://t.co/JUv0l82cf3 [Tweet]. https://twitter.com/acapellascience/status/1350782855564111875?s=20
Refactoring at scale is challenging and a significant investment, so communicating the progress is important both from a business perspective but also as a motivational effect to developers. In this case we used a combination of complexity trend visualizations and Code Health Metric as shown in the preceding example.
[[refactoring at scale]]
Kolko, J. (2020, September 30). Coronavirus and US Job Postings Through Sept 25. Indeed Hiring Lab. https://www.hiringlab.org/2020/09/30/job-postings-through-sept-25/
COVID Projections Tracker. (n.d.). Retrieved September 7, 2020, from https://www.covid-projections.com/
Bavadekar, Shailesh, Andrew Dai, John Davis, Damien Desfontaines, Ilya Eckstein, Katie Everett, Alex Fabrikant, et al. ‘Google COVID-19 Search Trends Symptoms Dataset: Anonymization Process Description (Version 1.0)’. ArXiv:2009.01265 [Cs], 2 September 2020. http://arxiv.org/abs/2009.01265.
Working at Home in Greece: Unexplored Potential at Times of Social Distancing?. COVID-19 and the Labor Market. (n.d.). IZA – Institute of Labor Economics. Retrieved July 31, 2020, from https://covid-19.iza.org/publications/dp13408/
Garrett, G. (2020 May 18) The post-COVID-19 world could be less global and less urban. World Economic Forum. https://www.weforum.org/agenda/2020/05/coronavirus-covid19-urbanization-globalization-change/
Dudel, C., Riffe, T., Acosta, E., van Raalte, A. A., Strozza, C., & Myrskylä, M. (2020). Monitoring trends and differences in COVID-19 case fatality rates using decomposition methods: Contributions of age structure and age-specific fatality [Preprint]. SocArXiv. https://doi.org/10.31235/osf.io/j4a3d
Schraff, D. (2020). Political trust during the Covid-19 pandemic: Rally around the flag or lockdown effects? [Preprint]. SocArXiv. https://doi.org/10.31235/osf.io/pu47c
Cohen, P. N. (2020). The COVID-19 epidemic in rural U.S. counties. https://doi.org/10.31235/osf.io/pnqrd
Webster, G. D., Mahar, E., & Wongsomboon, V. (2020). American Psychology Is Becoming More International, But Too Slowly: Comment on Thalmayer et al. (2020). https://doi.org/10.31234/osf.io/wqmer Ame
Views, R. (2020, June 3). More Select COVID-19 Resources. /2020/06/03/more-select-covid-19-resources/
Fränken, J.-P., & Pilditch, T. (2020). Cascades across networks are sufficient for the formation of echo chambers: An agent-based model. https://doi.org/10.31234/osf.io/8rgkc
Winton Centre for Risk and Evidence Communication
Bento, A. I., Nguyen, T., Wing, C., Lozano-Rojas, F., Ahn, Y.-Y., & Simon, K. (2020). Evidence from internet search data shows information-seeking responses to news of local COVID-19 cases. Proceedings of the National Academy of Sciences, 202005335. https://doi.org/10.1073/pnas.2005335117
Stephens-Davidowitz, S. (2020, April 5). Opinion | Google Searches Can Help Us Find Emerging Covid-19 Outbreaks. The New York Times. https://www.nytimes.com/2020/04/05/opinion/coronavirus-google-searches.html
Du, H., Yang, J., King, R. B., Yang, L., & Chi, P. (2020). COVID-19 Increases Online Emotional and Health-Related Searches [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/5gskw
COVID-19 Community Mobility Report. (n.d.). COVID-19 Community Mobility Report. Retrieved April 8, 2020, from https://www.google.com/covid19/mobility
Wang, T., Chen, X., Zhang, Q., & Jin, X. (2020, April 26). Use of Internet data to track Chinese behavior and interest in COVID-19. https://doi.org/10.31234/osf.io/j6m8q
Breckon, J. (2020 April 16). Seven welcome Covid-19 trends. Researchprofessionalnews.com. https://www.researchprofessionalnews.com/rr-news-uk-views-of-the-uk-2020-4-seven-welcome-covid-19-trends/
the deceptive practices it has been used to shield and enable are on borrowed time. The direction of travel — and the direction of innovation — is pro-privacy, pro-user control and therefore anti-deceptive-design.
During the recession everyone was trying to grab onto anything that would get them more business. Even small trucking companies were trying to get in on the action by saying they offered 3PL and 4PL services, but most were well out of their element.”
Crime increasing in the past 3 years47.75
47.75
Following the history of information technology and the massive trend towards open source, wecan see that democratizing information is the natural next step in the incessant trend to opensource, and thus the next big opportunity for innovation.
Those who succeed the most and establish successful platforms “on top” of the open standardlater tend to consolidate the industry by leveraging their scale (in assets and distribution) tointegrate vertically and expand horizontally at the expense of smaller companies. Competing inthis new environment suddenly becomes expensive and startups struggle to create value in theshadow of incumbents, compressing venture returns.Demand then builds for a low cost, open source alternative to the incumbent platforms, and thecycle repeats itself: the new open standard emerges and gets adopted, the market decentralizes asnew firms leverage the cost savings to compete with the old on price, value creation shiftsupwards (once more), and so on
Information technology evolves in multi-decade cycles of expansion, consolidation anddecentralization. Periods of expansion follow the introduction of a new open platform thatreduces the production costs of technology as it becomes a shared standard. As production costsfall, new firms come to market leveraging the standard to compete with established incumbents,pushing down prices and margins, and decentralizing existing market powers.The price drop attracts new users, increasing the overall size of the market and creating newopportunities for mass consumer applications. Entrepreneurial talent moves to serve the newmarkets where costs are low, competition is scarce, and the upside is high. Often these earlyentrepreneurs will introduce new kinds of business models, orthogonal to existing ones
Games such as The Sims Series, and Second Life are designed to be non-linear and to depend on collective intelligence for expansion. This way of sharing is gradually evolving and influencing the mindset of the current and future generations.[117] For them, collective intelligence has become a norm.
Soon after, Genius made a definitive push to realize Andreessen’s vision. By 2015, Genius claimed 40 million visitors to its website a month, 1 million of whom had annotated a post.
But the biggest problem with the annotator from Genius’ perspective is that few individuals are using it. After more than two years of development, the Chrome extension has only 12,320 users. It was last updated in June 2016.
Recent studies have indicated that Uber’s U.S. driver churn has sharply increased this year, to rates as high as 96%. Needless to say, it’s hard (and costly) to maintain double-digit growth rates, when only 4% of mission critical, de facto employees stay on the job for more than a year.
In historical context, Uber’s extraordinary losses are thus not just a case of growing pains of an ambitious Silicon Valley startup, but a reflection of the deep structural deficiencies in ride-hail industry economics. Prior to artificial regulatory supply caps, the unregulated taxi industry was unprofitable and subject to growing concerns over negative externalities. Uber is now facing the same relentless drag on its P&L.
An underlying theme in much of the work in the field is that existing government regulation of copyright, security, and antitrust is inappropriate in the modern world. For example, information goods, such as news articles and movies, now have zero marginal costs of production and sharing. This has made the redistribution without permission common and has increased competition between providers of information goods.
Research suggests that the number of close friends people report having has declined by one-third over the past generation (at least in the United States).