Hypothesis

33 Matching Annotations

Jul 2024
whoosh.readthedocs.io whoosh.readthedocs.io

Query expansion and Key word extraction — Whoosh 2.7.4 documentation

1
1. Spinningthoughts 03 Jul 2024
  
  in Public
  
  Whoosh provides methods for computing the “key terms” of a set of documents. For these methods, “key terms” basically means terms that are frequent in the given documents, but relatively infrequent in the indexed collection as a whole.
  
  Very interesting method, and way of looking at the signal. "What makes a document exceptional because something is common within itself and uncommon without".
  
  natural language processing
Visit annotations in context

Tags

natural language processing

Annotators

Spinningthoughts

URL

whoosh.readthedocs.io/en/latest/keywords.html
May 2024
media.dltj.org media.dltj.org

Video: Handling Academic Copyright and Artificial Intelligence Research Questions as the Law Develops by CNI Spring Meeting 2024, annotated

1
1. peter_murray 28 May 2024
  
  in Public
  
  Google translate is generative AI
  
  Google Translate as generative AI
  
  natural language translation
Visit annotations in context

Tags

natural language translation

Annotators

peter_murray

URL

media.dltj.org/annotated-video/20240527T173838-GMttBH1oAD4-handling-academic-copyright-artificial-intelligence-research-questions-law-develops/index.html
Feb 2024
www.cortical.io www.cortical.io

Semantic Folding | Semantic Fingerprinting | Language Intelligence | Cortical.io

1
1. stopresetgo 04 Feb 2024
  
  in Public
  
  for - semantic folding - semantic fingerprint - natural language processing - NLP - cortical.io - Numenta
  
  cortical.io Numenta natural language processing NLP semantic fingerprint semantic folding
Visit annotations in context

Tags

semantic fingerprint

NLP

semantic folding

natural language processing

Numenta

cortical.io

Annotators

stopresetgo

URL

cortical.io/science/semantic-folding/
Feb 2023
arstechnica.com arstechnica.com

The generative AI revolution has begun—how did we get here?

1
1. peter_murray 01 Feb 2023
  
  in Public
  
  An AI model that can learn and work with this kind of problem needs to handle order in a very flexible way. The old models—LSTMs and RNNs—had word order implicitly built into the models. Processing an input sequence of words meant feeding them into the model in order. A model knew what word went first because that’s the word it saw first. Transformers instead handled sequence order numerically, with every word assigned a number. This is called "positional encoding." So to the model, the sentence “I love AI; I wish AI loved me” looks something like (I 1) (love 2) (AI 3) (; 4) (I 5) (wish 6) (AI 7) (loved 8) (me 9).
  
  Google’s “the transformer”
  
  One breakthrough was positional encoding versus having to handle the input in the order it was given. Second, using a matrix rather than vectors. This research came from Google Translate.
  
  natural language translation
Visit annotations in context

Tags

natural language translation

Annotators

peter_murray

URL

arstechnica.com/gadgets/2023/01/the-generative-ai-revolution-has-begun-how-did-we-get-here/
Jan 2023
www.complexityexplorer.org www.complexityexplorer.org

Complexity Explorer

1
1. chrisaldrich 23 Jan 2023
  
  in Public
  
  a common technique in natural language processing is to operationalize certain semantic concepts (e.g., "synonym") in terms of syntactic structure (two words that tend to occur nearby in a sentence are more likely to be synonyms, etc). This is what word2vec does.
  
  Can I use some of these sorts of methods with respect to corpus linguistics over time to better identified calcified words or archaic phrases that stick with the language, but are heavily limited to narrower(ing) contexts?
  
  calcified words word2vec operationalization natural language processing historical linguistics open questions archaic phrases information theory
Visit annotations in context

Tags

word2vec

calcified words

open questions

information theory

historical linguistics

operationalization

archaic phrases

natural language processing

Annotators

chrisaldrich

URL

complexityexplorer.org/courses/162-foundations-applications-of-humanities-analytics/segments/15624
genizalab.princeton.edu genizalab.princeton.edu

Princeton Machine Learning and the Future of Philology Symposium

1
1. chrisaldrich 09 Jan 2023
  
  in Public
  
  https://genizalab.princeton.edu/events/2022/princeton-machine-learning-and-future-philology-symposium
  
  Was this recorded?
  
  machine learning philology symposia digital humanities manuscript studies artificial intelligence corpus linguistics incunabula handwriting recognition natural language processing
Visit annotations in context

Tags

handwriting recognition

artificial intelligence

philology

symposia

natural language processing

manuscript studies

incunabula

machine learning

digital humanities

corpus linguistics

Annotators

chrisaldrich

URL

genizalab.princeton.edu/events/2022/princeton-machine-learning-and-future-philology-symposium
Local file Local file

Finding a Fragment in a Pile of Geniza: A Practical Guide to Collections, Editions, and Resources

1
1. chrisaldrich 09 Jan 2023
  
  in Public
  
  Fried-berg Judeo-Arabic Project, accessible at http://fjms.genizah.org. This projectmaintains a digital corpus of Judeo-Arabic texts that can be searched and an-alyzed.
  
  The Friedberg Judeo-Arabic Project contains a large corpus of Judeo-Arabic text which can be manually searched to help improve translations of texts, but it might also be profitably mined using information theoretic and corpus linguistic methods to provide larger group textual translations and suggestions at a grander scale.
  
  Friedberg Jewish Manuscript Society Friedberg Judeo-Arabic Project corpus linguistics digital humanities information theory artificial intelligence natural language processing contextual clues contextual extrapolation
Tags

contextual clues

Friedberg Judeo-Arabic Project

artificial intelligence

information theory

contextual extrapolation

Friedberg Jewish Manuscript Society

natural language processing

digital humanities

corpus linguistics

Annotators

chrisaldrich
Dec 2022
dl.acm.org dl.acm.org

On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? "1F99COn the Dangers of Stochastic Parrots: Can Language Models Be Too Big? "1F99C

1
1. peter_murray 30 Dec 2022
  
  in Public
  
  Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? 🦜. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (FAccT '21). Association for Computing Machinery, New York, NY, USA, 610–623. https://doi.org/10.1145/3442188.3445922
  
  natural language processing
Visit annotations in context

Tags

natural language processing

Annotators

peter_murray

URL

dl.acm.org/doi/pdf/10.1145/3442188.3445922
www.nlpdemystified.org www.nlpdemystified.org

Natural Language Processing Demystified: Course Content

1
1. chrisaldrich 10 Dec 2022
  
  in Public
  
  https://www.nlpdemystified.org/course
  
  MOOC natural language processing online courseware neural networks
Visit annotations in context

Tags

natural language processing

online courseware

MOOC

neural networks

Annotators

chrisaldrich

URL

nlpdemystified.org/course
Nov 2022
www.researchgate.net www.researchgate.net

(20) Robert Amsler

1
1. chrisaldrich 14 Nov 2022
  
  in Public
  
  Robert Amsler is a retired computational lexicology, computational linguist, information scientist. His P.D. was from UT-Austin in 1980. His primary work was in the area of understanding how machine-readable dictionaries could be used to create a taxonomy of dictionary word senses (which served as the motivation for the creation of WordNet) and in understanding how lexicon can be extracted from text corpora. He also invented a new technique in citation analysis that bears his name. His work is mentioned in Wikipedia articles on Machine-Readable dictionary, Computational lexicology, Bibliographic coupling, and Text mining. He currently lives in Vienna, VA and reads email at robert.amsler at utexas. edu. He is currenly interested in chronological studies of vocabulary, esp. computer terms.
  
  https://www.researchgate.net/profile/Robert-Amsler
  
  Apparently follow my blog. :)
  
  Makes me wonder how we might better process and semantically parse peoples' personal notes, particularly when they're atomic and cross-linked?
  
  Robert Amsler linguistics dictionaries natural language processing corpus linguistics idea links open questions
Visit annotations in context

Tags

dictionaries

linguistics

idea links

natural language processing

open questions

corpus linguistics

Robert Amsler

Annotators

chrisaldrich

URL

researchgate.net/profile/Robert-Amsler
Oct 2022
www.explainpaper.com www.explainpaper.com

Explainpaper

1
1. chrisaldrich 27 Oct 2022
  
  in Public
  
  https://www.explainpaper.com/
  
  Another in a growing line of research tools for processing and making sense of research literature including Research Rabbit, Connected Papers, Semantic Scholar, etc.
  
  Functionality includes the ability to highlight sections of research papers with natural language processing to explain what those sections mean. There's also a "chat" that allows you to ask questions about the paper which will attempt to return reasonable answers, which is an artificial intelligence sort of means of having an artificial "conversation with the text".
  
  cc: @dwhly @remikalir @jeremydean
  
  artificial intelligence research papers tools tools for thought literature review literature search information overload research tools Explainpaper annotations natural language processing conversations with the text
Visit annotations in context

Tags

research tools

conversations with the text

tools for thought

Explainpaper

research papers

tools

literature review

literature search

natural language processing

artificial intelligence

information overload

annotations

Annotators

chrisaldrich

URL

explainpaper.com/
Aug 2022
maggieappleton.com maggieappleton.com

Joining Ought

1
1. chrisaldrich 05 Aug 2022
  
  in Public
  
  https://maggieappleton.com/joining-ought
  
  read Maggie Appleton machine learning natural language processing GPT-3 Elicit Ought
Visit annotations in context

Tags

Maggie Appleton

Elicit

natural language processing

GPT-3

machine learning

read

Ought

Annotators

chrisaldrich

URL

maggieappleton.com/joining-ought
Dec 2021
cacm.acm.org cacm.acm.org

Converting Laws to Programs

1
1. peter_murray 23 Dec 2021
  
  in Public
  
  Catala, a programming language developed by Protzenko's graduate student Denis Merigoux, who is working at the National Institute for Research in Digital Science and Technology (INRIA) in Paris, France. It is not often lawyers and programmers find themselves working together, but Catala was designed to capture and execute legal algorithms and to be understood by lawyers and programmers alike in a language "that lets you follow the very specific legal train of thought," Protzenko says.
  
  A domain-specific language for encoding legal interpretations.
  
  natural-language-processing legal legislative-history
Visit annotations in context

Tags

natural-language-processing legal legislative-history

Annotators

peter_murray

URL

cacm.acm.org/magazines/2022/1/257436-converting-laws-to-programs/fulltext
Nov 2021
www.nature.com www.nature.com

Natural language processing and network analysis provide novel insights on policy and scientific discourse around Sustainable Development Goals

1
1. SamRose 20 Nov 2021
  
  in Public
  
  natural language processing nlp policy
Visit annotations in context

Tags

nlp

natural language processing

policy

Annotators

SamRose

URL

nature.com/articles/s41598-021-01801-6
Jun 2021
psyarxiv.com psyarxiv.com

Web-scraping the Expression of Loneliness during COVID-19

1
1. XanaButt 28 Jun 2021
  
  in BehSci
  
  Jung, Y., Lee, Y. K., & Hahn, S. (2021). Web-scraping the Expression of Loneliness during COVID-19. PsyArXiv. https://doi.org/10.31234/osf.io/59gwk
  
  is:preprint lang:en COVID-19 loneliness Natural Language Processing modeling internet social media emotion internal state appraisal online relationship
Visit annotations in context

Tags

online relationship

appraisal

social media

internal state

modeling

Natural Language Processing

loneliness

internet

is:preprint

COVID-19

lang:en

emotion

Annotators

XanaButt

URL

psyarxiv.com/59gwk/
en.wikipedia.org en.wikipedia.org

ISO 639-3 - Wikipedia

1
1. TylerRick 04 Jun 2021
  
  in Public
  
  ISO 639-3 extends the ISO 639-2 alpha-3 codes with an aim to cover all known natural languages.
  
  ISO language codes ISO language codes: ISO 639-3 natural languages
Visit annotations in context

Tags

ISO language codes: ISO 639-3

ISO language codes

natural languages

Annotators

TylerRick

URL

en.wikipedia.org/wiki/ISO_639-3
loc.gov loc.gov

Untitled document

1
1. TylerRick 04 Jun 2021
  
  in Public
  
  This doesn't seem entirely trust-worthy/useful.
  
  The native name seems incorrect/missing for some languages, like German, Hebrew, compared to https://gist.github.com/piraveen/fafd0d984b2236e809d03a0e306c8a4d
  
  free data list natural languages ISO language codes
Visit annotations in context

Tags

ISO language codes

free data

list

natural languages

Annotators

TylerRick

URL

loc.gov/standards/iso639-2/ISO-639-2_utf-8.txt
en.wikipedia.org en.wikipedia.org

Bolivian Spanish - Wikipedia

1
1. TylerRick 03 Jun 2021
  
  in Public
  
  Similarities in dialects[edit]
  
  natural languages language: dialect comparison table see content below
Visit annotations in context

Tags

language: dialect

comparison table

natural languages

see content below

Annotators

TylerRick

URL

en.wikipedia.org/wiki/Bolivian_Spanish
Mar 2021
en.wikipedia.org en.wikipedia.org

Use case - Wikipedia

1
1. TylerRick 25 Mar 2021
  
  in Public
  
  Originally he had used the terms usage scenarios and usage case – the latter a direct translation of his Swedish term användningsfall – but found that neither of these terms sounded natural in English, and eventually he settled on use case.
  
  evolution of language origin story etymology feels natural euphony English
Visit annotations in context

Tags

English

feels natural

etymology

evolution of language

euphony

origin story

Annotators

TylerRick

URL

en.wikipedia.org/wiki/Use_case
psyarxiv.com psyarxiv.com

Scared into Action: How Partisanship and Fear are Associated with Reactions to Public Health Directives

1
1. sophia.sterckx 15 Mar 2021
  
  in BehSci
  
  Lindow, Mike, David DeFranza, Arul Mishra, and Himanshu Mishra. ‘Scared into Action: How Partisanship and Fear Are Associated with Reactions to Public Health Directives’. PsyArXiv, 12 January 2021. https://doi.org/10.31234/osf.io/8me7q.
  
  is:preprint lang:en COVID-19 political ideology tweets natural language processing word embedding gradient boosted decision trees corona coronavirus health directives liberals conservatives politics federal government USA twitter
Visit annotations in context

Tags

lang:en

processing

tweets

liberals

federal government

word embedding

gradient boosted decision trees

natural language

politics

COVID-19

corona

political ideology

conservatives

coronavirus

health directives

is:preprint

twitter

USA

Annotators

sophia.sterckx

URL

psyarxiv.com/8me7q/
arxiv.org arxiv.org

Semantic and Relational Spaces in Science of Science: Deep Learning Models for Article Vectorisation

1
1. n.parfitt 15 Mar 2021
  
  in BehSci
  
  Kozlowski, Diego, Jennifer Dusdal, Jun Pang, and Andreas Zilian. ‘Semantic and Relational Spaces in Science of Science: Deep Learning Models for Article Vectorisation’. ArXiv:2011.02887 [Physics], 5 November 2020. http://arxiv.org/abs/2011.02887.
  
  lang:en is:article semantic relational science deep learning model article vectorization literature review epistemic social pattern computer science tool research Natural Language Processing Graph Neural Networks
Visit annotations in context

Tags

Graph Neural Networks

research

tool

article

Natural Language Processing

literature

semantic

deep

vectorization

science

model

is:article

epistemic

computer science

relational

learning

review

social

lang:en

pattern

Annotators

n.parfitt

URL

arxiv.org/abs/2011.02887
Sep 2020
markojs.com markojs.com

Marko

1
1. TylerRick 17 Sep 2020
  
  in Public
  
  UI framework HTML template language feels natural intuitive Marko
Visit annotations in context

Tags

HTML

UI framework

intuitive

feels natural

template language

Marko

Annotators

TylerRick

URL

markojs.com/docs/syntax/
Aug 2020
onlinelibrary.wiley.com onlinelibrary.wiley.com

The More Who Die, the Less We Care: Evidence from Natural Language Analysis of Online News Articles and Social Media Posts

1
1. ErikStuchly 31 Aug 2020
  
  in BehSci
  
  Bhatia, S., Walasek, L., Slovic, P., & Kunreuther, H. (2020). The More Who Die, the Less We Care: Evidence from Natural Language Analysis of Online News Articles and Social Media Posts. Risk Analysis, risa.13582. https://doi.org/10.1111/risa.13582
  
  is:article lang:en COVID-19 natural language processing big data online news social media psychic numbing death rate caring affective reaction loss of life valence arousal emotional content psychology
Visit annotations in context

Tags

valence

lang:en

online news

affective reaction

loss of life

caring

emotional content

COVID-19

social media

psychic numbing

is:article

natural language processing

arousal

death rate

big data

psychology

Annotators

ErikStuchly

URL

onlinelibrary.wiley.com/doi/abs/10.1111/risa.13582
psyarxiv.com psyarxiv.com

Digital phenotyping of complex psychological responses to the COVID-19 pandemic

1
1. Gaurav_Saxena 14 Aug 2020
  
  in BehSci
  
  Hull, T., Levine, J., Bantilan, N., Desai, A., & Majumder, M. S. (2020, August 13). Digital phenotyping of complex psychological responses to the COVID-19 pandemic. https://doi.org/10.31234/osf.io/qtrpf
  
  is:preprint lang:en COVID-19 symptom tracking digital phenotyping psychological sequalae telehealth digital mental health natural language processing machine learning
Visit annotations in context

Tags

psychological sequalae

digital mental health

digital phenotyping

machine learning

symptom tracking

telehealth

is:preprint

natural language processing

COVID-19

lang:en

Annotators

Gaurav_Saxena

URL

psyarxiv.com/qtrpf/
Jul 2020
wit.ai wit.ai

Wit.ai

1
1. TylerRick 23 Jul 2020
  
  in Public
  
  natural language processing AI
Visit annotations in context

Tags

natural language processing

AI

Annotators

TylerRick

URL

wit.ai/
osf.io osf.io

Capturing and analyzing social representations. A first application of Natural Language Processing techniques to reader’s comments in COVID-19 news. Argentina, 2020

1
1. ErikStuchly 15 Jul 2020
  
  in BehSci
  
  Rosati, G., Domenech, L., Chazarreta, A., & Maguire, T. (2020). Capturing and analyzing social representations. A first application of Natural Language Processing techniques to reader’s comments in COVID-19 news. Argentina, 2020 [Preprint]. SocArXiv. https://doi.org/10.31235/osf.io/3pcdu
  
  is:preprint lang:en COVID-19 social representation analysis natural language processing comment news Argentina quantification topic Latent Dirichlet Allocation prototype FastText
Visit annotations in context

Tags

topic

FastText

Argentina

analysis

comment

quantification

prototype

news

natural language processing

is:preprint

Latent Dirichlet Allocation

social representation

COVID-19

lang:en

Annotators

ErikStuchly

URL

osf.io/preprints/socarxiv/3pcdu/
May 2020
arxiv.org arxiv.org

Complex Societies and the Growth of the Law

1
1. edampf 28 May 2020
  
  in BehSci
  
  Katz, D. M., Coupette, C., Beckedorf, J., & Hartung, D. (2020). Complex Societies and the Growth of the Law. ArXiv:2005.07646 [Physics]. http://arxiv.org/abs/2005.07646
  
  is:preprint lang:en computer science complex societies law Germany USA legislation modeling multidimensional time-evolving natural language processing network science welfare state tax state
Visit annotations in context

Tags

network science

tax state

time-evolving

modeling

law

multidimensional

Germany

computer science

natural language processing

is:preprint

complex societies

legislation

lang:en

USA

welfare state

Annotators

edampf

URL

arxiv.org/abs/2005.07646
psyarxiv.com psyarxiv.com

Moral Concerns are Differentially Observable in Language

1
1. edampf 13 May 2020
  
  in BehSci
  
  Kennedy, B., Atari, M., Davani, A. M., Hoover, J., Omrani, A., Graham, J., & Dehghani, M. (2020, May 7). Moral Concerns are Differentially Observable in Language. https://doi.org/10.31234/osf.io/uqmty
  
  is:preprint lang:en morality language text analysis moral foundations theory observational analysis psychology communication Facebook online status update questionnaire prediction self-report natural language processing
Visit annotations in context

Tags

morality

text analysis

moral foundations theory

communication

Facebook

language

prediction

is:preprint

questionnaire

self-report

online status

update

observational analysis

natural language processing

lang:en

psychology

Annotators

edampf

URL

psyarxiv.com/uqmty/
Apr 2020
en.wikipedia.org en.wikipedia.org

Hyphenation algorithm - Wikipedia

1
1. TylerRick 29 Apr 2020
  
  in Public
  
  algorithms natural language processing
Visit annotations in context

Tags

algorithms

natural language processing

Annotators

TylerRick

URL

en.wikipedia.org/wiki/Hyphenation_algorithm
english.stackexchange.com english.stackexchange.com

Is it proper to use the word "bandwidth" as it relates to time allotment?

1
1. TylerRick 20 Apr 2020
  
  in Public
  
  The question of whether or not it is "proper" is meaningless, unless you define the particular arbiter of manners who you want to defer to. There is no authority for the English language.
  
  authority English language natural langauges context
Visit annotations in context

Tags

English

natural langauges

authority

language

context

Annotators

TylerRick

URL

english.stackexchange.com/questions/57935/is-it-proper-to-use-the-word-bandwidth-as-it-relates-to-time-allotment
kokociel.blogspot.com kokociel.blogspot.com

Fashion advice from cartoon characters

1
1. TylerRick 01 Apr 2020
  
  in Public
  
  Just as with wine-tasting, having a bigger vocabulary for colours allows specific colours to be perceived more readily and remembered more easily, even if not done consciously.
  
  be precise natural languages language precision vocabulary
Visit annotations in context

Tags

be precise

vocabulary

precision

language

natural languages

Annotators

TylerRick

URL

kokociel.blogspot.com/2016/01/fashion-advice-from-cartoon-characters.html
Mar 2020
developer.wordpress.org developer.wordpress.org

How to Internationalize Your Plugin | Plugin Developer Handbook | WordPress Developer Resources

1
1. TylerRick 06 Mar 2020
  
  in Public
  
  the singular form of the string (note that it can be used for numbers other than one in some languages, so '%s item' should be used instead of 'One item')
  
  natural langauges language: grammar
Visit annotations in context

Tags

language: grammar

natural langauges

Annotators

TylerRick

URL

developer.wordpress.org/plugins/internationalization/how-to-internationalize-your-plugin/
thepugautomatic.com thepugautomatic.com

Rails i18n tips - The Pug Automatic

1
1. TylerRick 06 Mar 2020
  
  in Public
  
  This will of course depend on your perspective, but: beware Finnish and other highly inflected languages. As a grammar nerd, I actually love this stuff. But judging by my colleagues, you won’t.
  
  fun language: grammar natural langauges
Visit annotations in context

Tags

language: grammar

natural langauges

fun

Annotators

TylerRick

URL

thepugautomatic.com//2012/07/rails-i18n-tips/