- Mar 2024
- Oct 2023
-
arxiv.org arxiv.org
-
"Causal Triplet: An Open Challenge for Intervention-centric Causal Representation Learning" Yuejiang Liu1, 2,* YUEJIANG.LIU@EPFL.CH Alexandre Alahi2 ALEXANDRE.ALAHI@EPFL.CH Chris Russell1 CMRUSS@AMAZON.DE Max Horn1 HORNMAX@AMAZON.DE Dominik Zietlow1 ZIETLD@AMAZON.DE Bernhard Sch Ìolkopf1, 3 BS@TUEBINGEN.MPG.DE Francesco Locatello1 LOCATELF@AMAZON.DE
-
- Oct 2022
-
repositorio.usp.br repositorio.usp.br
-
Free ions in kerosene-based ferrofluid detected by impedance spectroscopy (2021)
-
-
www.alice.cnptia.embrapa.br www.alice.cnptia.embrapa.br
-
Tempo de cultivo contĂnuo de cana-de-açĂșcar e influĂȘncia nas caracterĂsticas fĂsicas e carbono orgĂąnico de latossolos vermelhos distrĂłficos em GuaĂra/SP.
-
-
www.arca.fiocruz.br www.arca.fiocruz.br
-
An old drug and different ways to treat cutaneous leishmaniasis: Intralesional and intramuscular meglumine antimoniate in a reference center, Rio de Janeiro, Brazil.
-
AN OLD DRUG AND DIFFERENT WAYS TO TREAT CUTANEOUS LEISHMANIASIS: INTRALESIONAL AND INTRAMUSCULAR MEGLUMINE ANTIMONIATE IN A REFERENCE CENTER, RIO DE JANEIRO, BRAZIL
-
- Jun 2022
-
data-feminism.mitpress.mit.edu data-feminism.mitpress.mit.edu
-
The major issue with much of the data that can be downloaded from web portals or through APIs is that they come without context or metadata. If you are lucky you might get a paragraph about where the data are from or a data dictionary that describes what each column in a particular spreadsheet means. But more often than not, you get something that looks like figure 6.3.
I think that the reason behind data's lack of context is the reluctance in making extra column for data's description and the inconsiderate and misleading vision that those in technologies hold when they put forth that data should be clean and concise.
I encountered the insufficient provision of data multiple times and I found it extremely inconvenient when trying to use downloaded online reports and attached them to my work experiences as a way to illustrate the efficient changes in driving audiences for a social media platform (Facebook). I used to help run an facebook page for a student organization. After being done with the role, I went to the "Insights" section of Facebook, hoping to download the report of increases in Page Likes, Visits, and Interactions during the period that I was an admin of the page. It took me several glitches to download the report (because it was a year-long term). When the pdf file was ready to be viewed, I was surprised, because they did not mention the years I was working, the name of the student organization, and other categorizations that should have been highlighted. Apparently, it's not hard to include the years or even the name because they were included in the filter when I wanted to extract certain part of the report and because it was the source where they took the data from, respectively. This laziness in showing competent data for analysis was desperate, and I had to add extra analysis to it. Even after I finished with the "extra work", I started to question to validity of the report I was downloading. Would it be trustworthy anymore, because without my clarification, no analysis could be made even by a person involved in data science field. Even if they could, it would take them a while to collect other external information before making clear of the data presented to them.
Understanding and constantly being bothered by this ongoing problem gives me justification to call for a more thorough data translation and presentation process. More questions should be raised and answered regarding what might a user wonder about this dataset when encountering it.
-
- May 2022
-
www.gwern.net www.gwern.net
-
Such a highly non-linear problem would clearly benefitfrom the computational power of many layers. Unfortu-nately, back-propagation learning generally slows downby an order of magnitude every time a layer is added toa network.
The problem in 1988
-
- Apr 2022
-
www.abc.net.au www.abc.net.au
-
Charting the COVID-19 spread: How Australia is faring. (2020, March 16). ABC News. https://www.abc.net.au/news/2020-03-17/coronavirus-cases-data-reveals-how-covid-19-spreads-in-australia/12060704
-
- Nov 2021
-
arxiv.org arxiv.org
-
Just because a dataset is publicly available doesn't mean that you can use it to build commercial AI software.
-
- Jun 2021
-
www.medrxiv.org www.medrxiv.org
-
Karlinsky, A., & Kobak, D. (2021). The World Mortality Dataset: Tracking excess mortality across countries during the COVID-19 pandemic. MedRxiv, 2021.01.27.21250604. https://doi.org/10.1101/2021.01.27.21250604
-
- May 2021
-
moodle.southwestern.edu moodle.southwestern.edu
-
To investigate these hypotheses, I created an election-year-country dataset covering the period from the early 1990s to the present for all post- communist democracies.7 The dataset is structured as a quasi-time series of 93 parliamentary elections in 17 countries from 1991 to 2012, and the depen-dent variable is the natural log of the radical right partyâs combined vote share in elections held at time t.
this is the data, her explanation of the dataset she created
-
- Mar 2021
-
-
Karimi, Fariba, and Petter Holme. âA Temporal Network Version of Wattsâs Cascade Modelâ. ArXiv:2103.13604 [Physics], 25 March 2021. http://arxiv.org/abs/2103.13604.
-
-
data.cdc.gov data.cdc.gov
-
Calgary, Open. âCOVID-19 Case Surveillance Public Use Data with Geography | Data | Centers for Disease Control and Preventionâ. Accessed 26 March 2021. https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data-with-Ge/n8mc-b4w4.
-
-
www.ncbi.nlm.nih.gov www.ncbi.nlm.nih.gov
-
14 of which were sampled at multiple timepoints
-
RNA sequencing on samples from 46 individuals with PCR-positive, symptomatic SARS-CoV-2 infection
-
77 peripheral blood samples across 46 subjects with COVID-19 and compared them to subjects with seasonal coronavirus, influenza, bacterial pneumonia, and healthy controls.
-
seasonal coronavirus (n=59)
-
divided based on disease severity and time from symptom onset
-
elucidate novel aspects of the host response to SARS-CoV-2
-
influenza (n=17)
-
bacterial pneumonia (n=20)
-
healthy controls (n=19)
Tags
Annotators
URL
-
-
www.ncbi.nlm.nih.gov www.ncbi.nlm.nih.gov
-
elucidate key pathways in the host transcriptome of patients infected with SARS-CoV-2, we used RNA sequencing (RNA Seq) to analyze nasopharyngeal (NP) swab and whole blood (WB) samples from 333 COVID-19 patients and controls, including patients with other viral and bacterial infections.
-
host response biosignature for COVID-19 from RNA profiling of nasal swabs and blood
Tags
Annotators
URL
-
-
-
Cheng, C., BarcelĂł, J., Hartnett, A. S., Kubinec, R., & Messerschmidt, L. (2020). COVID-19 Government Response Event Dataset (CoronaNet v.1.0). Nature Human Behaviour, 1â13. https://doi.org/10.1038/s41562-020-0909-7
-
- Dec 2020
-
saveriomiroddi.github.io saveriomiroddi.github.io
-
Databases If databases data is stored on a ZFS filesystem, itâs better to create a separate dataset with several tweaks: zfs create -o recordsize=8K -o primarycache=metadata -o logbias=throughput -o mountpoint=/path/to/db_data rpool/db_data recordsize: match the typical RDBMSs page size (8 KiB) primarycache: disable ZFS data caching, as RDBMSs have their own logbias: essentially, disabled log-based writes, relying on the RDBMSsâ integrity measures (see detailed Oracle post)
-
- Oct 2020
-
ourworldindata.org ourworldindata.org
-
docs.google.com docs.google.com
-
publications clinical trials datasets
-
-
www.kaggle.com www.kaggle.com
-
github.com github.com
-
storymaps.arcgis.com storymaps.arcgis.com
-
nextstrain.org nextstrain.org
-
www.arcgis.com www.arcgis.com
-
441187 total confirmed cases 111933 recovered 19784 deadhs
-
- Sep 2020
-
github.com github.com
-
I forgot to mention in the original issue way back that I have a lot of data. Like 1 to 3 MB that is being passed around via export let foo.
-
-
arxiv.org arxiv.org
-
Bavadekar, Shailesh, Andrew Dai, John Davis, Damien Desfontaines, Ilya Eckstein, Katie Everett, Alex Fabrikant, et al. âGoogle COVID-19 Search Trends Symptoms Dataset: Anonymization Process Description (Version 1.0)â. ArXiv:2009.01265 [Cs], 2 September 2020. http://arxiv.org/abs/2009.01265.
-
- Jul 2020
-
osf.io osf.io
-
Morgan, L., Protopopova, A., Birkler, R. I. D., Itin-Shwartz, B., Sutton, G. A., gamliel, alexandra, Yakobson, B., & Raz, T. (2020). Human-dog relationships during COVID-19 pandemic; booming dog adoption during social isolation [Preprint]. SocArXiv. https://doi.org/10.31235/osf.io/s9k4y
-
-
psyarxiv.com psyarxiv.com
-
Schelhorn, I., Ecker, A., Bereznai, J., Tran, T., Rehm, S., Lugo, R., SĂŒtterlin, S., Kinateder, M., & Shiban, Y. (2020). Depression symptoms during the COVID-19 pandemic in different regions in Germany. [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/p9wz8
-
- Jun 2020
-
www.youtube.com www.youtube.com
-
EU Datathon 2020âWebinar on COVID-19 and media and data monitoring. (2020, April 22). https://www.youtube.com/watch?v=wyNgmEfi_vk&feature=youtu.be
-
-
www.youtube.com www.youtube.com
-
EU Datathon 2020âWebinar on COVID-19 and media and data monitoring. (2020, April 22). https://www.youtube.com/watch?v=wyNgmEfi_vk&feature=youtu.be
Tags
Annotators
URL
-
-
www.youtube.com www.youtube.com
-
EU Datathon 2020âWebinar dedicated to COVID-19 data. (2020, April 9). https://www.youtube.com/watch?v=JIy6NO7QRQM&list=PLT5rARDev_rlAZ21iedz0ynnN4Na3UIoW&index=14&t=270s
Tags
Annotators
URL
-
-
eml.berkeley.edu eml.berkeley.edu
-
DellaVigna, S & Linos E. (2020). RCTs to scale: Comprehensive evidence from two nudge units. UC Berkeley. https://eml.berkeley.edu/~sdellavi/wp/NudgeToScale2020-03-20.pdf
-
-
psyarxiv.com psyarxiv.com
-
Yamada, Y., ÄepuliÄ, D.-B., Coll-MartĂn, T., Debove, S., Gautreau, G., Han, H., Rasmussen, J., Tran, T. P., Travaglino, G. A., & Lieberoth, A. (2020). COVIDiSTRESS Global Survey dataset on psychological and behavioural consequences of the COVID-19 outbreak [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/v7cep
-
- May 2020
-
docs.google.com docs.google.com
-
www.ukcdr.org.uk www.ukcdr.org.uk
-
UKCDR - COVID-19 Research Project Tracker
-
-
ai.googleblog.com ai.googleblog.com
-
Tsitsulin, A. & Perozzi B. Understanding the Shape of Large-Scale Data. (2020 May 05). Google AI Blog. http://ai.googleblog.com/2020/05/understanding-shape-of-large-scale-data.html
-
-
www.kaggle.com www.kaggle.com
-
COVID-19 Open Research Dataset Challenge (CORD-19). (n.d.). Retrieved May 6, 2020, from https://kaggle.com/allen-institute-for-ai/CORD-19-research-challenge
-
-
leoferres.info leoferres.info
-
Ferres, L. (2020 April 10). COVID19 mobility reports. Leo's Blog. https://leoferres.info/blog/2020/04/10/covid19-mobility-reports/
-
-
coviz.apps.allenai.org coviz.apps.allenai.orgAbout1
-
About. (n.d.). Retrieved May 6, 2020, from https://coviz.apps.allenai.org/
-
-
epjdatascience.springeropen.com epjdatascience.springeropen.com
-
Vilella, S., Paolotti, D., Ruffo, G. et al. News and the city: understanding online press consumption patterns through mobile data. EPJ Data Sci. 9, 10 (2020). https://doi.org/10.1140/epjds/s13688-020-00228-9
-
- Apr 2020
-
rajpurkar.github.io rajpurkar.github.io
-
-
-
Killeen, B.D., et al. (2020, April 1). A country-level dataset for informing the United States' response to COVID-19. Cornel University. arXiv:2004.00756.
-
-
www.ofcom.org.uk www.ofcom.org.uk
-
Ofcom. (2020 April 09). Covid-19 news and information: consumption and attitudes. https://www.ofcom.org.uk/research-and-data/tv-radio-and-on-demand/news-media/coronavirus-news-consumption-attitudes-behaviour
Tags
- access
- consumption
- interactive
- BARB
- lang:en
- comScore
- is:webpage
- survey
- COVID-19
- misinformation
- information
- news
- attitude
- dataset
- response
Annotators
URL
-
-
www.pnas.org www.pnas.org
-
Salganik, M. J., Lundberg, I., Kindel, A. T., Ahearn, C. E., Al-Ghoneim, K., Almaatouq, A., Altschul, D. M., Brand, J. E., Carnegie, N. B., Compton, R. J., Datta, D., Davidson, T., Filippova, A., Gilroy, C., Goode, B. J., Jahani, E., Kashyap, R., Kirchner, A., McKay, S., ⊠McLanahan, S. (2020). Measuring the predictability of life outcomes with a scientific mass collaboration. Proceedings of the National Academy of Sciences. https://doi.org/10.1073/pnas.1915006117
-
-
trello.com trello.com
-
Collective Intelligence and COVID-19 | Trello. (n.d.). Retrieved April 20, 2020, from https://trello.com/b/STdgEhvX/collective-intelligence-and-covid-19
-
-
arxiv.org arxiv.org
-
Alam, F., Sajjad, H., Imran, M., & Ofli, F. (2020). Standardizing and Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing. ArXiv:2004.06774 [Cs]. http://arxiv.org/abs/2004.06774
-
-
github.com github.com
-
experience.arcgis.com experience.arcgis.com
- Mar 2020
-
Local file Local file
-
ll datasets were supplied by Suther-land in the Supporting Information as 3D geometriesaligned according to the original literature, namely byflexible alignment on one or more templates obtained bycrystallographic enzyme-inhibitor complexes
-
eight comprehensive datasets
what are the datasets look like? this may help to understand the application domain of this tool.
-
-
ourworldindata.org ourworldindata.org
-
favorito,data_science
-
-
multimedia.scmp.com multimedia.scmp.com
-
unidad_COVID2019,favorita
-
-
www.visualcapitalist.com www.visualcapitalist.com
-
favorito,hermoso
-
-
coronavirus.thebaselab.com coronavirus.thebaselab.com
Tags
Annotators
URL
-
-
www.apprise.org.au www.apprise.org.au
-
www.gov.uk www.gov.uk
-
github.com github.com
-
unidad_COVID2019
-
-
coronavirus.jhu.edu coronavirus.jhu.edu
-
unidad_COVID2019
-
-
www.worldometers.info www.worldometers.info
-
bnonews.com bnonews.com
-
linea_tiempo
-
-
covid2019.app covid2019.app
-
acceso_abierto
Tags
Annotators
URL
-
-
www.consulta.mx www.consulta.mx
-
unidad_COVID2019,encuesta
-
-
coronavirus-disasterresponse.hub.arcgis.com coronavirus-disasterresponse.hub.arcgis.com
-
unidad_COVID2019,imprescindible
-
-
www.kff.org www.kff.org
- Feb 2019
-
iphysresearch.github.io iphysresearch.github.io
-
Impact of Fully Connected Layers on Performance of Convolutional Neural Networks for Image Classification
äœè æ»ç»èŻŽïŒ1ïŒCNN ć±è¶ć°ïŒFC ć±éçnode ć°±èŠè¶ć€æèĄăçžć CNN è¶æ·±ïŒFC node ć°ć°±ć€äșïŒ2ïŒæ” ç CNN é€äșéèŠæŽć€ FC node ć€ïŒæ°æźé class ç±»çźæ°è¶ć€ïŒFC ć±ćșèŻ„è¶ć€è¶ć„œïŒćäčäșŠç¶ïŒ3ïŒćŻčäșćäžȘ class ć æ ·æŹè¶ć€çæ°æźéïŒçœç»è¶æ·±è¶ć„œïŒäœè„ class ç±»çźæ°ćŸć€ïŒæ” ççœç»èĄšç°äŒæŽć„œă
-
Do we train on test data? Purging CIFAR of near-duplicates
äœè ç©äșæ CIFAR æ”èŻæ°æźéïŒèź€äžșæäșæ ·æŹäœäžș test äŒäž train æ ·æŹć€ȘçžèżèèżæćçéźéąïŒäșæŻć°±èȘć·±æżæąäșç䌌éźéąæ ·æŹæćșäșæ° test æ°æźéïŒæćæżéŁäșèćæšĄććźéȘćïŒćșćčžèŻŽèČ䌌ćźä»ŹæČĄæèżæćèèą«éèŻŻèŻäŒ°æšĄćäŒćŁ~ïŒæçčæèžçæè§~ïŒ
-
Semantic Redundancies in Image-Classification Datasets: The 10% You Don't Need
æ·±ćșŠç„ç»çœç»ççâçčćŸć·„çšâææŻ~ [doge]
-
Deep Learning on Small Datasets without Pre-Training using Cosine Loss
ćšćœä»Łæ·±ćșŠćŠäč äžïŒæ䞀件äș䌌äčæ ćŻäșèźźïŒ
- softmaxæżæŽ»ćçćç±»äș€ćç”æ怱æŻćç±»çéŠéæčæłïŒ
- ćšć°ćæ°æźéäžä»é¶ćŒć§èźç»CNNćç±»ćšææäžäœłăćšæŹæäžäœè èŻæïŒćœć€çć°æ°æźæ ·æŹç±»æ¶äœćŒŠæ怱ćœæ°æŻäș€ćäžèœć€æäŸæŽć„œçæ§èœă
-
-
towardsdatascience.com towardsdatascience.com
-
Top Sources For Machine Learning Datasets
-
- Jan 2019
-
iphysresearch.github.io iphysresearch.github.io
-
Fitting A Mixture Distribution to Data: Tutorial
çźæ”æŻäžçŻćŸæç±çæçšïŒ
-
Optimization Models for Machine Learning: A Survey
æè§æ€æäșæèèšçæŁæä»·ćŒçææćȘæææ«éćœç Dataset tables æ±æ»æŽçäșăăăăă
-
- Dec 2018
-
iphysresearch.github.io iphysresearch.github.io
-
Are All Training Examples Created Equal? An Empirical Study
ä»æ€paperäșè§Łć°äșć« Active learning çæè¶ŁæŠćż”ïŒèżäŒŒäčćèȘć·±èźŸèźĄçèżç»ćæ°èźç»æ°æźéæ ·æ± ćŸæ„èżăăăă
èżçŻæç« çäž»èŠć·„äœæŻç»ćșäșäžäžȘćšćŸććç±»äžć łäșèźç»æ ·æŹéèŠæ§çç 究ïŒćŻčäșæ ·æŹçéèŠćșŠéçšćșäșæąŻćșŠçæčæłèżèĄćșŠéăæç« çç»èźșćŻèœèĄšæćšæ·±ćșŠćŠäč äžäž»ćšćŠäč æèźžćč¶äžæ»æŻææçă
-
Image Score: How to Select Useful Samples
æćșç semi-supervised learning èżäžȘæŠćż”æŻèŸæè¶Łăç»æ°æźéæŻäžȘ sample æćæèźžćŻč interpretability æçčćžźć©ć§ăăăă
-
- Nov 2018
-
iphysresearch.github.io iphysresearch.github.io
-
Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift
èŻ„æćçćźéȘæŻæąçŽąćŻčæ°æźéèżèĄ shifts (æç§ćŻæ§çæ°ćš) ćçæšĄćèĄšç°ïŒæćșäșclassifier-basedçæčæł/pipeline æ„è§ćŻćèŻä»·ïŒ
èżćŻčäșæçćŒćæłąæ°æźç 究æ„èŻŽïŒćŻä»„ćéŽć ¶æ°æźç shift æčæłä»„ćèŻä»·æșć¶ ïŒtwo-sample testsïŒă
-
Training neural audio classifiers with few data
èżæŻäžäžȘæŻèŸćæ„ççźććźéȘă
ćŸćç»èźșć ¶ćźćč¶äžæć€ïŒæ°æźéè¶ć€ćœç¶èĄšç°è¶ć„œïŒèżç§»ćŠäč ćšæć°éæ°æźäžèĄšç°èŻć„œïŒPrototypical æšĄććŻèœć ç»æççčćŒæ§äŒèĄšç°ćșäžćźçšćșŠäžçäŒćżïŒæ°æźéè¶ć°ïŒèżæćéźéąè¶äž„éăăă
-
- Sep 2016
-
www.ukbiobank.ac.uk www.ukbiobank.ac.uk
-
UK Biobank
Large UK dataset containing extensive phenotypic, genotypic, and neuroimaging data.
License: Unclear, but restrictive. Access: Human, ? Needs data use agreement: Yes Needs institutional signature for access: No (?)
Tags
Annotators
URL
-
-
openfmri.org openfmri.orgOpenfMRI1
-
View Data Sets
Public fMRI dataset repository.
- License: PDDL v.1.0
- Access: Human, s3 Needs data use agreement: No Needs institutional signature for access: No
-
-
dataverse.harvard.edu dataverse.harvard.edu
-
Brain Genomics Superstruct Project (GSP)
License: Data use agreement Access: Human, API Needs data use agreement: Yes Needs institutional signature for access: No
Tags
Annotators
URL
-
-
studyforrest.org studyforrest.org
-
What is studyforrest?
Rich multimodal dataset on naturalistic stimuli
- License: PDDL v.10
- Access: Human, rsync, git annex
- Needs data use agreement: No
- Needs institutional signature for access: No
-
-
myconnectome.org myconnectome.org
-
- License: PDDL v.10
- Access: Human, s3, openfmri
- Needs data use agreement: No
- Needs institutional signature for access: No
Tags
Annotators
URL
-
- May 2016
-
www.jstage.jst.go.jp www.jstage.jst.go.jp
-
Bird song data set
-
- Aug 2015
-
europepmc.org europepmc.org
-
the definition of a âdataset,â
this is interesting, and will be interesting to track within and across disciplines
Tags
Annotators
URL
-