105 Matching Annotations
  1. Mar 2024
  2. Oct 2023
    1. "Causal Triplet: An Open Challenge for Intervention-centric Causal Representation Learning" Yuejiang Liu1, 2,* YUEJIANG.LIU@EPFL.CH Alexandre Alahi2 ALEXANDRE.ALAHI@EPFL.CH Chris Russell1 CMRUSS@AMAZON.DE Max Horn1 HORNMAX@AMAZON.DE Dominik Zietlow1 ZIETLD@AMAZON.DE Bernhard Sch ̈olkopf1, 3 BS@TUEBINGEN.MPG.DE Francesco Locatello1 LOCATELF@AMAZON.DE

  3. Oct 2022
    1. An old drug and different ways to treat cutaneous leishmaniasis: Intralesional and intramuscular meglumine antimoniate in a reference center, Rio de Janeiro, Brazil.
    2. AN OLD DRUG AND DIFFERENT WAYS TO TREAT CUTANEOUS LEISHMANIASIS: INTRALESIONAL AND INTRAMUSCULAR MEGLUMINE ANTIMONIATE IN A REFERENCE CENTER, RIO DE JANEIRO, BRAZIL
  4. Jun 2022
    1. The major issue with much of the data that can be downloaded from web portals or through APIs is that they come without context or metadata. If you are lucky you might get a paragraph about where the data are from or a data dictionary that describes what each column in a particular spreadsheet means. But more often than not, you get something that looks like figure 6.3.

      I think that the reason behind data's lack of context is the reluctance in making extra column for data's description and the inconsiderate and misleading vision that those in technologies hold when they put forth that data should be clean and concise.

      I encountered the insufficient provision of data multiple times and I found it extremely inconvenient when trying to use downloaded online reports and attached them to my work experiences as a way to illustrate the efficient changes in driving audiences for a social media platform (Facebook). I used to help run an facebook page for a student organization. After being done with the role, I went to the "Insights" section of Facebook, hoping to download the report of increases in Page Likes, Visits, and Interactions during the period that I was an admin of the page. It took me several glitches to download the report (because it was a year-long term). When the pdf file was ready to be viewed, I was surprised, because they did not mention the years I was working, the name of the student organization, and other categorizations that should have been highlighted. Apparently, it's not hard to include the years or even the name because they were included in the filter when I wanted to extract certain part of the report and because it was the source where they took the data from, respectively. This laziness in showing competent data for analysis was desperate, and I had to add extra analysis to it. Even after I finished with the "extra work", I started to question to validity of the report I was downloading. Would it be trustworthy anymore, because without my clarification, no analysis could be made even by a person involved in data science field. Even if they could, it would take them a while to collect other external information before making clear of the data presented to them.

      Understanding and constantly being bothered by this ongoing problem gives me justification to call for a more thorough data translation and presentation process. More questions should be raised and answered regarding what might a user wonder about this dataset when encountering it.

  5. May 2022
    1. Such a highly non-linear problem would clearly benefitfrom the computational power of many layers. Unfortu-nately, back-propagation learning generally slows downby an order of magnitude every time a layer is added toa network.

      The problem in 1988

  6. Apr 2022
  7. Nov 2021
  8. Jun 2021
  9. May 2021
    1. To investigate these hypotheses, I created an election-year-country dataset covering the period from the early 1990s to the present for all post- communist democracies.7 The dataset is structured as a quasi-time series of 93 parliamentary elections in 17 countries from 1991 to 2012, and the depen-dent variable is the natural log of the radical right party’s combined vote share in elections held at time t.

      this is the data, her explanation of the dataset she created

  10. Mar 2021
    1. 14 of which were sampled at multiple timepoints
    2. RNA sequencing on samples from 46 individuals with PCR-positive, symptomatic SARS-CoV-2 infection
    3. 77 peripheral blood samples across 46 subjects with COVID-19 and compared them to subjects with seasonal coronavirus, influenza, bacterial pneumonia, and healthy controls.
    4. seasonal coronavirus (n=59)
    5. divided based on disease severity and time from symptom onset
    6. elucidate novel aspects of the host response to SARS-CoV-2
    7. influenza (n=17)
    8. bacterial pneumonia (n=20)
    9. healthy controls (n=19)
    1. elucidate key pathways in the host transcriptome of patients infected with SARS-CoV-2, we used RNA sequencing (RNA Seq) to analyze nasopharyngeal (NP) swab and whole blood (WB) samples from 333 COVID-19 patients and controls, including patients with other viral and bacterial infections.
    2. host response biosignature for COVID-19 from RNA profiling of nasal swabs and blood
  11. Dec 2020
    1. Databases If databases data is stored on a ZFS filesystem, it’s better to create a separate dataset with several tweaks: zfs create -o recordsize=8K -o primarycache=metadata -o logbias=throughput -o mountpoint=/path/to/db_data rpool/db_data recordsize: match the typical RDBMSs page size (8 KiB) primarycache: disable ZFS data caching, as RDBMSs have their own logbias: essentially, disabled log-based writes, relying on the RDBMSs’ integrity measures (see detailed Oracle post)
  12. Oct 2020
  13. Sep 2020
    1. Bavadekar, Shailesh, Andrew Dai, John Davis, Damien Desfontaines, Ilya Eckstein, Katie Everett, Alex Fabrikant, et al. ‘Google COVID-19 Search Trends Symptoms Dataset: Anonymization Process Description (Version 1.0)’. ArXiv:2009.01265 [Cs], 2 September 2020. http://arxiv.org/abs/2009.01265.

  14. Jul 2020
  15. Jun 2020
  16. May 2020
  17. Apr 2020
    1. Salganik, M. J., Lundberg, I., Kindel, A. T., Ahearn, C. E., Al-Ghoneim, K., Almaatouq, A., Altschul, D. M., Brand, J. E., Carnegie, N. B., Compton, R. J., Datta, D., Davidson, T., Filippova, A., Gilroy, C., Goode, B. J., Jahani, E., Kashyap, R., Kirchner, A., McKay, S., 
 McLanahan, S. (2020). Measuring the predictability of life outcomes with a scientific mass collaboration. Proceedings of the National Academy of Sciences. https://doi.org/10.1073/pnas.1915006117

  18. Mar 2020
    1. ll datasets were supplied by Suther-land in the Supporting Information as 3D geometriesaligned according to the original literature, namely byflexible alignment on one or more templates obtained bycrystallographic enzyme-inhibitor complexes
    2. eight comprehensive datasets

      what are the datasets look like? this may help to understand the application domain of this tool.

    Tags

    Annotators

  19. Feb 2019
    1. Impact of Fully Connected Layers on Performance of Convolutional Neural Networks for Image Classification

      äœœè€…æ€»ç»“èŻŽïŒš1CNN ć±‚è¶Šć°‘ïŒŒFC ć±‚é‡Œçš„node ć°±èŠè¶Šć€šæ‰èĄŒă€‚ç›žć CNN 越深FC node ć°‘ć°±ć€Ÿäș†ïŒ›2攅的 CNN 陀äș†éœ€èŠæ›Žć€š FC node ć€–ïŒŒæ•°æźé›† class ç±»ç›źæ•°è¶Šć€šïŒŒFC 求ćș”èŻ„è¶Šć€šè¶Šć„œïŒŒćäč‹äșŠç„¶ïŒ›3ćŻčäșŽć•äžȘ class ć†…æ ·æœŹè¶Šć€šçš„æ•°æźé›†ïŒŒçœ‘ç»œè¶Šæ·±è¶Šć„œïŒŒäœ†è‹„ class ç±»ç›źæ•°ćŸˆć€šïŒŒæ”…çš„çœ‘ç»œèĄšçŽ°äŒšæ›Žć„œă€‚

    2. Do we train on test data? Purging CIFAR of near-duplicates

      䜜者玩äș†æŠŠ CIFAR æ”‹èŻ•æ•°æźé›†ïŒŒèź€äžș有äș›æ ·æœŹäœœäžș test 䌚䞎 train æ ·æœŹć€Șç›žèż‘è€Œèż‡æ‹Ÿćˆçš„é—źéą˜ïŒŒäșŽæ˜Żć°±è‡Șć·±æ›żæąäș†ç–‘äŒŒé—źéą˜æ ·æœŹæć‡șäș†æ–° test æ•°æźé›†ïŒŒæœ€ćŽæ‹żé‚Łäș›è‘—ćæšĄćž‹ćźžéȘŒćŽïŒŒćș†ćčžèŻŽèČŒäŒŒćźƒä»ŹæČĄæœ‰èż‡æ‹Ÿćˆè€Œèą«é”™èŻŻèŻ„äŒ°æšĄćž‹äŒ˜ćŠŁ~有ç‚č打脞的感觉~

    3. Semantic Redundancies in Image-Classification Datasets: The 10% You Don't Need

      æ·±ćșŠç„žç»çœ‘络版的“ç‰čćŸć·„çš‹â€æŠ€æœŻ~ [doge]

    4. Deep Learning on Small Datasets without Pre-Training using Cosine Loss

      ćœšćœ“ä»Łæ·±ćșŠć­Šäč äž­ïŒŒæœ‰äž€ä»¶äș‹äŒŒäčŽæ— ćŻäș‰èźźïŒš

      1. softmaxæż€æŽ»ćŽçš„ćˆ†ç±»äș€ć‰ç†”æŸć€±æ˜Żćˆ†ç±»çš„éŠ–é€‰æ–čæł•ïŒ›
      2. ćœšć°ćž‹æ•°æźé›†äžŠä»Žé›¶ćŒ€ć§‹èź­ç»ƒCNNćˆ†ç±»ć™šæ•ˆæžœäžäœłă€‚ćœšæœŹæ–‡äž­äœœè€…èŻæ˜ŽïŒŒćœ“ć€„ç†ć°æ•°æźæ ·æœŹç±»æ—¶äœ™ćŒŠæŸć€±ć‡œæ•°æŻ”äș€ć‰äžŠèƒœć€ŸæäŸ›æ›Žć„œçš„æ€§èƒœă€‚
  20. Jan 2019
    1. Fitting A Mixture Distribution to Data: Tutorial

      ç›źæ”‹æ˜Żäž€çŻ‡ćŸˆæœ‰çˆ±çš„æ•™çš‹ïŒ

    2. Optimization Models for Machine Learning: A Survey

      感觉歀文äșŽæˆ‘è€Œèš€çœŸæ­Łæœ‰ä»·ć€Œçš„ææ€•ćȘæœ‰æ–‡æœ«é™„ćœ•çš„ Dataset tables 汇总敎理äș†ă€‚。。。。

  21. Dec 2018
    1. Are All Training Examples Created Equal? An Empirical Study

      从歀paperäș†è§Łćˆ°äș†ć« Active learning çš„æœ‰è¶ŁæŠ‚ćż”ïŒŒèż™äŒŒäčŽć’Œè‡Șć·±èźŸèźĄçš„èżžç»­ć‚æ•°èź­ç»ƒæ•°æźé‡‡æ ·æ± ćŸˆæŽ„èż‘ă€‚ă€‚ă€‚ă€‚

      èż™çŻ‡æ–‡ç« çš„äž»èŠć·„äœœæ˜Żç»™ć‡șäș†äž€äžȘćœšć›Ÿćƒćˆ†ç±»äž­ć…łäșŽèź­ç»ƒæ ·æœŹé‡èŠæ€§çš„研究ćŻčäșŽæ ·æœŹçš„重芁ćșŠé‡‡ç”šćŸșäșŽæąŻćșŠçš„æ–čæł•èż›èĄŒćșŠé‡ă€‚文章的结èźșćŻèƒœèĄšæ˜Žćœšæ·±ćșŠć­Šäč äž­äž»ćŠšć­Šäč æˆ–èźžćč¶äžæ€»æ˜Żæœ‰æ•ˆçš„。

    2. Image Score: How to Select Useful Samples

      提ć‡ș的 semi-supervised learning èż™äžȘæŠ‚ćż”æŻ”èŸƒæœ‰è¶Łă€‚ç»™æ•°æźé›†æŻäžȘ sample æ‰“ćˆ†æˆ–èźžćŻč interpretability 有ç‚č枟抩搧。。。。

  22. Nov 2018
    1. Failing Loudly: An Empirical Study of Methods for Detecting Dataset Shift

      èŻ„æ–‡ćšçš„ćźžéȘŒæ˜ŻæŽąçŽąćŻčæ•°æźé›†èż›èĄŒ shifts (æŸç§ćŻæŽ§çš„æ‰°ćŠš) ćŽçš„æšĄćž‹èĄšçŽ°ïŒŒæć‡șäș†classifier-based的æ–čæł•/pipeline æ„è§‚ćŻŸć’ŒèŻ„ä»·ïŒš

      èż™ćŻčäșŽæˆ‘çš„ćŒ•ćŠ›æłąæ•°æźç ”ç©¶æ„èŻŽïŒŒćŻä»„ć€Ÿé‰Žć…¶æ•°æźçš„ shift æ–čæł•ä»„ćŠèŻ„ä»·æœș戶 two-sample testsïŒ‰ă€‚

    2. Training neural audio classifiers with few data

      èż™æ˜Żäž€äžȘæŻ”èŸƒćˆæ­„çš„çź€ć•ćźžéȘŒă€‚

      ć›Ÿćƒç»“èźș慶柞ćč¶äžæ„ć€–ïŒšæ•°æźé‡è¶Šć€šćœ“ç„¶èĄšçŽ°è¶Šć„œïŒ›èżç§»ć­Šäč ćœšæžć°é‡æ•°æźäžŠèĄšçŽ°è‰Żć„œïŒ›Prototypical æšĄćž‹ćŻèƒœć› ç»“æž„çš„ç‰čćŒ‚æ€§äŒšèĄšçŽ°ć‡ș侀漚繋ćșŠäžŠçš„äŒ˜ćŠżïŒ›æ•°æźé‡è¶Šć°ïŒŒèż‡æ‹Ÿćˆé—źéą˜è¶Šäž„重。。。

  23. Sep 2016
    1. UK Biobank

      Large UK dataset containing extensive phenotypic, genotypic, and neuroimaging data.

      License: Unclear, but restrictive. Access: Human, ? Needs data use agreement: Yes Needs institutional signature for access: No (?)

    1. View Data Sets

      Public fMRI dataset repository.

      • License: PDDL v.1.0
      • Access: Human, s3 Needs data use agreement: No Needs institutional signature for access: No
    1. Brain Genomics Superstruct Project (GSP)

      License: Data use agreement Access: Human, API Needs data use agreement: Yes Needs institutional signature for access: No

    1. What is studyforrest?

      Rich multimodal dataset on naturalistic stimuli

      • License: PDDL v.10
      • Access: Human, rsync, git annex
      • Needs data use agreement: No
      • Needs institutional signature for access: No
      • License: PDDL v.10
      • Access: Human, s3, openfmri
      • Needs data use agreement: No
      • Needs institutional signature for access: No
  24. May 2016
  25. Aug 2015
    1. the definition of a “dataset,”

      this is interesting, and will be interesting to track within and across disciplines