- Dec 2020
-
www.sciencedirect.com www.sciencedirect.com
-
The theme and motivation in this paper is that Auto-ML can streamline building machine learning models by non-experts in healthcare. What is not clear is why they need to. A more prevalent question is how to streamline use of algorithms based on machine/deep learning in healthcare setting.
-
This is a good manuscript to assign as a reading to the Machine Learning for Medical Applications course
-
Overall, this is a good overview of what is out there vis-a-vis Auto-ML, associated concepts, and variations.
-
-
sites.research.google sites.research.google
-
machinelearningmastery.com machinelearningmastery.com
-
Central to the approach is defining a large hierarchical optimization problem that involves identifying data transforms and the machine learning models themselves, in addition to the hyperparameters for the models
Essentially, python auto-ml library does the leg-work for building the model and optimizing its macro structure
-
-
psyarxiv.com psyarxiv.com
-
Rocca, R., & Yarkoni, T. (2020). Putting psychology to the test: Rethinking model evaluation through benchmarking and prediction. PsyArXiv. https://doi.org/10.31234/osf.io/e437b
-
- Nov 2020
-
www.youtube.com www.youtube.com
-
Nye Warburton 47:36
-
Moises Sanabria 1:19
Tags
Annotators
URL
-
- Oct 2020
-
-
f′(2)
Did he mean to write f'(3) here?
-
-
danmackinlay.name danmackinlay.name
-
A statistician is the exact same thing as a data scientist or machine learning researcher with the differences that there are qualifications needed to be a statistician, and that we are snarkier.
-
-
en.wikipedia.org en.wikipedia.org
-
numerically evaluate the derivative of a function specified by a computer program
I understand what they're saying, but one should be careful here not to confuse themselves with numerical differentiation a la finite differnces
-
-
bartwronski.com bartwronski.com
-
no gradient tapes, no graph definitions requires
Note to self: look up what this means
-
- Sep 2020
-
iriss.stanford.edu iriss.stanford.edu
-
2020 Conference on Computational Sociology | IRiSS. (n.d.). Retrieved 30 September 2020, from https://iriss.stanford.edu/css/conferences/2020-conference-computational-sociology
-
-
www.thelancet.com www.thelancet.com
-
Wilkinson, Jack, Kellyn F. Arnold, Eleanor J. Murray, Maarten van Smeden, Kareem Carr, Rachel Sippy, Marc de Kamps, et al. ‘Time to Reality Check the Promises of Machine Learning-Powered Precision Medicine’. The Lancet Digital Health 0, no. 0 (16 September 2020). https://doi.org/10.1016/S2589-7500(20)30200-4.
Tags
- algorithmic complexity
- prediction of individual responses
- collaboration
- lang:en
- machine learning
- personalised medical approach
- clinical science
- machine learning powered precision medicine
- revolution
- electronic health database
- clinical practice
- challenges
- is:report
- improved diagnosis
Annotators
URL
-
-
robjhyndman.com robjhyndman.com
-
cross-validation is sometimes not valid for time series models
What? Why? Does he mean k-fold specifically?
-
-
fermatslibrary.com fermatslibrary.com
-
For example, the one- pass (hardware) translator generated a symbol table and reverse Polish code as in conven- tional software interpretive languages. The translator hardware (compiler) operated at disk transfer speeds and was so fast there was no need to keep and store object code, since it could be quickly regenerated on-the-fly. The hardware-implemented job controller per- formed conventional operating system func- tions. The memory controller provided
Hardware assisted compiler is a fantastic idea. TPUs from Google are essentially this. They're hardware assistance for matrix multiplication operations for machine learning workloads created by tools like TensorFlow.
-
-
psyarxiv.com psyarxiv.com
-
Yang, Scott Cheng-Hsin, Chirag Rank, Jake Alden Whritner, Olfa Nasraoui, and Patrick Shafto. ‘Unifying Recommendation and Active Learning for Information Filtering and Recommender Systems’. Preprint. PsyArXiv, 25 August 2020. https://doi.org/10.31234/osf.io/jqa83.
Tags
- computer science
- exploration-exploitation tradeoff
- is:preprint
- AI
- Internet
- algorithms
- information filtering
- recommendation accuracy
- cognitive science
- lang:en
- machine learning
- predictive accuracy
- parameterized model
- experimental approach
- active learning
- artificial intelligence
- recommender system
Annotators
URL
-
- Aug 2020
-
psyarxiv.com psyarxiv.com
-
Hull, T., Levine, J., Bantilan, N., Desai, A., & Majumder, M. S. (2020, August 13). Digital phenotyping of complex psychological responses to the COVID-19 pandemic. https://doi.org/10.31234/osf.io/qtrpf
-
-
www.nber.org www.nber.org
-
Augenblick, N., Kolstad, J. T., Obermeyer, Z., & Wang, A. (2020). Group Testing in a Pandemic: The Role of Frequent Testing, Correlated Risk, and Machine Learning (Working Paper No. 27457; Working Paper Series). National Bureau of Economic Research. https://doi.org/10.3386/w27457
-
-
covid-19.iza.org covid-19.iza.org
-
Work That Can Be Done from Home: Evidence on Variation within and across Occupations and Industries. COVID-19 and the Labor Market. (n.d.). IZA – Institute of Labor Economics. Retrieved August 4, 2020, from https://covid-19.iza.org/publications/dp13374/
-
-
jasp-stats.org jasp-stats.org
-
Introducing JASP 0.11: The Machine Learning Module. (2019, September 24). JASP - Free and User-Friendly Statistical Software. https://jasp-stats.org/2019/09/24/introducing-jasp-0-11-the-machine-learning-module/
-
- Jul 2020
-
smile.amazon.com smile.amazon.com
-
github.com github.com
-
Determine if who is using my computer is me by training a ML model with data of how I use my computer. This is a project for the Intrusion Detection Systems course at Columbia University.
Tags
Annotators
URL
-
-
www.youtube.com www.youtube.com
-
Virtual MLSS 2020 (Opening Remarks). (2020, June 29). https://www.youtube.com/watch?v=8staJlMbAig
-
-
www.cs.cornell.edu www.cs.cornell.edu
-
Our membership inference attack exploits the observationthat machine learning models often behave differently on thedata that they were trained on versus the data that they “see”for the first time.
How well would this work on some of the more recent zero-shot models?
Tags
Annotators
URL
-
-
veekaybee.github.io veekaybee.github.io
-
data leakage (data from outside of your test set making it back into your test set and biasing the results)
This sounds like the inverse of “snooping”, where information about the test data is inadvertently built into the model.
-
-
-
Shah, C., Dehmamy, N., Perra, N., Chinazzi, M., Barabási, A.-L., Vespignani, A., & Yu, R. (2020). Finding Patient Zero: Learning Contagion Source with Graph Neural Networks. ArXiv:2006.11913 [Cs]. http://arxiv.org/abs/2006.11913
-
- Jun 2020
-
-
Ben-David, S. (2018). Clustering—What Both Theoreticians and Practitioners are Doing Wrong. ArXiv:1805.08838 [Cs, Stat]. http://arxiv.org/abs/1805.08838
-
-
arxiv.org arxiv.org
-
Velásquez, N., Leahy, R., Restrepo, N. J., Lupu, Y., Sear, R., Gabriel, N., Jha, O., Goldberg, B., & Johnson, N. F. (2020). Hate multiverse spreads malicious COVID-19 content online beyond individual platform control. ArXiv:2004.00673 [Nlin, Physics:Physics]. http://arxiv.org/abs/2004.00673
-
-
psyarxiv.com psyarxiv.com
-
Rahman, M. (2020, June 1). COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification. https://doi.org/10.31234/osf.io/sw2dn
-
- May 2020
-
-
Lanovaz, M., & Turgeon, S. (2020). Tutorial: Applying Machine Learning in Behavioral Research [Preprint]. PsyArXiv. https://doi.org/10.31234/osf.io/9w6a3
-
-
www.preprints.org www.preprints.org
-
Samuel, J.; Ali, G.G.M.N.; Rahman, M.M.; Esawi, E.; Samuel, Y. COVID-19 Public Sentiment Insights and Machine Learning for Tweets Classification. Preprints 2020, 2020050015 (doi: 10.20944/preprints202005.0015.v1)
-
-
psyarxiv.com psyarxiv.com
-
Donnellan, E., Sumeyye, Fastrich, G. M., & Murayama, K. (2020). How are Curiosity and Interest Different? Naïve Bayes Classification of People’s Naïve Belief. https://doi.org/10.31234/osf.io/697gk
-
-
www.deeplearningbook.org www.deeplearningbook.org
-
the network typically learns to useh(t)as a kind of lossysummary of the task-relevant aspects of the past sequence of inputs up tot
The hidden state h(t) is a high-level representation of whatever happened until time step t.
-
Parameter sharingmakes it possible to extend and apply the model to examples of different forms(different lengths, here) and generalize across them. If we had separate parametersfor each value of the time index, we could not generalize to sequence lengths notseen during training, nor share statistical strength across different sequence lengthsand across different positions in time. Such sharing is particularly important whena specific piece of information can occur at multiple positions within the sequence.
RNN have the same parameters for each time step. This allows to generalize the inferred "meaning", even when it's inferred at different steps.
-
-
expertsystem.com expertsystem.com
-
Machine learning is an application of artificial intelligence (AI) that provides systems the ability to automatically learn and improve from experience without being explicitly programmed
Tags
Annotators
URL
-
-
www.thelancet.com www.thelancet.com
-
Schwalbe, N., & Wahl, B. (2020). Artificial intelligence and the future of global health. The Lancet, 395(10236), 1579–1586. https://doi.org/10.1016/S0140-6736(20)30226-9
-
-
www.ft.com www.ft.com
-
Multiple articles from Financial Times - Future of AI and Digital Healthcare
-
- Apr 2020
-
deepspeech.readthedocs.io deepspeech.readthedocs.io
-
Python contributed examples¶ Mic VAD Streaming¶ This example demonstrates getting audio from microphone, running Voice-Activity-Detection and then outputting text. Full source code available on https://github.com/mozilla/DeepSpeech-examples. VAD Transcriber¶ This example demonstrates VAD-based transcription with both console and graphical interface. Full source code available on https://github.com/mozilla/DeepSpeech-examples.
-
-
deepspeech.readthedocs.io deepspeech.readthedocs.io
-
Python API Usage example Edit on GitHub Python API Usage example¶ Examples are from native_client/python/client.cc. Creating a model instance and loading model¶ 115 ds = Model(args.model) Performing inference¶ 149 150 151 152 153 154 if args.extended: print(metadata_to_string(ds.sttWithMetadata(audio, 1).transcripts[0])) elif args.json: print(metadata_json_output(ds.sttWithMetadata(audio, 3))) else: print(ds.stt(audio)) Full source code
-
-
github.com github.com
-
DeepSpeech is an open source Speech-To-Text engine, using a model trained by machine learning techniques based on Baidu's Deep Speech research paper. Project DeepSpeech uses Google's TensorFlow to make the implementation easier. NOTE: This documentation applies to the 0.7.0 version of DeepSpeech only. Documentation for all versions is published on deepspeech.readthedocs.io. To install and use DeepSpeech all you have to do is: # Create and activate a virtualenv virtualenv -p python3 $HOME/tmp/deepspeech-venv/ source $HOME/tmp/deepspeech-venv/bin/activate # Install DeepSpeech pip3 install deepspeech # Download pre-trained English model files curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.pbmm curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/deepspeech-0.7.0-models.scorer # Download example audio files curl -LO https://github.com/mozilla/DeepSpeech/releases/download/v0.7.0/audio-0.7.0.tar.gz tar xvf audio-0.7.0.tar.gz # Transcribe an audio file deepspeech --model deepspeech-0.7.0-models.pbmm --scorer deepspeech-0.7.0-models.scorer --audio audio/2830-3980-0043.wav A pre-trained English model is available for use and can be downloaded using the instructions below. A package with some example audio files is available for download in our release notes.
-
-
research.mozilla.org research.mozilla.org
-
Speech & Machine Learning
-
-
www.analyticsvidhya.com www.analyticsvidhya.com
-
import all the necessary libraries into our notebook. LibROSA and SciPy are the Python libraries used for processing audio signals. import os import librosa #for audio processing import IPython.display as ipd import matplotlib.pyplot as plt import numpy as np from scipy.io import wavfile #for audio processing import warnings warnings.filterwarnings("ignore") view raw modules.py hosted with ❤ by GitHub View the code on <a href="https://gist.github.com/aravindpai/eb40aeca0266e95c128e49823dacaab9">Gist</a>. Data Exploration and Visualization Data Exploration and Visualization helps us to understand the data as well as pre-processing steps in a better way.
-
TensorFlow recently released the Speech Commands Datasets. It includes 65,000 one-second long utterances of 30 short words, by thousands of different people. We’ll build a speech recognition system that understands simple spoken commands. You can download the dataset from here.
-
Learn how to Build your own Speech-to-Text Model (using Python) Aravind Pai, July 15, 2019 Login to Bookmark this article (adsbygoogle = window.adsbygoogle || []).push({}); Overview Learn how to build your very own speech-to-text model using Python in this article The ability to weave deep learning skills with NLP is a coveted one in the industry; add this to your skillset today We will use a real-world dataset and build this speech-to-text model so get ready to use your Python skills!
-
-
keras.io keras.io
-
Keras is a high-level neural networks API, written in Python and capable of running on top of TensorFlow, CNTK, or Theano. It was developed with a focus on enabling fast experimentation. Being able to go from idea to result with the least possible delay is key to doing good research. Use Keras if you need a deep learning library that: Allows for easy and fast prototyping (through user friendliness, modularity, and extensibility). Supports both convolutional networks and recurrent networks, as well as combinations of the two. Runs seamlessly on CPU and GPU. Read the documentation at Keras.io. Keras is compatible with: Python 2.7-3.6.
-
-
docs.opencv.org docs.opencv.org
-
Installation in Windows Compatibility: > OpenCV 2.0 Author: Bernát Gábor You will learn how to setup OpenCV in your Windows Operating System!
-
Here you can read tutorials about how to set up your computer to work with the OpenCV library. Additionally you can find very basic sample source code to introduce you to the world of the OpenCV. Installation in Linux Compatibility: > OpenCV 2.0
-
-
opencv.org opencv.orgAbout1
-
OpenCV (Open Source Computer Vision Library) is an open source computer vision and machine learning software library. OpenCV was built to provide a common infrastructure for computer vision applications and to accelerate the use of machine perception in the commercial products. Being a BSD-licensed product, OpenCV makes it easy for businesses to utilize and modify the code. The library has more than 2500 optimized algorithms, which includes a comprehensive set of both classic and state-of-the-art computer vision and machine learning algorithms. These algorithms can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements, track moving objects, extract 3D models of objects, produce 3D point clouds from stereo cameras, stitch images together to produce a high resolution image of an entire scene, find similar images from an image database, remove red eyes from images taken using flash, follow eye movements, recognize scenery and establish markers to overlay it with augmented reality, etc. OpenCV has more than 47 thousand people of user community and estimated number of downloads exceeding 18 million. The library is used extensively in companies, research groups and by governmental bodies. Along with well-established companies like Google, Yahoo, Microsoft, Intel, IBM, Sony, Honda, Toyota that employ the library, there are many startups such as Applied Minds, VideoSurf, and Zeitera, that make extensive use of OpenCV. OpenCV’s deployed uses span the range from stitching streetview images together, detecting intrusions in surveillance video in Israel, monitoring mine equipment in China, helping robots navigate and pick up objects at Willow Garage, detection of swimming pool drowning accidents in Europe, running interactive art in Spain and New York, checking runways for debris in Turkey, inspecting labels on products in factories around the world on to rapid face detection in Japan. It has C++, Python, Java and MATLAB interfaces and supports Windows, Linux, Android and Mac OS. OpenCV leans mostly towards real-time vision applications and takes advantage of MMX and SSE instructions when available. A full-featured CUDAand OpenCL interfaces are being actively developed right now. There are over 500 algorithms and about 10 times as many functions that compose or support those algorithms. OpenCV is written natively in C++ and has a templated interface that works seamlessly with STL containers.
Tags
Annotators
URL
-
-
arxiv.org arxiv.org
-
Liu, D., Clemente, L., Poirier, C., Ding, X., Chinazzi, M., Davis, J. T., Vespignani, A., & Santillana, M. (2020). A machine learning methodology for real-time forecasting of the 2019-2020 COVID-19 outbreak using Internet searches, news alerts, and estimates from mechanistic models. ArXiv:2004.04019 [Cs, q-Bio, Stat]. http://arxiv.org/abs/2004.04019
-
-
arxiv.org arxiv.org
-
Kerzendorf, W. E., Patat, F., Bordelon, D., van de Ven, G., & Pritchard, T. A. (2020). Distributed peer review enhanced with natural language processing and machine learning. Nature Astronomy. https://doi.org/10.1038/s41550-020-1038-y
-
-
-
Punn, N. S., Sonbhadra, S. K., & Agarwal, S. (2020). COVID-19 Epidemic Analysis using Machine Learning and Deep Learning Algorithms [Preprint]. Health Informatics. https://doi.org/10.1101/2020.04.08.20057679
-
-
www.mdpi.com www.mdpi.com
-
there is also strong encouragement to make code re-usable, shareable, and citable, via DOI or other persistent link systems. For example, GitHub projects can be connected with Zenodo for indexing, archiving, and making them easier to cite alongside the principles of software citation [25].
Teknologi Github dan Gitlab fokus kepada modus teks yang dapat dengan mudah dikenali dan dibaca mesin/komputer (machine readable).
Saat ini text mining adalah teknologi utama yang berkembang cepat. Machine learning tidak akan jalan tanpa bahan baku dari teknologi text mining.
Oleh karenanya, jurnal-jurnal terutama terbitan LN sudah lama memiliki dua versi untuk setiap makalah yang dirilis, yaitu versi PDF (yang sebenarnya tidak berbeda dengan kertas zaman dulu) dan versi HTML (ini bisa dibaca mesin).
Pengolah kata biner seperti Ms Word sangat bergantung kepada teknologi perangkat lunak (yang dimiliki oleh entitas bisnis). Tentunya kode-kode untuk membacanya akan dikunci.
Bahkan PDF yang dianggap sebagai cara termudah dan teraman untuk membagikan berkas, juga tidak dapat dibaca oleh mesin dengan mudah.
-
- Mar 2020
-
www.wired.com www.wired.com
-
a black software developer embarrassed Google by tweeting that the company’s Photos service had labeled photos of him with a black friend as “gorillas.”
-
More than two years later, one of those fixes is erasing gorillas, and some other primates, from the service’s lexicon. The awkward workaround illustrates the difficulties Google and other tech companies face in advancing image-recognition technology
-
-
www.quora.com www.quora.com
-
It doesn’t.What it does do is teach AI to recognize various things and fool you into thinking you’re getting better security.When you get something for free, you are the product.
-
-
www.youtube.com www.youtube.com
Tags
Annotators
URL
-
-
github.com github.com
-
en.wikipedia.org en.wikipedia.org
-
cloud.google.com cloud.google.com
Tags
Annotators
URL
-
- Nov 2019
-
www.cleveroad.com www.cleveroad.com
-
What’s the Difference Between AI, Machine Learning and Data Science?
-
- Sep 2019
-
onezero.medium.com onezero.medium.com
-
At the moment, GPT-2 uses a binary search algorithm, which means that its output can be considered a ‘true’ set of rules. If OpenAI is right, it could eventually generate a Turing complete program, a self-improving machine that can learn (and then improve) itself from the data it encounters. And that would make OpenAI a threat to IBM’s own goals of machine learning and AI, as it could essentially make better than even humans the best possible model that the future machines can use to improve their systems. However, there’s a catch: not just any new AI will do, but a specific type; one that uses deep learning to learn the rules, algorithms, and data necessary to run the machine to any given level of AI.
This is a machine generated response in 2019. We are clearly closer than most people realize to machines that can can pass a text-based Turing Test.
-
-
en.wikipedia.org en.wikipedia.org
-
Since all neurons in a single depth slice share the same parameters, the forward pass in each depth slice of the convolutional layer can be computed as a convolution of the neuron's weights with the input volume.[nb 2] Therefore, it is common to refer to the sets of weights as a filter (or a kernel), which is convolved with the input. The result of this convolution is an activation map, and the set of activation maps for each different filter are stacked together along the depth dimension to produce the output volume. Parameter sharing contributes to the translation invariance of the CNN architecture. Sometimes, the parameter sharing assumption may not make sense. This is especially the case when the input images to a CNN have some specific centered structure; for which we expect completely different features to be learned on different spatial locations. One practical example is when the inputs are faces that have been centered in the image: we might expect different eye-specific or hair-specific features to be learned in different parts of the image. In that case it is common to relax the parameter sharing scheme, and instead simply call the layer a "locally connected layer".
important terms you hear repeatedly great visuals and graphics @https://distill.pub/2018/building-blocks/
-
-
setosa.io setosa.io
-
Here's a playground were you can select different kernel matrices and see how they effect the original image or build your own kernel. You can also upload your own image or use live video if your browser supports it. blurbottom sobelcustomembossidentityleft sobeloutlineright sobelsharpentop sobel The sharpen kernel emphasizes differences in adjacent pixel values. This makes the image look more vivid. The blur kernel de-emphasizes differences in adjacent pixel values. The emboss kernel (similar to the sobel kernel and sometimes referred to mean the same) givens the illusion of depth by emphasizing the differences of pixels in a given direction. In this case, in a direction along a line from the top left to the bottom right. The indentity kernel leaves the image unchanged. How boring! The custom kernel is whatever you make it.
I'm all about my custom kernels!
-
-
-
We developed a new metric, UAR, which compares the robustness of a model against an attack to adversarial training against that attack. Adversarial training is a strong defense that uses knowledge of an adversary by training on adversarially attacked images[3]To compute UAR, we average the accuracy of the defense across multiple distortion sizes and normalize by the performance of an adversarially trained model; a precise definition is in our paper. . A UAR score near 100 against an unforeseen adversarial attack implies performance comparable to a defense with prior knowledge of the attack, making this a challenging objective.
@metric
Tags
Annotators
URL
-
- Aug 2019
-
colah.github.io colah.github.io
-
Using multiple copies of a neuron in different places is the neural network equivalent of using functions. Because there is less to learn, the model learns more quickly and learns a better model. This technique – the technical name for it is ‘weight tying’ – is essential to the phenomenal results we’ve recently seen from deep learning.
This parameter sharing allows CNNs, for example, to need much less params/weights than Fully Connected NNs.
-
The known connection between geometry, logic, topology, and functional programming suggests that the connections between representations and types may be of fundamental significance.
Examples for each?
-
Representations are Types With every layer, neural networks transform data, molding it into a form that makes their task easier to do. We call these transformed versions of data “representations.” Representations correspond to types.
Interesting.
Like a Queue Type represents a FIFO flow and a Stack a FILO flow, where the space we transformed is the operation space of the type (eg a Queue has a folded operation space compared to an Array)
Just free styling here...
-
In this view, the representations narrative in deep learning corresponds to type theory in functional programming. It sees deep learning as the junction of two fields we already know to be incredibly rich. What we find, seems so beautiful to me, feels so natural, that the mathematician in me could believe it to be something fundamental about reality.
compositional deep learning
-
Appendix: Functional Names of Common Layers Deep Learning Name Functional Name Learned Vector Constant Embedding Layer List Indexing Encoding RNN Fold Generating RNN Unfold General RNN Accumulating Map Bidirectional RNN Zipped Left/Right Accumulating Maps Conv Layer “Window Map” TreeNet Catamorphism Inverse TreeNet Anamorphism
👌translation. I like to think about embeddings as List lookups
-
-
en.wikipedia.org en.wikipedia.org
-
As log-bilinear regression model for unsupervised learning of word representations, it combines the features of two model families, namely the global matrix factorization and local context window methods
What does "log-bilinear regression" mean exactly?
-
-
labsblog.f-secure.com labsblog.f-secure.com
-
Security Issues, Dangers And Implications Of Smart Systems
-
- Jul 2019
-
mml-book.github.io mml-book.github.io
-
We will discuss classification in the context of supportclassificationvector machines
SVMs aren't used that much in practice anymore. It's more of an academic fling, because they're nice to work with mathematically. Empirically, Tree Ensembles or Neural Nets are almost always better.
Tags
Annotators
URL
-
-
machinelearningmastery.com machinelearningmastery.com
-
Ensemble Machine Learning Algorithms in Python with scikit-learn
Read on July 4, 2019
-
-
www.oreilly.com www.oreilly.com
-
Machine learning models are basically mathematical functions that represent the relationship between different aspects of data.
-
-
jmlr.csail.mit.edu jmlr.csail.mit.edu
-
Compared with neural networks configured by a pure grid search,we find that random search over the same domain is able to find models that are as good or betterwithin a small fraction of the computation time.
-
- Jun 2019
-
towardsdatascience.com towardsdatascience.com
-
To interpret a model, we require the following insights :Features in the model which are most important.For any single prediction from a model, the effect of each feature in the data on that particular prediction.Effect of each feature over a large number of possible predictions
Machine learning interpretability
-
-
www.wired.com www.wired.com
-
By comparison, Amazon’s Best Seller badges, which flag the most popular products based on sales and are updated hourly, are far more straightforward. For third-party sellers, “that’s a lot more powerful than this Choice badge, which is totally algorithmically calculated and sometimes it’s totally off,” says Bryant.
"Amazon's Choice" is made by an algorithm.
Essentially, "Amazon" is Skynet.
-
-
www.tensorflow.org www.tensorflow.org
-
This problem is called overfitting—it's like memorizing the answers instead of understanding how to solve a problem.
Simple and clear explanation of overfitting
-
- May 2019
-
cdn.aiindex.org cdn.aiindex.org
-
increased participation in organizations like AI4ALL and Women in Machine Learning
-
-
ML teaching events
-
-
policychangeindex.org policychangeindex.org
-
policy change index - machine learning on corpus of text to identify and predict policy changes in China
Tags
Annotators
URL
-
-
parametric.press parametric.press
-
simple explanation of AI bias - sources etc.
-
- Mar 2019
-
www.jeremyjordan.me www.jeremyjordan.me
-
normalization
Tags
Annotators
URL
-
-
-
Mention McDonald’s to someone today, and they're more likely to think about Big Mac than Big Data. But that could soon change: The fast-food giant has embraced machine learning, in a fittingly super-sized way.McDonald’s is set to announce that it has reached an agreement to acquire Dynamic Yield, a startup based in Tel Aviv that provides retailers with algorithmically driven "decision logic" technology. When you add an item to an online shopping cart, it’s the tech that nudges you about what other customers bought as well. Dynamic Yield reportedly had been recently valued in the hundreds of millions of dollars; people familiar with the details of the McDonald’s offer put it at over $300 million. That would make it the company's largest purchase since it acquired Boston Market in 1999.
McDonald's are getting into machine learning. Beware.
-
- Feb 2019
-
stackoverflow.com stackoverflow.com
-
Efficient way to loop over Pandas Dataframe to make dummy variables (1 or 0 input)
dummy encoding
-
-
dougengelbart.org dougengelbart.org
-
For instance, an aborigine who possesses all of our basic sensory-mental-motor capabilities, but does not possess our background of indirect knowledge and procedure, cannot organize the proper direct actions necessary to drive a car through traffic, request a book from the library, call a committee meeting to discuss a tentative plan, call someone on the telephone, or compose a letter on the typewriter.
In other words: culture. I'm pretty sure that Engelbart would agree with the statement that someone who could order a book from a library would likely not know the best way to find a nearby water source, as the right kind of aborigine would know. Collective intelligence is a monotonically increasing store of knowledge that is maintained through social learning -- not just social learning, but teaching. Many species engage in social learning, but humans are the only primates with visible sclera -- the whites of our eyeballs -- which enables even infants to track where their teacher/parent is looking. I think this function of culture is what Engelbart would call "C work"
A Activity: 'Business as Usual'. The organization's day to day core business activity, such as customer engagement and support, product development, R&D, marketing, sales, accounting, legal, manufacturing (if any), etc. Examples: Aerospace - all the activities involved in producing a plane; Congress - passing legislation; Medicine - researching a cure for disease; Education - teaching and mentoring students; Professional Societies - advancing a field or discipline; Initiatives or Nonprofits - advancing a cause. B Activity: Improving how we do that. Improving how A work is done, asking 'How can we do this better?' Examples: adopting a new tool(s) or technique(s) for how we go about working together, pursuing leads, conducting research, designing, planning, understanding the customer, coordinating efforts, tracking issues, managing budgets, delivering internal services. Could be an individual introducing a new technique gleaned from reading, conferences, or networking with peers, or an internal initiative tasked with improving core capability within or across various A Activities. C Activity: Improving how we improve. Improving how B work is done, asking 'How can we improve the way we improve?' Examples: improving effectiveness of B Activity teams in how they foster relations with their A Activity customers, collaborate to identify needs and opportunities, research, innovate, and implement available solutions, incorporate input, feedback, and lessons learned, run pilot projects, etc. Could be a B Activity individual learning about new techniques for innovation teams (reading, conferences, networking), or an initiative, innovation team or improvement community engaging with B Activity and other key stakeholders to implement new/improved capability for one or more B activities.
In other words, human culture, using language, artifacts, methodology, and training, bootstrapped collective intelligence; what Engelbart proposed, then was to apply C work to culture's bootstrapping capabilities.
-
-
rightsanddissent.org rightsanddissent.org
-
Nearly half of FBI rap sheets failed to include information on the outcome of a case after an arrest—for example, whether a charge was dismissed or otherwise disposed of without a conviction, or if a record was expunged
This explains my personal experience here: https://hyp.is/EIfMfivUEem7SFcAiWxUpA/epic.org/privacy/global_entry/default.html (Why someone who had Global Entry was flagged for a police incident before he applied for Global Entry).
-
Applicants also agree to have their fingerprints entered into DHS’ Automatic Biometric Identification System (IDENT) “for recurrent immigration, law enforcement, and intelligence checks, including checks against latent prints associated with unsolved crimes.
Intelligence checks is very concerning here as it suggests pretty much what has already been leaked, that the US is running complex autonomous screening of all of this data all the time. This also opens up the possibility for discriminatory algorithms since most of these are probably rooted in machine learning techniques and the criminal justice system in the US today tends to be fairly biased towards certain groups of people to begin with.
-
It cited research, including some authored by the FBI, indicating that “some of the biometrics at the core of NGI, like facial recognition, may misidentify African Americans, young people, and women at higher rates than whites, older people, and men, respectively.
This re-affirms the previous annotation that the set of training data for the intelligence checks the US runs on global entry data is biased towards certain groups of people.
-
- Jan 2019
-
www.learndatasci.com www.learndatasci.com
-
Measurements are variables that can be quantified. All data in the output above are measurements. Some of these measurements, such as state_percentile_16, avg_score_16 and school_rating, are outcomes; these outcomes cannot be used to explain one another. For example, explaining school_rating as a result of state_percentile_16 (test scores) is circular logic. Therefore we need a second class of variables.
-
-
hypothes.is hypothes.is
-
nning. It's time to start annotating som
sfsdf
-
- Nov 2018
-
www.technologyreview.com www.technologyreview.com
-
The vast majority of machine-learning applications rely on supervised learning.
So then we know that most people will use supervised learning that requires less computational power and knowledge.
-
-
www.coursera.org www.coursera.org
-
Mathematics for Machine Learning
specialization
-
- Sep 2018
-
am207.github.io am207.github.io
-
in equation B for the marginal of a gaussian, only the covariance of the block of the matrix involving the unmarginalized dimensions matters! Thus “if you ask only for the properties of the function (you are fitting to the data) at a finite number of points, then inference in the Gaussian process will give you the same answer if you ignore the infinitely many other points, as if you would have taken them all into account!”(Rasmunnsen)
key insight into Gaussian processes
-
-
www.nature.com www.nature.com
-
The new learned optical correlator uses existing light to save energy costs over an optoelectronic two-layer Combines Convolutional neural network (CNN). https://www.sciencedaily.com/releases/2018/08/180802130750.htm
Tags
Annotators
URL
-
-
machinelearnings.co machinelearnings.co
-
AI and machine learning
-
-
www.statisticssolutions.com www.statisticssolutions.com
-
predictive analysis
Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modelling, and machine learning, that analyze current and historical facts to make predictions about future or otherwise unknown events.
-
- Jul 2018
-
course-computational-literary-analysis.netlify.com course-computational-literary-analysis.netlify.com
-
There is here, moral, if not legal, evidence, that the murder was committed by the Indians.
This is a very interesting take on "evidence" as being moral if not legal by Sergeant Cuff. It makes me question exactly what he means by that if there is a way to use computational analysis to find out. We could perhaps start by parsing out "evidence" throughout the text with a machine learning algorithm to help he define evidence and then, going forward, device a way (maybe with sentiment analysis) to determine moral evidence from legal evidence.
-
-
www.youtube.com www.youtube.com
-
~32:00 What about the domain of the function being effectively lower dimensional, rather than a strongly regularity assumption? That would also work, right? Could this be the case for images? (what's the dimensionality of the manifold of natural images?)
Nice. I like the idea of regularity <> low dimensional representation. I guess by that general definition, the above is a form of regularity..
He comments about this on 38:30
Tags
Annotators
URL
-
-
www.sapiens.org www.sapiens.org
-
This system of demonstrating tasks to one robot that can then transfer its skills to other robots with different body shapes, strengths, and constraints might just be the first step toward independent social learning in robots. From there, we might be on the road to creating cultured robots.
-
Soon we might add robots to this list. While our fanciful desert scene of robots teaching each other how to defuse bombs lies in the distant future, robots are beginning to learn socially. If one day robots start to develop and share knowledge independently of humans, might that be the seed for robot culture?
-
his imaginary scene shows the power of learning from others. Anthropologists and zoologists call this “social learning”: picking up new information by observing or interacting with others and the things others produce. Social learning is rife among humans and across the wider animal kingdom. As we discussed in our previous post, learning socially is fundamental to how humans become fully rounded people, in all our diversity, creativity, and splendor.
-
-
www.npr.org www.npr.org
-
"It's so scary that it works," Perelman sighs. "Machines are very brilliant for certain things and very stupid on other things. This is a case where the machines are very, very stupid."
-
- Apr 2018
-
imgflip.com imgflip.com
Tags
Annotators
URL
-
- Mar 2018
-
webfoundation.org webfoundation.org
-
Artificial intelligence (AI), machine learning and deep learning
Explicación gráfica de artificial intelligence, machine learning y deep learning
-
- Dec 2017
-
-
Most of the recent advances in AI depend on deep learning, which is the use of backpropagation to train neural nets with multiple layers ("deep" neural nets).
Neural nets consist of layers of nodes, with edges from each node to the nodes in the next layer. The first and last layers are input and output. The output layer might only have two nodes, representing true or false. Each node holds a value representing how excited it is. Each edge has a value representing strength of connection, which determines how much of the excitement passes through.
The edges in an untrained neural net start with random values. The training data consists of a series of samples that are already labeled. If the output is wrong, the edges are adjusted according to how much they contributed to the error. It's called backpropagation because it starts with the output nodes and works toward the input nodes.
Deep neural nets can be effective, but only for single specific tasks. And they need huge sets of training data. They can also be tricked rather easily. Worse, someone who has access to the net can discover ways of adding noise to images that will make the net "see" things that obviously aren't there.
-
- Nov 2017
-
www.datavisor.com www.datavisor.com
-
UML automatically finds these hidden patterns to link seemingly unrelated accounts and customers. These links can be one of thousands of data fields that the UML model ingests.
Why does this have to be done in a different system?
-
-
genomebiology.biomedcentral.com genomebiology.biomedcentral.com
-
MCC
Matthews correlation coefficient
-
- Oct 2017
-
www.nytimes.com www.nytimes.com
-
Back in 2012, Zeynep Tufecki pointed out that election campaigns driven by big-data and social media could be bad for democracy.
-
- Sep 2017
-
toidicodedao.com toidicodedao.com
-
Đầu tiên mình nghĩ bạn cần nắm về machine learning và algorithm, bạn có thể bắt đầu bằng các khóa học trên mạng. Mình recommend khóa học Machine Learning của Andrew Ng, khóa học này được coi là kinh thánh cho data scientist. Sau đó bạn có thể bắt đầu với Python hoặc R và tham gia challenge trên Kaggle. Kaggle là một platform để Data Scientist tham gia, kiếm tiền thưởng và cạnh tranh thứ hạng với nhau. Nhiều người cũng nói với mình Kaggle là con đường tốt nhất và ngắn nhất để đến với Data Science.
Học cơ bản
-
- Aug 2017
-
www.lenddo.com www.lenddo.com
-
financial inclusion startup
-
-
www.americanbanker.com www.americanbanker.com
-
blog.athelas.com blog.athelas.com
-
Excellent overview. I found the papers a little hard to grasp, and this cleared a lot of that up.
-
- Jul 2017
-
ml.berkeley.edu ml.berkeley.edu
-
A very accessible explanation of the bias-variance trade-off in ML.
-
great examples and interactive viz
-
- Jun 2017
-
w4nderlu.st w4nderlu.st
-