In an early blog on Recurrent Neural Networks (still my favorite for its insight into what's really going on), the author found individual neurons that evolved/optimized themselves into being gatekeepers for certain high-level features in the text: this section.After training an RNN to generate C code, one neuron became sensitive to the length of the line, another turned on inside quotation marks and was off outside, another was only on for the predicate of if statements, another for comments or quoted text, and one more for the depth of nested brackets/indentation. Most of the neurons were not easily interpretable, but presumably, combinations of them controlled combinations of high-level features.Could a set of booleans controlling things like "if quotation," "if comment," "if predicate" plus many other conditions be considered an internal representation of C code? If I were to write an algorithm for generating C code, it almost certainly would include variables that controlled these things.The way I look at it, the biggest difference between machine learning and hand-written code is the development process. Hand-written code is like craftsmanship, like building a chair from wood, while machine learning is like farming: putting a seed in the ground, controlling the environment—humidity, temperature, hyperparameters, training datasets—and waiting. Design is good for some products and agriculture is good for others. Agriculture is a particularly good way to make very complex things with loose constraints on how it works: I would not want to design a tree, but when nature grows a tree, I don't care if it has three branches on the left and two on the right or vice-versa.I'm glad that we now have two ways of making software, craftsmanship and farming. It's good to have more tools.
He basically likens it to evolution without being aware of it. I love how complex and intricately the functions of these neurons emerge, really like evolution.