- Jan 2019
A comment at the bottom by Barbara H Partee, another panelist alongside Chomsky:
I'd like to see inclusion of a version of the interpretation problem that reflects my own work as a working formal semanticist and is not inherently more probabilistic than the formal 'generation task' (which, by the way, has very little in common with the real-world sentence production task, a task that is probably just as probabilistic as the real-world interpretation task).
There is a notion of success ... which I think is novel in the history of science. It interprets success as approximating unanalyzed data.
This article makes a solid argument for why statistical and probabilistic models are useful, not only for prediction, but also for understanding. Perhaps this is a key point that Noam misses, but the quote narrows the definition to models that approximate "unanalyzed data".
However, it seems clear from this article that the successes of ML models have gone beyond approximating unanalyzed data.
But O'Reilly realizes that it doesn't matter what his detractors think of his astronomical ignorance, because his supporters think he has gotten exactly to the key issue: why? He doesn't care how the tides work, tell him why they work. Why is the moon at the right distance to provide a gentle tide, and exert a stabilizing effect on earth's axis of rotation, thus protecting life here? Why does gravity work the way it does? Why does anything at all exist rather than not exist? O'Reilly is correct that these questions can only be addressed by mythmaking, religion or philosophy, not by science.
Scientific insight isn't the same as metaphysical questions, in spite of having the same question word. Asking, "Why do epidemics have a peak?" is not the same as asking "Why does life exist?". Actually, that second question can be interested in two different ways, one metaphysically and one physically. The latter interpretation means that "why" is looking for a material cause. So even simple and approximate models can have generalizing value, such as the Schelling Segregation model. There is difference between models to predict and models to explain, and both have value. As later mentioned in this document, theory and data are two feet and both are needed for each other.
This page discusses different types of models
- statistical models
- probabilistic models
- trained models
and explores the interaction between prediction and insight.
Chomsky (1991) shows that he is happy with a Mystical answer, although he shifts vocabulary from "soul" to "biological endowment."
Wasn't one of Chomsky's ideas that humans are uniquely suited to language? The counter-perspective espoused here appears to be that language emerges, and that humans are only distinguished by the magnitude of their capacity for language; other species probably have proto-language, and there is likely a smooth transition from one to the other. In fact, there isn't a "one" nor an "other" in a true qualitative sense.
So what if we discover something about the human that appears to be required for our language? Does this, then, lead us to knowledge of how human language is qualitatively different from other languages?
Can probabilistic models account for qualitative differences? If a very low, but not 0, probability is assigned to a given event that we know is impossible from our theory-based view, that doesn't make our probabilistic model useless. "All models are wrong, some are useful." But it seems that it does carry with it an assumption that there are no real categories, that categories change according to the needs, and are only useful in describing things. But the underlying nature of reality is of a continuum.
To Guide Data Collection
This seems to be, essentially, that models are useful for prediction, but prediction of unknowns in the data instead of prediction of future system dynamics.
Without models, in other words, it is not always clear what data to collect!
Or how to interpret that data in the light of complex systems.
Plate tectonics surely explains earthquakes, but does not permit us to predict the time and place of their occurrence.
But how do you tell the value of an explanation? Should it not empower you to some new action or ability? It could be that the explanation is somewhat of a by-product of other prediction-making theories (like how plate tectonics relies on thermodynamics, fluid dynamics, and rock mechanics, which do make predictions).
It might also make predictions itself, such as that volcanoes not on clear plate boundaries might be somehow different (distribution of occurrence over time, correlation with earthquakes, content of magma, size of eruption...), or that understanding the explanation for lightning allows prediction that a grounded metal pole above the house might protect the house from lightning strikes. This might be a different kind of prediction, though, since it isn't predicting future dynamics. Knowing how epidemics works doesn't necessarily allow prediction of total infected counts or length of infection, but it does allow prediction of minimum vaccination rates to avert outbreaks.
Nonetheless, a theory as a tool to explain, with very poor predictive ability, can still be useful, though less valuable than one that also makes testable predictions.
But in general, it seems like data -> theory is the explanation. Theory -> data is the prediction. The strength of the prediction depends on the strength of the theory.