3 Matching Annotations
  1. Nov 2025
    1. Smith suggests that experimental data can help us better understand the causal mechanisms behind typological generalizations, something observational typological studies cannot do. We generally agree that some research setups are more adequate for investigating certain types of questions, and a division of labor, or triangulation, makes sense from this perspective. The difficulty emerges, again, with cases of disagreeing results between experimental and typological studies. Smith provides two very insightful examples of such cases. We will react to the first example, as it concerns a topic that we also explored in previous work, namely the relation between sociolinguistic factors and linguistic complexity (cf. Becker et al. 2023; Guzmán Naranjo et al. 2025). In both cases, we failed to find clear, convincing evidence for sociolinguistic correlates of linguistic complexity. In contrast, Smith (2024) reports on an artificial language learning experiment that supports the presence of mechanisms proposed in the typological literature to account for an association between sociolinguistic factors and linguistic complexity. In such a situation, the important question arises: how can we understand the discrepancy between the results? Smith mentions two hypotheses: (i) the factors identified in the experiments are outweighed by other factors in the wild, and (ii) natural language data cannot show the correlation with sufficient confidence. We agree, and we can think of a number of other potential explanations that can lead to the situation of finding an effect of, e.g., socio-linguistic factors on linguistic complexity in experimental studies but not in typological ones. We think that all these issues should be explored and subsequently discarded in order to understand diverging results: experimental studies: the experimental design may not be suitable the experimental study may not reflect natural language learning the data analysis of the experimental study may have issues typological studies: the study may not operationalize the actual socio-linguistic hypotheses well the data collection and annotation may contain too many mistakes the language sample may be too small to detect the (potentially weak) effects the language sample may be wrong in just the right way, hiding the effects the data analysis of the typological study may have issues These issues all highlight the possibility that either the experimental or typological studies could lead to fundamentally incorrect results. This goes back to our main point: we can only increase our confidence about our findings with more transparency about the work process, with robustness tests and with replication. If at some point we reach high confidence about results from both experimental and typological studies, and these still diverge, we can then start to think about how and why they diverge. Currently, we do not believe that we can have high certainty about our typological results regarding sociolinguistic effects on linguistic complexity to begin with. Therefore, we should be cautious when trying to interpret differences between the typological and experimental results.

      B&GN appreciate Smith’s contribution and agree on the importance of combining typology with cognitive experiments. Nevertheless, Smith talked about two types of mismatch between typological and experimental results, while B&GN say that there are many more possible explanations for mismatch (they list the methodological problems in both approaches). B&GN think we cannot blindly trust typological results yet, cause they can be uncertain.

    1. If this joint approach is widely adopted, there will be cases where we need to resolve mismatches between typological and experimental findings. I have encountered two such mismatches in my recent work: one case where I think the hypothesised mechanisms are plausible even if the typology is highly contested, and one where the natural language facts seem to be agreed upon but we cannot support the mechanism experimentally. The first case relates to the well-known claim that languages spoken in larger, more heterogeneous communities, with more non-native speakers, tend to be morphologically simpler (e.g. Trudgill 2011); there is some quantitive evidence in support of this claim (e.g. Lupyan and Dale 2010; Sinnemäki 2020), although as we might expect in light of B&GN, different measures of morphological and social complexity and different analytic techniques produce different results (Koplenig 2019; Kauhanen et al. 2023; Shcherbakova et al. 2023). A small series of experiments have nevertheless tested some of the proposed mechanisms. In Smith (2024) I used an artificial language learning paradigm to test whether L2-like morphological simplifications made during (imperfect) learning could result in cumulative simplification of complex morphology as a language is transmitted across generations, and found that they could. Other work suggests other mechanisms that could explain the putative correlation, including e.g. the difficulty of converging on shared conventions with many versus few interlocutors (Raviv et al. 2019). My current impression is that there are probably several plausible mechanisms with at least some experimental support by which population size and proportion of non-native speakers could influence language complexity. If that link is not evident in the cross-linguistic data, it could be that the factors identified in the experiments are outweighed by other factors in the wild, or that the natural language data cannot show the correlation with confidence. In the second case we are testing mechanisms for unidirectionality in grammaticalisation (Kapron-King et al. under review, 2025). Concrete concepts (e.g. terms for body parts) tend to become grammatical markers (e.g. adpositions marking spatial relationships) but not vice versa. The intuitive and apparently quite widely-held assumption is that this unidirectionality is due to an inherent asymmetry in associations between these sets of concepts, such that e.g. body-part terms evoke spatial concepts but not the reverse. We have not been able to find evidence for this asymmetry across several semantic extension experiments; while our participants reliably associate body part terms and spatial relationships that are frequently involved in grammaticalisation pathways (as documented in Heine and Kuteva 2002, e.g. “head” and “above”), these associations are quite symmetrical. We have therefore (reluctantly!) become somewhat sceptical about association-based explanations of unidirectionality, and are exploring reanalysis-based accounts which do not rely on this asymmetry.

      if researchers combine typological and experimental approaches, they will find cases where the two types of evidence do not match. - The linguistic facts observed in natural languages are well-established, but the experiments fail to support the proposed cognitive mechanism (ex. typological data: ‘grammaticalisation’ goes from concrete concepts to abstract concepts, but not viceversa. Hypothesis: it is cognitively easier to do the association in this direction. Experimental data: false, people can do the association in both directions in their brain and it’s quite symmetrical —-> so why does it only happen in one direction in language?) - The cognitive mechanisms proposed by experimental data seem plausible, but the typological evidence is highly disputed among linguists (ex. experimental data: non native speakers simplify language and over time the standard variety also loose complexity. Typological data: imprecise data, variable based on the definition of ‘complexity’ and many more factors can influence the evolution of a language)

    2. If one’s goal is primarily to document constraints on cross-linguistic variation then this is obviously deeply troubling. However, if the central interest is the cognitive and interactional mechanisms responsible for those constraints – what it is about the way languages are learned, used and transmitted that leads to convergent cultural evolution on recurring constellations of linguistic features (see e.g. Haspelmath 2019, 2021) – then this uncertainty may be less problematic than it first appears, since we should in any case be running controlled experiments to test hypotheses about those mechanisms. B&GN (Becker and Guzmán Naranjo 2025) refer to experimental approaches briefly in a footnote as “triangulation”, “the combination of different empirical approaches to study the same phenomenon in order to test how robust results are across methods and to, ideally, find converging evidence”. I think the value of experimental work lies not in providing some additional data from another source, but a fundamentally different kind of data which allows us to test cognitive and interactional mechanisms hypothesised to be responsible for potential universals. Being observational, no matter how rigorously conducted, analyses of typological data cannot speak to those causal mechanisms. However, the observational data from typology is a rich source of potential hypotheses about mechanisms shaping linguistic systems, which can subsequently be tested in controlled experiments that can go beyond correlation and speak to causality.

      According to Smith, analyses of typological data can be a source of potential hypotheses about the mechanisms shaping linguistic systems, but it cannot speak to those causal mechanisms. Here lies the value of experimental work —-> test cognitive and interactional mechanisms that may be the potential cause for universals.for this reason, unlike B&GN, Smith thinks this data shouldn’t be used only to test the robustness of the results about the same phenomenon.