When we conduct a scientific study, it is often not possible to collect data from every person in the population in the exact situation we want to study. Instead, we often have only a sample of subjects, which we observe in a certain, typical situation. For example, if we want to study adherence to red lights in traffic, we cannot check if every human being will stop at every red light, when driving cars, riding a bike, walking, skateboarding, or using any other means of transportation. We could, however, test 200 pedestrians’ behavior at the traffic light in front of a university.
Generalizability refers to whether a study’s findings, given its own restricted circumstances, can be extended to make statements about what will be true for the population in general, and for similar situations. For example, imagine we want to study adherence to red lights in traffic by observing 200 pedestrians’ behavior at the traffic light in front of a university. Given that our sample size is small and not representative (because there are mostly students in front of a university, a very specific sample of people), and that the situation we observe is only one facet of participation in traffic (we ignore driving, cycling, skateboarding, etc.), we could not make very good statements about adherence to red lights in general.