There might be a simple linear function that classifies the majority group correctly and there might be a (different) simple linear function that classifies the minority group correctly, but learning a (non-linear) combination of two linear classifiers is in general a computationally much harder problem.
This quote reminds me of what Jon mentioned how there was a distinct difference in customers who were english speaking vs another language speaker. He really mentioned that they were very different and hard to compare. Linear regression might also be a factor that does not help to match up an overarching view compared with each majority vs minority.