- Aug 2016
A team at Facebook reviewed thousands of headlines using these criteria, validating each other’s work to identify a large set of clickbait headlines. From there, we built a system that looks at the set of clickbait headlines to determine what phrases are commonly used in clickbait headlines that are not used in other headlines. This is similar to how many email spam filters work.
Though details are scarce, the very idea that Facebook would tackle this problem with both humans and algorithms is reassuring. The common argument about human filtering is that it doesn’t scale. The common argument about algorithmic filtering is that it requires good signal (though some transhumanists keep saying that things are getting better). So it’s useful to know that Facebook used so hybrid an approach. Of course, even algo-obsessed Google has used human filtering. Or, at least, human judgment to tweak their filtering algorithms. (Can’t remember who was in charge of this. Was a semi-frequent guest on This Week in Google… Update: Matt Cutts) But this very simple “we sat down and carefully identified stuff we think qualifies as clickbait before we fed the algorithm” is refreshingly clear.