Extractive summarization may be regarded as acontextual bandit as follows. Each document is acontext, and each ordered subset of a document’ssentences is a different action
We can represent extractive summarization as a bandit problem by treating the document as the context and possible reorderings of sentences as actions an agent could take