Teacher forcing is a training technique that isapplicable to RNNs that have connections from their output to their hidden states at thenext time step. (Left) At train time, we feed the correct outputy(t)drawn from the trainset as input toh(t+1). (Right) When the model is deployed, the true output is generallynot known. In this case, we approximate the correct outputy(t)with the model’s outputo(t), and feed the output back into the model.
Teacher forcing strategy to parellelize training.