Natural language processing tasks, such as caption generation and machine translation, involve generating sequences of words. Models developed for these problems often operate by generating probability distributions across the vocabulary of output words and it is up to decoding algorithms to sample the probability distributions to generate the most likely sequences of words. In this […]
