# Recurrent Continuous Translation Models

@inproceedings{Kalchbrenner2013RecurrentCT, title={Recurrent Continuous Translation Models}, author={Nal Kalchbrenner and Phil Blunsom}, booktitle={EMNLP}, year={2013} }

We introduce a class of probabilistic continuous translation models called Recurrent Continuous Translation Models that are purely based on continuous representations for words, phrases and sentences and do not rely on alignments or phrasal translation units. [...] Key Result Finally we show that they match a state-of-the-art system when rescoring n-best lists of translations. Expand

#### 1,149 Citations

Translation Modeling with Bidirectional Recurrent Neural Networks

- Computer Science
- EMNLP
- 2014

This work presents phrase-based translation models that are more consistent with phrasebased decoding and introduces bidirectional recurrent neural models to the problem of machine translation, allowing the full source sentence to be used in the models. Expand

Character-based Neural Machine Translation

- Computer Science
- ArXiv
- 2015

A neural machine translation model that views the input and output sentences as sequences of characters rather than words, which alleviates much of the challenges associated with preprocessing/tokenization of the source and target languages. Expand

Translating TED speeches by recurrent neural network based translation model

- Computer Science
- 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
- 2014

This paper uses word-to-word alignment to compose translation units of bilingual tuples and presents a recurrent neural network-based translation model (RNNTM) to capture long-span context during estimating translation probabilities of bilingual Tuples. Expand

A Convolutional Encoder Model for Neural Machine Translation

- Computer Science
- ACL
- 2017

A faster and simpler architecture based on a succession of convolutional layers that allows to encode the source sentence simultaneously compared to recurrent networks for which computation is constrained by temporal dependencies is presented. Expand

Improving the Quality of Neural Machine Translation

- 2018

Over the last few years, neural machine translation has become the major approach to the problem of automatic translation. Nonetheless, even though current models are able to output fluent… Expand

Local Translation Prediction with Global Sentence Representation

- Computer Science
- IJCAI
- 2015

A novel bilingually-constrained chunkbased convolutional neural network to learn sentence semantic representations is proposed and a feed-forward neural network is designed to better predict translations using both local and global information. Expand

Incorporating Discrete Translation Lexicons into Neural Machine Translation

- Computer Science
- EMNLP
- 2016

A method to calculate the lexicon probability of the next word in the translation candidate by using the attention vector of the NMT model to select which source word lexical probabilities the model should focus on is described. Expand

Context-dependent word representation for neural machine translation

- Computer Science
- Comput. Speech Lang.
- 2017

This paper proposes to contextualize the word embedding vectors using a nonlinear bag-of-words representation of the source sentence and proposes to represent special tokens with typed symbols to facilitate translating those words that are not well-suited to be translated via continuous vectors. Expand

Exploiting Cross-Sentence Context for Neural Machine Translation

- Computer Science
- EMNLP
- 2017

This paper proposes a cross-sentence context-aware approach and investigates the influence of historical contextual information on the performance of neural machine translation (NMT). Expand

Phrase-Based & Neural Unsupervised Machine Translation

- Computer Science
- EMNLP
- 2018

This work investigates how to learn to translate when having access to only large monolingual corpora in each language, and proposes two model variants, a neural and a phrase-based model, which are significantly better than methods from the literature, while being simpler and having fewer hyper-parameters. Expand

#### References

SHOWING 1-10 OF 18 REFERENCES

Continuous Space Translation Models for Phrase-Based Statistical Machine Translation

- Computer Science
- COLING
- 2012

Experimental evidence is provided that the approach seems to be able to infer meaningful translation probabilities for phrase pairs not seen in the training data, or even predict a list of the most likely translations given a source phrase. Expand

Continuous Space Translation Models with Neural Networks

- Computer Science
- NAACL
- 2012

Several continuous space translation models are explored, where translation probabilities are estimated using a continuous representation of translation units in lieu of standard discrete representations, jointly computed using a multi-layer neural network with a SOUL architecture. Expand

Continuous Space Language Models for Statistical Machine Translation

- Computer Science
- ACL
- 2006

This work proposes to use a new statistical language model that is based on a continuous representation of the words in the vocabulary, which achieves consistent improvements in the BLEU score on the development and test data. Expand

cdec: A Decoder, Alignment, and Learning Framework for Finite- State and Context-Free Translation Models

- Computer Science
- ACL
- 2015

We present cdec, an open source framework for decoding, aligning with, and training a number of statistical machine translation models, including word-based models, phrase-based models, and models… Expand

Recurrent Convolutional Neural Networks for Discourse Compositionality

- Computer Science
- CVSM@ACL
- 2013

The discourse model coupled to the sentence model obtains state of the art performance on a dialogue act classification experiment and is able to capture both the sequentiality of sentences and the interaction between different speakers. Expand

Context dependent recurrent neural network language model

- Computer Science
- 2012 IEEE Spoken Language Technology Workshop (SLT)
- 2012

This paper improves recurrent neural network language models performance by providing a contextual real-valued input vector in association with each word to convey contextual information about the sentence being modeled by performing Latent Dirichlet Allocation using a block of preceding text. Expand

The Mathematics of Statistical Machine Translation: Parameter Estimation

- Computer Science
- Comput. Linguistics
- 1993

It is reasonable to argue that word-by-word alignments are inherent in any sufficiently large bilingual corpus, given a set of pairs of sentences that are translations of one another. Expand

A Neural Probabilistic Language Model

- Computer Science
- J. Mach. Learn. Res.
- 2000

This work proposes to fight the curse of dimensionality by learning a distributed representation for words which allows each training sentence to inform the model about an exponential number of semantically neighboring sentences. Expand

Semantic Compositionality through Recursive Matrix-Vector Spaces

- Computer Science
- EMNLP
- 2012

A recursive neural network model that learns compositional vector representations for phrases and sentences of arbitrary syntactic type and length and can learn the meaning of operators in propositional logic and natural language is introduced. Expand

The Role of Syntax in Vector Space Models of Compositional Semantics

- Computer Science
- ACL
- 2013

This model leverages the CCG combinatory operators to guide a non-linear transformation of meaning within a sentence and is used to learn high dimensional embeddings for sentences and evaluate them in a range of tasks, demonstrating that the incorporation of syntax allows a concise model to learn representations that are both effective and general. Expand