While the relations word2vec captured had an intuitive and almost magical quality to them, later studies showed that there is nothing inherently special about word2vec: Word embeddings can also be learned via matrix factorization (Pennington et al, 2014; Levy & Goldberg, 2014)and with proper tuning, classic matrix factorization approaches like SVD and LSA achieve similar results (Levy et al., 2015)
Semantic sentence embeddings for paraphrasing and text summarization
On labeled examples standard supervised learning is used;
On unlabeled examples - CVT teaches auxiliary prediction modules that see restricted views of the input (only part of a sentence) to match the predictions of the full model seeing the whole input;
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
BERT Transformer uses bidirectional self-attention, while the GPT Transformer uses constrained self-attention where every token can only attend to context to its left;
All in all looks overly complicated for down-to-earth applications;
Masking pre-training schemes are cool, but also overly complicated;
Zero-Shot Style Transfer in Text Using Recurrent Neural Networks
Create paraphrases which are written in the style of another existing text;
Use thirty-two stylistically distinct versions of the Bible (Old and New Testaments);
In zero-shot translation, the system must translate from one language to another even though it has never seen a translation between the particular language pair;
Multi-layer recurrent neural network encoder and multi-layer recurrent network with attention for decoding;
Essentially a seq2seq model for paraphrasing. No details about actual forward / backward pass mechanics;
New super cool competitions:
Astronomical dataset competition - looks very cool, but as a competition it is of little interest. Dataset is very cool;
ISS RFID tracking challenge - very challenging and interesting;
Human protein multi-class - huge full-rez original dataset;
Google + DL + cancer - the real of using DL in assiting diagnostics “Algorithm-assisted pathologists demonstrated higher accuracy than either the algorithm or the pathologist alone”;