2018 DS/ML digest 28

2018 DS/ML digest 28

Posted by snakers41 on October 15, 2018

Buy me a coffeeBuy me a coffee

Become a PatronBecome a Patron


  • Google open-sources BERT:

    • Google releases pre-trained multi-language BERT for 102 languages;
    • Line-by-line PyTorch re-implementation + checkpoint load support. Use this to jump-start your seq2seq project maybe (together with annoated transformer);
  • Notes on writing NLP code - too much Allen NLP ads, but some ideas are really good like:

    • Mentally block all the ads;
    • Code version control and data version control;
    • Code is for humans;
    • Minimal tests for research;
  • Seq2seq visualization;

  • Knowledge transfer from unsupervised MT to NMT;

News / market / articles

  • Facebook builds a library of optimized mobile neural network primitives compatible with PyTorch 1.0:

    • A slight modification — computing dot products of several rows of A and several columns of B simultaneously — dramatically improves performance;
    • Low-precision integer representation offers several benefits over single-precision and even half-precision floating point: a 2x-4x smaller memory footprint, which helps keep a neural network model inside small caches of mobile processors; improved performance on memory bandwidth-bound operations; increased energy efficiency; and, on many types of hardware, higher computational throughputж
    • Uses Android CNN API;
  • Detecting whales in the ocean with spectrograms and CNNs;

  • Evolution of AirBnb search ranking;

  • A GAN generated paitning sold for a lot of money;

  • Google combines memory + curiosity + RL;

  • Brief history of NMT before RNNs from Yandex;

  • Google explores automated stacking … examples they show are on … CIFAR;

  • A novel TL-GAN architecture explained:

    • Essentially this is a progressive gan from Nvidia + several classifiers on categorical features;
    • Video;
    • Repo;


  • A guide on Python testing - overall logit - use built-in unittest and just write a testing class;


  • CurriculumNet: Weakly Supervised Learning from Large-Scale Web Images;
  • New regularization for CNNs?

Just for lulz