2019 DS/ML digest 03

2019 DS/ML digest 03

Posted by snakers41 on February 8, 2019

Week highlights


  • An article explaining current SOTA NLP models;
  • Let’s revisit this amazing piece on attention;


Post / blogs / articles

  • Looks like a decent new product from Google. Live transcribe for deaf people. Good job!;
  • How to install private repos from pip;
  • AlphaStar beats humans in SC2. What is interesting:
    • Average bot’s APM is 280;
    • 16 TPUs * 14 days for each agent (200 years of play);
    • Camera interface bot almost as good as raw SC2 interface bot;
  • An Interactive Introduction to Fourier Transforms link;
  • SpaceNet 4 competition results) - top competitors had 10-30 models trained on all of the types of images;
  • Sliding window approaches … work for super large histology images?;
  • Scholarship awards by Yandex;
  • Car plate recognition baseline;
  • Burnout poll;
  • Bayesian statistics explained in simple terms;
  • Mixture density networks + some nice remarks about CE / logloss / ML estimation;
  • Bend Evans digest;
  • Yes, too late here, but you CAN use CNNs for mixed / tabular data;


  • Google’s Audioset - did not know about it. Looks useful;


Deep Learning on Small Datasets without Pre-Training using Cosine Loss

  • Link;
  • The categorical cross-entropy loss after softmax activation is the method of choice for classification;
  • Training a CNN classifier from scratch on small datasets does does not work well;
  • In contrast to this, we show that the cosine loss function provides significantly better performance than cross-entropy on datasets with only a handful of samples per class. For
  • The problems is that it looks like they use imagenet for some kind of transfer learning;
  • Key idea - use information “between” the labels, because some classes clearly correlate;

Must … not … post BERT papers