2019 DS/ML digest 04

2019 DS/ML digest 04

Posted by snakers41 on February 18, 2019

Buy me a coffeeBuy me a coffee

NLP / presentations

  • Modern state of pre-trained NLP vectors - BERT pre-training. Problems:
    • Does not work for morphologically rich languages;
    • Trained on 4x4 or 8x8 TPU slice for 4 days;
  • OpenAI’s controversial take on LM pre-training:
    • it seems to be capable of generating reasonable samples about 50% of the time;
    • Large transformer-based language model with 1.5 billion parameters, trained on a dataset of 8 million web pages;
    • On highly technical or esoteric types of content, the model can perform poorly;
  • Using [transformer](thomwolf.io/data/Amsterdam_Uni_2019_01_18 - final.pdf) on conversational challenges:
    • A set of pre-trained transformer based models for LM and classification tasks;
  • OpenAI threat analysis - all of it boils down to investing US$50-100k into training such a model;
  • Drawbacks of BLEU metric:
    • BLEU was always intended to be a corpus-level measure => inflation on sentence level;
    • It doesn’t consider meaning;
    • It doesn’t directly consider sentence structure;
    • It doesn’t handle morphologically rich languages well;
    • It doesn’t map well to human judgments;
    • ROUGE - BLEU modification that focuses on recall rather than precision;
  • AAAI Conference highlights;
  • Evolved transformer;

Articles / blog posts

  • Troubleshooting CNNs, guide itself - mosly obvious pieces of advice;
  • Preventing fraud at Uber;
    • Altitude map;
    • Action sequences are different for fraudsters and non-fraudsters + LSTM usage;
    • Speed abnormalities;
  • The limitations of Deep Learning (CV) and how to fix them;
  • Google adds street view localization features to its products based on camera images;
  • Open-domain question answering with DeepPavlov - this is very niche + TF based;
  • Cameras that understand;
  • Ben Evans;
  • A New Golden Age for Computer Architecture;
  • NLP news;
  • Why CAPTCHAs are getting more difficult;
  • Where ML is headed - TLDR - RL;
  • Cut the AllenNLP crap and you have some arguments for proper code;
  • We are living in the post truth world;
  • How Yandex uses Semseg for satellite maps;
  • LDA explained:

Cool things