2020 DS/ML digest 06

Posted by snakers41 on April 30, 2020

Ben Evans about covid19 - https://www.ben-evans.com/benedictevans/2020/4/13/covid-and-forced-experiments
Srsly read Ben Evans - https://mailchi.mp/32f65c509253/benedicts-newsletter-no-451093?e=b7fff6bc1c
Some gihub CI workflow for pre-commit hooks - https://github.com/ternaus/iglovikov_helper_functions/blob/master/.github/workflows/ci.yml

Featured Stuff

Training with quantization noise for extreme model compression

Audio noise reduction on Nvidia RTX


Looks like razdel got a decent facelift - https://github.com/natasha/razdel#sentencies
Nice thread on Twitter about training transformers on small data - https://twitter.com/Tim_Dettmers/status/1247998807494684672
New library for language modelling written from scratch - https://habr.com/ru/post/499064/
A Scalable Approach to Reducing Gender Bias in Google Translate - https://ai.googleblog.com/2020/04/a-scalable-approach-to-reducing-gender.html - https://1.bp.blogspot.com/-bI20JjyYU1E/XqCMCVjkEDI/AAAAAAAAF1E/YkfKSMrsktol3Wy5LzC4ZwNJrMYNqC0ZgCLcBGAsYHQ/s640/pipeline.png
Text normnalization paper by Facebook - https://research.fb.com/publications/neural-models-of-text-normalization-for-speech-applications/
Russian News Scraping Project - https://github.com/ods-ai-ml4sg/proj_news_viz/wiki/История-проекта-и-как-сейчас-всё-устроено


Just love this guy’s posts - https://blog.piekniewski.info/2020/04/13/deflaition/
Some nice algorithms in GPU - https://github.com/rapidsai/cuml#supported-algorithms
Real application of ML in agriculture - https://habr.com/ru/company/cognitivepilot/blog/497098/, https://habr.com/ru/company/cognitivepilot/blog/496058/
Tram autopilot (RU) - https://habr.com/ru/company/cognitivepilot/blog/498660/
Train autopilot (RU) - https://habr.com/ru/company/cognitivepilot/blog/499440/
Library to study how autograd works - https://github.com/karpathy/micrograd

3D sensors on Google Pixel phones - https://ai.googleblog.com/2020/04/udepth-real-time-3d-depth-sensing-on.html

  • Key insights
    • Given a pair of regions that are similar to each other, most corresponding subsets of those regions are also similar
    • Lightweight (75k parameter) convolutional architecture, using IR brightness and neighbor information to adjust incorrect matches
    • System is self-calibrating
    • End-to-end deep learning architecture that enhances the raw uDepth data, inferring a complete, dense 3D depth map


Gradient Centralization: A New Optimization Technique for Deep Neural Networks - http://arxiv.org/abs/2004.01461
Dynamic Relu - http://arxiv.org/abs/2003.10027

Designing Network Design Spaces - http://arxiv.org/abs/2003.13678

  • Very similar to how I do it, albeit with 1000x more compute
  • Essentially this is differential evolution of network design in a low-compute,low-epoch training regime with only a single block type
  • Interesting design principles: (i) shared bottleneck ratio (ii) shared group width (iii) good network have increasing widths
  • The depth of best models is stable across regimes, with an optimal depth of ∼20 blocks (60 layers)
  • The best models use a bottleneck ratio b of 1.0 which effectively removes the bottleneck
  • The width multiplier wm of good models is ∼2.5 similar but not identical to the popular recipe of doubling widths across stages
  • Inverted bottleneck degrades the quality
  • SE modules add a good gain
    A blog of an open-source search engine - https://0x65.dev/ ?

Some ez tweaks to the transformer architecture for NER - http://arxiv.org/abs/1911.04474

  • Transformer-based character encoder
  • Un-scaled Dot-Product Attention
  • Sinusoidal position embeddings

Looks like a very cool set of CUDA based ML libraries - https://github.com/rapidsai

Advancing Self-Supervised and Semi-Supervised Learning with SimCLR

Interesting, new “internal” NN visualizations:



  • 40 typologically diverse languages (spanning 12 language families)
  • 9 tasks