Ben Evans about covid19 - https://www.ben-evans.com/benedictevans/2020/4/13/covid-and-forced-experiments
Srsly read Ben Evans - https://mailchi.mp/32f65c509253/benedicts-newsletter-no-451093?e=b7fff6bc1c
Some gihub CI workflow for pre-commit hooks - https://github.com/ternaus/iglovikov_helper_functions/blob/master/.github/workflows/ci.yml
Featured Stuff
Training with quantization noise for extreme model compression
- https://ai.facebook.com/blog/training-with-quantization-noise-for-extreme-model-compression/
- https://arxiv.org/abs/2004.07320
- https://github.com/pytorch/fairseq/tree/master/examples/quant_noise
- The authors have not yet replied, but it looks like their 14x result is a bit bogus - it does not provide speed up in production
Audio noise reduction on Nvidia RTX
- https://www.nvidia.com/en-us/geforce/guides/nvidia-rtx-voice-setup-guide/
- You can unzip the exe file, there is a model, but looks like it is encrypted (open-ssl is also there)
- Typical dick move by Nvidia - release something useful as an exe file, encrypt it, claim it to work only with new RTX GPUs (which is false btw)
NLP
Looks like razdel got a decent facelift - https://github.com/natasha/razdel#sentencies
Nice thread on Twitter about training transformers on small data - https://twitter.com/Tim_Dettmers/status/1247998807494684672
New library for language modelling written from scratch - https://habr.com/ru/post/499064/
A Scalable Approach to Reducing Gender Bias in Google Translate - https://ai.googleblog.com/2020/04/a-scalable-approach-to-reducing-gender.html - https://1.bp.blogspot.com/-bI20JjyYU1E/XqCMCVjkEDI/AAAAAAAAF1E/YkfKSMrsktol3Wy5LzC4ZwNJrMYNqC0ZgCLcBGAsYHQ/s640/pipeline.png
Text normnalization paper by Facebook - https://research.fb.com/publications/neural-models-of-text-normalization-for-speech-applications/
Russian News Scraping Project - https://github.com/ods-ai-ml4sg/proj_news_viz/wiki/История-проекта-и-как-сейчас-всё-устроено
ML
Just love this guy’s posts - https://blog.piekniewski.info/2020/04/13/deflaition/
Some nice algorithms in GPU - https://github.com/rapidsai/cuml#supported-algorithms
Real application of ML in agriculture - https://habr.com/ru/company/cognitivepilot/blog/497098/, https://habr.com/ru/company/cognitivepilot/blog/496058/
Tram autopilot (RU) - https://habr.com/ru/company/cognitivepilot/blog/498660/
Train autopilot (RU) - https://habr.com/ru/company/cognitivepilot/blog/499440/
Library to study how autograd works - https://github.com/karpathy/micrograd
3D sensors on Google Pixel phones - https://ai.googleblog.com/2020/04/udepth-real-time-3d-depth-sensing-on.html
- Key insights
- Given a pair of regions that are similar to each other, most corresponding subsets of those regions are also similar
- Lightweight (75k parameter) convolutional architecture, using IR brightness and neighbor information to adjust incorrect matches
- System is self-calibrating
- End-to-end deep learning architecture that enhances the raw uDepth data, inferring a complete, dense 3D depth map
Papers
Gradient Centralization: A New Optimization Technique for Deep Neural Networks - http://arxiv.org/abs/2004.01461
Dynamic Relu - http://arxiv.org/abs/2003.10027
Designing Network Design Spaces - http://arxiv.org/abs/2003.13678
- Very similar to how I do it, albeit with 1000x more compute
- Essentially this is differential evolution of network design in a low-compute,low-epoch training regime with only a single block type
- Interesting design principles: (i) shared bottleneck ratio (ii) shared group width (iii) good network have increasing widths
- The depth of best models is stable across regimes, with an optimal depth of ∼20 blocks (60 layers)
- The best models use a bottleneck ratio b of 1.0 which effectively removes the bottleneck
- The width multiplier wm of good models is ∼2.5 similar but not identical to the popular recipe of doubling widths across stages
- Inverted bottleneck degrades the quality
- SE modules add a good gain
A blog of an open-source search engine - https://0x65.dev/ ?
Some ez tweaks to the transformer architecture for NER - http://arxiv.org/abs/1911.04474
- Transformer-based character encoder
- Un-scaled Dot-Product Attention
- Sinusoidal position embeddings
Looks like a very cool set of CUDA based ML libraries - https://github.com/rapidsai
Advancing Self-Supervised and Semi-Supervised Learning with SimCLR
- https://ai.googleblog.com/2020/04/advancing-self-supervised-and-semi.html
- Contrastive learning
- A combination of simple augmentations (random cropping, random color distortion, and Gaussian blur)
- ResNet
- Two transformations stand out: random cropping and random color distortion
Interesting, new “internal” NN visualizations:
Datasets
https://ai.googleblog.com/2020/04/xtreme-massively-multilingual-multi.html
- 40 typologically diverse languages (spanning 12 language families)
- 9 tasks