2021 DS/ML digest 06

2021 DS/ML digest 06

Posted by snakers41 on July 1, 2021

Speech

U2++: Unified Two-pass Bidirectional End-to-end Model for Speech Recognition https://arxiv.org/pdf/2106.05642.pdf

ML / Papers

Contrastive Representation Learning - https://lilianweng.github.io/lil-log/2021/05/31/contrastive-representation-learning.html
Extending Contrastive Learning to the Supervised Setting - https://ai.googleblog.com/2021/06/extending-contrastive-learning-to.html
A Browsable Petascale Reconstruction of the Human Cortex - https://ai.googleblog.com/2021/06/a-browsable-petascale-reconstruction-of.html
Image “Cloaking” for Personal Privacy - https://sandlab.cs.uchicago.edu/fawkes/
Paper Review: ByT5: Towards a token-free future with pre-trained byte-to-byte models - https://andlukyane.com//blog/paper-review-byt5
Paper Review: Long Text Generation by Modeling Sentence-Level and Discourse-Level Coherence - https://andlukyane.com//blog/paper-review-hint
Маленький и быстрый BERT для русского языка - https://habr.com/ru/post/562064/
FRILL: On-Device Speech Representations using TensorFlow-Lite - https://ai.googleblog.com/2021/06/frill-on-device-speech-representations.html
Paper Review: CoAtNet Marrying Convolution and Attention for All Data Sizes - https://andlukyane.com//blog/paper-review-coatnet
Are Self-Driving Cars Really Safer Than Human Drivers? - https://thegradient.pub/are-self-driving-cars-really-safer-than-human-drivers/
Scaling Vision Transformers - https://arxiv.org/pdf/2106.04560.pdf
How to Do Multi-Task Learning Intelligently - https://thegradient.pub/how-to-do-multi-task-learning-intelligently/
Take All Your Pictures to the Cleaners, with Google Photos Noise and Blur Reduction - https://ai.googleblog.com/2021/06/take-all-your-pictures-to-cleaners-with.html
Обзор методов численной оптимизации. Безусловная оптимизация: метод линий - https://habr.com/ru/post/561128/
Github copilot - https://copilot.github.com/
Advancing AI to make shopping easier for everyone - https://ai.facebook.com/blog/advancing-ai-to-make-shopping-easier-for-everyone/

Code

Why your multiprocessing Pool is stuck (it’s full of sharks!) - https://pythonspeed.com/articles/python-multiprocessing/
The Art of Writing Loops in Python - https://medium.com/techtofreedom/the-art-of-writing-loops-in-python-68e9869e4ed4
Python requests alternative - https://github.com/encode/httpx/
Python string parser - https://github.com/r1chardj0n3s/parse
Measuring memory usage in Python: it’s tricky! - https://pythonspeed.com/articles/measuring-memory-python/
Python: неочевидное в очевидном - https://habr.com/ru/post/564804/
Functools - The Power of Higher-Order Functions in Python - https://martinheinz.dev/blog/52
Measuring the memory usage of a Pandas DataFrame - https://pythonspeed.com/articles/pandas-dataframe-series-memory-usage/
Some python patterns - https://habr.com/ru/post/564598/

Tech

Sober Post: Things I’ve learned down-leveling my career - https://old.reddit.com/r/ExperiencedDevs/comments/nnw7yd/sober_post_things_ive_learned_downleveling_my/
Self-Driving Truck Completes 950-Mile Trip 10 Hours Faster Than Human Driver - https://interestingengineering.com/self-driving-truck-completes-950-mile-trip-10-hours-faster-than-human-driver
Freelance hell - https://habr.com/ru/post/561288/

Blogs

Gaming 004: Pirates of the 7 CDs - https://madned.substack.com/p/gaming-004-pirates-of-the-7-cds
The Rise of SPACs: IPO Disruptors or Blank Check Distortions? - http://aswathdamodaran.blogspot.com/2021/06/the-rise-of-spacs-ipo-disruptors-or.html
Dunning Kruger and the emperor’s new clothes - https://cerebralab.com/Dunning Kruger and the emperor’s new clothes
30 minute presentation from Tesla’s Andrej Karpathy on its counter-consensus approach to building autonomous driving - https://www.youtube.com/watch?v=NSDTZQdo6H8&ab_channel=YarrowB.

Hardware

Datasets

10k hours - https://github.com/SpeechColab/GigaSpeech
PatentNet: A Large-Scale Incomplete Multiview, Multimodal, Multilabel Industrial Goods Image Database - https://arxiv.org/abs/2106.12139