2021 DS/ML digest 05

2021 DS/ML digest 05

Posted by snakers41 on June 2, 2021

Speech

Scaling End-to-End Models for Large-Scale Multilingual ASR - http://arxiv.org/abs/2104.14830
Unsupervised Speech Recognition - https://arxiv.org/abs/2105.11084

ML / Papers

Holistic Video Scene Understanding with ViP-DeepLab - https://ai.googleblog.com/2021/04/holistic-video-scene-understanding-with.html
Do Wide and Deep Networks Learn the Same Things? - https://ai.googleblog.com/2021/05/do-wide-and-deep-networks-learn-same.html
Advancing the state of the art in computer vision with self-supervised Transformers and 10x more efficient training (semseg) - https://ai.facebook.com/blog/dino-paws-computer-vision-with-self-supervised-transformers-and-10x-more-efficient-training/, code looks decent and minimal - https://github.com/facebookresearch/dino
Introducing FELIX: Flexible Text Editing Through Tagging and Insertion - https://ai.googleblog.com/2021/05/introducing-felix-flexible-text-editing.html
Infinite Nature: Perpetual View Generation of Natural Scenes from a Single Image - https://www.youtube.com/watch?v=oXUf6anNAtc&ab_channel=AngjooKanazawa
Open source software, freedom zero, and machine learning - https://thegradient.pub/machine-learning-ethics-and-open-source-licensing-2/
MLP-Mixer: An all-MLP Architecture for Vision - https://arxiv.org/abs/2105.01601
Paper Review: Are Pre-trained Convolutions Better than Pre-trained Transformers? - https://andlukyane.com//blog/paper-review-cnnbettertransformers
Accelerating Eye Movement Research for Wellness and Accessibility https://ai.googleblog.com/2021/05/accelerating-eye-movement-research-for.html
Rotary Embeddings for transformers - https://blog.eleuther.ai/rotary-embeddings/
FNet: Mixing Tokens with Fourier Transforms - http://arxiv.org/abs/2105.03824
Дисбаланс классов - https://dyakonov.org/2021/05/27/imbalance/

New “notation” for Deep learning

ALIGN: Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision - https://ai.googleblog.com/2021/05/align-scaling-up-visual-and-vision.html
Understanding Contextual Facial Expressions Across the Globe - https://ai.googleblog.com/2021/05/understanding-contextual-facial.html

Code

A repo with standalone transformer implementations - https://github.com/lucidrains/x-transformers
Sorted sets in python - https://habr.com/ru/company/macloud/blog/558724/
The Correct Way to Overload Functions in Python - https://martinheinz.dev/blog/50

Tech

Resetting the App Store - https://www.ben-evans.com/benedictevans/2021/4/30/resetting-the-app-store
Step changes in ecommerce - https://www.ben-evans.com/benedictevans/2021/4/25/step-changes-in-ecommerce
Why A Unified ID Will Never Work - And Why Walled Gardens Will - https://www.kevel.co/blog/unified-id-cookies/

Blogs

Last Week in AI #114 - https://lastweekin.ai/p/115
Farming Robot Kills 100,000 Weeds per Hour With Lasers - https://www.freethink.com/articles/farming-robot
Top 3 Statistical Paradoxes in Data Science - https://towardsdatascience.com/top-3-statistical-paradoxes-in-data-science-e2dc37535d99
Transformers, Explained: Understand the Model Behind GPT-3, BERT, and T5 - https://daleonai.com/transformers-explained
GOOGLE’S VCU – A NEW CHIP ENTERS THE FRAY - https://digitstodollars.com/2021/05/11/googles-vcu-a-new-chip-enters-the-fray/
Andrew Ng X-Rays the AI Hype AI pioneer says machine learning may work on test sets, but that’s a long way from real world use https://spectrum.ieee.org/view-from-the-valley/artificial-intelligence/machine-learning/andrew-ng-xrays-the-ai-hype
GOOGLE BIGCHIP? https://digitstodollars.com/2021/05/14/google-bigchip/
The Gradient bi-weekly newsletter - https://thegradientpub.substack.com/p/update-1-fbi-usage-of-facial-recognition
Can Apple change ads? - https://www.ben-evans.com/benedictevans/2021/5/13/apples-ads-music
Unravelling the with statement - https://snarky.ca/unravelling-the-with-statement/
Project Guideline: Enabling Those with Low Vision to Run Independently - https://ai.googleblog.com/2021/05/project-guideline-enabling-those-with.html
Reader feedback on CPUs: pay attention to those frequencies - https://rachelbythebay.com/w/2021/05/22/cpu/
Unravelling async and await - https://snarky.ca/unravelling-async-and-await/
LET’S BUILD A CHIP – WITH MATH - https://digitstodollars.com/2021/05/28/lets-build-a-chip-with-math/
Please don’t count outages - https://rachelbythebay.com/w/2021/06/01/count/

Hardware

New Nvidia GPUs, you know - https://pc-01.tech/nvidia-3080-ti-3070-ti/

Datasets

Crisscrossed Captions: Semantic Similarity for Images and Text - https://ai.googleblog.com/2021/05/crisscrossed-captions-semantic.html

TextOCR dataset:

  • 28,134 natural images from TextVQA
  • 903,069 annotated scene-text words
  • 32 words per image on average