2018 DS/ML digest 20
Market / posts:
(1) Classic AI competition https://habr.com/post/419745/
- Looks really cool, but there is no leaderboard
- Also too little time
(2) FAST AI and their fast imagenet training http://www.fast.ai/2018/08/10/fastai-diu-imagenet/
- Too much hype and over engineering?
- Sound ideas: (1) using variable image size (because CNNs are FCNs inside) (2) progressive resizing (train on smaller images => then increase the size, also you can use larger batch-sizes) (3) remove weight decay from batchnorm layers
- PyTorch has a distributed training tutorial https://pytorch.org/tutorials/intermediate/dist_tuto.html#distributed-training
(3) Intel NLP overview https://software.intel.com/en-us/articles/transfer-learning-in-natural-language-processing
- Drop connect for LSTMs (randomly set innet LSTM weights to zero)
- Concat pooling of LSTM mean output and last output
- Gradual unfreezing, different learning rates per layer
(4) Kaggle provides free GPUs … yeah right https://mailchi.mp/kaggle/tap-into-the-power-of-gpus-for-deep-learning-2577173?e=fbd638a11f
(5) Recurrent models do not need to be recurrent http://www.offconvex.org/2018/07/27/approximating-recurrent/
(6) The state of conversational AI -
https://www.poly-ai.com/docs/naacl18.pdf
Papers / intersting abstracts:
- Google’s new MNasNet - faster mobile Net https://ai.googleblog.com/2018/08/mnasnet-towards-automating-design-of.html - this time they actually show the final architecture
Applied code / libraries
- Choosing syntactic parser for Russian language https://habr.com/company/sberbank/blog/418701/ - TLDR - use UDPipe