My portfolio / CV

Just an up to date list of my accomplishments / competitions / credentials in a human readable form with hyperlinks

Posted by snakers41 on January 1, 2018

TLDR

My name is Alexander Veysov, I am a Machine Learning practitioner / Data Scientist with background in Economics / Finance. I have pivoted my path roughly every 3-4 years from a VC fund in Moscow, to online e-commerce startup to machine learning.

My favorite DL framework is PyTorch, I have experience in building production ready ML pipelines in Computer Vision (CV) and Natural Language Processing (NLP).

Now I work in Profi.ru as a head of Data Science department.

Contacts

References

Machine learning competitions / projects / pet projects

Obviously, this list excludes anything that I have done under NDA / any kind of oral agreement of this sort.

Also I am not particularly fond of latest developments happening to Kaggle (I entered too late I guess, it started spoiling ~2016 and was recently purchased) and I like to choose ML competitions that:

  • Open a new and unique domain for me;
  • Are genuinely interesting (not stacking 100 XGBoosts, or grinding and ensemble of 300 ResNets);
  • Provide some value for me as well as the community / my channel;

Nevertheless I have found motivation to enter these competitions / projects and learn in the process:

  • 2018 - How we took first place in CFT-2018 spelling competition (TLDR - proper seq2seq meets typos) + TDS article + code release (not full, sorry);
  • 2018 - A summary publication about what we managed to do in Profi.ru, I was also accepted as a TDS author. No code release here, for obvious reasons;
  • 2018 - Learning that Google’s new papers are not easy to replicate - I contacted the authors, and they promised to release the GPU training regime, which they did not (alas);
  • 2018 - Parsing Wikipedia and Common Crawl in literally a handful of commands;
  • 2018 - Solo medal in Airbus ship detection challenge on Kaggle, which finally led me to believe that Kaggle is a joke. I was sceptical about Kaggle from the beginning, but this kind of nailed it. No proper article and code release;
  • 2018 - Some ideas about building model cascades in scenarious close to real life (Google’s Open Images);
  • 2018 - CrowAI mapping challenge - I was not really interested in competing per se, but refining self-supervision techniques in semantic segmentation was nice;
  • 2018 - how I learned a bit about VAEs;
  • 2018 - Machines Can See 2018 2nd place + proper code release - new domain - adversarial attacks;
  • 2018 - Kaggle DS bowl 2018 article + proper code release - new domain for me (instance segmentation) and ~100-150th place on private LB within one week on single model w/o ensembles (also later I found a stupid bug in my code);
  • 2018 - more or less finished my exploratory hobby project I did for a friend (part one). It was a moonshot, but a lot of cool experience;
  • 2018 - a 5000+ time series forecasting competition. At first it seemed really cool, but in the end though we ended up on the 16th place the overall experience was disappointing. On the other hand - the techniques we tried here were really cool - tabular data embeddings, RNNs and various forests as baselines;
  • 2018 - SpaceNet 3 Road Detector + proper code release. I ended up in top 10 with the advice os Dmytro. I believe this to be my best achievement;
  • 2017-2018 our prize winning Jungle animal detection challenge + proper code;
  • 2017 - a small social phenomena classification project I did. Though it seems to be a bit straightforward, there was a lot of feature engineering involved;
  • 2017 - my first acquaintance with object detection in classifying fish and measuring its length;
  • 2017 - a single model submission in Carvana segmentation challenge - I ranked 67th. I mostly did this to learn about UNet;
  • 2017 - our hobby neuro chicken coop project - where we basically detected chickens in a remote location with Raspberry Pi and bad Internet. Just a passion project I did with my girlfriend;
  • 2017 - just for fun a did a hobby project with bird classification by their songs;

My Telegram channel

I have a telegram channel with 1k+ subscribers (~1400 as of latest revision). Also do not mix up these numbers with something more conventional like email - Telegram’s open rates are ~80-100%+, i.e. the audience really involved and active.

Telegram essentially is a messaging app with public groups / chats / open API / chat bots and a plethora of other features. It also boasts the most advanced tech and is robust to many things, if you know what I mean. I mostly post things that I find interesting / worth reading / outstanding about ML / DL / CV / Internet and Linux. It also mirrors and follows our blog (this website) a bit. Also we have a web telegram feed with RSS if you are into this sort of thing.

Spark-in.me website

In a nutshell, after leaving Ponominalu.ru, I made this website my pet-project, having written the back-end / API layer and a simple CMS using the tools I knew - PHP and PostgreSQL. The front-end part was outsourced to a third party. So the way this site is structured (namely its tags, SEO capabilities and overall architecture) can tell you a bit about myself.

Education / background

Initially I am an BA and MA in Economics from this university. Originally being born in Siberia I worked my way up by entering one of the most prestigious universities in the CIS (for free =) ) to graduate magna cum laude. I would not really say that it was difficult there given that we mostly studied foreign languages and econometrics-related disciplines, but it was a challenge to combine some orthogonal things. MGIMO in general sounds a bit like diplomatic / economic background, but I graduated from more of math / econometrics / IT driven faculty there.

Also as a result of this, I believe that my English is fluent, German and Spanish just being additional perks on the top.

Major stepping stones:

  • Buran VC. In the beginning, my first major place of work was this formerly Moscow-based VC fund (previous version of website I wrote in Notepad++ back in 2012, it stayed alive for 5-6 years). I was hired as the first hired employee apart from 2 managing partners, who worked for free at that time. I worked there for ~3 years to be promoted to an associate. It was really mind broadening to see ~1000 start-ups and more or less actively get involved with a couple of dozens. When the crisis hit Russia back in 2014 - 2015, they decided to gravitate towards Europe and my long-term plan was to move towards more technical roles;

  • Ponominalu.ru. Initially I entered the company more or less in the capacity of financial controller for a Buran VC portfolio company / corporate secretary for communication with Cypriot administrators. Then I more or less gravitated towards more technical projects within a company. In the end the company was sold to one of Russian public TelCos, which I enjoyed as one of option holders. In total I also stayed there for ~3 years. At first my projects were more or less financial (most notable was making sure that Formula 1 event did not bankrupt us and that we remained CF positive, because the promoter - a company owning Sochi Olympics objects - was notorious for litigation and delaying payments). The company was structured without explicit hierarchy, so this is a brief list of my best project achievements there:

    • Major refactoring of back-end to fit proper financial tracking;
    • Overseeing the online acquiring CF stream, I made sure that our online acquiring rate decreased by ~30-40% and online conversion grew by ~20 percentage points;
    • Mobile app 2.0 on iOS and Android;
    • Some SEO-driven projects;
    • A leader of team of 5 analysts;
  • ~1 year of ML competitions and self-education;

  • Brief work in PicsArt Inc in Armenia (a USA VC backed company outsourcing its operations to Armenia) ~ 4 months as a Data Scientist. At first I was very enthusiastic about joining an international company, but then mindset differences / corporate “culture” drove me away from there (read between the lines - I would not say that people were … striving for result there 100% of the time). During this short term, I finished 2 production-ready models there;

What inspires me

Building beautiful and useful pipelines / models.
ML / DL provides a constant stimulus to read about new things, learn new technologies.

During the last year I was particularly inspired by:

  • Almost all the things that FAIR produces. Such a drastic difference with Google / TF / their main business;
  • UMAP / HDBSCAN. Leland Imcinnes is my role model;
  • Needless to say - PyTorch is awesome;