Data Science by ODS.ai 🦜 – Telegram

Data Science by ODS.ai 🦜

@opendatascience

46K subscribers

664 photos

77 videos

7 files

1.75K links

First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev

About

Blog

Apps

Platform

Data Science by ODS.ai 🦜

46K subscribers

Data Science by ODS.ai 🦜

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer (T5)

The approach casts every language problem as a text-to-text task. For example, English-to-German translation – input: "translate English to German: That is good." target: "Das ist gut." or sentiment ID – input: "sentiment: This movie is terrible!", target: "negative"

Transfer learning for NLP usually uses unlabeled data for pre-training, so they assembled the "Colossal Clean Crawled Corpus" (C4), ~750GB of cleaned text from Common Crawl.

Compared to different architectural variants including encoder-decoder models and language models in various configurations and with various objectives. The encoder-decoder architecture performed best in our text-to-text setting.

More at the thread by the tweet: https://twitter.com/colinraffel/status/1187161460033458177?s=20

Paper: https://arxiv.org/abs/1910.10683
Code/models/data/etc: https://github.com/google-research/text-to-text-transfer-transformer

#NLP #DL #transformer

8.97K views12:17

Data Science by ODS.ai 🦜

How Trip Inferences and Machine Learning Optimize Delivery Times on Uber Eats

Article on how business task can be decomposed to ML problem

Link: https://eng.uber.com/uber-eats-trip-optimization/

#Uber #ml #taskdesign #analytics

How Trip Inferences and Machine Learning Optimize Delivery Times on Uber Eats | Uber Blog

Using GPS and sensor data from Android phones, Uber engineers develop a state model for trips taken by Uber Eats delivery-partners, helping to optimize trip timing for delivery-partners and eaters alike.

8.91K views05:19

Data Science by ODS.ai 🦜

Two papers stating random architecture search is a competitive (in some cases superior) baseline for NAS methods.

These are papers demonstrating that Neural Architecture Search can be stohastic.

Paper 1: https://arxiv.org/abs/1902.08142
Paper 2: https://arxiv.org/abs/1902.07638

#NAS #nn #DL

Evaluating the Search Phase of Neural Architecture Search

Neural Architecture Search (NAS) aims to facilitate the design of deep networks for new tasks. Existing techniques rely on two stages: searching over the architecture space and validating the best...

10.4K views07:05

Data Science by ODS.ai 🦜

ICCV 2019 papers

ICCV 2019 – one of the major tier A conferences on Computer Vision. These are papers presented at the conference. We are definitely going to post short descriptions of the most influential ones, but if you don't want to wait, here is the link:

Link: http://openaccess.thecvf.com/ICCV2019.py

#CV #Papers

9.48K views14:03

Data Science by ODS.ai 🦜

This media is not supported in your browser

VIEW IN TELEGRAM

FUNIT: Few-Shot Unsupervised Image-to-Image Translation

A team of NVIDIA researchers has defined new AI techniques that give computers enough smarts to see a picture of one animal and recreate its expression and pose on the face of any other creature. The work is powered in part by generative adversarial networks (GANs), an emerging AI technique that pits one neural network against another.

Blog: https://blogs.nvidia.com/blog/2019/10/27/ai-gans-pets-ganimals/
Paper: https://arxiv.org/abs/1905.01723
Сode: https://github.com/NVlabs/FUNIT
GANimal app: http://nvidia-research-mingyuliu.com/ganimal/

#CV #GAN #ICCV

9.28K views18:42

👎 1 🐶 15 👍 19

Data Science by ODS.ai 🦜

YOLACT_ Real-Time Instance Segmentation [ICCV Trailer].mp4

YOLACT: Real-time Instance Segmentation

Fully-convolutional model for real-time instance segmentation that achieves 29.8 mAP on MS COCO at 33.5 fps evaluated on a single Titan Xp, which is significantly faster than any previous competitive approach. They obtain this result after training on only one GPU.

video: https://www.youtube.com/watch?v=0pMfmo8qfpQ
paper: https://arxiv.org/abs/1904.02689
code: https://github.com/dbolya/yolact

#yolo #instance_segmentation #segmentation #real_time

14.7K views08:03

Data Science by ODS.ai 🦜

🎃Moscow Data Halloween on the 31st of October

It’s gonna be one of the most unusual data science meetups!

We will have several Black ML talks, Data Science PPT Karaoke from Hell, costume contest with prizes, lots of fun and afterparty.

Registration link: https://corp.mail.ru/ru/press/events/678/

31 октября 2019 Mail.ru Group и сообщество Open Data Science приглашают на Data Halloween!

8.48K viewsedited 10:42

Data Science by ODS.ai 🦜

NLP News: Deep Learning Indaba, EurNLP, ML echo chamber, Pretrained LMs, Reproducibility papers

The famous Sebastion Ruder (Research scientist @ DeepMindAI) wrote an interesting article about the last NLP news

article: http://newsletter.ruder.io/issues/deep-learning-indaba-eurnlp-ml-echo-chamber-pretrained-lms-reproducibility-papers-199557
tweet: https://twitter.com/seb_ruder/status/1186567939232817153?s=20

#NLP #News #Conference

9.99K views13:13

Data Science by ODS.ai 🦜

🏆 Moscow ML Trainings meetup on the 2nd of November

ML Trainings are based on Kaggle and other platform competitions and are held regularly with free attendance and a live stream. Winners and top-performing participants discuss competition tasks, share their solutions, and results.

Program and the registration link - https://corp.mail.ru/ru/press/events/682/
Live stream link - https://youtu.be/VNsXzK4C7gg
* Note: this time all the talks will be in Russian. Usually, we have one talk in English. @mltrainings

VK / Тренировка по машинному обучению

Тренировка по машинному обучению – это открытый митап, на который мы приглашаем участников соревнований по анализу данных, чтобы познакомиться, рассказать про задачи, обменяться опытом участия и пообщаться.

9.99K views14:21

Data Science by ODS.ai 🦜

ODS breakfast in Paris! See you this Saturday (2nd of November) at 10:30 at Malongo Café, 50 Rue Saint-André des Arts.

8.51K views09:35

Data Science by ODS.ai 🦜

This media is not supported in your browser

VIEW IN TELEGRAM

6-PACK: Category-level 6D Pose Tracker with Anchor-Based Keypoints

It's deep learning approach to category-level 6D object pose tracking on RGB-D data. this method tracks in real-time novel object instances of known object categories such as bowls, laptops, and mugs. 6-PACK learns to compactly represent an object by a handful of 3D keypoints, based on which the interframe motion of an object instance can be estimated through keypoint matching.
These keypoints are learned end-to-end without manual supervision to be most effective for tracking. Their experiments show that the method substantially outperforms existing methods on the NOCS category-level 6D pose estimation benchmark and supports a physical robot to perform simple vision-based closed-loop manipulation tasks.

preprint: https://arxiv.org/abs/1910.10750
code: https://github.com/j96w/6-PACK
tweet: https://twitter.com/RobobertoMM/status/1187617487837257733?s=20
video: https://www.youtube.com/watch?v=INBjNZsnfy4

#CV #DL #PatternRecognition

8.89K views10:53

Data Science by ODS.ai 🦜

Keras Tuner

Fully-featured, scalable, easy-to-use hyperparameter tuning for Keras & beyond.

It supports RandomSearch, BayesianOptimization, and Hyperband. It can run locally or in a distributed setting. It's possible to have both multi-device single-model training (one machine training one model over 8 GPUs) and distributed search (many models in parallel) at the same time

documentation: https://keras-team.github.io/keras-tuner/
tweet: https://twitter.com/fchollet/status/1189992078991708160?s=21

#DL #keras #Tuning #BayesianOptimization

8.57K views15:57

👎 1 😱 19 👍 41

Data Science by ODS.ai 🦜

🔥DeepMind’s AlphaStar beats top human players at strategy game StarCraft II

AlphaStar by Google’s DeepMind can now play StarCraft 2 so well that it places in the 99.8 percentile on the European server. In other words, way better than even great human players, achieving performance similar to gods of StarCraft.

Solution basically combines reinforcement learning with a quality-diversity algorithm, which is similar to an evolutionary algorithm.

What’s difficult about StarCraft and how is it different to recent #Go and #Chess AI solutions: even finding winning strategy (StarCraft is famouse to closeness to rock-scissors-paper, not-so-transitive game design, as chess and go), is not enough to win, since the result depends on execution on different macro and micro levels at different timescales.

How that is applicable in real world: basically, it is running logistics, manufacture, research with complex operations and different units.

Why this matters: it brings AI one step closer to running real business.

Blog post: https://deepmind.com/blog/article/AlphaStar-Grandmaster-level-in-StarCraft-II-using-multi-agent-reinforcement-learning
Nature: https://www.nature.com/articles/d41586-019-03298-6
ArXiV: https://arxiv.org/abs/1902.01724
Nontechnical video: https://www.youtube.com/watch?v=6eiErYh_FeY

#Google #GoogleAI #AlphaStar #Starcraft #Deepmind #nature #AlphaZero

The AI that mastered Starcraft II

Google’s DeepMind artificial intelligence researchers have already mastered games like Pong, Chess and Go but their latest triumph is on another planet. AlphaStar is an artificial intelligence trained to play the science fiction video game StarCraft II.
…

8.14K views05:19

🔥 22 🐜 5 🌏 1 💎 2

Data Science by ODS.ai 🦜

SinGan: Learning a Generative Model from a Single Natural Image

Best Paper Award at #ICCV2019. A generative model, which learns from a single natural image, and then generates random samples.

ArXiV: https://arxiv.org/pdf/1905.01164v2.pdf
Github: https://github.com/tamarott/SinGAN

#GAN #ICCV #BestPaperAward

7.39K views06:35

Data Science by ODS.ai 🦜

Matus Telgarsky’s Deep Learning Theory course

Course syllabus, lecture handout materials from Illinois university.

Link: http://mjt.cs.illinois.edu/courses/dlt-f19/

#MOOC #DL #Theory #Course

7.9K views06:41

Data Science by ODS.ai 🦜

Prescribed Generative Adversarial Networks

Adding noise to the generator's output prevent common model collapse in GANs, and also allows to approximate log-likelihood evaluation.

#GAN

Link: https://arxiv.org/abs/1910.04302

7.88K views07:05

Data Science by ODS.ai 🦜

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

It's the method for pre-training seq2seq models by de-noising text.

BART is trained by (1) corrupting text with an arbitrary noising function, and (2) learning a model to reconstruct the original text.

They evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

BART matches the performance of RoBERTa with comparable training resources on GLUE and SQuAD, achieves new state-of-the-art results on a range of abstractive dialogue, Q&A, and summarization tasks, with gains of up to 6 ROUGE.

Paper: https://arxiv.org/abs/1910.13461

#nlp #bert

8.29K viewsedited 19:27

Data Science by ODS.ai 🦜

Function-Space Distributions over Kernels

With a function-space approach to kernel learning helps to incorporate interpretable inductive biases, manage uncertainty, and discover rich representations of data.

ArXiV: https://arxiv.org/abs/1910.13565

#gaussianprocess #NeurIPS #NeurIPS2019 #FKL #kernellearning

8.58K viewsedited 07:03

Data Science by ODS.ai 🦜

Forwarded from Spark in me (Alexander)

The current state of "DIY" ML hardware

(i.e. that you can actually assemble and maintain and use in a small team)

Wanted to write a large post, but decided to just a TLDR.
In case you need a super-computer / cluster / devbox with 4 - 16 GPUs.

The bad
- Nvidia DGX and similar - 3-5x overpriced (sic!)
- Cloud providers (Amazon) - 2-3x overpriced

The ugly
- Supermicro GPU server solutions. This server hardware is a bit overpriced, but its biggest problem is old processor sockets
- Custom shop buit machines (with water) - very nice, but (except for water) you just pay US$5 - 10 - 15k for work you can do yourself in one day
- 2 CPU professional level motherboards - very cool, but powerful Intel Xeons are also very overpriced

The good
- Powerful AMD processor with 12-32 cores + top tier motherboard. This will support 4 GPUs on x8 speed and have a 10 Gb/s ethernet port
- Just add more servers with 10 Gb/s connection and probably later connect them into a ring ... cheap / powerful / easy to maintain

More democratization soon?

Probably the following technologies will untie our hands

- Single slot GPUs - Zotac clearly thought about it, maybe it will become mainstream in the professional market
- PCIE 4.0 => enough speed for ML even on cheaper motherboards
- New motherboards for AMD processors => maybe more PCIE slots will become normal
- Intel optane persistent memory => slow and expensive now, maybe RAM / SSD will merge (imagine having 2 TB of cheap RAM on your box)

Good chat in ODS on same topic.

#hardware

ZOTAC’s GeForce RTX 2080 Ti ArcticStorm: A Single-Slot Water Cooled GeForce RTX 2080 Ti

Ultra-high-end graphics cards these days all seem to either come with a very large triple fan cooler, or more exotically, a hybrid cooling system based around a large heatsink with fans and a liquid cooling block. Naturally, these cards use two or more slots…

378 views11:22

Data Science by ODS.ai 🦜

Forwarded from Spark in me (Alexander)

Open STT v1.0 release

Finally we released open STT v1.0 =)

Highlights

- 20 000 hours of annotated data
- 2 new large and diverse domains
- 12k speakers (to be released soon)
- Overall quality improvement
- See below posts and releases for more details

+---------------+------+--------+------+
| Domain        | Utts | Hours  | GB   |
+---------------+------+--------+------+
| Radio         | 8,3М | 11,996 | 1367 |
+---------------+------+--------+------+
| Public Speech | 1,7M | 2,709  | 301  |
+---------------+------+--------+------+
| Youtube       | 2,6М | 2,117  | 346  |
+---------------+------+--------+------+
| Books         | 1,3М | 1,632  | 180  |
+---------------+------+--------+------+
| Calls         | 695K | 819    | 91   |
+---------------+------+--------+------+
| Other         | 1.9M | 835    | 95   |
+---------------+------+--------+------+

How can I help?
- Share our dataset
- Share / publish your dataset - the more domains the better
- Upvote on habr
- Upvote on TDS (when released)
- We have an Open Collective page for donations

Links
- Open STT https://github.com/snakers4/open_stt
- Release https://github.com/snakers4/open_stt/releases
- Open TTS https://github.com/snakers4/open_tts
- Habr https://habr.com/ru/post/474462/
- Towards Data Science (coming soon)
- Bloghttps://spark-in.me/post/open-stt-release-v10
- Open collective https://opencollective.com/open_stt (edited)

GitHub - snakers4/open_stt: Open STT

Open STT. Contribute to snakers4/open_stt development by creating an account on GitHub.

302 views20:58