Data Science by ODS.ai 🦜
46.1K subscribers
663 photos
77 videos
7 files
1.75K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
加入频道
Looks like #google started fighting automatic #captcha recognition systems with adding noise to the images.
Don't forget to give author claps!
Large-Scale Multilingual Speech Recognition with a Streaming End-to-End Model

High-quality #speechrecognition systems require large amounts of data—yet many languages have little data available. Check out new research into an end-to-end system trained as a single model allowing for real-time multilingual speech recognition.

Link: https://ai.googleblog.com/2019/09/large-scale-multilingual-speech.html

#speech #audio #DL #Google
Library for Scikit-learn parallelization

Operations like grid search, random forest, and others that use the njobs parameter in Scikit-Learn can automatically hand-off parallelism to a Dask cluster.

Link: https://ml.dask.org/joblib.html

#ML #scikitlearn
🔥🔥🔥Tomorrow we will hold an AMA session with Alexey Moiseenkov — ex-founder of #Prisma app (2016), which made neural networks popular and commodity nowadays. Now he works on #Capture app, bringing power of visual search in attempt to revolutionize messagers as we know them.

Please send questions through Google form. Make sure you provide your telegram nickname for clearing up question details.

Best to ask questions regarding his area of expertise (please do you homework and read existing interview):

1. Managing DS research
2. Product management in DS: how to control engeineers, how to manage team
3. How to build viral products
4. Fundrasing for messenger / DS products
5. Recruitment questions, building HR brand
6. How to find idea for a startup

Interview with Alexey: https://www.businessinsider.com/prisma-labs-app-profile-interview-with-ceo-alexey-moiseenkov-2016-8
Google forms link for questions: https://forms.gle/GupBUvkyqLp6kDvi8
Data Science by ODS.ai 🦜
🔥🔥🔥Tomorrow we will hold an AMA session with Alexey Moiseenkov — ex-founder of #Prisma app (2016), which made neural networks popular and commodity nowadays. Now he works on #Capture app, bringing power of visual search in attempt to revolutionize messagers…
AMA today at 15:00 GMT (in 4 hours). In a couple of hours we will publish link to private chat for AMA session.

Stay tuned, prepare your questions. Please do not ask trivial and gramatically incorrect questions like 'where to start data science'.
First of all, use search, we have nice collections of resources for starting a DS career, tagged with #wheretostart #entrylevel #novice. Secondly, pay respect to our guest and ask questions more relevant to his area of experise.
Hello!

We are announcing first historical Munich Data Science #meetup on Oct 24th jointly with LMU
Pls come grab snacks, chill with your peers, discuss #ml magic 🙂
Evgenii +4916091541827

https://www.meetup.com/Munich-Data-Science/events/265339172/
Simple comic on how #ML works from #Google

Make sure you save the link (or this message) to show it to people without great technical background for it is one of the best and clear explanations there is.

Link: https://cloud.google.com/products/ai/ml-comic-1/

#wheretostart #entrylevel #novice #explainingtochildren
ODS AMA with ex-Prisma and current founder of Capture has finished.

Due to requests, chat link will persist (at least for some time) here, so feel free to read. Messaging is disabled until further AMAs.

Stats:
155 people joined special AMA chat.
7 questions were pre-submitted through Google Form.
1 participant got banned.
ODS breakfast in Paris! See you this Saturday (12th of October) at 10:30 at Malongo Café, 50 Rue Saint-André des Arts.
ODS Frushtuck Munich! Jeder ist wilkommen, aber offizielle sprache ist englisch.

ODS breakfast in Munchen! See you this Friday (11th) at 8:30 at

Schmalznudel - Cafe Frischhut
Prälat-Zistl-Straße 8, 80331 München
https://goo.gl/maps/LnX8QVpjDM6sDCNQ8

Evgenii +4916091541827
PyTorch 1.3 released

- named tensors support
- general availability of Google Cloud TPU support
- captum - SOTA tools to understand how the importance of specific neurons and layers affect predictions made by the models
- crypten - a new research tool for secure machine learning with PyTorch
- many other improvements

Official announce: https://pytorch.org/blog/pytorch-1-dot-3-adds-mobile-privacy-quantization-and-named-tensors/
Captum website: https://www.captum.ai
CrypTen code: https://github.com/facebookresearch/CrypTen
#DL #PyTorch #TPU #GCP #Captum #CrypTen
​​DeepPrivacy model for making people on photoes unrecognizable (by humans)


ArXiV: https://arxiv.org/pdf/1909.04538.pdf

#MaskRCNN #DeepPrivacy #CV #DL
Self-supervised QA from Facebook AI

The researchers from Facebook AI published a paper with the results of exploring the idea of unsupervised extractive question answering and the following training of the supervised question answering model. This approach achieves 56.41F1 on SQuAD2 dataset.


Original paper: https://research.fb.com/wp-content/uploads/2019/07/Unsupervised-Question-Answering-by-Cloze-Translation.pdf?
Code for experiments: https://github.com/facebookresearch/UnsupervisedQA


#NLP #BERT #FacebookAI #SelfSupervised
Simple, Scalable Adaptation for Neural Machine Translation

Fine-tuning pre-trained Neural Machine Translation (NMT) models is the dominant approach for adapting to new languages and domains. However, fine-tuning requires adapting and maintaining a separate model for each target task. Researchers from Google propose a simple yet efficient approach for adaptation in #NMT. Their proposed approach consists of injecting tiny task specific adapter layers into a pre-trained model. These lightweight adapters, with just a small fraction of the original model size, adapt the model to multiple individual tasks simultaneously.

Guess it can be applied not only in #NMT but in many other #NLP, #NLU and #NLG tasks.

Paper: https://arxiv.org/pdf/1909.08478.pdf

#BERT #NMT #FineTuning
Communication-based Evaluation for Natural Language Generation (#NLG) that's dramatically out-performed standard n-gram-based methods.

Have you ever think that n-gram overlap measures like #BLEU or #ROUGE is not good enough for #NLG evaluation and human based evaluation is too expensive? Researchers from Stanford University also think so. The main shortcoming of #BLEU or #ROUGE methods is that they fail to take into account the communicative function of language; a speaker's goal is not only to produce well-formed expressions, but also to convey relevant information to a listener.

Researchers propose approach based on color reference game. In this game, a speaker and a listener see a set of three colors. The speaker is told one color is the target and tries to communicate the target to the listener using a natural language utterance. A good utterance is more likely to lead the listener to select the target, while a bad utterance is less likely to do so. In turn, effective metrics should assign high scores to good utterances and low scores to bad ones.

Paper: https://arxiv.org/pdf/1909.07290.pdf
Code: https://github.com/bnewm0609/comm-eval

#NLP #NLU