Data Science by ODS.ai 🦜
46.1K subscribers
663 photos
77 videos
7 files
1.75K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
加入频道
CACTUs: an unsupervised learning algorithm that learns to learn tasks constructed from unlabeled data. Leads to significantly more effective downstream learning & enables few-shot learning *without* labeled meta-learning datasets

ArXiv: https://arxiv.org/abs/1810.02334

#cactus #unsupervised
Hitchhiker’s guide to Exploratory Data Analysis

Exploratory Data Analysis — stage of finding out distribution of the data, volume, number of missing values and all the other characteristics of the available dataset.

Part 1: https://towardsdatascience.com/hitchhikers-guide-to-exploratory-data-analysis-6e8d896d3f7e
Part 2: https://towardsdatascience.com/hitchhikers-guide-to-exploratory-data-analysis-part-2-36ab72201e1d

#ExploratoryDA #novice #entrylevel
The Code for Facial Identity in the Primate Brain

This paper showed that facial images can be reconstructed from a simple linear model using responses of only ~200 visual neurons recorded from a monkey. This approach uses "face cells" which are encoding how much a face differs from average in particular ways ("eigenface dimensions").

https://www.sciencedirect.com/science/article/pii/S009286741730538X

#cv #dl
Great example on how different approach to feature encoding can influence the results.

Mean (likelihood) encoding for categorical variables with high cardinality and feature interactions: a comprehensive study with Python

Link: https://www.kaggle.com/vprokopev/mean-likelihood-encodings-a-comprehensive-study

#FeatureEngineering #FeactureEncoding #Kaggle
Neural Network Embeddings Explained

How deep learning can represent War and Peace as a vector

Easy to read #novice article about #embeddings. Basically — how to represent everything as a vector.

Link: https://towardsdatascience.com/neural-network-embeddings-explained-4d028e6f0526
Test-Driven Data Analysis

TDD is an approach to software development, suggesting that tests are essential part of the process. Over the years TDD have shown that it is required to maintain a good code base and the most common requirement for the lasting project.

Test driven approach can be maintain with data analysis too, with the reproducible research approach or TDDA, which is suggested by the latter link.

Link: http://www.tdda.info

#tdda
The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care

Interesting work looking at how AI could suggest optimal treatment for sepsis. Sepsis is a life threatening complication of infection and many deaths could be prevented with earlier identification and more targeted therapies.

Link: https://www.nature.com/articles/s41591-018-0213-5

#medical #health
Dynamic Meta-Embeddings for Improved Sentence Representations

While one of the first steps in many NLP systems is selecting what pre-trained word embeddings to use, we argue that such a step is better left for neural networks to figure out by themselves. To that end, we introduce dynamic meta-embeddings, a simple yet effective method for the supervised learning of embedding ensembles, which leads to state-of-the-art performance within the same model class on a variety of tasks. We subsequently show how the technique can be used to shed new light on the usage of word embeddings in NLP systems.

Paper: https://research.fb.com/wp-content/uploads/2018/10/Dynamic-Meta-Embeddings-for-Improved-Sentence-Representations.pdf
Link: https://research.fb.com/publications/dynamic-meta-embeddings-for-improved-sentence-representations/

P.S. Note the date of the publication

#embeddings #NLP #facebook
Really scary