Data Science by ODS.ai ๐Ÿฆœ
46K subscribers
676 photos
77 videos
7 files
1.75K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
ๅŠ ๅ…ฅ้ข‘้“
โ€‹โ€‹Building Automated Feature Rollouts on Robust Regression Analysis

Nice article on important thing โ€” statistical analysis of hypothesis testing. Every new feature or change made to existent one is basically an experiment. Article covers how #Uber team handles this in live system.

Link: https://eng.uber.com/autonomous-rollouts-regression-analysis/

#Uber #statistics #production #truestory
Valuing Life as an Asset, as a Statistic and at Gunpoint

Ever wondered, how much your life is worth? This is an article about Life as an asset evaluation. It is extremely useful for insuarance companies and as a metric to calculate compensations in case of tragic events, but it is also a key to understand, how valuable (or not) life is.

Math is beautiful.

Link: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3156911

#math #life #insurance #statistics
โ€‹โ€‹IQ is largely a pseudoscientific swindle

Note by Nassim Taleb on how IQ works. He shows that high-IQ is not well-correlated with wealth or overall cognitive performance.

Link: https://medium.com/incerto/iq-is-largely-a-pseudoscientific-swindle-f131c101ba39

#statistics #iq #fallacy
โ€‹โ€‹Fair Regression for Health Care Spending

What happens, if fairness built into the objective function for continuous outcomes & see large improvements in group undercompensation?

This is the most interesting & potentially impactful analysis of fairness in #ML for #healthcare, which can lead to significant improvement in the life of millions.

ArXiV: https://arxiv.org/abs/1901.10566
GitHub: https://github.com/zinka88/Fair-Regression

#statistics #regression
โ€‹โ€‹Why Financial Planning is Excitingโ€ฆ At Least for a Data Scientist

Great introduction into the finance world and what data scientist can lack diving into the topic.

Link: https://eng.uber.com/financial-planning-for-data-scientist/

#Financial #statistics #Uber
Analyzing Experiment Outcomes: Beyond Average Treatment Effects

Good #statistics article on why tail distribution and #experimentdesign matters. Quantile treatment effects (QTEs) helps to capture the inherent heterogeneity in treatment effects when riders and drivers interact within the #Uber marketplace.

Link: https://eng.uber.com/analyzing-experiment-outcomes/
โ€‹โ€‹Pseudo-extended Markov chain Monte Carlo

Pseudo-Extended #MC for easier sampling from multimodal posteriors. Extend the target distribution and then run your favourite sampler (f.e. #HMC).

ArXiV: https://arxiv.org/abs/1708.05239

#statistics
โ€‹โ€‹Important article in Nature about statistical significance

Scientists rise up against statistical significance โ€” about motion to move from widely using and quoting statistical significance to confindence intervals.

Link: https://www.nature.com/articles/d41586-019-00857-9

#statistics #statsignificance #nature #science
Ranking Items With Star Ratings and How Not To Sort By Average Rating

Two absolute must read articles for proper sorting handling. Sorting items with just an average score is wrong and there is some good classic statistics explanation why.

Link: https://www.evanmiller.org/ranking-items-with-star-ratings.html
Link2: https://www.evanmiller.org/how-not-to-sort-by-average-rating.html

#Statistics #rating #scoring #ranking
๐Ÿ“šGuest post on great example of book abandonment at GoodReads

An excellent new article from Gwern on analyzing abandoned (hard to finish, hard to read) books on Goodreads. This write up includes step by step instructions with source code, even the way he parsed the data from the website without an API.

Itโ€™s a shame analysis like this does not come from an online book subscription service like Bookmate or MyBook. They have vastly superior datasets and many able data scientists. I am quite sure amazon kindle team does prepare internal reports like that for some evil business purposes, but thatโ€™s a whole different story.

During my time at video game database company RAWG.io weโ€™ve compiled โ€˜most abandonedโ€™ and โ€˜most addictiveโ€™ reports for video games.

Do you make a popular service with valuable user behavior data? Funny data analysis reports are a good way to get some attention to your product. Take a lead from Pornhub, they are great at publicizing their data.

Link: https://www.gwern.net/GoodReads
Pornhub Insights: https://www.pornhub.com/insights/

โ€”
This is a guest post by Samat Galimov, who writes about technology, programming and management in Russian on @ctodaily.


#DataAnalysis #GoodReads #statistics #greatstats #talkingnumbers