Data Science by ODS.ai 🦜
45.9K subscribers
677 photos
77 videos
7 files
1.76K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
加ε…₯钑道
News on new Macbook Pro 13:

* M1 Apple chip with built in stuff for ML β€” but anyway you won't build models on the laptop
* Max 16 Gb RAM β€” so you won't be able to open more tabs in Chrome / Safari
* 100% recycled alluminium β€” good for nature
* Improved microphones and camera β€” collegues will see better picture of you and listen to your cats meowing clearer

And still no reasons to update if you are doing any DS.

#Apple
Benford’s Law, DS and the 2020 Election

This law can be used for the very basic check on wether the data was artificially generated or not. It assumes that lower digits have higher probability of occuring.

And there can be nothing better for #reproducibleresearch concept promotion, than #openresearch on poll data, because it shows that those can and should be transparent and open.

With the help of the repo below anyone can check compliance of poll data results with the #BenfordsLaw on unofficial (or official if you are able to get that data).

KDnuggets tutorial: https://www.kdnuggets.com/2020/09/diy-election-fraud-analysis-benfords-law.html
Github repo with examples on unofficial US election data: https://github.com/cjph8914/2020_benfords

#statistics
​​Three-dimensional residual channel attention networks denoise and sharpen fluorescence microscopy image volumes

#3DRCAN for denoising, super resolution and expansion microscopy.

GitHub: https://github.com/AiviaCommunity/3D-RCAN
ArXiV: https://www.biorxiv.org/content/10.1101/2020.08.27.270439v1

#biolearning #cv #dl
​​Tutorial on Generative Adversarial Networks (GANs) with Keras and TensorFlow

Nice tutorial with enough theory to understand what you are doing and code to get it done.

Link: https://www.pyimagesearch.com/2020/11/16/gans-with-keras-and-tensorflow/

#Keras #TensorFlow #tutorial #wheretostart #GAN
​​DeepMind significally (+100%) improved protein folding modelling

Why is this important: protein folding = protein structure = protein function = how protein works in the living speciment and what it does.
What this means: better vaccines, better meds, more curable diseases and more calamities easen by the medications or better understanding.

Dataset: ~170000 available protein structures from PDB
Hardware: 128 TPUv3 cores (roughly  equivalent to ~100-200 GPUs)

Link: https://deepmind.com/blog/article/alphafold-a-solution-to-a-50-year-old-grand-challenge-in-biology

#DL #NLU #proteinmodelling #bio #biolearning #insilico #deepmind #AlphaFold
​​Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes

This technology allows to move camera a bit on any video, slow down time or do both. Great application for video producers and motion designers.

Website: http://www.cs.cornell.edu/~zl548/NSFF/
ArXiV: https://arxiv.org/abs/2011.13084
YouTube: https://youtu.be/qsMIH7gYRCc

#Nerf #videointerpolation #DL
πŸ‘©β€πŸŽ“Online lectures on Special Topics in AI: Deep Learning

Fresh free and open playlist on special topics in #DL from University of Wisconsin-Madison. Topics covering reliable deep learning, generalization, learning with less supervision, lifelong learning, deep generative models and more.

Overview Lecture: https://www.youtube.com/watch?v=6LSErxKe634&list=PLKvO2FVLnI9SYLe1umkXsOfIWmEez04Ii
YouTube Playlist: https://www.youtube.com/playlist?list=PLKvO2FVLnI9SYLe1umkXsOfIWmEez04Ii
Syllabus: http://pages.cs.wisc.edu/~sharonli/courses/cs839_fall2020/schedule.html

#wheretostart #lectures #YouTube
πŸ‘1
Opinion: Remote jobs are to stay even after pandemic

As Packy McCormick (author of Notboring blog) writes in his recent post: we are never going back. Pandemic has catalized the global switch to remote jobs and acceptance of it (including consideration of stock options for remote workers).

Given that we as @opendatascience community are able to admit that those are only opinions and the future is more complex and unpredictable, we are posting a list of remote job aggregators so every reader can explore those opportunities if needed.

Blog entry: https://notboring.substack.com/p/were-never-going-back
Audio version: https://open.spotify.com/show/6k1YLBvORRMyosKy3x1xIl?si=_Z7mdecqTSSYrwhHexwGEA

Telegram bots:

@sixnomads_bot β€” a bot that connects you with relevant remote and full-time jobs that fit your tech stack, desired time zone and salary
@remotejobss β€” a channel that posts new remote opportunities daily
@datasciencejobseeker β€” a chat for jobs in data science
@remotejobpositions β€” a channel with interesting remote jobs for developers.

Here are some websites that might be a good alternative for you:

remoteok.io β€” a colorful job board with remote jobs in tech companies (from the creators of Nomadlist)
weworkremotely.com β€” another job board with new remote opportunities updated every day.
remote.co β€” a hub to learn new tips on working remotely and find your new remote job.

#hr #career #job #remote
​​Tool for restoration of pixelated images

Tool uses De Bruijn sequence to restore the original information

Github: https://github.com/beurtschipper/Depix

#pixelization #github
​​MPG: A Multi-ingredient Pizza Image Generator with Conditional StyleGANs

Work on conditional image generation

ArXiV: https://arxiv.org/abs/2012.02821

#GAN #DL #food2vec
Yandex Team Talk at NeurIPS. Talk will be most interesting for those who are working on critical aspects of successful data collection and labeling.

Moderation team will focus on:
- Remoteness. A discussion about effectiveness and efficiency of remote work on crowdsourcing platforms.
- Fairness. How the working environment (e.g., a crowdsourcing platform) may help provide executors flexibility in choosing/switching tasks and working hours.
- Mechanisms. Discussion on bilateral mechanisms that not only provide flexibility to the performers, but also guarantee the quality of the result and the efficiency of the process to the customers.

Toloka's workshop info: https://clck.ru/SNwi3

#NeurIPS2020 #labeling #Yandex
​​Supporting content decision makers with machine learning

#Netflix shared a post providing information about how they research and prepare data for new title production.

Link: https://netflixtechblog.com/supporting-content-decision-makers-with-machine-learning-995b7b76006f

#NLU #NLP #recommendation #embeddings
πŸ”₯Everything You Always Wanted To Know About GitHub (But Were Afraid To Ask)

ClickHouse team provided extensive statistics on GitHub, including but not limited to distribution of repositories by star count, top repositories by stars, affinity list, top labels etc.

All the data is available for download with instructions for ClickHouse import

Link: https://gh.clickhouse.tech/explorer/

#GitHub #ClickHouse #Yandex #statistics #EDA #engineerketing
Do you need any more proofs that GitHub is the best social network ever?