Data Science by ODS.ai 🦜
46K subscribers
677 photos
77 videos
7 files
1.75K links
First Telegram Data Science channel. Covering all technical and popular staff about anything related to Data Science: AI, Big Data, Machine Learning, Statistics, general Math and the applications of former. To reach editors contact: @malev
加入频道
​​Using ‘radioactive data’ to detect if a data set was used for training

The authors have developed a new technique to mark the images in a data set so that researchers can determine whether a particular machine learning model has been trained using those images. This can help researchers and engineers to keep track of which data set was used to train a model so they can better understand how various data sets affect the performance of different neural networks.

The key points:
- the marks are harmless and have no impact on the classification accuracy of models, but are detectable with high confidence in a neural network;
- the image features are moved in a particular direction (the carrier) that has been sampled randomly and independently of the data
- after a model is trained on such data, its classifier will align with the direction of the carrier
- the method works in such a way that it is difficult to detect whether a data set is radioactive and to remove the marks from the trained model.

blogpost: https://ai.facebook.com/blog/using-radioactive-data-to-detect-if-a-data-set-was-used-for-training/
paper: https://arxiv.org/abs/2002.00937

#cv #cnn #datavalidation #image #data
​​Are Pre-trained Convolutions Better than Pre-trained Transformers?

In this paper, the authors from Google Research wanted to investigate whether CNN architectures can be competitive compared to transformers on NLP problems. It turns out that pre-trained CNN models outperform pre-trained Transformers on some tasks; they also train faster and scale better to longer sequences.

Overall, the findings outlined in this paper suggest that conflating pre-training and architectural advances is misguided and that both advances should be considered independently. The authors believe their research paves the way for a healthy amount of optimism in alternative architectures.

Paper: https://arxiv.org/abs/2105.03322

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-cnnbettertransformers

#nlp #deeplearning #cnn #transformer #pretraining
​​InceptionNeXt: When Inception Meets ConvNeXt

Large-kernel convolutions, such as those employed in ConvNeXt, can improve model performance but often come at the cost of efficiency due to high memory access costs. Although reducing kernel size may increase speed, it often leads to significant performance degradation.

To address this issue, the authors propose InceptionNeXt, which decomposes large-kernel depthwise convolution into four parallel branches along the channel dimension. This new Inception depthwise convolution results in networks with high throughputs and competitive performance. For example, InceptionNeXt-T achieves 1.6x higher training throughputs than ConvNeX-T and a 0.2% top-1 accuracy improvement on ImageNet-1K. InceptionNeXt has the potential to serve as an economical baseline for future architecture design, helping to reduce carbon footprint.

A detailed unofficial overview of the paper: https://andlukyane.com/blog/paper-review-inceptionnext

Paper link:https://arxiv.org/abs/2303.16900

Code link: https://github.com/sail-sg/inceptionnext

#cnn #deeplearning #computervision
👍6