Neural nets are terrible at arithmetic & counting. If you train one in 1 to 10, it will do okay on 3 + 5 but fail miserably for 1000 + 3000. Resolving this, Β«Neural Arithmetic Logic UnitsΒ» can track time, do arithmetic on images of numbers, & extrapolate, providing better results than other architectures.
https://arxiv.org/pdf/1808.00508.pdf
#nn #architecture #concept #deepmind #arithmetic
https://arxiv.org/pdf/1808.00508.pdf
#nn #architecture #concept #deepmind #arithmetic
Mastermind: Using Uber Engineering to Combat Fraud in Real Time
Article on general aspects of how #Uberβs fraud prevention engine works.
Link: https://eng.uber.com/mastermind/
#architecture
Article on general aspects of how #Uberβs fraud prevention engine works.
Link: https://eng.uber.com/mastermind/
#architecture
ββTResNet: High Performance GPU-Dedicated Architecture
An alternative design of ResNet Architecture to better utilize GPU structure and assets.
Modern neural net architectures provide high accuracy but often at the expense of FLOPS count.
The authors of this paper suggest various design and optimization improvements achieve both higher accuracy and efficiency.
There are three variants of architecture: TResNet-M, TResNet-L, and TResNet-XL. These three models vary only in-depth and the number of channels.
The refinements of the architecture:
β SpaceToDepth stem
β Anti-Alias downsampling
β In-Place Activated BatchNorm
β Blocks selection
β SE layers
They also use Jit Compilation for layers without learnable parameters and a custom implementation of Average pooling with up to 5 times speed increase.
Paper: https://arxiv.org/abs/2003.13630
Github: https://github.com/mrT23/TResNet
#deeplearning #architecture #optimization
An alternative design of ResNet Architecture to better utilize GPU structure and assets.
Modern neural net architectures provide high accuracy but often at the expense of FLOPS count.
The authors of this paper suggest various design and optimization improvements achieve both higher accuracy and efficiency.
There are three variants of architecture: TResNet-M, TResNet-L, and TResNet-XL. These three models vary only in-depth and the number of channels.
The refinements of the architecture:
β SpaceToDepth stem
β Anti-Alias downsampling
β In-Place Activated BatchNorm
β Blocks selection
β SE layers
They also use Jit Compilation for layers without learnable parameters and a custom implementation of Average pooling with up to 5 times speed increase.
Paper: https://arxiv.org/abs/2003.13630
Github: https://github.com/mrT23/TResNet
#deeplearning #architecture #optimization
Forwarded from Binary Tree
Diagrams lets you draw the cloud system architecture in Python code. It was born for prototyping a new system architecture design without any design tools. You can also describe or visualize the existing system architecture as well. Diagrams currently supports main major providers including: AWS, Azure, GCP, Kubernetes, Alibaba Cloud, Oracle Cloud etc... It also supports On-Premise nodes, SaaS and major Programming frameworks and languages.
#python, #diagram, #drawing, #prototyping, #architecture
#python, #diagram, #drawing, #prototyping, #architecture
π1