Проекты машинного обучения
78 subscribers
4 photos
414 links
加入频道
This media is not supported in your browser
VIEW IN TELEGRAM
Flow-Guided Transformer for Video Inpainting

Especially in spatial transformer, we design a dual perspective spatial MHSA, which integrates the global tokens to the window-based attention.
https://github.com/hitachinsk/fgt
LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale

📝We develop a procedure for Int8 matrix multiplication for feed-forward and attention projection layers in transformers, which cut the memory needed for inference by half while retaining full precision performance.
https://github.com/timdettmers/bitsandbytes
KeypointNeRF: Generalizing Image-based Volumetric Avatars using Relative Spatial Encoding of Keypoints

📝In this work, we investigate common issues with existing spatial encodings and propose a simple yet highly effective approach to modeling high-fidelity volumetric humans from sparse views.
https://github.com/facebookresearch/KeypointNeRF
StyleFaceV: Face Video Generation via Decomposing and Recomposing Pretrained StyleGAN3

📝Notably, StyleFaceV is capable of generating realistic $1024\times1024$ face videos even without high-resolution training videos.
https://github.com/arthur-qiu/stylefacev
Multi-instrument Music Synthesis with Spectrogram Diffusion

📝An ideal music synthesizer should be both interactive and expressive, generating high-fidelity audio in realtime for arbitrary combinations of instruments and notes.

https://github.com/magenta/music-spectrogram-diffusion
YOLOPv2: Better, Faster, Stronger for Panoptic Driving Perception

📝Over the last decade, multi-tasking learning approaches have achieved promising results in solving panoptic driving perception problems, providing both high-precision and high-efficiency performance.

https://github.com/CAIC-AD/YOLOPv2
Online Decision Transformer

📝Recent work has shown that offline reinforcement learning (RL) can be formulated as a sequence modeling problem (Chen et al., 2021; Janner et al., 2021) and solved via approaches similar to large-scale language modeling.

https://github.com/facebookresearch/online-dt
This media is not supported in your browser
VIEW IN TELEGRAM
PeRFception: Perception using Radiance Fields

📝The recent progress in implicit 3D representation, i. e., Neural Radiance Fields (NeRFs), has made accurate and photorealistic 3D reconstruction possible in a differentiable manner.

https://github.com/POSTECH-CVLab/PeRFception
DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation

📝Once the subject is embedded in the output domain of the model, the unique identifier can then be used to synthesize fully-novel photorealistic images of the subject contextualized in different scenes.

https://github.com/XavierXiao/Dreambooth-Stable-Diffusion
👍2
A Gentle Introduction to Conformal Prediction and Distribution-Free Uncertainty Quantification

📝Conformal prediction is a user-friendly paradigm for creating statistically rigorous uncertainty sets/intervals for the predictions of such models.
https://github.com/aangelopoulos/conformal-prediction
👍1
Transformers are Sample Efficient World Models

📝Deep reinforcement learning agents are notoriously sample inefficient, which considerably limits their application to real-world problems.
https://github.com/eloialonso/iris
👍1
Behavior Trees in Robotics and AI: An Introduction

📝A Behavior Tree (BT) is a way to structure the switching between different tasks in an autonomous agent, such as a robot or a virtual entity in a computer game.
https://github.com/BehaviorTree/BehaviorTree.CPP
👍2
FedBN: Federated Learning on Non-IID Features via Local Batch Normalization

📝The emerging paradigm of federated learning (FL) strives to enable collaborative training of deep models on the network edge without centrally aggregating raw data and hence improving data privacy.
https://github.com/adap/flower
👍1