Data Science Jupyter Notebooks

🔖 ImageBind: One Embedding Space To Bind Them All

📝 This project is a significant step forward in understanding and connecting information from diverse sources like images, text, audio, video, and even motion sensor data.

⚙️ Supports 6 Modalities:

📷 Image
📝 Text
🔈 Audi
🎥 Video
🦴 IMU sensor data (e.g., accelerometer)
🙄 Depth/Thermal & 3D data
Interestingly, only some modalities had labels, yet ImageBind learned to align them through self-supervised learning.

💫 Key Features:

..No need for paired data (e.g., images and audio don’t have to be aligned)..Leverages contrastive learning for learning joint embedding space
..Competes with CLIP and AudioCLIP, but with better accuracy and coverage..Enables zero-shot retrieval (e.g., finding relevant video using just a sentence)

📌 Repo: https://github.com/facebookresearch/ImageBind

🔍 By: https://yangx.top/DataScienceN

🌟

#ImageBind #MultimodalAI #MetaAI #DeepLearning #SelfSupervised

Please open Telegram to view this post

VIEW IN TELEGRAM

👍3🔥2

1.24K viewsedited 06:46

Data Science Jupyter Notebooks

This media is not supported in your browser

VIEW IN TELEGRAM

💃

GENMO: Generalist Human Motion by NVIDIA

💃

NVIDIA introduces GENMO, a unified generalist model for human motion that seamlessly combines motion estimation and generation within a single framework. GENMO supports conditioning on videos, 2D keypoints, text, music, and 3D keyframes, enabling highly versatile motion understanding and synthesis.

Currently, no official code release is available.

Review:
https://t.ly/Q5T_Y

Paper:
https://lnkd.in/ds36BY49

Project Page:
https://lnkd.in/dAYHhuFU

#NVIDIA #GENMO #HumanMotion #DeepLearning #AI #ComputerVision #MotionGeneration #MachineLearning #MultimodalAI #3DReconstruction

✉️ Our Telegram channels: https://yangx.top/addlist/0f6vfFbEMdAwODBk

📱 Our WhatsApp channel: https://whatsapp.com/channel/0029VaC7Weq29753hpcggW2A

Please open Telegram to view this post

VIEW IN TELEGRAM

👍3

1.21K views05:13

About

Blog

Apps

Platform