Data Science | Machine Learning with Python for Researchers
31.2K subscribers
1.44K photos
102 videos
22 files
1.72K links
Admin: @HusseinSheikho

The Data Science and Python channel is for researchers and advanced programmers

Buy ads: https://telega.io/c/dataScienceT
加入频道
🔹 Title: Discrete Diffusion VLA: Bringing Discrete Diffusion to Action Decoding in Vision-Language-Action Policies

🔹 Publication Date: Published on Aug 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20072
• PDF: https://arxiv.org/pdf/2508.20072

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: Predicting the Order of Upcoming Tokens Improves Language Modeling

🔹 Publication Date: Published on Aug 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19228
• PDF: https://arxiv.org/pdf/2508.19228
• Github: https://github.com/zaydzuhri/token-order-prediction

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation

🔹 Publication Date: Published on Aug 1

🔹 Abstract: IGL-Nav uses an incremental 3D Gaussian representation for efficient and accurate image-goal navigation in 3D space, outperforming existing methods and applicable in real-world settings. AI-generated summary Visual navigation with an image as goal is a fundamental and challenging problem. Conventional methods either rely on end-to-end RL learning or modular-based policy with topological graph or BEV map as memory, which cannot fully model the geometric relationship between the explored 3D environment and the goal image. In order to efficiently and accurately localize the goal image in 3D space, we build our navigation system upon the renderable 3D gaussian ( 3DGS ) representation. However, due to the computational intensity of 3DGS optimization and the large search space of 6-DoF camera pose , directly leveraging 3DGS for image localization during agent exploration process is prohibitively inefficient. To this end, we propose IGL-Nav , an Incremental 3D Gaussian Localization framework for efficient and 3D-aware image-goal navigation. Specifically, we incrementally update the scene representation as new images arrive with feed-forward monocular prediction . Then we coarsely localize the goal by leveraging the geometric information for discrete space matching, which can be equivalent to efficient 3D convolution. When the agent is close to the goal, we finally solve the fine target pose with optimization via differentiable rendering . The proposed IGL-Nav outperforms existing state-of-the-art methods by a large margin across diverse experimental configurations. It can also handle the more challenging free-view image-goal setting and be deployed on real-world robotic platform using a cellphone to capture goal image at arbitrary pose. Project page: https://gwxuan.github.io/ IGL-Nav /.

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.00823

• PDF: https://arxiv.org/pdf/2508.00823

• Project Page: https://gwxuan.github.io/IGL-Nav/

• Github: https://gwxuan.github.io/IGL-Nav/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔥2
🔹 Title: Diffusion Language Models Know the Answer Before Decoding

🔹 Publication Date: Published on Aug 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19982
• PDF: https://arxiv.org/pdf/2508.19982

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔥2
🔹 Title: Taming the Chaos: Coordinated Autoscaling for Heterogeneous and Disaggregated LLM Inference

🔹 Publication Date: Published on Aug 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19559
• PDF: https://arxiv.org/pdf/2508.19559

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: DeepScholar-Bench: A Live Benchmark and Automated Evaluation for Generative Research Synthesis

🔹 Publication Date: Published on Aug 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20033
• PDF: https://arxiv.org/pdf/2508.20033

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: MotionFlux: Efficient Text-Guided Motion Generation through Rectified Flow Matching and Preference Alignment

🔹 Publication Date: Published on Aug 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19527
• PDF: https://arxiv.org/pdf/2508.19527

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
2🔥2
🔹 Title: Web-CogReasoner: Towards Knowledge-Induced Cognitive Reasoning for Web Agents

🔹 Publication Date: Published on Aug 3

🔹 Abstract: A framework for web agents decomposes their capabilities into knowledge content learning and cognitive processes, using a structured dataset and a novel reasoning framework to enhance generalization and performance. AI-generated summary Multimodal large-scale models have significantly advanced the development of web agents, enabling perception and interaction with digital environments akin to human cognition. In this paper, we argue that web agents must first acquire sufficient knowledge to effectively engage in cognitive reasoning. Therefore, we decompose a web agent's capabilities into two essential stages: knowledge content learning and cognitive processes. To formalize this, we propose Web-CogKnowledge Framework , categorizing knowledge as Factual, Conceptual, and Procedural. In this framework, knowledge content learning corresponds to the agent's processes of Memorizing and Understanding , which rely on the first two knowledge types, representing the "what" of learning. Conversely, cognitive processes correspond to Exploring , grounded in Procedural knowledge , defining the "how" of reasoning and action. To facilitate knowledge acquisition, we construct the Web-CogDataset , a structured resource curated from 14 real-world websites, designed to systematically instill core knowledge necessary for web agent. This dataset serves as the agent's conceptual grounding-the "nouns" upon which comprehension is built-as well as the basis for learning how to reason and act. Building on this foundation, we operationalize these processes through a novel knowledge-driven Chain-of-Thought (CoT) reasoning framework, developing and training our proposed agent, the Web-CogReasoner . Extensive experimentation reveals its significant superiority over existing models, especially in generalizing to unseen tasks where structured knowledge is decisive. To enable rigorous evaluation, we introduce the Web-CogBench , a comprehensive evaluation suite designed to assess and compare agent performance across the delineated knowledge domains and cognitive capabilities. Our code and data is open sourced at https://github.com/Gnonymous/ Web-CogReasoner

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.01858

• PDF: https://arxiv.org/pdf/2508.01858

• Project Page: https://eohan.me/Web-CogReasoner

• Github: https://Gnonymous.github.io/Web-CogReasoner

🔹 Datasets citing this paper:
https://huggingface.co/datasets/Gnonymous/Web-CogDataset

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
1
🔹 Title: Beyond Transcription: Mechanistic Interpretability in ASR

🔹 Publication Date: Published on Aug 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.15882
• PDF: https://arxiv.org/pdf/2508.15882

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: Gaze into the Heart: A Multi-View Video Dataset for rPPG and Health Biomarkers Estimation

🔹 Publication Date: Published on Aug 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.17924
• PDF: https://arxiv.org/pdf/2508.17924
• Project Page: https://huggingface.co/datasets/kyegorov/mcd_rppg
• Github: https://github.com/ksyegorov/mcd_rppg

🔹 Datasets citing this paper:
https://huggingface.co/datasets/kyegorov/mcd_rppg

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🎁❗️TODAY FREE❗️🎁

Entry to our VIP channel is completely free today. Tomorrow it will cost $500! 🔥

JOIN 👇

https://yangx.top/+Gc5luJUbfjRkMTk5
https://yangx.top/+Gc5luJUbfjRkMTk5
https://yangx.top/+Gc5luJUbfjRkMTk5
🔹 Title: SEAM: Semantically Equivalent Across Modalities Benchmark for Vision-Language Models

🔹 Publication Date: Published on Aug 25

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.18179
• PDF: https://arxiv.org/pdf/2508.18179
• Project Page: https://lilv98.github.io/SEAM-Website/
• Github: https://github.com/CSSLab/SEAM

🔹 Datasets citing this paper:
https://huggingface.co/datasets/lilvjosephtang/SEAM-Benchmark

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: Analysing Chain of Thought Dynamics: Active Guidance or Unfaithful Post-hoc Rationalisation?

🔹 Publication Date: Published on Aug 27

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.19827
• PDF: https://arxiv.org/pdf/2508.19827

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: Training a Foundation Model for Materials on a Budget

🔹 Publication Date: Published on Aug 22

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.16067
• PDF: https://arxiv.org/pdf/2508.16067
• Github: https://github.com/atomicarchitects/nequix

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
2
🔹 Title: rStar2-Agent: Agentic Reasoning Technical Report

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20722
• PDF: https://arxiv.org/pdf/2508.20722

🔹 Datasets citing this paper:
https://huggingface.co/datasets/rstar2-reproduce/rStar2-Agent-RL-Data

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20453
• PDF: https://arxiv.org/pdf/2508.20453
• Github: https://github.com/Accenture/mcp-bench

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
2
🔹 Title: Mixture of Contexts for Long Video Generation

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21058
• PDF: https://arxiv.org/pdf/2508.21058
• Project Page: https://primecai.github.io/moc/
• Github: https://primecai.github.io/moc/

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

🔹 Publication Date: Published on Aug 28

🔹 Paper Links:
• arXiv Page: https://arxiv.org/pdf/2508.20751
• PDF: https://arxiv.org/pdf/2508.20751
• Project Page: https://codegoat24.github.io/UnifiedReward/Pref-GRPO
• Github: https://github.com/CodeGoat24/UniGenBench

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: ROSE: Remove Objects with Side Effects in Videos

🔹 Publication Date: Published on Aug 26

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.18633
• PDF: https://arxiv.org/pdf/2508.18633
• Project Page: https://rose2025-inpaint.github.io/
• Github: https://github.com/Kunbyte-AI/ROSE

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
https://huggingface.co/spaces/Kunbyte/ROSE
==================================

For more data science resources:
https://yangx.top/DataScienceT
🔹 Title: Collaborative Multi-Modal Coding for High-Quality 3D Generation

🔹 Publication Date: Published on Aug 21

🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.15228
• PDF: https://arxiv.org/pdf/2508.15228

🔹 Datasets citing this paper:
No datasets found

🔹 Spaces citing this paper:
No spaces found
==================================

For more data science resources:
https://yangx.top/DataScienceT