🔹 Title: USO: Unified Style and Subject-Driven Generation via Disentangled and Reward Learning
🔹 Publication Date: Published on Aug 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.18966
• PDF: https://arxiv.org/pdf/2508.18966
• Project Page: https://bytedance.github.io/USO/
• Github: https://bytedance.github.io/USO/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/bytedance-research/USO
• https://huggingface.co/spaces/bep40/USO
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 26
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.18966
• PDF: https://arxiv.org/pdf/2508.18966
• Project Page: https://bytedance.github.io/USO/
• Github: https://bytedance.github.io/USO/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
• https://huggingface.co/spaces/bytedance-research/USO
• https://huggingface.co/spaces/bep40/USO
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: AWorld: Orchestrating the Training Recipe for Agentic AI
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20404
• PDF: https://arxiv.org/pdf/2508.20404
• Github: https://github.com/inclusionAI/AWorld/tree/main
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20404
• PDF: https://arxiv.org/pdf/2508.20404
• Github: https://github.com/inclusionAI/AWorld/tree/main
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: TCIA: A Task-Centric Instruction Augmentation Method for Instruction Finetuning
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20374
• PDF: https://arxiv.org/pdf/2508.20374
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20374
• PDF: https://arxiv.org/pdf/2508.20374
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: Dress&Dance: Dress up and Dance as You Like It - Technical Preview
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21070
• PDF: https://arxiv.org/pdf/2508.21070
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21070
• PDF: https://arxiv.org/pdf/2508.21070
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: OnGoal: Tracking and Visualizing Conversational Goals in Multi-Turn Dialogue with Large Language Models
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21061
• PDF: https://arxiv.org/pdf/2508.21061
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21061
• PDF: https://arxiv.org/pdf/2508.21061
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: FakeParts: a New Family of AI-Generated DeepFakes
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21052
• PDF: https://arxiv.org/pdf/2508.21052
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21052
• PDF: https://arxiv.org/pdf/2508.21052
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: CogVLA: Cognition-Aligned Vision-Language-Action Model via Instruction-Driven Routing & Sparsification
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21046
• PDF: https://arxiv.org/pdf/2508.21046
• Github: https://github.com/JiuTian-VL/CogVLA
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21046
• PDF: https://arxiv.org/pdf/2508.21046
• Github: https://github.com/JiuTian-VL/CogVLA
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: Provable Benefits of In-Tool Learning for Large Language Models
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20755
• PDF: https://arxiv.org/pdf/2508.20755
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20755
• PDF: https://arxiv.org/pdf/2508.20755
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
❤1
🔹 Title: LaTCoder: Converting Webpage Design to Code with Layout-as-Thought
🔹 Publication Date: Published on Aug 5
🔹 Abstract: LaTCoder enhances layout preservation in design-to-code tasks by dividing webpage designs into blocks and using Chain-of-Thought reasoning with MLLMs, achieving significant improvements in metrics and human preference. AI-generated summary Converting webpage designs into code (design-to-code) plays a vital role in User Interface (UI) development for front-end developers, bridging the gap between visual design and functional implementation. While recent Multimodal Large Language Models (MLLMs) have shown significant potential in design-to-code tasks, they often fail to accurately preserve the layout during code generation. To this end, we draw inspiration from the Chain-of-Thought (CoT) reasoning in human cognition and propose LaTCoder, a novel approach that enhances layout preservation in webpage design during code generation with Layout-as-Thought (LaT). Specifically, we first introduce a simple yet efficient algorithm to divide the webpage design into image blocks . Next, we prompt MLLMs using a CoTbased approach to generate code for each block. Finally, we apply two assembly strategies- absolute positioning and an MLLM-based method-followed by dynamic selection to determine the optimal output. We evaluate the effectiveness of LaTCoder using multiple backbone MLLMs (i.e., DeepSeek-VL2, Gemini, and GPT-4o) on both a public benchmark and a newly introduced, more challenging benchmark (CC-HARD) that features complex layouts. The experimental results on automatic metrics demonstrate significant improvements. Specifically, TreeBLEU scores increased by 66.67% and MAE decreased by 38% when using DeepSeek-VL2, compared to direct prompting. Moreover, the human preference evaluation results indicate that annotators favor the webpages generated by LaTCoder in over 60% of cases, providing strong evidence of the effectiveness of our method.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03560
• PDF: https://arxiv.org/pdf/2508.03560
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/xcodemind/CC-HARD
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 5
🔹 Abstract: LaTCoder enhances layout preservation in design-to-code tasks by dividing webpage designs into blocks and using Chain-of-Thought reasoning with MLLMs, achieving significant improvements in metrics and human preference. AI-generated summary Converting webpage designs into code (design-to-code) plays a vital role in User Interface (UI) development for front-end developers, bridging the gap between visual design and functional implementation. While recent Multimodal Large Language Models (MLLMs) have shown significant potential in design-to-code tasks, they often fail to accurately preserve the layout during code generation. To this end, we draw inspiration from the Chain-of-Thought (CoT) reasoning in human cognition and propose LaTCoder, a novel approach that enhances layout preservation in webpage design during code generation with Layout-as-Thought (LaT). Specifically, we first introduce a simple yet efficient algorithm to divide the webpage design into image blocks . Next, we prompt MLLMs using a CoTbased approach to generate code for each block. Finally, we apply two assembly strategies- absolute positioning and an MLLM-based method-followed by dynamic selection to determine the optimal output. We evaluate the effectiveness of LaTCoder using multiple backbone MLLMs (i.e., DeepSeek-VL2, Gemini, and GPT-4o) on both a public benchmark and a newly introduced, more challenging benchmark (CC-HARD) that features complex layouts. The experimental results on automatic metrics demonstrate significant improvements. Specifically, TreeBLEU scores increased by 66.67% and MAE decreased by 38% when using DeepSeek-VL2, compared to direct prompting. Moreover, the human preference evaluation results indicate that annotators favor the webpages generated by LaTCoder in over 60% of cases, providing strong evidence of the effectiveness of our method.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.03560
• PDF: https://arxiv.org/pdf/2508.03560
🔹 Datasets citing this paper:
• https://huggingface.co/datasets/xcodemind/CC-HARD
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: Turning the Spell Around: Lightweight Alignment Amplification via Rank-One Safety Injection
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20766
• PDF: https://arxiv.org/pdf/2508.20766
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.20766
• PDF: https://arxiv.org/pdf/2508.20766
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: Multi-View 3D Point Tracking
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21060
• PDF: https://arxiv.org/pdf/2508.21060
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21060
• PDF: https://arxiv.org/pdf/2508.21060
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: Persuasion Dynamics in LLMs: Investigating Robustness and Adaptability in Knowledge and Safety with DuET-PD
🔹 Publication Date: Published on Aug 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.17450
• PDF: https://arxiv.org/pdf/2508.17450
• Github: https://github.com/Social-AI-Studio/DuET-PD
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.17450
• PDF: https://arxiv.org/pdf/2508.17450
• Github: https://github.com/Social-AI-Studio/DuET-PD
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Title: OneReward: Unified Mask-Guided Image Generation via Multi-Task Human Preference Learning
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21066
• PDF: https://arxiv.org/pdf/2508.21066
• Project Page: https://one-reward.github.io/
• Github: https://one-reward.github.io/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 28
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.21066
• PDF: https://arxiv.org/pdf/2508.21066
• Project Page: https://one-reward.github.io/
• Github: https://one-reward.github.io/
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
❤1
🔹 Title: Social-MAE: A Transformer-Based Multimodal Autoencoder for Face and Voice
🔹 Publication Date: Published on Aug 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.17502
• PDF: https://arxiv.org/pdf/2508.17502
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 24
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.17502
• PDF: https://arxiv.org/pdf/2508.17502
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
❤2
🔹 Title: Learning an Efficient Multi-Turn Dialogue Evaluator from Multiple Judges
🔹 Publication Date: Published on Aug 1
🔹 Abstract: An efficient multi-turn dialogue evaluator aggregates multiple LLM judgments into a single model to assess dialogue quality with reduced computational cost. AI-generated summary Evaluating the conversational abilities of large language models (LLMs) remains a challenging task. Current mainstream approaches primarily rely on the `` LLM-as-a-judge " paradigm, where an LLM is prompted to serve as an evaluator to assess dialogue quality. However, such methods often suffer from various biases, which undermine the reliability and consistency of the evaluation results. To mitigate these biases, recent methods employ multiple LLMs as judges and aggregate their judgments to select the optimal assessment. Although effective, this multi-judge approach incurs significant computational overhead during inference. In this paper, we propose an efficient multi-turn dialogue evaluator that captures the collective wisdom of multiple LLM judges by aggregating their preference knowledge into a single model. Our approach preserves the advantages of diverse multi-judge feedback while drastically reducing the evaluation cost, enabling fast and flexible dialogue quality assessment. Extensive experiments on seven single rating and pairwise comparison dialogue evaluation benchmarks demonstrate that our method outperforms existing baselines across diverse scenarios, showcasing its efficiency and robustness.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.00454
• PDF: https://arxiv.org/pdf/2508.00454
• Github: https://github.com/James-TYQ/MTDEval
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 1
🔹 Abstract: An efficient multi-turn dialogue evaluator aggregates multiple LLM judgments into a single model to assess dialogue quality with reduced computational cost. AI-generated summary Evaluating the conversational abilities of large language models (LLMs) remains a challenging task. Current mainstream approaches primarily rely on the `` LLM-as-a-judge " paradigm, where an LLM is prompted to serve as an evaluator to assess dialogue quality. However, such methods often suffer from various biases, which undermine the reliability and consistency of the evaluation results. To mitigate these biases, recent methods employ multiple LLMs as judges and aggregate their judgments to select the optimal assessment. Although effective, this multi-judge approach incurs significant computational overhead during inference. In this paper, we propose an efficient multi-turn dialogue evaluator that captures the collective wisdom of multiple LLM judges by aggregating their preference knowledge into a single model. Our approach preserves the advantages of diverse multi-judge feedback while drastically reducing the evaluation cost, enabling fast and flexible dialogue quality assessment. Extensive experiments on seven single rating and pairwise comparison dialogue evaluation benchmarks demonstrate that our method outperforms existing baselines across diverse scenarios, showcasing its efficiency and robustness.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.00454
• PDF: https://arxiv.org/pdf/2508.00454
• Github: https://github.com/James-TYQ/MTDEval
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
“Most traders are terrified to admit: 90% lose money. I used to be one of them… until I saw how AI actually learns from EVERY trade.”
Nobody believed me when I said charts would be history. Want to know the protocol that could change trading forever? This is it
#إعلان InsideAds
Nobody believed me when I said charts would be history. Want to know the protocol that could change trading forever? This is it
#إعلان InsideAds
🔹 Title: LeanK: Learnable K Cache Channel Pruning for Efficient Decoding
🔹 Publication Date: Published on Aug 4
🔹 Abstract: LeanK, a learning-based method, prunes unimportant key cache channels in large language models to reduce memory usage and accelerate decoding without sacrificing accuracy. AI-generated summary Large language models (LLMs) enable long-context tasks but face efficiency challenges due to the growing key-value (KV) cache. We propose LeanK, a learning-based method that prunes unimportant key (K) cache channels by leveraging static channel sparsity . With a novel two-stage training process, LeanK learns channel-wise static mask that could satisfy specific sparsity ratio and hardware alignment requirement. LeanK reduces GPU memory and accelerates decoding without sacrificing accuracy. Experiments demonstrate up to 70% K cache and 16%-18% V cache memory reduction. Custom decoding kernel enables 1.3x speedup for attention computation . We also provide insights into model channels and attention heads during long-context inference by analyzing the learned importance distribution . Our code is available at https://aka.ms/LeanK.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.02215
• PDF: https://arxiv.org/pdf/2508.02215
• Project Page: https://aka.ms/LeanK
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
🔹 Publication Date: Published on Aug 4
🔹 Abstract: LeanK, a learning-based method, prunes unimportant key cache channels in large language models to reduce memory usage and accelerate decoding without sacrificing accuracy. AI-generated summary Large language models (LLMs) enable long-context tasks but face efficiency challenges due to the growing key-value (KV) cache. We propose LeanK, a learning-based method that prunes unimportant key (K) cache channels by leveraging static channel sparsity . With a novel two-stage training process, LeanK learns channel-wise static mask that could satisfy specific sparsity ratio and hardware alignment requirement. LeanK reduces GPU memory and accelerates decoding without sacrificing accuracy. Experiments demonstrate up to 70% K cache and 16%-18% V cache memory reduction. Custom decoding kernel enables 1.3x speedup for attention computation . We also provide insights into model channels and attention heads during long-context inference by analyzing the learned importance distribution . Our code is available at https://aka.ms/LeanK.
🔹 Paper Links:
• arXiv Page: https://arxiv.org/abs/2508.02215
• PDF: https://arxiv.org/pdf/2508.02215
• Project Page: https://aka.ms/LeanK
🔹 Datasets citing this paper:
No datasets found
🔹 Spaces citing this paper:
No spaces found
==================================
For more data science resources:
✓ https://yangx.top/DataScienceT
What if you could skip all the chart noise and just ask AI to invest for you?
90% of traders lose money… but the game changes when you chat instead of chart.
Be early—join the AI revolution before everyone else wakes up.
#إعلان InsideAds
90% of traders lose money… but the game changes when you chat instead of chart.
Be early—join the AI revolution before everyone else wakes up.
#إعلان InsideAds
90% трейдеров сливают из-за ручного анализа и эмоций?
TRUE — первый AI-протокол на Solana, который сам учится на on-chain данных, выдает персональные стратегии и делает трейдинг понятным на естественном языке.
Именно сейчас можно получить доступ к запуску и участвовать в токенсейле $TRUE, пока условия максимально выгодные.
Трейдинг будущего начинается здесь — не упусти шанс быть первым! Присоединяйся
#إعلان InsideAds
TRUE — первый AI-протокол на Solana, который сам учится на on-chain данных, выдает персональные стратегии и делает трейдинг понятным на естественном языке.
Именно сейчас можно получить доступ к запуску и участвовать в токенсейле $TRUE, пока условия максимально выгодные.
Трейдинг будущего начинается здесь — не упусти шанс быть первым! Присоединяйся
#إعلان InsideAds
They mocked my finishing. Now I score in UCL finals. Nobody believed in me. You have no idea who this is… yet. Join the only channel where legends and mysteries collide. Can you handle the riddles that stump even diehard fans? Prove it here
#إعلان InsideAds
#إعلان InsideAds