Daily AI Research Papers - July 28, 2025
🔑 Keywords: diffusion models, test-time adaptation, LLM quantization, GUI agents, multimodal compression, error analysis, prompt evolution, autonomous driving, AI risk management, video communication, semantic segmentation
1. Deep Researcher with Test-Time Diffusion
🔗 Read Paper
📋 Summary: This paper introduces a novel approach to test-time diffusion models, enabling deep research capabilities with real-time adaptation. The method demonstrates significant improvements in model performance during inference, with 18 authors contributing to this comprehensive study on adaptive diffusion techniques.
2. The Geometry of LLM Quantization: GPTQ as Babai’s Nearest Plane Algorithm
🔗 Read Paper
📋 Summary: This research explores the mathematical foundations of LLM quantization, revealing GPTQ as an implementation of Babai’s nearest plane algorithm. The study provides deep insights into the geometric properties of quantization, with 3 authors contributing to this theoretical breakthrough.
3. MMBench-GUI: Hierarchical Multi-Platform Evaluation Framework for GUI Agents
🔗 Read Paper
📋 Summary: This paper presents a comprehensive evaluation framework for GUI agents across multiple platforms. The hierarchical approach enables standardized assessment of agent performance, with 28 authors contributing to this multi-platform evaluation methodology.
4. When Tokens Talk Too Much: A Survey of Multimodal Long-Context Token Compression across Images, Videos, and Audios
🔗 Read Paper
📋 Summary: This comprehensive survey examines token compression techniques across images, videos, and audio in multimodal long-context scenarios. The research addresses the challenge of managing excessive token usage, with 10 authors providing detailed analysis of compression strategies.
5. CLEAR: Error Analysis via LLM-as-a-Judge Made Easy
🔗 Read Paper
📋 Summary: This paper introduces CLEAR, a simplified approach to error analysis using LLM-as-a-judge methodology. The framework makes error analysis more accessible and efficient, with 5 authors contributing to this practical error analysis tool.
6. GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
🔗 Read Paper
📋 Summary: This research demonstrates that reflective prompt evolution can achieve superior performance compared to traditional reinforcement learning approaches. The GEPA method shows promising results in prompt optimization, with 17 authors contributing to this innovative approach.
7. Specification Self-Correction: Mitigating In-Context Reward Hacking Through Test-Time Refinement
🔗 Read Paper
📋 Summary: This paper addresses the critical issue of in-context reward hacking through specification self-correction techniques. The method provides test-time refinement capabilities to improve model reliability, with 1 author contributing to this focused study.
8. PRIX: Learning to Plan from Raw Pixels for End-to-End Autonomous Driving
🔗 Read Paper
📋 Summary: This research introduces PRIX, a novel approach to autonomous driving that learns planning directly from raw pixel inputs. The end-to-end system demonstrates significant advances in visual-based driving decisions, with 4 authors contributing to this autonomous driving breakthrough.
9. Frontier AI Risk Management Framework in Practice: A Risk Analysis Technical Report
🔗 Read Paper
📋 Summary: This comprehensive technical report presents a practical framework for managing risks in frontier AI systems. The study provides detailed risk analysis methodologies and implementation guidelines, with 37 authors contributing to this critical safety research.