MMODELYST
Literature

Papers

Showing 1–19 of 19 notable papers
PaperTopicAuthorsPublishedHF ▲
Humanoid-GPT: Scaling Data and Structure for Zero-Shot Motion TrackingData & benchmarksZekun Qi +12Jun 2, 202638
SCOPE: Self-Play via Co-Evolving Policies for Open-Ended TasksData & benchmarksWai-Chung Kwan +3May 29, 202628
Rethinking RAG in Long Videos: What to Retrieve and How to Use It?Data & benchmarksYuho Lee +7Jun 11, 202620
UnpredictaBench: A Benchmark for Evaluating Distributional Randomness in LLMsData & benchmarksAmirhossein Abaskohi +6Jun 4, 202620
ComBench: A Benchmark for Rigorous Proof Reasoning and Constructive Realization in Olympiad-Level CombinatoricsData & benchmarksShunkai Zhang +17Jun 9, 202618
PRISM: A Multi-Dimensional Benchmark for Evaluating LLM Peer ReviewersData & benchmarksNgoc Phan Phuoc Loc +10May 26, 202616
RankJudge: A Multi-Turn LLM-as-a-Judge Synthetic Benchmark GeneratorData & benchmarksZhenwei Tang +5May 20, 202616
REPOT: Recoverable Program-of-Thought via Checkpoint RepairData & benchmarksParsa MazaheriMay 28, 202610
LLMs Can Leak Training Data But Do They Want To? A Propensity-Aware Evaluation of Memorization in LLMsData & benchmarksGianluca Barmina +2Jun 4, 20268
RayDer: Scalable Self-Supervised Novel View Synthesis from Real-World VideoData & benchmarksUlrich Prestel +3May 29, 20267
PRISM: Position-encoded Regressive Inverse Spectral Model for Multilayer Thin-Film DesignData & benchmarksRuntian Wang +3May 26, 20267
SurGe: Improved Surface Geometry in Point MapsData & benchmarksKarim Knaebel +6May 29, 20266
Breaking the Chains of Probability: Neutrosophic Logic as a New Framework for Epistemic Uncertainty in Large Language ModelsData & benchmarksMaikel Yelandi Leyva-Vázquez +1May 22, 20265
Time-Series Foundation Model Embeddings for Remaining Useful Life EstimationData & benchmarksAmir El-Ghoussani +3Jun 10, 20263
On the Limits of LLM-as-Judge for Scientific Novelty AssessmentData & benchmarksSoumitra Sinhahajari +2Jun 10, 20263
Chiaroscuro Attention: Spending Compute in the DarkData & benchmarksPrateek Kumar SikdarJun 6, 20261
Geometric Latent Reasoning Induces Shorter Generations in LLMsData & benchmarksShashi Kumar +4Jun 1, 20261
Model-Based Quality Assessment for Massively Multilingual Parallel DataData & benchmarksAbdelaziz M. A. Ibrahim +3May 29, 20261
The Chain Holds, the Answer Folds: Trace-Answer Dissociation in Reasoning Models Under Adversarial PressureData & benchmarksYubo Li +2May 27, 20261