MMODELYST
Literature

Papers

Showing 1–29 of 29 notable papers
PaperTopicAuthorsPublishedHF ▲
On the Scaling of PEFT: Towards Million Personal Models of Trillion ParametersEfficiency & systemsMind Lab +39Jun 1, 2026224
OmniRetrieval: Unified Retrieval across Heterogeneous Knowledge SourcesEfficiency & systemsJinheon Baek +7May 28, 202677
Trust-Region Behavior Blending for On-Policy DistillationEfficiency & systemsDaniil Plyusov +6May 29, 202665
NITP: Next Implicit Token Prediction for LLM Pre-trainingEfficiency & systemsXiangdong Zhang +5May 24, 202635
UniSteer: Text-Guided Flow Matching in Activation Space for Versatile LLM SteeringEfficiency & systemsYingdong Shi +6May 28, 202626
KletterMix: Climbing Toward High-Quality German Pretraining DataEfficiency & systemsMaurice Kraus +6Jun 2, 202617
MobileMoE: Scaling On-Device Mixture of ExpertsEfficiency & systemsYanbei Chen +7May 26, 202615
Do Language Models Need Sleep? Offline Recurrence for Improved Online InferenceEfficiency & systemsSangyun Lee +3May 25, 202612
Reflective Prompt Tuning through Language Model Function-CallingEfficiency & systemsFarima Fatahi Bayat +3May 20, 20269
Skip a Layer or Loop It? Learning Program-of-Layers in LLMsEfficiency & systemsZiyue Li +2Jun 4, 20268
Value-Aware Stochastic KV Cache Eviction for Reasoning ModelsEfficiency & systemsTing-Yun Chang +5Jun 2, 20268
PEFT-Arena: Understanding Parameter-Efficient Finetuning from a Stability-Plasticity PerspectiveEfficiency & systemsYangyi Huang +6May 27, 20268
Breaking the Bubble: Asynchronous Pipeline Parallel Training with Bounded Weight InconsistencyEfficiency & systemsItay Elam +3Jun 5, 20267
The Shadow Price of Reasoning: Economic Perspective on Optimal Budget Allocation for LLMsEfficiency & systemsXu Wan +6Jun 2, 20267
αDepth: Learning Single-Pass Soft Boundary Decomposition for Stereo ConversionEfficiency & systemsXiang Zhang +5May 29, 20267
Adapting Multilingual Embedding Models to Turkish via Cross-Lingual Tokenizer Surgery and Offline DistillationEfficiency & systemsM. Ali Bayram +2May 28, 20266
The Hidden Power of Scaling Factor in LoRA OptimizationEfficiency & systemsZicheng Zhang +12Jun 11, 20265
Dynamic Linear AttentionEfficiency & systemsXin Wang +9Jun 9, 20265
Flash-GMM: A Memory-Efficient Kernel for Scalable Soft ClusteringEfficiency & systemsGal Bloch +4Jun 9, 20263
When Gradients Collide: Failure Modes of Multi-Objective Prompt Optimization for LLM JudgesEfficiency & systemsParth Darshan +1May 25, 20263
STRIDE: Training Data Attribution via Sparse Recovery from Subset PerturbationsEfficiency & systemsRishit Dagli +6Jun 3, 20263
Lius: Translation Model Based Instructional Lingustic Using Continual Instruction Tuning In Kupang MalayEfficiency & systemsJoanito Agili Lopo +2Jun 10, 20262
Compress-Distill: Reasoning Trace Compression for Efficient Knowledge DistillationEfficiency & systemsMaxime Griot +2Jun 4, 20262
Can Predicted Dynamics Exist in the Physical World?Efficiency & systemsBarak OrMay 23, 20262
The Good, the Bad, and the Ugly of Markov Boundary for Tabular PredictionEfficiency & systemsShu Wan +3May 28, 20262
SigmaScale: LLM Compression with SVD-based Low-Rank Decomposition and Learned Scaling MatricesEfficiency & systemsErnests Lavrinovics +5Jun 5, 20261
Pruning and Distilling Mixture-of-Experts into Dense Language ModelsEfficiency & systemsJunhyuck Kim +5May 27, 20261
Deep Embedded Multiplicative DMD for Algebra-Preserving Koopman LearningEfficiency & systemsKelan Gray +3Jun 3, 20261
The Hamilton-Jacobi Theory of Deep LearningEfficiency & systemsJose Marie Antonio Miñoza +2May 27, 20261