An implementation of the MoDeGPT LLM compression from the ICLR 2025 Conference paper: Modular Decomposition For Large Language Model Compression.
pruning llama lora matrix-decomposition llm llama2 llm-compression llama3 iclr-2025 weight-decomposition sparsity-allocation iclr-2025-oral
-
Updated
Dec 16, 2025 - Python