[1]

Y. Wang, “Structured Compression of Large Language Models with Sensitivity-aware Pruning Mechanisms”, JCTS, vol. 3, no. 9, Dec. 2024.