[1]

Wang, Y. 2024. Structured Compression of Large Language Models with Sensitivity-aware Pruning Mechanisms. Journal of Computer Technology and Software. 3, 9 (Dec. 2024). DOI:https://doi.org/10.5281/zenodo.15851638.