Repository logo
 

Novel tensor norm optimization for neural network training acceleration

dc.contributor.authorBanik, Mridul, author
dc.contributor.authorACM, publisher
dc.date.accessioned2025-12-22T19:09:11Z
dc.date.available2025-12-22T19:09:11Z
dc.date.issued2025-12-09
dc.description.abstractThis paper introduces an advanced optimization algorithm designed to enhance the training efficiency of neural networks, particularly focusing on the intricate weight matrices prevalent in large language models. Diverging from prior spectral norm-based approaches, our method leverages the nuclear norm to formulate a novel update rule, yielding a distinct optimization technique called Neon. We provide rigorous theoretical guarantees concerning its convergence properties through convex optimization and Karush-Kuhn-Tucker conditions. Performance evaluations across multilayer perceptrons, convolutional neural networks, and generative models such as NanoGPT demonstrate computational advantages over existing optimizers including Muon and AdamW. The Frobenius-based Neon variant achieves comparable or superior convergence while maintaining significantly lower per-iteration overhead of O(mn) FLOPs compared to Muon's O(mn · min {m, n}) for m x n matrices. This work advances more robust and faster training methodologies for complex AI systems.
dc.format.mediumborn digital
dc.format.mediumarticles
dc.identifier.bibliographicCitationMridul Banik. 2025. Novel Tensor Norm Optimization for Neural Network Training Acceleration. In 2025 International Conference on Artificial Intelligence and its Applications (ICARTI 2025), December 09-10, 2025, Port Louis, Mauritius. ACM, New York, NY, USA, 7 pages. https://doi.org/10.1145/ 3774791.3774805
dc.identifier.doihttps://doi.org/10.1145/ 3774791.3774805
dc.identifier.urihttps://hdl.handle.net/10217/242556
dc.languageEnglish
dc.language.isoeng
dc.publisherColorado State University. Libraries
dc.relation.ispartofPublications
dc.relation.ispartofACM DL Digital Library
dc.rights.licenseThis work is licensed under a Creative Commons Attribution 4.0 International License.
dc.rights.urihttps://creativecommons.org/licenses/by/4.0
dc.subjectneural network optimization
dc.subjectnuclear norm
dc.subjectlow-rank updates
dc.subjectgradient descent
dc.subjectdeep learning
dc.titleNovel tensor norm optimization for neural network training acceleration
dc.typeText

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
FACF_ACMOA_3774791.3774805.pdf
Size:
818.91 KB
Format:
Adobe Portable Document Format

Collections