International Journal on Science and Technology
E-ISSN: 2229-7677
•
Impact Factor: 9.88
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Home
Research Paper
Submit Research Paper
Publication Guidelines
Publication Charges
Upload Documents
Track Status / Pay Fees / Download Publication Certi.
Editors & Reviewers
View All
Join as a Reviewer
Get Membership Certificate
Current Issue
Publication Archive
Conference
Publishing Conf. with IJSAT
Upcoming Conference(s) ↓
Conferences Published ↓
ALSDAHW-2025
Contact Us
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 17 Issue 2
April-June 2026
Indexing Partners
Memory-Efficient LLM Training and Inference: Balancing Capacity, Speed, and Environmental Impact
| Author(s) | Smitha Shivashankaraiah |
|---|---|
| Country | United States |
| Abstract | The memory requirements of large language models (LLMs) present a critical bottleneck for both training and inference. While numerous techniques exist to reduce memory usage, most surveys organize them by mechanism or by the specific bottleneck addressed. This paper takes a different approach. We argue that real-world LLM deployment must balance three fundamental constraints: memory capacity, memory speed, and environmental impact. Capacity-focused techniques such as CXL memory pooling prioritize fitting large models and long contexts. Speed-focused techniques such as near-memory compute prioritize low latency for real-time and agentic workloads. Environmental factors — including carbon, water, noise, vibration, and e-waste — impose social and regulatory constraints that can override technical advantages. We present a comparative analysis of these three factors, a decision matrix for different workloads, and recommendations for engineers designing LLM infrastructure. Our conclusion is that memory efficiency is not merely a technical problem but a systems problem requiring trade-offs across capacity, speed, and sustainability. |
| Keywords | Memory efficiency; Large language models; GPU memory; CXL; HBM; Environmental sustainability; AI infrastructure |
| Field | Engineering |
| Published In | Volume 17, Issue 2, April-June 2026 |
| Published On | 2026-05-12 |
| DOI | https://doi.org/10.71097/IJSAT.v17.i2.11323 |
Share this

CrossRef DOI is assigned to each research paper published in our journal.
IJSAT DOI prefix is
10.71097/IJSAT
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.