-
1
Entropy-Guided KV Caching for Efficient LLM Inference
Published 2025-07-01“…However, their practical deployment—especially in long-context scenarios—is often hindered by the computational and memory costs associated with managing the key–value (KV) cache during inference. Optimizing this process is therefore crucial for improving LLM efficiency and scalability. …”
Get full text
Article -
2
AsymGroup: Asymmetric Grouping and Communication Optimization for 2D Tensor Parallelism in LLM Inference
Published 2025-01-01“…Recent advances in Large Language Models (LLMs), such as GPT and LLaMA, have demonstrated remarkable capabilities across a wide array of natural language processing tasks. …”
Get full text
Article -
3
Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports
Published 2025-01-01“…Future work should focus on API creation, secure machine learning pipelines, and optimized deployment of LLMs, particularly in complex contexts.…”
Get full text
Article -
4
Generative Artificial Intelligence-Enabled Facility Layout Design Paradigm
Published 2025-05-01“…The convolutional knowledge graph embedding (ConvE) method is employed for link prediction, converting entities and relationships into low-dimensional vectors to infer optimal spatial arrangements while addressing data sparsity through negative sampling. …”
Get full text
Article