Search Results - llms inference optimization :: Kabale University Library Catalog

Search alternatives:
inference » conference (Expand Search), influence (Expand Search)

1

Entropy-Guided KV Caching for Efficient LLM Inference by Heekyum Kim, Yuchul Jung

Published 2025-07-01
“…However, their practical deployment—especially in long-context scenarios—is often hindered by the computational and memory costs associated with managing the key–value (KV) cache during inference. Optimizing this process is therefore crucial for improving LLM efficiency and scalability. …”

Get full text

Article

Save to List

Saved in:
2

AsymGroup: Asymmetric Grouping and Communication Optimization for 2D Tensor Parallelism in LLM Inference by Ki Tae Kim, Seok-Ju Im, Eui-Young Chung

Published 2025-01-01
“…Recent advances in Large Language Models (LLMs), such as GPT and LLaMA, have demonstrated remarkable capabilities across a wide array of natural language processing tasks. …”

Get full text

Article

Save to List

Saved in:
3

Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports by Vedat Togan, Fatemeh Mostofi, Onur Behzat Tokdemir, Fethi Kadioglu

Published 2025-01-01
“…Future work should focus on API creation, secure machine learning pipelines, and optimized deployment of LLMs, particularly in complex contexts.…”

Get full text

Article

Save to List

Saved in:
4

Generative Artificial Intelligence-Enabled Facility Layout Design Paradigm by Fuwen Hu, Chun Wang, Xuefei Wu

Published 2025-05-01
“…The convolutional knowledge graph embedding (ConvE) method is employed for link prediction, converting entities and relationships into low-dimensional vectors to infer optimal spatial arrangements while addressing data sparsity through negative sampling. …”

Get full text

Article

Save to List

Saved in:

Entropy-Guided KV Caching for Efficient LLM Inference by Heekyum Kim, Yuchul Jung

AsymGroup: Asymmetric Grouping and Communication Optimization for 2D Tensor Parallelism in LLM Inference by Ki Tae Kim, Seok-Ju Im, Eui-Young Chung

Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports by Vedat Togan, Fatemeh Mostofi, Onur Behzat Tokdemir, Fethi Kadioglu

Generative Artificial Intelligence-Enabled Facility Layout Design Paradigm by Fuwen Hu, Chun Wang, Xuefei Wu

Search Tools:

Refine Results

Institution

Format

Author

Language

Year of Publication