Search alternatives:
inference » conference (Expand Search), influence (Expand Search)
Showing 1 - 4 results of 4 for search 'llms inference optimization', query time: 0.07s Refine Results
  1. 1

    Entropy-Guided KV Caching for Efficient LLM Inference by Heekyum Kim, Yuchul Jung

    Published 2025-07-01
    “…However, their practical deployment—especially in long-context scenarios—is often hindered by the computational and memory costs associated with managing the key–value (KV) cache during inference. Optimizing this process is therefore crucial for improving LLM efficiency and scalability. …”
    Get full text
    Article
  2. 2

    AsymGroup: Asymmetric Grouping and Communication Optimization for 2D Tensor Parallelism in LLM Inference by Ki Tae Kim, Seok-Ju Im, Eui-Young Chung

    Published 2025-01-01
    “…Recent advances in Large Language Models (LLMs), such as GPT and LLaMA, have demonstrated remarkable capabilities across a wide array of natural language processing tasks. …”
    Get full text
    Article
  3. 3

    Efficient Management of Safety Documents Using Text-Based Analytics to Extract Safety Attributes From Construction Accident Reports by Vedat Togan, Fatemeh Mostofi, Onur Behzat Tokdemir, Fethi Kadioglu

    Published 2025-01-01
    “…Future work should focus on API creation, secure machine learning pipelines, and optimized deployment of LLMs, particularly in complex contexts.…”
    Get full text
    Article
  4. 4

    Generative Artificial Intelligence-Enabled Facility Layout Design Paradigm by Fuwen Hu, Chun Wang, Xuefei Wu

    Published 2025-05-01
    “…The convolutional knowledge graph embedding (ConvE) method is employed for link prediction, converting entities and relationships into low-dimensional vectors to infer optimal spatial arrangements while addressing data sparsity through negative sampling. …”
    Get full text
    Article