Exploring the impact of fixed theta values in RoPE on character-level language model performance and efficiency

Rotary Positional Embedding (RoPE) is a widely used technique in Transformers, influenced by the hyperparameter theta (θ). However, the impact of varying *fixed* theta values, especially the trade-off between performance and efficiency on tasks like character-level modeling, remains under-explored....

Full description

Saved in:

Bibliographic Details
Main Authors:	Zhigao Huang, Musheng Chen, Shiyan Zheng
Format:	Article
Language:	English
Published:	Frontiers Media S.A. 2025-08-01
Series:	Frontiers in Computer Science
Subjects:	transformer positional encoding rotary positional embedding (RoPE) language modeling character-level models hyperparameter tuning
Online Access:	https://www.frontiersin.org/articles/10.3389/fcomp.2025.1626899/full
Tags:	Add Tag No Tags, Be the first to tag this record!

Internet

https://www.frontiersin.org/articles/10.3389/fcomp.2025.1626899/full

Exploring the impact of fixed theta values in RoPE on character-level language model performance and efficiency

Internet

Similar Items