Policy Similarity Measure for Two-Player Zero-Sum Games
Policy space response oracles (PSRO) is an important algorithmic framework for approximating Nash equilibria in two-player zero-sum games. Enhancing policy diversity has been shown to improve the performance of PSRO in this approximation process significantly. However, existing diversity metrics are...
Saved in:
| Main Authors: | , , |
|---|---|
| Format: | Article |
| Language: | English |
| Published: |
MDPI AG
2025-03-01
|
| Series: | Applied Sciences |
| Subjects: | |
| Online Access: | https://www.mdpi.com/2076-3417/15/5/2815 |
| Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
| Summary: | Policy space response oracles (PSRO) is an important algorithmic framework for approximating Nash equilibria in two-player zero-sum games. Enhancing policy diversity has been shown to improve the performance of PSRO in this approximation process significantly. However, existing diversity metrics are often prone to redundancy, which can hinder optimal strategy convergence. In this paper, we introduce the policy similarity measure (PSM), a novel approach that combines Gaussian and cosine similarity measures to assess policy similarity. We further incorporate the PSM into the PSRO framework as a regularization term, effectively fostering a more diverse policy population. We demonstrate the effectiveness of our method in two distinct game environments: a non-transitive mixture model and Leduc poker. The experimental results show that the PSM-augmented PSRO outperforms baseline methods in reducing exploitability by approximately 7% and exhibits greater policy diversity in visual analysis. Ablation studies further validate the benefits of combining Gaussian and cosine similarities in cultivating more diverse policy sets. This work provides a valuable method for measuring and improving the policy diversity in two-player zero-sum games. |
|---|---|
| ISSN: | 2076-3417 |