Text this: Deduplication algorithm based on condensed nearest neighbor rule for deduplication metadata