Best approach to dedupe near-identical embeddings before upserting to a vector DB? Cosine threshold feels brittle across domains.