bwang0911 commited on
Commit
0aaf6db
1 Parent(s): ea6cbd8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +28 -0
README.md CHANGED
@@ -66,6 +66,34 @@ print(cos_sim(text_embeddings[0], text_embeddings[1])) # text embedding similari
66
  print(cos_sim(text_embeddings[0], image_embeddings[0])) # text-image cross-modal similarity
67
  ```
68
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69
  ## Performance
70
 
71
  ### Text-Image Retrieval
 
66
  print(cos_sim(text_embeddings[0], image_embeddings[0])) # text-image cross-modal similarity
67
  ```
68
 
69
+ **notice: our emperical study shows that text-text cosine similarity is normally larger than text-image cosine similarity!**
70
+
71
+ If you want to merge two scores, we recommended 2 ways:
72
+
73
+ 1. weighted average of text-text sim and text-image sim:
74
+
75
+ ```python
76
+ # pseudo code
77
+ alpha = 0.6 # text search
78
+ beta = 0.4 # cross-modal search
79
+
80
+ combined_scores = alpha * sim(query, document) + beta * sim(text, image)
81
+ ```
82
+
83
+ 2. apply z-score normalization before merging scores:
84
+
85
+ ```python
86
+ # pseudo code
87
+ query_document_sim_mean = np.mean(cos_sim_query_documents)
88
+ query_document_sim_std = np.std(cos_sim_query_documents)
89
+ text_image_sim_mean = np.mean(cos_sim_text_images)
90
+ text_image_sim_std = np.std(cos_sim_text_images)
91
+
92
+ query_document_sim_normalized = (cos_sim_query_documents - query_document_sim_mean) / query_document_sim_std
93
+ text_image_sim_normalized = (cos_sim_text_images - text_image_sim_mean) / text_image_sim_std
94
+ # sum normalized scores
95
+ ```
96
+
97
  ## Performance
98
 
99
  ### Text-Image Retrieval