-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Thanks for this benchmark set, and thanks for your writeup.
My retrieval method returns some top scored items, not a constant number between queries, and not in any particular order.
If I pad all of my returned indices with -1's, and then change this line to:
aps.append(get_average_precision_score((np.logical_and(categories_DB[indices[i]] == categories_Q[i], indices[i] != -1)), k))
And then run gpr_evaluate on my indices (Which are a n_queries by n_samples_DB array of indices with -1s) I end up getting high retrieval scores of 0.87.
I've tried using sklearn's precision_recall_fscore_support(targets, preds, average='binary'), where targets is a 1D boolean array of length 12000 and is True where the DB category matches the query category, and preds is a 1D boolean array of length 12000 and is True at the indices returned by my retrieval method.
My mAP using sklearn's precision_recall_fscore_support in that way over all queries is about 0.05. My mean precision is 0.388, and mean retrieval 0.114.
I want to compare my results to the ones you obtain by nearest neighbor indices but my scores using my modified version of your code seem too high compared to the retrieval scores I'm getting based on binary AP.