-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Datahub]: Change related datasets suggestions to be more relevant #1082
base: main
Are you sure you want to change the base?
Conversation
Affected libs:
|
📷 Screenshots are here! |
], | ||
like: [ | ||
{ | ||
_index: 'gn-records', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this should stay hardcoded here. I found it in the docker file and somewhere else but not as a environment variable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can vary across deployments and should not be hardcoded; unfortunately the gn-ui apps don't really have access to this information :/
1ea7aa3
to
18f48e5
Compare
dataset detail page to use more similar datasets. Now ES is based on title, abstract and keywords.
more like this
suggestions for e2e
4c26e6d
to
6d1db07
Compare
default: record.abstract, | ||
}, | ||
allKeywords: record.keywords.map( | ||
(keyword) => keyword.label |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the order here matter? I tested with the MEL instance (only around 50 records, I don't know if it's enough though), and some records have relevant suggestions IMO, but some others are too biased by the title. For instance, a record about wind turbines in Roubaix had suggestions like "women history in Roubaix", "trees in Roubaix" and "citizen crowdfunding in Roubaix" --> their only common point is Roubaix, I checked their keywords & abstract and they do not have anything in common. What if a datahub only has datasets with city names in their title? Will it suggest random records like this all the time?
I don't know much about elastic search, but could it check that the title match AND something else match, not just the title?
Description
This PR changes the elastic search service to base its search for related dataset suggestions on the title, abstract and keywords of the current dataset.
This makes sure that the suggestions will be more relevant and more similar to the current dataset.
[For Review]: Please make sure you visit different datasets and verify that the suggestions are relevant to the dataset. I tested it with the dev.geo2france backend.
Quality Assurance Checklist
breaking change
labelbackport <release branch>
label