Skip to content

Conversation

@aolieman
Copy link

Added a types attribute to annotate() and candidates(), which enables server-side filtering of resources. It also makes for a nice addition to the policy parameter.

I've tested it on both kinds of backends, but it only works properly with the Lucene-backed web service. This is, however, not a bug in pyspotlight and seems to be an unnoticed bug in Spotlight's statistical backend. It will be discussed in DBpS issue #251.

@aolieman
Copy link
Author

The problem of using a types filter with the statistical backend was an issue with missing documentation. Besides setting the types parameter, coreferenceResolution=false needs to be passed to the API in order for it to function. Because this behavior might change in the near future (see #251), I would not suggest to change the signature of annotate() solely to accommodate it.

But I would still like this to work asap ;-). My suggestion is to include all filter-related parameters in a filters attribute, which accepts a dictionary with any optional filters. I'm not sure if it's necessary, but I've included policy=whitelist as a default in the filter_kwargs dictionary that is included in the payload, to ensure that existing usage of pyspotlight is not disturbed.

Usage example:

only_person_filter = {
    'policy': "whitelist",
    'types': "DBpedia:Person",
    'coreferenceResolution': False
}

spotlight.annotate("http://localhost:2223/rest/annotate", 
                     "Komen Albert Verlinde en Metallica elkaar wel eens tegen in de showbizz?", 
                     filters=only_person_filter)
# [{u'similarityScore': 0.9999999700393123, u'surfaceForm': u'Albert Verlinde', u'support': 76, u'offset': 6, u'URI': u'http://nl.dbpedia.org/resource/Albert_Verlinde', u'percentageOfSecondRank': 0.0, u'types': u'DBpedia:Agent,Schema:Person,DBpedia:Http://xmlns.com/foaf/0.1/Person,DBpedia:Person,DBpedia:Presenter'}]

@originell
Copy link
Contributor

That sounds great :D Are you still using it this way ? =)

@aolieman
Copy link
Author

Yes, I am. By using a single filters argument, the signature of annotate and candidates only needs to change once. I think there are still plans to change the filter parameters in DBp Spotlight, but I'm not sure what the implementation status is.
Would you like to incorporate my changes into pyspotlight?

@aolieman
Copy link
Author

aolieman commented May 1, 2014

Hi @originell,
Adding the filters attribute is still relevant. Would you mind merging this pull request and updating on PyPI?
Or, if you are not interested in maintaining pyspotlight on PyPI, would you consider letting me submit new releases there for the time being? This is a nice wrapper and I would like to use it in many projects. In some cases, however, it is essential to use the version from PyPI.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants