Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Counting spurious entities #15

Open
anjiefang opened this issue Sep 17, 2019 · 5 comments
Open

Counting spurious entities #15

anjiefang opened this issue Sep 17, 2019 · 5 comments

Comments

@anjiefang
Copy link

anjiefang commented Sep 17, 2019

Hi,

I found an issue when counting spurious. In lines 317-322 in ner_eval.py, if a spurious entity is found, +1 is performed for all types of entities. Shoud it be performed for only one type of entity? :

for true in tags:   
    evaluation_agg_entities_type[true]['strict']['spurious'] += 1
     evaluation_agg_entities_type[true]['ent_type']['spurious'] += 1
     evaluation_agg_entities_type[true]['partial']['spurious'] += 1
     evaluation_agg_entities_type[true]['exact']['spurious'] += 1

change to

evaluation_agg_entities_type[pred.e_type]['strict']['spurious'] += 1
evaluation_agg_entities_type[pred.e_type]['ent_type']['spurious'] += 1
evaluation_agg_entities_type[pred.e_type]['partial']['spurious'] += 1 
evaluation_agg_entities_type[pred.e_type]['exact']['spurious'] += 1

?

Thanks.
Andy.

@ivyleavedtoadflax
Copy link
Contributor

ivyleavedtoadflax commented Sep 20, 2019

Hi Andy, thanks very much for taking the time to create an issue. Looking at the code, it looks like I was unsure about this too as I left this comment:

                # or when it simply does not appear in the test set, then it is
                # spurious, but it is not clear where to assign it at the tag
                # level. In this case, it is applied to all target_tags
                # found in this example. This will mean that the sum of the
                # evaluation_agg_entities will not equal evaluation.

What do you think about it @davidsbatista?

@ivyleavedtoadflax
Copy link
Contributor

Also @anjiefang you may be interested to see that we started to convert this code into a module here https://github.com/ivyleavedtoadflax/nervaluate, although we've not got too much further yet. I have a task coming up for which I will need to use it, so I hope to get more time to develop it in the near future.

@amlarraz
Copy link

I'm working with the library and I've found a (i think) mistake in this part of the code.
When the predicted entity is not in the true-entities list and the offsets do not match exactly with any of the true-entities and it do not have any overlap with any of the true-entities the code add 1 to all the labels in the 'spurious' field. This is because the note:

NOTE: when pred.e_type is not found in tags
or when it simply does not appear in the test set, then it is
spurious, but it is not clear where to assign it at the tag
level. In this case, it is applied to all target_tags
found in this example. This will mean that the sum of the
evaluation_agg_entities will not equal evaluation

but there is no check to ensure that the predicted label is not in the label set.

Maybe it is neccesary to add if pred.e_type not in tags: before the for true in tags: here?

@ivyleavedtoadflax
Copy link
Contributor

Hi @amlarraz. Many thanks for your comment. Could you possibly open a PR for this?

@amlarraz
Copy link

No problem, I've just created the pull request.
Many thanks for your work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants