Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad entity recognition #13480

Closed
ax-va opened this issue May 6, 2024 · 1 comment
Closed

Bad entity recognition #13480

ax-va opened this issue May 6, 2024 · 1 comment
Labels
feat / ner Feature: Named Entity Recognizer lang / en English language data and models perf / accuracy Performance: accuracy

Comments

@ax-va
Copy link

ax-va commented May 6, 2024

How to reproduce the behaviour

import spacy

nlp = spacy.load("en_core_web_sm")
doc = nlp("Elon Musk, owner of Tesla and SpaceX, has offered to buy Twitter (now X) for $21 billion of his own money.")

# For each entity print the text and the entity label
for entity in doc.ents:
    print(entity.text, entity.label_, sep=",")
# Elon Musk,PERSON
# Tesla,ORG
# Twitter,PERSON
# $21 billion,MONEY

Twitter is no person, nothing about SpaceX and X.

Your Environment

  • Operating System: Ubuntu 22.04.4 LTS
  • Python Version Used: 3.11
  • spaCy Version Used: 3.7.4
  • en-core-web-sm: 3.7.1
  • Environment Information: virtual environment with
annotated-types==0.6.0
asttokens==2.4.1
beautifulsoup4==4.12.3
blis==0.7.11
bs4==0.0.2
catalogue==2.0.10
certifi==2024.2.2
charset-normalizer==3.3.2
click==8.1.7
cloudpathlib==0.16.0
confection==0.1.4
contourpy==1.2.0
cramjam==2.8.3
cycler==0.12.1
cymem==2.0.8
decorator==5.1.1
en-core-web-sm @ https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl#sha256=86cc141f63942d4b2c5fcee06630fd6f904788d2f0ab005cce45aadb8fb73889
executing==2.0.1
fastavro==1.9.4
fastparquet==2024.2.0
fonttools==4.50.0
fsspec==2024.3.1
idna==3.6
ipython==8.22.2
jedi==0.19.1
Jinja2==3.1.4
joblib==1.3.2
kiwisolver==1.4.5
langcodes==3.4.0
language_data==1.2.0
lxml==5.2.1
marisa-trie==1.1.0
MarkupSafe==2.1.5
matplotlib==3.8.3
matplotlib-inline==0.1.6
murmurhash==1.0.10
nltk==3.8.1
numpy==1.26.4
packaging==24.0
pandas==2.2.1
pandavro==1.8.0
parso==0.8.3
pexpect==4.9.0
pillow==10.2.0
preshed==3.0.9
prompt-toolkit==3.0.43
ptyprocess==0.7.0
pure-eval==0.2.2
pydantic==2.7.1
pydantic_core==2.18.2
Pygments==2.17.2
pyparsing==3.1.2
python-dateutil==2.9.0.post0
pytz==2024.1
regex==2024.4.28
requests==2.31.0
scikit-learn==1.4.1.post1
scipy==1.12.0
six==1.16.0
smart-open==6.4.0
soupsieve==2.5
spacy==3.7.4
spacy-legacy==3.0.12
spacy-loggers==1.0.5
srsly==2.4.8
stack-data==0.6.3
thinc==8.2.3
threadpoolctl==3.4.0
tqdm==4.66.2
traitlets==5.14.2
typer==0.9.4
typing_extensions==4.11.0
tzdata==2024.1
urllib3==2.2.1
wasabi==1.1.2
wcwidth==0.2.13
weasel==0.3.4
@svlandeg svlandeg added feat / ner Feature: Named Entity Recognizer perf / accuracy Performance: accuracy lang / en English language data and models labels May 15, 2024
@svlandeg
Copy link
Member

Hi! Let me transfer this thread to the discussion forum, as we like to keep the issue tracker focused on bug reports.

@explosion explosion locked and limited conversation to collaborators May 15, 2024
@svlandeg svlandeg converted this issue into discussion #13497 May 15, 2024

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
feat / ner Feature: Named Entity Recognizer lang / en English language data and models perf / accuracy Performance: accuracy
Projects
None yet
Development

No branches or pull requests

2 participants