April 25, 2016

How to Approximate Named-Entity Recognition

I just added named-entity recognition to the news summariser. Named-entity recognition is the identification of nouns in a piece. It further involves classifying these nouns. In this case, I've classified them into locations, people, and organisations (or rather, the Stanford NLP group has). It involved downloading named entity package, trimming the fat out of that and putting it into my filesystem. I then activated it in nltk and got some results. I would like a better training set, as all artificial intelligence of this sort rests on how accurate a training set you have. No demo, but you can download the linked script, and install the dependencies, and run it yourself to see how great (or horrible) the training set is. If you find a better one, please leave it in the comments. Thanks!

No comments:

Post a Comment