BookNLP is a natural language processing pipeline for books

One of the most challenging things about natural language processing (NLP) is how broad and expansive it is for understanding natural language. Where do you start? One of the best things about NLP is how you can narrow it down with things like trained models and pipelines.

BookNLP is an NLP pipeline that works specifically on books and long-form documents in English. Features include:

Part-of-speech (POS) tagging
Dependency parsing
Named entity recognition (NER)
Character name clustering and coreference resolution
Event tagging
Referential gender inference

BookNLP comes with small and large BERT language models for different needs. The smaller one is better for personal use while the large model works best for more powerful computers and bigger projects.

Filed under: language models linguistics natural language processing

LOGiCFACE — A STEM blog focussing on Black people and POCs.

BookNLP is a natural language processing pipeline for books

Related

Leave a Reply Cancel reply