One of the biggest issues in natural language classification is labelling, whether it be finding labellers to do the work or naively thinking AI can do that job too. Or there’s the case of using labellers for exploitative labour and diminishing not only on their wellbeing but also the quality of the product you’re making. But this blog isn’t about that part (although you should always think about that!).
In the above video, Explosion’s Vincent Warmerdam explores bad labelling and how you can identify it for text classification using Jupyter and SpaCy’s annotation tool, Prodigy.
Filed under: data education machine learning natural language processing programming Python video