natural language processing
Popular uses of machine learning involving text are generally referred to as natural language processing
(or NLP for short) tasks.
A figurative illustration of document clustering - described below.
These tasks include popular problems like sentiment analysis, spam detection, and document clustering that automate (at very large scale) simple cognitive tasks involving the analysis and understanding of text documents.
Spam detection is a classic example. Built into every modern email product, a spam detector uses simple concepts from AI to automatically distinguish between real and spam email messages sent to your inbox. After passing through the detector new emails are sorted and sent to either your inbox (if the message is real) or spam folder.
Sentiment analysis is a more recent popular NLP problem. By scanning a large body of tweets or other social media messages businesses can better gauge aggregate feelings of a large swaths of customers to products and services.
A figurative illustration of document clustering - described below. Two ‘Bag of Words’ (or BoW for short) vector representations for two short reviews of a controversial comedy movie, one with a positive opinion and the other with a negative one. The polar opposite sentiment of these two reviews is perfectly represented in their BoW representations, which are perpendicular to each other. Image taken from our book, Machine Learning Refined.
Document clustering is a method for automatically organizing large collections of documents based on their content similarity. This style of ‘unsupervised’ NLP is often used to improve business bottlenecks where humans are required to analyze large collections of documents.