0% found this document useful (0 votes)
57 views2 pages

Case Study For Text Analytics

Vinod needs to analyze a 300-page online book for a librarian client and extract word features like sentiment and part of speech using natural language processing. Supervised learning can be used to extract tagged text features through algorithms after training on labeled datasets, while unsupervised learning can cluster unlabeled data and identify frequently used words through matrix factorization. A hybrid machine learning model is needed to structure the text data, extract content, and determine sentiment for summarization.

Uploaded by

Ajay Krish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views2 pages

Case Study For Text Analytics

Vinod needs to analyze a 300-page online book for a librarian client and extract word features like sentiment and part of speech using natural language processing. Supervised learning can be used to extract tagged text features through algorithms after training on labeled datasets, while unsupervised learning can cluster unlabeled data and identify frequently used words through matrix factorization. A hybrid machine learning model is needed to structure the text data, extract content, and determine sentiment for summarization.

Uploaded by

Ajay Krish
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 2

Case Study for text analytics (NLP) machine learning:

Situation:
Vinod who is working in AI company. He has a client who is a librarian who need to
analyze a online book and get the datas based on the features of words and their
characterstics.The features of words can be its sentiments, parts of speech etc. By validating
the sentences the client need to get each words features based on the natural language
processing.

Solution:
Since the book has more than 300 pages ,it necessary to organize the data of words and
able to provide a better approach for the model.To categorize each text based on NLP approach
we can provide statistical techniques for analyzing sentiments and other aspects of the text.This
techiques can be solved using two learnings in machine learning approach. They are suprvised
and unsupervised learning.

Supervised learning: In this learning we can able to get the labbled text according to users
choice by training the datasets. By using various algorithms you extract the data based on
tagged texts. There provides features of NLP to extract the text documents. Tokenization is one
of the feature the text of the words book is seperated into pieces to figure out easily by the
machine.Another feature is to identify the parts of speech and assigning. The next feature is to
get the named entity of the text followed by its places,people,title etc.The next method is to
analyse the sentiment of the text whether it is positive,negative and neutral and extract by
means of each category.The final method is to classify and categorize the data to get the results
most accurately in faster time.

Unsupervised learning:In this method you get the data on the basis of group and doesnot have
labelled set of datas.Since the words are in hierarchical form it’s a method to use clustering to
group the documents and then sorted,and other method is latent semantics indexing to identify
and search words and phrases that’s are frequently used with each other. We can explore the
data by another technique of unsupervised learning called Matrix Factorization.The data of
texts are combines in a matrix and able to identify the similarity between two smaller
matrices.This process is called latent factors.By various matrices approach we can analyze the
data based on syntax we can recall the sentences text later

To get the exact information of the sentence in the book nlp classifies the information such as
semantic information is to get correct meaning for word that suites the sentence,syntax
information is to get the information based on the meaning that how we read and finally the
context of the sentence.Thus based on these crietria we need take a good machine learning
model to build the application.There are following levels/rules in hybrid machine learning
model system to undergo:To get the following text from the book we need the initial process to
run text and change to structured data by tuning the model.Then the mid level text funtion is to
extract the content of the text and final stage is high level text function by detrmining and
applying the sentiment to the text by NLP approach and summarize the case generating the
data of texts.

Results:
By detailed view of choosing the machine learning approach and NLP model we
can generate the data from the book and its features.

You might also like