Professional Documents
Culture Documents
Volume: 3 Issue: 8
ISSN: 2321-8169
5167 - 5172
_______________________________________________________________________________________________
Sujata M.Thamke
U.R.S Kalyani
IT Department, KMIT
KMIT
Hyderabad, India
upadhyayula.kalyani@gmail.com
Abstract Usage of social media like whatsapp, facebook, twitter, blogs etc is increasing day by day which makes every people to feel free to
comment and share their views, opinions and suggestions which can be either positive, negative or neutral comments on various topics like
politics, business, advertisement, entertainment etc. This may contain likes, dislikes, good, bad or Emotions etc which are nothing but some type
of sentiments. Judging these sentiments helps to find out whether the given sentiment is positive, negative or neutral by using sentiment analysis.
In this paper we are discussing about the concept of polarity in sentiment analysis by using polarity movie review dataset from Bo Pang and
Lillian Lee.
Keywords-Sentiment analysis; Polarity; Natural Language Processing
__________________________________________________*****_________________________________________________
I.
INTRODUCTION
II.
METHODOLOGY
A. Nave Bayes
Nave Bayes is a simple probabilistic classifier which is a
Supervised Machine Learning approach. Nave Bayes works
on Bayes theorem by strong independence assumptions. It is
taken from Bayesian Statistics. Nave Bayes classifier requires
small amount of training data to calculate means and variances
of the necessary variables for classification. In this we will
assume the independent variables and only the variances of the
5167
_______________________________________________________________________________________
ISSN: 2321-8169
5167 - 5172
_______________________________________________________________________________________________
variables need to be determined. In this Rapid Miner we will
consider the data table as training data set for Nave Bayes
operator. Laplace Correction parameter is to prevent high
influence of zero probabilities and the range is Boolean, which
is the main advantage of Nave Bayes classifier.
Train
b) Conditional Probabilites
Test
Words
Class
Chinese
Bejing
Chinese
Chinese
Chinese
Chinese
Shilong
Chinese
Chinese
Mizoram
Tokyo
Japan
Chinese
Chinese
Chinese
Chinese
Chinese
Japan
Tokyo
Japan
5168
IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
ISSN: 2321-8169
5167 - 5172
_______________________________________________________________________________________________
c) Testing Document
Choosing a Class
P (w|d5) P(c)*C*T*J
P(Chinese|d5) = 3/4*3/7*3/7*3/7*1/14*1/14
=0.0003
P (Japan|d5) = 1/4*2/9*2/9*2/9*2/9*2/9
= 0.0001
Document 5(d5) is common for both training and
testing.
III.
WORK CONTRIBUTION
b) Performance
It is used to evaluate the performance. It conveys a list of
performance criteria values. These execution criteria are
naturally decided keeping in mind the end goal to fit the
learning task type.
C. Store Operators
It helps to store an IO Object in the data repository at a
specific location with the help of repository entry.
IV.
a.
b.
c.
d.
B. X-Validation
e.
f.
5169
IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
ISSN: 2321-8169
5167 - 5172
_______________________________________________________________________________________________
g.
h.
i.
j.
k.
l.
a.
b.
c.
d.
_______________________________________________________________________________________
ISSN: 2321-8169
5167 - 5172
_______________________________________________________________________________________________
operator is used to store the data present in process document
from files.
Connect the ouput mod of Validation operator as an input to
inp of Store (2) operator.
RESULTS
Accuracy = (Sensitivity+Specificity)/2
= (0.7059+0.71428)/2
= 0.71
5171
IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________
ISSN: 2321-8169
5167 - 5172
_______________________________________________________________________________________________
B. Calculation of Accuracy for 50 validations
Confusion Matrix
True Positive (TP) True Negative (TN)
Predicted Positive (PP)
12
4
Predicted Negative (PN)
8
16
Sensitivity=PPTP / (PPTP + PPTN)
= 12 / (12+4)
= 0.75
Specificity=PNTN / (PNTP + PNTN)
= 16 / (8+16)
= 0.66
VI.
CONCLUSION
Accuracy = (Sensitivity+Specificity)/2
= (0.75+0.66)/2
=0.705
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
Figure 11. Nave Bayes Performance Vector Output for 50 Validations
[10]
5172
IJRITCC | August 2015, Available @ http://www.ijritcc.org
_______________________________________________________________________________________