sentiment analysis using LSTM (1)
sentiment analysis using LSTM (1)
AIM:
To write a python program to implement sentiment analysis using LSTM.
ALGORITHM:
PROGRAM
data = pd.read_csv('C:\\Users\\Sentiment.csv')
# Keeping only the neccessary columns
data = data[['text','sentiment']]
max_fatures = 2000
tokenizer = Tokenizer(num_words=max_fatures, split=' ')
tokenizer.fit_on_texts(data['text'].values)
X = tokenizer.texts_to_sequences(data['text'].values)
X = pad_sequences(X)
4472
16986
embed_dim = 128
lstm_out = 196
model = Sequential()
model.add(Embedding(max_fatures, embed_dim,input_length = X.shape[1]))
model.add(SpatialDropout1D(0.4))
model.add(LSTM(lstm_out, dropout=0.2, recurrent_dropout=0.2))
model.add(Dense(2,activation='softmax'))
model.compile(loss = 'categorical_crossentropy',
optimizer='adam',metrics = ['accuracy'])
print(model.summary())
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding (Embedding) (None, 28, 128) 256000
=================================================================
Total params: 511194 (1.95 MB)
Trainable params: 511194 (1.95 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________
None
Y = pd.get_dummies(data['sentiment']).values
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size =
0.33, random_state = 42)
print(X_train.shape,Y_train.shape)
print(X_test.shape,Y_test.shape)
batch_size = 32
model.fit(X_train, Y_train, epochs = 7, batch_size=batch_size, verbose
= 2)
Epoch 1/7
225/225 - 49s - loss: 0.4306 - accuracy: 0.8172 - 49s/epoch -
217ms/step
Epoch 2/7
225/225 - 47s - loss: 0.3136 - accuracy: 0.8688 - 47s/epoch -
210ms/step
Epoch 3/7
225/225 - 51s - loss: 0.2783 - accuracy: 0.8854 - 51s/epoch -
226ms/step
Epoch 4/7
225/225 - 56s - loss: 0.2525 - accuracy: 0.8961 - 56s/epoch -
251ms/step
Epoch 5/7
225/225 - 54s - loss: 0.2262 - accuracy: 0.9065 - 54s/epoch -
241ms/step
Epoch 6/7
225/225 - 54s - loss: 0.2035 - accuracy: 0.9204 - 54s/epoch -
240ms/step
Epoch 7/7
225/225 - 55s - loss: 0.1842 - accuracy: 0.9265 - 55s/epoch -
242ms/step
<keras.src.callbacks.History at 0x159323d58d0>
validation_size = 1500
X_validate = X_test[-validation_size:]
Y_validate = Y_test[-validation_size:]
X_test = X_test[:-validation_size]
Y_test = Y_test[:-validation_size]
score,acc = model.evaluate(X_test, Y_test, verbose = 2, batch_size =
batch_size)
print("score: %.2f" % (score))
print("acc: %.2f" % (acc))
result =
model.predict(X_validate[x].reshape(1,X_test.shape[1]),batch_size=1,ve
rbose = 2)[0]
if np.argmax(result) == np.argmax(Y_validate[x]):
if np.argmax(Y_validate[x]) == 0:
neg_correct += 1
else:
pos_correct += 1
if np.argmax(Y_validate[x]) == 0:
neg_cnt += 1
else:
pos_cnt += 1
[[ 0 0 0 0 0 0 0 0 0 0 0 0 0
0
0 0 0 206 633 6 150 5 55 1055 55 46 6
150]]
1/1 - 0s - 289ms/epoch - 289ms/step
negative