Mini Project - Merged
Mini Project - Merged
Submitted by
of
BACHELOR OF TECHNOLOGY
in
MAY 2024
TRADING BOT
A MINI PROJECT-I REPORT
Submitted by
of
BACHELOR OF TECHNOLOGY
in
MAY 2024
ii
ANNA UNIVERSITY: CHENNAI 600 025
BONAFIDE CERTIFICATE
Certified that this project Report “TRADING BOT” is the bonafide work of “MURAJIT
VARUN(22110139), GOWTHAMS(2211021)” who carried ou22222t the project work
under my supervision.
SIGNATURE SIGNATURE
Dr.V.Karpagam Mrs.C.Kavitha
HEAD OF THE DEPARTMENT SUPERVISOR
Professor, Assistant Professor,
Department of Artificial Intelligence and Department of Artificial Intelligence and
Data Science, Data Science,
Sri Ramakrishna Engineering College, Sri Ramakrishna Engineering College,
Coimbatore-641022. Coimbatore-641022.
iii
ACKNOWLEDGEMENT
With immense pleasure, we express our hearty thanks to the Head of the
Department, Dr. V. Karpagam, Department of Artificial Intelligence and Data
Science for her encouragement towards the completion of this project.
We convey our thanks to all the teaching and non-teaching staff members
of our department who rendered their co-operation by all means for
completion of this project.
iv
ABSTRACT
This project presents a trading strategy for financial markets based on sentiment
analysis, using natural language processing (NLP) techniques and machine learning models.
The strategy uses sentiment analysis of newspaper headlines to decide whether to buy or sell
a given financial instrument. The implementation uses the Lumibot framework for broker
integration and backtesting along with the Alpaca API for accessing financial data and
executing trades.
The sentiment analysis component uses the FinBERT model, a pre-trained
transformer-based model tuned for financial sentiment analysis. News headlines are
processed using the FinBERT model to predict sentiment (positive, negative or neutral) and
associated probabilities. The trading strategy dynamically adjusts position size and trade
execution based on the results of sentiment analysis in order to take advantage of sentiment-
driven market movements.
The script includes functions to estimate sentiment for a given set of news headlines,
check GPU availability for accelerated computation using CUDA, and backtest a trading
strategy based on sentiment analysis using historical financial data. Backtest evaluates the
performance of a trading strategy over a certain period of time and provides insight into its
profitability and effectiveness. Overall, this project demonstrates the potential of sentiment
analysis techniques in developing algorithmic trading strategies for finance
markets that offer opportunities for automated decision making and improved risk
management.
v
TABLE OF CONTENTS
ABSTRACT iv
LIST OF ABBREVIATIONS vii
1 INTRODUCTION 1
PROBLEM STATEMENT 2
OBJECTIVE 2
2 LITERATURE SURVEY 3
3 3.1 EXISTING SYSTEMS 5
4 WORKING PRINCIPLE 9
5.3 MODULES 14
APPENDIXES
APPENDIX 1 15
APPENDIX 2 18
SCREEN SHOTS 23
REFERENCES 26
vii
LIST OF ABBREVIATIONS
ABBREVIATION EXPANSION
Application Programming
API
Interface
Representational State
REST
Transfer
API_KEY
Application Programming
Interface Key
Application Programming
API_SECRET
Interface Secret
SQL Structured Query Language
EM Expectation-Maximization
PT PyTorch
viii
CHAPTER 1
INTRODUCTION
In today's fast-moving financial markets, the use of cutting-edge technology has become
indispensable for traders looking for an edge. Enter the world of AI trading bots – sophisticated
algorithms designed to analyze market data, identify patterns and execute trades with accuracy
and speed. Powered by artificial intelligence and machine learning, these bots have
revolutionized the way investors navigate the complexities of trading.
At its core, an AI trading bot is a culmination of advanced mathematical models and
computing power, capable of processing massive amounts of data in real time. Using historical
market data, news sentiment analysis and technical indicators, these bots can identify trading
opportunities that human traders might miss. Through iterative learning processes, they
constantly adapt and optimize their strategies to outperform traditional methods and generate
consistent returns.
The benefits of AI trading bots are numerous. They operate 24 hours a day, 7 days a
week, unaffected by human emotion or fatigue, and ensure continuous vigilance over market
movements. Their lightning-fast execution capabilities allow them to take advantage of
fleeting opportunities and respond immediately to market changes. Additionally, their ability
to analyze multiple variables simultaneously gives them a holistic view of market dynamics,
allowing them to make informed decisions with greater accuracy.
However, it is essential to realize that AI trading robots are not infallible. Market
conditions can be unpredictable and even the most sophisticated algorithms can run into
unexpected problems. In addition, regulatory scrutiny and ethical considerations regarding
algorithmic trading underscore the importance of responsible use and oversight.
i
PROBLEM STATEMENT
1
Give marketers a tool that improves decision making by leveraging sentiment analysis
insights.
CHAPTER 2
LITERATURE SURVEY
2.1 Jerry Joy, Aparna Kannan, Shreya Ram,“ Speech emotion recognition using
neural network and MLP classifier” ISSN 2321 3361 © 2020 IJESC
It is suggested to use a deep learning model that has had its hyperparameters
optimized to find the ideal settings and get more accurate recognition results. This deep
learning model consists of a convolutional neural network (CNN) layer and four local
feature-learning blocks to learn short- and long-term correlations in the log Mel-spectrogram
of the input speech samples. To show the effectiveness of the method, four speech emotion
datasets—IEMOCAP, Emo-DB, RAVDESS, and SAVEE—are included in the experiments.
Based on the four datasets, the attained recognition accuracies are 98.13%, 99.76%, 99.47%,
and 99.50%, respectively.
2.2 Felicia Andayani, Lau Bee Theng Mark Teekit Tsun and CasloChua “Hybrid
LSTM-Transformer Model for Emotion Recognition from Speech Audio Files” IEEE
Access.2022.3163856
This research introduced a hybrid Long Short-Term Memory (LSTM) Network and
Transformer Encoder to analyze the long-term relationships in voice signals and discern
emotions. The Mel Frequency Cepstral Coefficient (MFCC) is used to extract speech
features, which are then fed into the proposed hybrid LSTM-Transformer classifier. Several
performance evaluations of the proposed LSTM-Transformer model were conducted. The
results demonstrate that, in terms of recognition, it greatly outperforms the models currently
offered by earlier published studies. The recommended hybrid model achieved recognition
2
success rates of 75.62%, 85.55%, and 72.49%, respectively, using the RAVDESS, EmoDB,
and language-independent datasets.
2.3 Taiba majid wani, Teddy surya gunawan ,Syed asif ahma “A comprehensive Review
of Speech Emotion and Recognition Systems” IEEE-EXPLORE 9383000
In the literature on speech emotion recognition, numerous techniques, including
numerous well-known speech analysis and classification techniques, have been employed to
extract emotions from signals. (SER). Deep learning methodologies have recently been
presented as a replacement for traditional SER procedures. Some of the acoustic features
used are MFCCs, PLPs, and FBANKs. Combining these three acoustic features with MFCCs
resulted in better performance and a higher identification rate of 92.3% than when utilizing
only one acoustic feature, which was 92.1%.
3
CHAPTER 3
3.1 EXISTING SYSTEMS
TradeStation is a brokerage platform that offers a suite of trading tools and analytics
for active traders. It provides a platform called EasyLanguage for developing custom trading
strategies and indicators. TradeStation also offers a marketplace where users can purchase
or sell trading strategies developed by other users
The proposed system aims to integrate sentiment analysis techniques into algorithmic trading
strategies to improve decision making and trading performance.
Using sentiment analysis from textual data sources such as financial news headlines and
social media posts, the system aims to identify sentiment-driven market trends and capitalize
on trading opportunities.
The system will include an automated sentiment analysis pipeline to preprocess text data,
extract sentiment features, and classify text into sentiment categories.
4
3.3 SENTIMENT ANALYSIS ON FINANCIAL
5
3.4 ALGORITHMIC TRADING STRATEGIES
6
3.5 NLP AND MACHINE LEARNING MODELS
Role in Sentiment Analysis:
Natural language processing (NLP) and machine learning models play a key role in sentiment
analysis by enabling automated sentiment extraction and analysis from textual data.
NLP techniques are used to preprocess text data, tokenize words, remove stop words, and
extract features for sentiment analysis.
Machine learning models are trained on labeled data to classify text as positive, negative, or
neutral based on sentiment indicators, enabling automated sentiment analysis.
Supervised learning models: Such as Support Vector Machines (SVM), Naive Bayes, and
Logistic Regression, which learn to classify text based on labeled training data.
Deep learning models: such as recurrent neural networks (RNNs), convolutional neural
networks (CNNs), and transformer models that learn to capture complex patterns and
relationships in text data for sentiment analysis.
Ensemble methods: Such as Random Forests and Gradient Boosting Machines (GBM),
which combine multiple underlying models to improve prediction accuracy.
Function:
The engineering function plays a key role in training machine learning models for sentiment
analysis.
Features such as word embeddings, n-grams, parts of speech, and syntactic or semantic
features are extracted from text data to capture sentiment-related information.
7
CHAPTER 4
WORKING PRINCIPLE
1. Text cleaning:
• Text cleaning involves removing unnecessary characters, symbols and formatting from text
data.
• Common sanitization techniques include removing special characters, punctuation, HTML
and URL tags from text.
2. Tokenization:
• Tokenization is the process of dividing text into individual words or tokens.
• Text is tokenized based on spaces or punctuation, creating a list of words or tokens for further
analysis.
3. Normalization:
• Normalization techniques are used to standardize textual data and reduce variation.
• This may include converting text to lowercase, removing diacritics or diacritics, and
expanding contractions (eg converting "can't" to "can't").
4. Stopword removal:
• Stopwords are common words that do not have a significant meaning when analyzing the text.
• Ignored word removal involves filtering out the ignored words from the text to focus on the
meaningful content.
• Common ignored words include articles, conjunctions, and prepositions.
5. Lemmatization and stemming:
• Lemmatization and stemming are techniques used to reduce words to their root form.
• Lemmatization maps words to their base or dictionary form (lemma), while lemma removes
prefixes and suffixes to form the root or stem of a word.
9
4.3 SENTIMENT ANALYSIS
Definition:
Sentiment analysis, also known as opinion mining, is the process of computationally
identifying and categorizing opinions, attitudes, and emotions expressed in textual data.
Objective:
The primary goal of sentiment analysis is to determine the sentiment polarity of a text and
classify it as positive, negative, or neutral based on the sentiment expressed.
Application:
Sentiment analysis has various applications in different domains, including:
Marketing: Analyzing customer feedback, reviews and social media posts to understand
customer sentiment towards products and brands.
Finance: Analyzing financial news headlines, earnings call transcripts and social media
discussions to gauge market sentiment and make investment decisions.
Customer Service: Monitoring customer feedback and sentiment on social media platforms
to identify issues, trends and opportunities for improvement.
Politics: Analysis of public opinion and sentiment toward political candidates, parties, and
policies during elections or campaigns.
Techniques:
Sentiment analysis techniques can generally be divided into:
Rule-based approaches: Using predefined rules and lexicons to identify sentiment
indicators and classify text based on predefined criteria.
Machine learning approaches: Train machine learning models on labeled data to
automatically learn and classify text based on sentiment features.
Hybrid approaches: Combining rule-based and machine learning techniques to leverage
the strengths of both approaches for more accurate sentiment analysis.
10
CHAPTER 5
5.1ALGORITHMIC TRADING STRATEGIES
Definition and overview:
Algorithmic trading strategies are automated trading strategies that use computer algorithms
to execute trades in the financial markets.
The goal of these strategies is to exploit market inefficiencies, price discrepancies and
trading opportunities using quantitative analysis and mathematical models.
Types of Algorithmic Trading Strategies:
There are different types of algorithmic trading strategies, including:
Momentum Strategy: Trading based on the momentum of price movements.
Mean reversion strategy: Trading based on the tendency of prices to revert to their mean
or average value.
Arbitrage Strategy: Exploiting price differences between related assets or markets.
Market Making Strategy: Providing liquidity by constantly quoting bid and ask prices.
Sentiment Based Strategies: Trading based on market sentiment derived from sentiment
analysis of newspaper headlines and social media data.
Advantages of algorithmic trading strategies:
Algorithmic trading strategies offer several advantages over manual trading, including:
Speed: Algorithms can execute trades much faster than human traders, allowing for rapid
response to market events and opportunities.
Efficiency: Algorithms can analyze large data sets and execute trades with accuracy and
consistency, reducing the impact of human emotions and biases.
Scalability: Algorithmic trading strategies can be scaled to trade multiple markets and assets
simultaneously, increasing trading volume and liquidity.
11
5.2 BACKTESTING FRAMEWORK
Definition:
A backtesting framework is a systematic approach used to evaluate the performance of
trading strategies using historical market data.
It includes the simulation of trade execution based on predefined rules and parameters to
assess the effectiveness and profitability of the strategy.
Purpose:
The primary purpose of the backtesting framework is to validate trading strategies and
evaluate their performance under historical market conditions.
It helps traders identify strengths and weaknesses and areas for improvement in their
strategies before deploying them in a live trading environment.
Backtesting Framework components:
A backtesting framework typically consists of the following components:
Historical Data: Market data over a period of time used for backtesting.
Strategy Implementation: The code or algorithms that define the rules, conditions, and
parameters of a trading strategy.
Simulation Engine: A software or platform that executes trades based on trading strategy
and historical data.
Performance Metrics: Metrics used to evaluate strategy performance, such as profitability,
risk-adjusted returns, drawdowns, and Sharpe ratio.
Visualization and Reporting: Tools to visualize backtesting results and generate
performance reports to analyze and interpret strategy performance.
Backtesting Process:
The backtesting process includes the following steps:
Data Preparation: Cleaning, pre-processing and formatting historical market data for
analysis.
12
5.3 MODULES
Sentiment Analysis Module:
The sentiment analysis module is responsible for analyzing textual data from sources such
as financial news headlines and social media posts for sentiment indicators.
Natural language processing (NLP) techniques and machine learning models are used to
classify text into positive, negative, or neutral sentiment categories.
The module preprocesses text data, tokenizes words, removes ignored words, and applies
sentiment analysis algorithms to generate sentiment scores or labels.
Business Strategy Module:
The trading strategy module designs and implements algorithmic trading strategies that
incorporate insights from sentiment analysis.
It defines the rules, conditions and parameters for making trading decisions based on the
outputs of sentiment analysis, market data and risk management principles.
Machine learning models can be used to develop predictive trading signals and optimize
strategy performance.
Backtesting module:
The Backtesting module evaluates the performance of trading strategies using historical
market data.
It simulates the execution of trades based on pre-defined strategic rules and parameters to
assess profitability, risk and robustness.
Performance metrics such as Sharpe ratio, maximum draw and win-loss ratio are calculated
to analyze the strategy's performance under different market conditions.
Data collection module:
The data collection module acquires and pre-processes the market data and text data needed
for sentiment analysis and backtesting.
Integrates with financial APIs, market exchanges and data vendors to access real-time and
historical market data streams.
13
CHAPTER-6
6.1 Conclusion:
The proposed system demonstrates the feasibility and effectiveness of integrating sentiment
analysis techniques into algorithmic trading strategies.
Through backtesting and performance evaluation, the system has shown promising results in
generating profits and managing risks in the financial markets.
Using sentiment analysis provides valuable insights into market sentiment and helps traders
make informed decisions based on sentiment-driven trends.
6.2 Future improvements:
Several avenues for future enhancements and improvements have been identified:
Integrating Alternative Data Sources: Incorporating additional data sources such as alternative
data, social media sentiment, and macroeconomic indicators to increase predictive power and
robustness.
Advanced Machine Learning Techniques: Exploring advanced machine learning models
such as deep learning architectures and reinforcement learning algorithms for more accurate
sentiment analysis and predictive modeling.
Real-Time Sentiment Analysis: Developing real-time sentiment analysis capabilities to
capture dynamic changes in market sentiment and adjust trading strategies accordingly.
Dynamic Strategy Optimization: Implementation of dynamic strategy optimization
techniques that adjust strategy parameters in response to evolving market conditions and
sentiment analysis.
Compliance and risk management: Strengthening compliance measures and risk
management protocols to ensure compliance with regulatory requirements and mitigate
potential legal and financial risks.
14
APPENDIX 1
User interface
import tkinter as tk
from tkinter import ttk
class TradingBotUI:
def __init__(self, master):
self.master = master
master.title("Trading Bot")
15
self.apisecret_entry = ttk.Entry(self.main_frame, font=("Arial", 12))
self.apisecret_entry.grid(row=2, column=1, padx=10, pady=5, sticky="w")
def submit(self):
apikey = self.apikey_entry.get()
apisecret = self.apisecret_entry.get()
# Here you would call your trading bot function with the provided parameters
# Example:
# trading_bot(apikey, apisecret)
def main():
root = tk.Tk()
root.geometry("300x200")
trading_bot_ui = TradingBotUI(root)
root.mainloop()
if __name__ == "__main__":
main()
16
APPENDIX 2
Trading bot:
import lumibot
import datetime
import alpaca_trade_api
from lumibot.backtesting import YahooDataBacktesting
from lumibot.strategies.strategy import Strategy
from lumibot.traders import Trader
from datetime import datetime
from alpaca_trade_api import REST
from timedelta import Timedelta
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
from typing import Tuple
device ="cuda:0" if torch.cuda.is_available() else "cpu"
API_KEY = "PKBFY9NDMDZ3DZGNUBHU"
API_SECRET = "Q8Y6j2oTqTML1cOaXIHZSXbYYszZJtxZsZLrIT0p"
BASE_URL = "https://paper-api.alpaca.markets"
ALPACA_CREDS = {
"API_KEY":API_KEY,
"API_SECRET": API_SECRET,
"PAPER": True
}
17
tokenizer = AutoTokenizer.from_pretrained("ProsusAI/finbert")
model =
AutoModelForSequenceClassification.from_pretrained("ProsusAI/finbert").to(device)
labels = ["positive", "negative", "neutral"]
def estimate_sentiment(news):
if news:
tokens = tokenizer(news, return_tensors="pt", padding=True).to(device)
class MLTrader(Strategy):
def initialize(self, symbol:str="SPY", cash_at_risk:float=.5):
self.symbol = symbol
self.sleeptime = "24H"
self.last_trade = None
self.cash_at_risk = cash_at_risk
self.api = REST(base_url=BASE_URL,key_id=API_KEY, secret_key=API_SECRET)
18
def position_sizing(self):
cash = self.get_cash()
last_price = self.get_last_price(self.symbol)
quantity = round(cash * self.cash_at_risk / last_price,0)
return cash, last_price, quantity
def get_dates(self):
today = self.get_datetime()
three_days_prior = today - Timedelta(days=3)
return today.strftime('%Y-%m-%d'), three_days_prior.strftime('%Y-%m-%d')
def get_sentiment(self):
today, three_days_prior = self.get_dates()
news = self.api.get_news(symbol=self.symbol,
start=three_days_prior,
end=today)
news = [ev.__dict__["_raw"]["headline"] for ev in news]
probability, sentiment = estimate_sentiment(news)
return probability, sentiment
def on_trading_iteration(self):
cash, last_price, quantity = self.position_sizing()
probability, sentiment = self.get_sentiment()
19
self.sell_all()
order = self.create_order(
self.symbol,
quantity,
"buy",
type="bracket",
take_profit_price=last_price*1.20,
stop_loss_price=last_price*.95
)
self.submit_order(order)
self.last_trade = "buy"
elif sentiment == "negative" and probability > .999:
if self.last_trade == "buy":
self.sell_all()
order = self.create_order(
self.symbol,
quantity,
"sell",
type="bracket",
take_profit_price=last_price*.8,
stop_loss_price=last_price*1.05
)
self.submit_order(order)
self.last_trade = "sell"
start_date = datetime(2020,1,1)
end_date = datetime(2023,12,31)
20
broker = alpaca(ALPACA_CREDS)
strategy = MLTrader(name='mlstrat', broker=broker,
parameters={"symbol":"SPY",
"cash_at_risk":.5})
strategy.backtest(
YahooDataBacktesting,
start_date,
end_date,
parameters={"symbol":"SPY", "cash_at_risk":.5}
)
# trader = Trader()
# trader.add_strategy(strategy)
# trader.run_all()
21
SCREEN SHOTS
22
23
24
REFERENCES
1. Cambria, E., Schuller, B., Xia, Y., & Havasi, C. (2013). New avenues in opinion
gathering and sentiment analysis. IEEE Intelligent Systems, 28(2), 15–21.
https://doi.org/10.1109/MIS.2013.30
2. Hull, J.C. (2014). Options, Futures, and Other Derivatives (9th ed.). Pearson
Education Limited.
4. Lohiya, N., Jain, V., & Varshney, D. (2016). A Survey on Sentiment Analysis and
Opinion Mining Techniques. International Journal of Computer Applications, 139(3),
1–5. https://doi.org/10.5120/ijca2016908509
5. Malkiel, B.G., & Fama, E.F. (2014). The Wall Street Random Walk: A Time-Tested
Strategy for Successful Investing (11th ed.). W. W. Norton & Company.
25