0% found this document useful (0 votes)

786 views

ML Report 9 PDF

The document describes a project to build a password strength checker using machine learning in Python. It discusses a dataset of 700,000 passwords labeled as weak, medium, or strong based on three commercial password checking algorithms. The project aims to develop a machine learning model that can more accurately evaluate password strength by analyzing various factors. Python is chosen as the programming language due to its versatility and machine learning libraries. The methodology involves analyzing the dataset using data visualization tools in Python before developing machine learning algorithms to classify new passwords by strength.

Uploaded by

Ronak Shaik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

786 views

ML Report 9 PDF

Uploaded by

Ronak Shaik

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

PASSWORD STRENGTH CHECKER

USING MACHINE LEARNING

A PROJECT REPORT
Submitted by
SASWAT KUMAR PANDEY (220301120423)
SOUMYA RANJAN BEHERA (220301130015)
N DIBYANSU DIBYARANJAN (220301120404)
PRIYANSU BARIK (220301120419)
in partial fulfilment for the award of the degree
of
BACHELOR OF TECHNOLOGY
In
COMPUTER SCIENCE ENGINEERING

CENTURION UNIVERSITY OF TECHNOLOGY AND

MANAGEMENT
BHUBANESWAR, ODISHA
APRIL 2023-MAY 2023
DEPARTMENT OF COMPUTER SCIENCE
ENGINEERING
CENTURION UNIVERSITY OF TECHNOLOGY & MANAGEMENT
BHUBANESWAR 752050

BONAFIDE CERTIFICATE
Certified that this project report PASSWORD STRENGTH CHECKER USING MACHINE
LEARNING IN PYTHON is the Bonafede work of SASWAT KUMAR PANDEY (220301120423),
SOUMYA RANJAN BEHERA (220301130015), N DIBYANSU DIBYARANJAN (220301120404),
PRIYANSU BARIK (220301120419) who carried out the project work under my supervision. This is
to further certify to the best of my knowledge, that this project has not been carried out earlier in this
institute and the university.

SIGNATURE
(Prof SWARNA PRABHA JENA)
Professor of ECE Engg.
Certified that the above-mentioned project has been duly carried out as per the norms of the college
and statutes of the university.

SIGNATURE
(DR. RAJKUMAR MAHANTA)
Professor of CSE Engg.
ACKNOWLEDGEMENTS

I wish to express my profound and sincere gratitude to Prof. Prof. SWARNA PRABHA JENA
Department of ELECTRONICS AND COMMUNICATION Engineering, CUTM, BHUBANESWAR
who guided us into the intricacies of this project non-chalantly with matchless magnanimity.

We thank DR. RAJKUMAR MAHANTA, Head of the Dept. of ELECTRONICS AND

COMMUNICATION Engineering, CUTM, BHUBANESWAR, and DR. SUJATA CHAKRAVARTY,
DEAN, SOET CUTM for extending their support during Course of this investigation.

We would be failing in our duty if we don’t acknowledge the co-operation rendered during various
stages of image interpretation by Prof. SWARNA PRABHA JENA

We are highly grateful to Prof. SWARNA PRABHA JENA who evinced keen interest and invaluable
support in the progress and successful completion of our project work.

SASWAT KUMAR PANDEY (220301120423)

SOUMYA RANJAN BEHERA (220301130015)
N DIBYANSU DIBYARANJAN (220301120404)
PRIYANSU BARIK (220301120419)
Table Of Content

Chapter 1. INTRODUCTION: -

1.i- Background

Chapter 2. LITERATURE REVIEW

Chapter 3. METHODOLOGY

3.i- Dataset

3.ii- Data analysis

3.iii- library used

3.iv- Algorithms used

Chapter 4. RESULT

Chapter 5. CONCLUSION

REFERENCE
CHAPTER 1. INTRODUCTION
1.i- Background
A password strength checker is a tool that determines the security level of a password.
Passwords are used to protect sensitive information, and a strong password is essential for
keeping that information secure. Password strength is typically evaluated based on various
factors such as length, complexity, and uniqueness. A password strength checker can help users
to create strong passwords by providing feedback on the strength of their current passwords or
suggesting more secure alternatives. Password strength checkers use various techniques such
as rule-based systems, pattern matching, and machine learning algorithms to determine the
strength of a password. Machine learning-based password strength checkers use a dataset of
passwords and their corresponding strengths to train a model that can accurately predict the
strength of new passwords. These models can take into account a wide range of factors that
contribute to password strength, including character types, patterns, and length. Overall, a
password strength checker is an important tool for ensuring the security of sensitive
information. By providing users with feedback on the strength of their passwords, they can
encourage the use of stronger passwords that are less susceptible to hacking attempts.
CHAPTER 2: LITERATURE REVIEW
The motivation for building a password strength checker using Python and machine learning
is to create a more effective and efficient tool for evaluating the strength of passwords.
Password strength is an essential aspect of cybersecurity, and weak passwords can be easily
compromised, leading to significant security breaches. By using machine learning, a password
strength checker can analyse various factors that contribute to the strength of a password, such
as length, complexity, use of special characters, and other patterns. The machine learning model
can learn from a large dataset of password examples, which can include both strong and weak
passwords, to identify patterns and correlations that can help it accurately evaluate the strength
of a password. Python is a popular programming language for machine learning due to its
simplicity, versatility, and wide range of libraries and frameworks that support machine
learning tasks. Using Python, developers can build a password strength checker that can
evaluate passwords in real-time, providing immediate feedback to users and helping them to
choose stronger and more secure passwords. Overall, the motivation for building a password
strength checker using Python and machine learning is to enhance cybersecurity and protect
against potential security breaches caused by weak passwords. A password strength checker
using Python and machine learning could make a significant contribution in enhancing the
security of online accounts and reducing the likelihood of data breaches. Here are a few ways
in which such a tool could be beneficial, Improved password strength assessment: Machine
learning models could be trained on a large dataset of passwords to identify common patterns
and characteristics of weak and strong passwords. This information could then be used to
develop a more accurate password strength checker that can evaluate the strength of a password
based on various factors such as length, complexity, and uniqueness. Real-time password
strength feedback: With a machine learning-based password strength checker, users could
receive real-time feedback on the strength of their password as they type it in. This feedback
could include suggestions for how to make the password stronger, such as adding special
characters or increasing the length. Customized password recommendations: A machine
learning-based password strength checker could analyse a user's previous password choices
and provide customized recommendations for creating strong passwords that are more likely
to be remembered by the user. Security alerts: Machine learning algorithms could be trained to
recognize patterns of suspicious activity, such as multiple failed login attempts or attempts to
log in from unusual locations. If such activity is detected, the system could alert the user to
change their password to a stronger one. Overall, a password strength checker using Python
and machine learning could help improve the security of online accounts and reduce the risk
of data breaches by providing users with more accurate and personalized feedback on the
strength of their passwords.
CHAPTER3: METHODOLOGY
Dataset:-
The passwords used in our analysis are from 000webhost leak that is available online. How did
we figure out which passwords were stronger and which were weaker? Well, there is a tool
called PARS by Georgia Tech university which have all the commercial password meters
integrated into it. All I did was give that tool all the passwords and it gave me new files for
each commercial password strength meter. The files contained the passwords with one more
column i.e their strength based on the commercial password strength meters. The commercial
password strength algorithms I used are of Twitter, Microsoft and battle. How is this algorithm
different from these strength meters? First of all, it is entirely based on machine learning rather
than on rules. Secondly, I only kept those passwords that were flagged weak, medium and
strong by all three strength meters. This means that all the passwords were indeed either weak,
medium or strong.

I had a total of 3 million passwords but after taking the intersection of all classifications of
commercial meters, I was left with 0.7 million passwords. The reduction was because of the
fact that I only used passwords that were flagged in a particular category by all three algorithms.
Data Analysis

a
After analysing this data in Bar graph, we found that we have more than six lakhs’ data & we
have taken three level shown in the bars where: Zero indicates –
0 indicates- Easy password
1 indicates– Medium password
2 indicates – Strong password
Library used: -
 NumPy
 Pandas
 Matplotlib

Algorithms used:-
Random Forest Classifier

i- RandomForestClassifier- Random Forest Classifier is a popular machine

learning algorithm used for classification tasks. It is an ensemble learning method
that combines multiple decision trees and produces a more accurate and stable
prediction than a single decision tree.
In a random forest classifier, a set of decision trees are built using different subsets of the
training data and different subsets of the features. Each tree makes a prediction, and the final
prediction is made by taking the majority vote of all the trees. This approach reduces
overfitting and increases the accuracy and robustness of the model.
The random forest classifier is commonly used for a wide range of applications, such as in
finance for fraud detection, in medicine for disease diagnosis, in image recognition, and in
many other fields. It is a powerful and flexible algorithm that can handle both binary and
multi-class classification problems.
ii. TfidfVectorizer- TF-IDF (Term Frequency-Inverse Document Frequency)
vectorizer is a technique used to transform text data into a numerical format that
can be used for machine learning algorithms. It is a commonly used technique for
text classification, information retrieval, and natural language processing.
The TF-IDF vectorizer assigns a weight to each word in a document based on how
frequently it appears in that document (term frequency) and how often it appears in all other
documents (inverse document frequency). This weighting scheme helps to identify the most
important words in a document and reduces the importance of common words like "the" or
"and".
The TF-IDF vectorizer creates a vector for each document where the length of the vector is
the total number of unique words in the corpus, and each entry in the vector corresponds to
the TF-IDF weight of the corresponding word in the document. This vector representation
can then be used as input to machine learning algorithms.

The TF-IDF vectorizer is a widely used technique and is available in many popular machine
learning libraries like Scikit-learn and TensorFlow. It is particularly useful for tasks like
sentiment analysis, text classification, and topic modelling.

iii- Train_test_splitit- Train-Test Splitting is a technique used in machine learning to

evaluate the performance of a model. It involves splitting a dataset into two
separate sets: a training set and a testing set.
The training set is used to train the model, while the testing set is used to evaluate the model's
performance on unseen data. The goal is to build a model that can generalize well to new,
unseen data, and not just memorize the training data.
The train-test splitting process involves randomly dividing the dataset into two parts: the
training set and the testing set. The most common split is 80/20 or 70/30, where the training
set contains 70-80% of the data, and the testing set contains the remaining 20-30%.
The training set is used to fit the model to the data by optimizing the model parameters. The
testing set is then used to evaluate the performance of the model by calculating metrics such
as accuracy, precision, recall, and F1-score. These metrics provide insight into how well the
model is performing on unseen data.
It's important to note that the test set should only be used for evaluation purposes, and should
not be used for model training or parameter tuning. If the test set is used for these purposes, it
can lead to overfitting and an inaccurate evaluation of the model's performance on new,
unseen data.
In summary, train-test splitting is a crucial step in the machine learning pipeline, as it allows
us to estimate how well our model will perform on new, unseen data.
CHAPTER 4:RESULT: -

Sl No. Algorithm Accuracy

1. RandomForestClassifier 95.5%
CHAPTER 5: CONCLUSION: -
We compare the scores of 80% training data – 20% testing data. The author finds in comparison
of above six lakhs features that the accuracy by RandomForestClassifier.
The strength of password in machine learning depends on various factors such as the quality
of password, length of password, and the choice of machine learning model. In general, the
accuracy of the developed model can vary from 60% to 90%, depending on the complexity of
the problem and the quality of the data.
a password strength checker using machine learning in Python can be a useful tool for
evaluating the strength of passwords. By analyzing the features of a password, such as length,
character types, and patterns, a machine learning model can make predictions about the
password's strength.
To build a password strength checker using machine learning in Python, you would first need
to collect a dataset of password samples with known strengths. You could then pre-process the
data and extract relevant features before training a machine learning algorithm, such as a
decision tree or a neural network.
Once the model is trained, you can use it to predict the strength of new passwords that are
entered into the system. You could also use the model to suggest ways to improve weak
passwords or to enforce password strength requirements in your application or system.
It's important to note that no password strength checker can guarantee the security of a
password. However, a password strength checker can be a useful tool for encouraging users to
choose stronger passwords and for identifying weak passwords that may be vulnerable to
attacks.
REFERENCE: -
1. https://www.kaggle.com/datasets/bhavikbb/password-strength-classifier-dataset
2. https://youtu.be/BzManFSX5lg
3. https://youtu.be/x3GfMmzHJa8

Mini Project (Ramdom Password Gernator)
No ratings yet
Mini Project (Ramdom Password Gernator)
28 pages
Project Report
0% (1)
Project Report
23 pages
Report Minor Project PDF
No ratings yet
Report Minor Project PDF
37 pages
E Bill Report
No ratings yet
E Bill Report
23 pages
Summer Training Report - Ishan Patwal
No ratings yet
Summer Training Report - Ishan Patwal
52 pages
Cracking Non-Hashed Passwords
No ratings yet
Cracking Non-Hashed Passwords
15 pages
It Is A Mini Project
No ratings yet
It Is A Mini Project
34 pages
Hostel Management System Hostel Management System
No ratings yet
Hostel Management System Hostel Management System
12 pages
Machine Learning-Powered Web Application For Predicting and Identifying Fake Job Listing
No ratings yet
Machine Learning-Powered Web Application For Predicting and Identifying Fake Job Listing
6 pages
Python Report PDF
No ratings yet
Python Report PDF
41 pages
Password Generator - 20240401 - 185026 - 0000
100% (1)
Password Generator - 20240401 - 185026 - 0000
14 pages
College Management e Magazine
No ratings yet
College Management e Magazine
82 pages
Online Voting Management System Project Report
No ratings yet
Online Voting Management System Project Report
110 pages
Tourism Report PDF
No ratings yet
Tourism Report PDF
40 pages
Python Project: Parking Management System
No ratings yet
Python Project: Parking Management System
21 pages
Bmi Calculator
No ratings yet
Bmi Calculator
21 pages
Final Mini Project
No ratings yet
Final Mini Project
50 pages
Project - Report
No ratings yet
Project - Report
56 pages
LP-III - Mini Project Report (ML)
No ratings yet
LP-III - Mini Project Report (ML)
15 pages
Ste Micro-Project
No ratings yet
Ste Micro-Project
8 pages
RWPD Microproject by Group 9
No ratings yet
RWPD Microproject by Group 9
13 pages
Python Microproject
0% (1)
Python Microproject
20 pages
114-Online Gas Agency Management System-Synopsis
100% (1)
114-Online Gas Agency Management System-Synopsis
6 pages
Final CPP Project
No ratings yet
Final CPP Project
19 pages
Project Report On Employee Management System
No ratings yet
Project Report On Employee Management System
20 pages
Library Management System Project Report Compress
No ratings yet
Library Management System Project Report Compress
95 pages
SCIENTIFIC Calculator Report File
No ratings yet
SCIENTIFIC Calculator Report File
69 pages
Software Testing Code - 22518
No ratings yet
Software Testing Code - 22518
2 pages
Project Report On Password Manager With Multi Factor Authentication
No ratings yet
Project Report On Password Manager With Multi Factor Authentication
60 pages
Chatbot Using PHP: Department of Computer Engineering
No ratings yet
Chatbot Using PHP: Department of Computer Engineering
16 pages
AI Lab Manual
No ratings yet
AI Lab Manual
37 pages
Dbms Final Report Nithin and Ramesh
No ratings yet
Dbms Final Report Nithin and Ramesh
40 pages
Mini Project Report
No ratings yet
Mini Project Report
25 pages
PGDCA
No ratings yet
PGDCA
7 pages
Final Project Report New
No ratings yet
Final Project Report New
109 pages
Own Cryptography System: A Project Report
No ratings yet
Own Cryptography System: A Project Report
52 pages
Python Capstone Project On Message Encrypter and Decrypter
100% (1)
Python Capstone Project On Message Encrypter and Decrypter
24 pages
Final Year Project Report 2
No ratings yet
Final Year Project Report 2
96 pages
DSBDA Practical Final
No ratings yet
DSBDA Practical Final
49 pages
Soft Computing Techniques
No ratings yet
Soft Computing Techniques
48 pages
Online Discussion Project Report
0% (1)
Online Discussion Project Report
75 pages
Journal App Report
No ratings yet
Journal App Report
37 pages
AJS - NJS Manual - 1
No ratings yet
AJS - NJS Manual - 1
45 pages
Flight Delay Prediction: Project Synopsis On
No ratings yet
Flight Delay Prediction: Project Synopsis On
13 pages
Applet For Indian Flag 1.0 Rationale:-: Micro-Project Proposal
No ratings yet
Applet For Indian Flag 1.0 Rationale:-: Micro-Project Proposal
6 pages
Hangman Project PDF
No ratings yet
Hangman Project PDF
17 pages
Password Checker
No ratings yet
Password Checker
56 pages
Cyber Security Project
No ratings yet
Cyber Security Project
1 page
Railway Reservation System
No ratings yet
Railway Reservation System
51 pages
Chatbot Abstract
No ratings yet
Chatbot Abstract
6 pages
Visvesvaraya Technological University BELGAUM-590014: "Online Agriculture Products Marketing"
100% (1)
Visvesvaraya Technological University BELGAUM-590014: "Online Agriculture Products Marketing"
30 pages
Projects 1920 A12
No ratings yet
Projects 1920 A12
78 pages
Creating Simple Calculator Using Function in Python (2020-2021)
No ratings yet
Creating Simple Calculator Using Function in Python (2020-2021)
11 pages
Abstract
100% (1)
Abstract
42 pages
Quiz project with Java documentation
No ratings yet
Quiz project with Java documentation
68 pages
E Mart
No ratings yet
E Mart
7 pages
Software Engineering Software Requirements Specification (SRS) Document
No ratings yet
Software Engineering Software Requirements Specification (SRS) Document
13 pages
May_Jun_OOPs Solved Question Papers-2019 pattern
No ratings yet
May_Jun_OOPs Solved Question Papers-2019 pattern
48 pages
Nis Microproject
No ratings yet
Nis Microproject
19 pages
Touchpad Plus Ver. 1.1 Class 7
From Everand
Touchpad Plus Ver. 1.1 Class 7
Nisha Batra
No ratings yet
IC 1 Minimum Standards For Intensive Care Units
No ratings yet
IC 1 Minimum Standards For Intensive Care Units
15 pages
Excel Basics: You Will Learn Basic Concepts in This Module Which Will Give You A Chance
No ratings yet
Excel Basics: You Will Learn Basic Concepts in This Module Which Will Give You A Chance
4 pages
Karim Abdul Rahim Aquil v. State of South Carolina Attorney General of South Carolina, 81 F.3d 148, 4th Cir. (1996)
No ratings yet
Karim Abdul Rahim Aquil v. State of South Carolina Attorney General of South Carolina, 81 F.3d 148, 4th Cir. (1996)
5 pages
Statistical Arbitrage For Mid-Frequency Trading
No ratings yet
Statistical Arbitrage For Mid-Frequency Trading
17 pages
Peta#1 Fabm1
No ratings yet
Peta#1 Fabm1
5 pages
SAP-TCodes Module MDM-EN
No ratings yet
SAP-TCodes Module MDM-EN
8 pages
ACCA PER Objective Booklets
No ratings yet
ACCA PER Objective Booklets
24 pages
The National Art Center Tokyo
No ratings yet
The National Art Center Tokyo
2 pages
Unit I HONOR- Discrete Distibution
No ratings yet
Unit I HONOR- Discrete Distibution
12 pages
Nota Multimedia
No ratings yet
Nota Multimedia
84 pages
Carmelo A. Arreza, Et Al v. The Gregorio Araneta University Foundation
No ratings yet
Carmelo A. Arreza, Et Al v. The Gregorio Araneta University Foundation
2 pages
Bernard Valve Profibus Communication
No ratings yet
Bernard Valve Profibus Communication
12 pages
Import of Arc Bridge From Civil To GTS NX For Time History Analysis
No ratings yet
Import of Arc Bridge From Civil To GTS NX For Time History Analysis
37 pages
Answer 1: Introduction: Payment of Wages Act, 1936 Is Based On Recommendation of The
No ratings yet
Answer 1: Introduction: Payment of Wages Act, 1936 Is Based On Recommendation of The
5 pages
Apj.159) Alireza Bahadori Hari B. Vuthaluru Saeid Mokhatab - Optimizing Separator Pressures in The Multistage Crude
No ratings yet
Apj.159) Alireza Bahadori Hari B. Vuthaluru Saeid Mokhatab - Optimizing Separator Pressures in The Multistage Crude
7 pages
HP 210A Brochure PDF
No ratings yet
HP 210A Brochure PDF
2 pages
Tesla R32 Heat Pump Air To Water TGTP-8HMDA1 Spec 2023 ENG
No ratings yet
Tesla R32 Heat Pump Air To Water TGTP-8HMDA1 Spec 2023 ENG
1 page
English q3 Module8
No ratings yet
English q3 Module8
8 pages
Bowthorpe EMP Remote Surge Monitoring Systems
No ratings yet
Bowthorpe EMP Remote Surge Monitoring Systems
7 pages
Document
No ratings yet
Document
8 pages
ENGLISH
No ratings yet
ENGLISH
13 pages
ICT Development in Bangladesh
100% (1)
ICT Development in Bangladesh
2 pages
Arduino Mega 2560 Datasheet
No ratings yet
Arduino Mega 2560 Datasheet
16 pages
Bugreport CPH2565 UP1A.230620.001 2024 05 05 21 17 06 Dumpstate - Log 19688
No ratings yet
Bugreport CPH2565 UP1A.230620.001 2024 05 05 21 17 06 Dumpstate - Log 19688
22 pages
5EST Participants List
No ratings yet
5EST Participants List
21 pages
Android Users Guide
No ratings yet
Android Users Guide
81 pages
Ketan Parekh Account
No ratings yet
Ketan Parekh Account
20 pages
The Business Research Process
100% (1)
The Business Research Process
41 pages
Annual Report CD for FY 2023 2024pdf a3cbe899e07e81ecbffcb6659c0bebbd
No ratings yet
Annual Report CD for FY 2023 2024pdf a3cbe899e07e81ecbffcb6659c0bebbd
18 pages
Martillo de Fondo QLX-50 Secoroc Ac
No ratings yet
Martillo de Fondo QLX-50 Secoroc Ac
3 pages

ML Report 9 PDF

Uploaded by

ML Report 9 PDF

Uploaded by

PASSWORD STRENGTH CHECKER

USING MACHINE LEARNING

CENTURION UNIVERSITY OF TECHNOLOGY AND

We thank DR. RAJKUMAR MAHANTA, Head of the Dept. of ELECTRONICS AND

SASWAT KUMAR PANDEY (220301120423)

Chapter 2. LITERATURE REVIEW

3.ii- Data analysis

3.iii- library used

3.iv- Algorithms used

i- RandomForestClassifier- Random Forest Classifier is a popular machine

iii- Train_test_splitit- Train-Test Splitting is a technique used in machine learning to

Sl No. Algorithm Accuracy

You might also like