0% found this document useful (0 votes)
830 views

Stack Overflow

This project report summarizes a project to clone the Stack Overflow website. It includes an introduction, literature review, design flow/process, results analysis and validation, and conclusion. The report was submitted by two students, Shashank Singh and Nomula Bala Sai Vighnesh Guptha, under the supervision of Disha Sharma to fulfill the requirements for a Bachelor of Engineering degree.

Uploaded by

Shashank Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
830 views

Stack Overflow

This project report summarizes a project to clone the Stack Overflow website. It includes an introduction, literature review, design flow/process, results analysis and validation, and conclusion. The report was submitted by two students, Shashank Singh and Nomula Bala Sai Vighnesh Guptha, under the supervision of Disha Sharma to fulfill the requirements for a Bachelor of Engineering degree.

Uploaded by

Shashank Singh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 35

Stack over Flow Clone Website

A PROJECT REPORT

Submitted By:
Shashank Singh(20BCS9109)
Nomula Bala Sai Vighnesh Guptha(21BCS8054)

in partial fulfilment for the award of the degree of


BACHELOR OF ENGINEERING
(Computer Science & Engineering)
BONAFIDE CERTIFICATE

Certified that this project report “Stack over Flow Clone Website” is the bonafide
work of”Shashank Singh and Nomula Bala Sai Vighnesh Guptha” who carried out
the project work under my/our supervision.

SIGNATURE SIGNATURE

DISHA SHARMA

SUPERVISOR

The project Viva–Voce Examination of__________________ has been held on


____________ and accepted.
ACKNOWLEDGEMENT

We take this opportunity to express my gratitude to all those people who have been directly and indirectly
with me during the completion of this Project. We pay a special thanks to my guide Disha Sharma, who has
given guidance and a light to us during this training.We acknowledge here out debt to those who contributed
significantly to one or more steps. We take full responsibility for any remaining sins of omission and
commission.
TABLE OF CONTENTS
1. List of Figures
2. Abstract
3. Chapter 1- Introduction
4. Chapter 2- Literature Review/Background Study
5. Chapter 3- Design Flow/Process
6. Chapter 4- Results Analysis and Validation
7. Chapter 5- Conclusion and Future Work
8. References
List of Figures

1. Use Case Diagram for Stack Overflow

2. Class Diagram for Stack Overflow

3. UML for Stack Overflow

4. Activity Diagram for Stack Overflow

5. Sequence Diagram for Stack Overflow


ABSTRACT

When programmers look for how to achieve certain programming tasks, Stack Overflow is a popular
destination in search engine results. Over the years, Stack Overflow has accumulated an impressive knowledge
base of snippets of code that are amply documented. We are interested in studying how programmers use these
snippets of code in their projects. Can we find Stack Overflow snippets in real projects? When snippets are
used, is this copy literal or does it suffer adaptations? And are these adaptations specializations required by
the idiosyncrasies of the target artifact, or are they motivated by specific requirements of the programmer?
The large-scale study presented on this paper analyzes 909k non-fork Python projects hosted on Github, which
contain 290M function definitions, and 1.9M Python snippets captured in Stack Overflow. Results are
presented as quantitative analysis of block-level code cloning intra and inter Stack Overflow and GitHub, and
as an analysis of programming behaviors through the qualitative analysis of our findings.
CHAPTER 1- INTRODUCTION:
This work presents a systematic mapping study and quality evaluation of publications that are written about
Stack Overflow. Stack Overflow (available via stackoverflow.com) is the leading Community Question and
Answer (CQA) platform for programmers, with more than 10 million users, contributing some 16 million
questions as of 20191 . This platform’s success is evident through the fact that 92% of questions posted are
answered, with a median answering time of 11 minutes . Parnin and Treude found that 84.4% of general web
searches for jQuery API on Google resulted in at least one Stack Overflow outcome returned on the first page,
showing that programmers can turn to Stack Overflow for questions already asked. Chen and Xing also found
data from Stack Overflow tags to be useful for creating an overall view of technologies trend. Software
engineering has a particular reliance on crowdsourced knowledge (and CQAs), given the community’s general
drive and emphasis on knowledge reuse . CQA can benefit software practitioners seeking information, as it is
likely that other practitioners have faced a similar problem, and so, a relevant question may have already been
asked that has invoked a suitable answer. On the other hand, new questions may be created, and experts have
the opportunity to lend their particular experience which allows them to solve a problem and gain the respect
of their peers in the community. Beyond Stack Overflow, Yahoo!Answers2, TopCoder3 and Bountify4 all
provide crowd-based support for software practitioners’ challenges. However, Stack Overflow tends to be the
preferred choice for developers . While few would doubt the utility of Stack Overflow to software
practitioners, questions regarding the quality of the responses generated to questions abound. For instance, Jin,
et al. studied how the need to ‘win’ reputation rewards can influence answer quality, or the drive and
willingness to underestimate the quality of answers provided by others. Anand and Ravichandran identified a
need for separating answer quality and popularity, as the reward system (contributors on Stack Overflow are
rewarded points for their contributions) provided by Stack Overflow can sometimes conflict with quality
answers. These works suggest that there is need for caution when approaching Stack Overflow for
recommended solutions to software engineering challenges. However, Stack Overflow code reuse is
ubiquitous. For instance, Abdalkareem, et al. investigated code reused from Stack Overflow in mobile apps
and found that 1.3% of the apps they sampled were constructed from Stack Overflow posts. They also
discovered that mid-aged and older apps contained Stack Overflow code introduced later in their lifetime. An,
et al. also investigated Android apps and found that 62 out of 399 (15.5%) apps contained exact code clones;
and of the 62 apps, 60 had potential license violations. In terms of Stack Overflow, they discovered that 1,226
posts contained code found in 68 apps. Furthermore, 126 Stack Overflow snippets were involved in code
migration, where 12 cases of migration involved apps published under different licenses. Yang, et al. noted
that, in terms of Python projects, over 1% of code blocks in their token form exist in both GitHub and Stack
Overflow. At an 80% similarity threshold, over 1.1% of code blocks in GitHub were similar to those in Stack
Overflow, and 2% of Stack Overflow code blocks were similar to those in GitHub.
CHAPTER 2- LITERATURE REVIEW/BACKGROUND STUDY
Platforms such as Stack Overflow are available for software practitioners to solicit solutions to their challenges
and knowledge needs. This community’s practices have in recent times however triggered quality-related
concerns. This is a noteworthy issue when considering that the Stack Overflow platform is used by numerous
software developers. Academic research tends to provide validation for the practices and processes employed
by Stack Overflow and other such forums. However, previous work did not review the scale of scientific
attention that is given to this cause. Evidence resulting from such an analysis could be useful for understanding
the focus of academic studies when considering Stack Overflow and how research is conducted on this forum.
Furthermore, pertinent research quality evaluations may direct replication studies.

PROBLEM DEFINITION:
Stack Overflow (SO) is an online, collaborative platform for developers to post their programming questions,
provide answers to the existing questions and find solution to their difficulties faced during programming. A
developer needs to add tags while posting a question to help other users to find out what the question is about.
If the answer provided by any user gives solution to the problem faced by the questioner, that answer can be
selected by the questioner which is called the accepted answer to that question. Different members of the site
can vote on questions and answers. The positive votes called the upvote and negative votes called the
downvote, which shows how helpful that question/answer was for other users. The score of a question/answer
is determined by the difference between the number of up/down votes. Based on the different activities of each
user on Stack Overflow such as posting questions or answers, voting on them, posting comments, etc.

OBJECTIVE:
This website serves as a platform fir users to ask and answer question, and, through membership and active
participation, to vote questions and answers up and down similar to Reddit and edit questions and answers in
a fashion similar to a wiki.
CHAPTER 2- DESIGN FLOW/PROCESS

Stack Overflow is a question and answer website for professional and enthusiast programmers. It is the
flagship site of the Stack Exchange Network. It was created in 2008 by Jeff Atwood and Joel Spolsky. It
features questions and answers on a wide range of topics in computer programming. It was created to be a
more open alternative to earlier question and answer websites such as Experts-Exchange. Stack Overflow was
sold to Prosus, a Netherlands-based consumer internet conglomerate, on 2 June 2021 for $1.8 billion.
The website serves as a platform for users to ask and answer questions, and, through membership and active
participation, to vote questions and answers up or down similar to Reddit and edit questions and answers in a
fashion similar to a wiki. Users of Stack Overflow can earn reputation points and "badges"; for example, a
person is awarded 10 reputation points for receiving an "up" vote on a question or an answer to a question, and
can receive badges for their valued contributions, which represents a gamification of the traditional Q&A
website. Users unlock new privileges with an increase in reputation like the ability to vote, comment, and even
edit other people's posts.
As of March 2021 Stack Overflow has over 14 million registered users, and has received over 21 million questions and
31 million answers. The site and similar programming question and answer sites have globally mostly replaced
programming books for day-to-day programming reference in the 2000s, and today are an important part of computer
programming. Based on the type of tags assigned to questions, the top eight most discussed topics on the site
are: JavaScript, Java, C#, PHP, Android, Python, jQuery, and HTML.

Use Case Diagram


We have five main actors in our system:

• Admin: Mainly responsible for blocking or unblocking members.


• Guest: All guests can search and view questions.
• Member: Members can perform all activities that guests can, in addition to which they can
add/remove questions, answers, and comments. Members can delete and un-delete their questions,
answers or comments.
• Moderator: In addition to all the activities that members can perform, moderators can
close/delete/undelete any question.
• System: Mainly responsible for sending notifications and assigning badges to members.

Here are the top use cases for Stack Overflow:

1. Search questions.
2. Create a new question with bounty and tags.
3. Add/modify answers to questions.
4. Add comments to questions or answers.
5. Moderators can close, delete, and un-delete any question.
Fig 1: Use Case Diagram for Stack Overflow
Class Diagram
Here are the main classes of Stack Overflow System:

• Question: This class is the central part of our system. It has attributes like Title and Description to
define the question. In addition to this, we will track the number of times a question has been viewed
or voted on. We should also track the status of a question, as well as closing remarks if the question
is closed.
• Answer: The most important attributes of any answer will be the text and the view count. In addition
to that, we will also track the number of times an answer is voted on or flagged. We should also track
if the question owner has accepted an answer.
• Comment: Similar to answer, comments will have text, and view, vote, and flag counts. Members
can add comments to questions and answers.
• Tag: Tags will be identified by their names and will have a field for a description to define them. We
will also track daily and weekly frequencies at which tags are associated with questions.
• Badge: Similar to tags, badges will have a name and description.
• Photo: Questions or answers can have photos.
• Bounty: Each member, while asking a question, can place a bounty to draw attention. Bounties will
have a total reputation and an expiry date.
• Account: We will have four types of accounts in the system, guest, member, admin, and moderator.
Guests can search and view questions. Members can ask questions and earn reputation by answering
questions and from bounties.
• Notification: This class will be responsible for sending notifications to members and assigning
badges to members based on their reputations.
Fig 2: Class Diagram for Stack Overflow
Fig 3: UML for Stack Overflow

Activity Diagram
Post a new question: Any member or moderator can perform this activity. Here are the steps to post a
question:
Fig 4: Activity Diagram for Stack Overflow
Sequence Diagram
Following is the sequence diagram for creating a new question:

Fig 5: Sequence Diagram for Stack Overflow


Chapter 4- RESULTS ANALYSIS AND VALIDATION
Outcomes show that Stack Overflow has attracted increasing research interest over the years, with topics
relating to both community dynamics and human factors, and technical issues. In addition, research studies
have been largely evaluative or proposed solutions; however, the latter approach tends to lack validation. The
contributions of these studies are often techniques or answers to a specific problem. Evaluating the quality of
all studies that were dedicated to software programming (58 papers), our outcomes show that on average only
58% of the developed quality criteria were met.
CODES:
CHAPTER 5-CONCLUSION AND FUTURE WORK
In this paper we conducted a systematic mapping study and quality evaluation on the crowdsourced knowledge
platform Stack Overflow to understand and validate the research that has been conducted on this platform thus
far. Software engineering has a particular reliance on CQA platforms, and particularly with the help that is
offered by programming code included in solutions (or answers) provided by contributors. With significant
interest in CQAs, and Stack Overflow in particular, it is necessary for academic work to ensure that
information provided in answers (and questions) on Stack Overflow is relevant, up-to-date and of high quality.
Ongoing research interest may indeed address this gap; however, previous work did not review the scale of
scientific attention that is given to this cause. We have looked to do so in this systematic mapping study and
quality evaluation, exploring the level of academic interest that is given to Stack Overflow, the topics that are
researched, the approaches that are used, the contributions that are provided by research studies, and the quality
of Programming/Development papers.
The key to Stack Overflow’s future and growth are the millions of developers from around the world who find
the site useful, but who haven’t yet been welcomed into the community. We need to expand our reach and
engagement to ensure these developers join the conversation and push their own learning to new heights.
REFERENCES
[1] Acar, Y., Backes, M., Fahl, S., Kim, D., Mazurek, M.L. and Stransky, C., You Get Where You're Looking For: The Impact
Of Information Sources on Code Security. in Proceedings of the IEEE Symposium on Security and Privacy (SP), (2016),
IEEE, 289-305.
[2] Afzal, W., Torkar, R. and Feldt, R. A systematic review of search-based testing for non-functional system properties.
(2009). Information and Software Technology, 51 (6). 957-976.
[3] Anand, D. and Ravichandran, S. Investigations into the Goodness of Posts in Q&A Forums—Popularity Versus
Quality. in Information Systems Design and Intelligent Applications, Springer, (2015), 639-647.
[4] Asaduzzaman, M., Mashiyat, A.S., Roy, C.K. and Schneider, K.A., Answering questions about unanswered questions
of stack overflow. in Proceedings of the 10th Working Conference on Mining Software Repositories, (2013), IEEE Press,
97-100.
[5] Cavusoglu, H., Li, Z. and Huang, K.-W., Can Gamification Motivate Voluntary Contributions?: The Case of
StackOverflow Q&A Community. in Proceedings of the 18th ACM Conference Companion on Computer Supported
Cooperative Work & Social Computing, (2015), ACM, 171-174.
[6] Chen, C. and Xing, Z., Mining technology landscape from stack overflow. in Proceedings of the 10th ACM/IEEE
International Symposium on Empirical Software Engineering and Measurement, (2016), ACM, 14.
[7] Genc-Nayebi, N. and Abran, A., A systematic literature review: Opinion mining studies from mobile app store user
reviews. (2017). Journal of Systems and Software, 125. 207-219.
[8] Ginsca, A.L. and Popescu, A., User profiling for answer quality assessment in Q&A communities. in Proceedings of
the 2013 workshop on Data-driven user behavioral modelling and mining from social media, (2013), ACM, 25- 28.
[9] Gupta, R. and Reddy, P.K., Learning from Gurus: Analysis and Modeling of Reopened Questions on Stack Overflow.
in Proceedings of the 3rd IKDD Conference on Data Science, (2016), ACM, 13.
[10] Hall, T., Beecham, S., Bowes, D., Gray, D. and Counsell, S. A systematic literature review on fault prediction
performance in software engineering. (2012). IEEE Transactions on Software Engineering, 38 (6). 1276-1304.

You might also like