Predicting the rise and fall of an open source project

Predicting the rise and fall of an open source project

Having abandoned or unmaintained components in your projects is not ideal. 

Why? Because having unmaintained open source projects is tech debt. The risk of having projects which are no longer maintained is grave and a potential black hole for resources.  

Most companies are used to battling vulnerabilities and license issues, but what about making sure to use open source components that are likely to keep not only existing but thriving? Is there a way of estimating, or predicting, which projects will fail?   

Why do projects fail?

No alt text provided for this image

Maintenance practices and the lack thereof 

In order to figure out if we in fact can predict failure, we need to understand why projects fail in the first place. 

While developing Open Source Select, the Debricked R&D team read a lot of research and wrangled the data to figure out what correlates with the rise or fall of OSS projects. We asked ourselves, “Are there any distinguishing characteristics of abandoned projects compared to continuously maintained ones?”. 

A wish for documentation 

It turns out that projects that do not document their contribution process and make it easy for new contributors to onboard themselves fail to a larger extent than those who do. 

This is typically embodied in the Contributing{.md/.txt/.rst} file, where developers read up on how to contribute or submit code changes to the project. 16% of failed projects, 72% of top-performing projects, and 32% of a random sample had such a file. Furthermore, a CI setup for the continuous test, build, and deployment correlates with the Project's survival. 27% of failed projects had a CI setup, while 68% of top projects and 45% of a random project sample had that in place.

Look left and right before importing 

Truck factor of different open source projects

The truck factor of different projects 

When looking at which projects fail, it is important to analyse the Truck Factor of said projects. By Truck Factor or Truck Factor Developers, we refer to the number of developers that must stop contributing to the project for the project to die. Truck Factor developers can be found in multiple ways, for instance, by analysing the commit history, code ownership, or looking at pull request activity.  

In Open Source Select, we call this “Core Team Commitment”. This feature analyses the activity of all maintainers with merge rights to the repository (due to its scalability). In this study, the authors investigated the impact of a Truck Factor in failed projects and found that 66% of all failed projects had a Truck Factor of 1, and 57% of repositories in their dataset had a Truck Factor of 1. 

On top of that, 16% of projects in their dataset faced a “failure event”, where a core team no longer existed in the project and activity was dead for at least a year. Luckily, in 41% of these cases, the failed projects managed to get back up on their feet. 

So now we know that without Truck Factor developers and no proper documentation, projects stand a bigger risk of failing. If you cannot easily onboard a new contributor, you will eventually dry up your supply of developers. As they say, “A one-man ship can only stay afloat for so long”. 

Maintainers leaving; what’s up with that? 

Life as an Open Source Maintainer is quite the balancing act. Usually, these developers maintain projects in their spare time or during a few precious hours as a part of other employments. If we look at this study, the reasons these maintainers leave their projects can be roughly divided into Project, Team, and Environment, where Project (37%) is the most common one.

The reasons to why maintainers leave an open source project

Why maintainers leave 

Taking a closer look, Project consists of Obsolete (your project isn’t relevant anymore), Outdated Technologies, and Low Maintainability, where Obsolete is the most common. The Team (35%) factor consists of Lack of time, Lack of interest, and Conflict among developers. Here, lack of time and interest makes up most cases. Environment (27%) considers more significant factors that are harder to control for the maintainer. Usurped by a competitor (typically, tech companies releasing their internal tool that competes with the project) makes up almost all cases in Environment and is the single most common reason maintainers leave.

 So, how do I find well-maintained projects? 

Our tool Open Source Select digs rather deep into some of this data to help developers choose and compare open source projects. Choosing carefully can potentially save you lots of time, sweat and money in the long run. Select gives you some of these key data points and helps you make informed decisions, giving you the ability to curate a codebase of healthy, thriving projects that actually solve your problems, rather than create new ones.   

Open Source Select is only a baby, a beta baby, and there’s a lot more to be added. For example, we want to help you search in code (yes, in the code) for functionality, contextualize your searches to make sure you get results that align with your organizational policies, and much more. Stay tuned! 

This post also exists in the form of a talk; watch it here.


References 

[Why Modern Open Source Projects Fail, Jailton Coelho and Marco Tulio Valente. 2017, https://doi.org/10.1145/3106237.3106246]. 

[On the abandonment and survival of open source projects: An empirical investigation, Guilherme Avelino and Eleni Constantinou and Marco Tulio Valente and Alexander Serebrenik. 2019, https://arxiv.org/abs/1906.08058

[Understanding the Factors That Impact the Popularity of GitHub Repositories, Borges, Hudson & Hora, Andre & Valente, Marco. 2016, https://ieeexplore.ieee.org/document/7816479

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics