Project Health
a baseline to start from
Learning Objectives
- Name the six dimensions of a minimum viable research software project.
- Score your own project on each dimension.
- Identify the highest-priority gaps.
Before
Three weeks after Jess started their post-doc, she got an email from Aaron at another lab, who had been trying to get the simulator running for two days without success. The setup instructions referenced a Conda environment file that no longer matched the pinned versions, and the tests didn't pass because he didn't have the dataset that Jess had downloaded on her machine.
"Healthy" for a research software project means more than "runs on my machine." It means the project is findable by people who could use it, reproducible by people who want to build on it, and maintainable by someone who isn't you. Most research software projects score well on none of these, and most researchers don't find out until someone like Aaron emails them.
What "Healthy" Means
You can use these six dimensions to assess the state of your project:
- Findability
- The project has a DOI and a
CITATION.cfffile, is listed in at least one relevant registry, and follows the FAIR Principles for findable, accessible, interoperable, and reusable data [Lin2020, Wilkinson2016]. Without these, two groups will independently build the same tool, and neither will find the other until years later. - Reproducibility
- Dependencies are pinned, the project runs inside a virtual environment or container, and there is a script that re-runs the whole analysis from raw data with a fixed seed for any stochastic steps [Taschuk2017]. Without this, collaborators can't run the code or check results.
- Testability
- Automated tests exist and are run on every change via continuous integration. Without this, every fix can introduce new bugs. (You may or may not track test test coverage; if you do, you keep track so that it doesn't quietly decline.)
- Contribution Pathway
- There is a
CONTRIBUTING.mdthat a stranger could follow, issues are labeled so newcomers can find things to do, and the process for submitting and reviewing pull requests is written down. Without this, potential contributors will quickly give up. - Governance
- There are written rules for who decides what and how those decisions are made public. Without this, decisions get made by whoever shouts loudest or gives up last.
- Sustainability
- More than one person can make a release, the lottery factor is documented, and there is a succession plan. Without this, the project dies when the post-doc graduates.
These six points focus mostly on the software, not on the team, because that's what most participants in this workshop have the most experience with, and what they are most comfortable talking about at first. Later modules will talk more about the human aspects of management.
Honesty is Uncomfortable
Starting with an honest audit is uncomfortable. It matters because it tells you which something to fix first. Improving your lowest dimension by one point before next week is worth more than polishing a dimension that's already 4/5.
In our experience,
an LLM will usually give your project a higher score than it deserves:
it has no way to know whether your CONTRIBUTING.md has been tested recently,
whether your CI actually blocks merges,
or whether anyone other than you can cut a release.
After
Here are Jess's scores:
| item | score | explanation |
|---|---|---|
| Findability | 2/5 | GitHub repo exists, no DOI, not in any registry |
| Reproducibility | 1/5 | a requirements.txt exists but versions aren't pinned |
| Testability | 3/5 | a pytest suite exists, but CI is not configured |
| Contribution pathway | 1/5 | CONTRIBUTING.md is one sentence: "We welcome pull requests." |
| Governance | 0/5 | nothing is written down |
| Sustainability | 2/5 | Jess can make a release; her colleague Tahia probably could with help |
The 0 on governance was the one that surprised her. She hadn't written down how decisions were made because nobody had disagreed yet. She found out later that this pattern is nearly universal: governance feels unnecessary until the moment it isn't.
Exercises
Self-Audit (10 min)
Using the six-dimension rubric, score your project. Include evidence for each dimension, (e.g., the URL of the governance description).
LLM Audit (5 min)
Repeat the audit using an LLM.
I have a research software project called [name]. It does [one sentence]. Its repository is [URL if public]. Score it on these six dimensions and explain each score: findability, reproducibility, testability, contribution pathway, governance, sustainability.
Note any dimension where the LLM seems to have assumed something exists that you know doesn't. Note dimensions where the LLM's score differs from yours by more than one point.
Action Planning (5 min)
-
Identify the two dimensions with the largest gaps.
-
Write one concrete action you could take in the next two weeks to raise the score.