Research Software Design by Example

Advent 202 by Danielle Navarro

When I first began working with scientists I was puzzled by how little they knew about programming. I soon realized that they knew far more about it than I did about galaxy formation, quantum chemistry, or protein folding, and that the principal reason they didn't know more was that people like me had never taught them.

It took me a longer to understand that many of the practices I was pushing them to adopt were a poor fit for their actual problems: branches in version control aren't designed to manage dozens of experimental variations on a machine learning algorithm, and it's hard to write unit tests when no-one in the world knows what the correct answer is

This tutorial is my latest attempt to teach a few of the things that I am sure are useful. All material is available under open licenses; if you have suggestions or would like to contribute, please get in touch.

Learner Persona

Maya has a master's degree in genomics. She knows enough Python to analyze data from her experiments, but is struggling to write code other people can use. These lessons will teach her how to design, build, and test large programs in less time and with less pain.

Syllabus

  1. Introduction
  2. Making Grids
  3. Testing
  4. Parsing Messy Data
  5. Synthesizing Data
  6. A Complete Scenario
  7. Conclusion

Appendices

  1. License
  2. Code of Conduct
  3. Contributing
  4. Bibliography
  5. Glossary
  6. Links