Contributing

Contributions are very welcome. Please file issues or submit pull requests in our GitHub repository. All contributors will be acknowledged.

In Brief

Logical Structure

Tags for Issues and Pull Requests

FAQ

Why SQL?
Because if you dig down far enough, almost every data science project sits on top of a relational database. (Jon Udell once called PostgreSQL “an operating system for data science”.) SQL’s relational model has also been a powerful influence on dataframe libraries like the tidyverse, Pandas, and Polars; understanding the former therefore helps people understand the latter.
Why Ark?
The first version of this tutorial used Jekyll because it is the default for GitHub Pages and because its frustrating limitations would discourage contributors from messing around with the template instead of writing content. However, those limitations proved more frustrating than anticipated: in particular, very few data scientists speak Ruby, so previewing changes locally required them to install and use yet another language framework.
Why Make?
It runs everywhere, no other build tool is a clear successor, and, like Jekyll, it’s uncomfortable enough to use that people won’t be tempted to fiddle with it when they could be writing.
Why hand-drawn figures rather than Graphviz or Mermaid?
Because it’s faster to Just Effing Draw than it is to try to tweak layout parameters for text-to-diagram systems. If you really want to make developers’ lives better, build a diff-and-merge tool for SVG: programmers shouldn’t have to use punchard-compatible data formats in the 21st Century just to get the benefits of version control.
Why make this tutorial freely available?
Because if we all give a little, we all get a lot.