Continuous Integration

Running tests and security checks automatically on every pull request

Concepts

Continuous integration (CI) is the practice of running tests and other checks automatically every time a change is proposed
- problems are caught before code is merged rather than after
A GitHub Actions workflow is a YAML file stored in .github/workflows/ that describes when to run, what machine to use, and what steps to execute in order
A workflow is triggered by an event (such as opening a pull request)
- runs one or more jobs in parallel by default
- each job runs a sequence of steps that execute commands or call pre-built actions
pip-audit scans a project's dependencies against a database of known vulnerabilities and exits with a non-zero status if any are found
- causes the CI job to fail
Ruff's S (security) rule set flags patterns that are commonly exploited: shell injection, weak hash algorithms, hardcoded passwords
A failing CI check on a pull request does not block the merge by default
- configure branch protection rules to require the check to pass before merging

Why automate?

Running tests by hand before every merge is aspirational, not realistic
- within a month of starting a project, you will forget
- within six months, your collaborators will forget too
A CI workflow runs the same checks every time, on a clean machine, without anyone having to remember to do it

A first workflow

A .github/workflows/ci.yml file that triggers on pull requests to main
Setting up the environment with astral-sh/setup-uv and uv sync
Running uv run task check and uv run pytest as separate steps so that a linting failure and a test failure produce separate, identifiable results

name: CI

on:
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v5
      - run: uv sync
      - run: uv run task check
      - run: uv run pytest

Why ubuntu-latest rather than macos-latest: Linux runners are faster and cheaper on GitHub's free tier, and the application has no platform-specific code

Scanning for vulnerable dependencies

pip-audit queries the Python Packaging Advisory Database for known vulnerabilities in the packages listed in uv.lock
Adding it as a separate job means a vulnerability in a dependency fails CI independently of whether the tests pass

  audit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: astral-sh/setup-uv@v5
      - run: uv sync
      - run: uv run pip-audit

The difference between a vulnerability in a direct dependency (a package you chose) and a transitive dependency (a package your package depends on), and why both matter
What to do when pip-audit reports a vulnerability in a package you cannot easily upgrade

Static security analysis

Enabling ruff's S rule set in pyproject.toml to catch dangerous patterns in the application's own code
Common violations in LLM-generated web code:
- subprocess with shell=True
- hashlib.md5 for anything security-sensitive
- SQL strings assembled by string concatenation rather than parameterized queries
Adding uv run ruff check --select S as a step, or folding it into check
The difference between what pip-audit finds (problems in packages you depend on) and what ruff's S rules find (problems in code you wrote)

Reading the results

Where to find workflow output: the "Actions" tab, per-job logs, and inline annotations on the pull request diff
How to read a failed step: exit code, last few lines of output, and which step failed tell you where to start
Re-running a failed job after pushing a fix without opening a new pull request
What to do when a check fails because of a problem that is not yours to fix: suppress with a comment, open an issue, document the decision

Check for Understanding

What is the difference between a CI job and a CI step?

A job is a collection of steps that run sequentially on the same virtual machine. Multiple jobs in a workflow run in parallel by default, each on their own machine, and each starts from a clean environment. A step is a single command or action within a job. Steps within a job share the same filesystem and environment variables; jobs do not share anything unless you explicitly pass artifacts between them.

Why run the linting check and the test suite as separate steps rather than one shell command?

When they are separate steps, GitHub shows each one as a named row with its own pass/fail indicator. If linting fails and tests are not run (because the step failed and subsequent steps are skipped by default), you know immediately that the problem is a linting violation, not a test failure. A single combined command produces a single pass/fail result, which gives you less information when something goes wrong.

What does pip-audit check, and what does it not check?

pip-audit checks the packages installed in the environment against a database of known vulnerabilities that have been assigned CVE identifiers. It does not check for bugs in your own code, insecure coding patterns, or vulnerabilities that have not yet been publicly disclosed. It is a necessary check, but not a sufficient one---which is why ruff's S rules and code review still matter.

If a CI check fails on a pull request, does that prevent the merge?

Not by default. GitHub shows the failed check on the pull request page, but still offers a "Merge" button unless you have configured branch protection rules to require the check to pass. Go to the repository settings, select "Branches", add a branch protection rule for main, and enable "Require status checks to pass before merging". Without that setting, CI is advisory, not mandatory.

Exercises

Require the check to pass

Configure branch protection rules on your repository so that the CI workflow must pass before a pull request can be merged into main. Open a pull request that deliberately breaks a test and confirm that GitHub blocks the merge. Fix the test, push again, and confirm the merge becomes available once CI passes.

Add a matrix build

Modify the workflow to run the test suite on both Python 3.12 and Python 3.13 in parallel using a build matrix. Ask the LLM to show you the strategy.matrix syntax. Read the generated YAML before accepting it and verify that both Python versions appear as separate jobs in the Actions tab.

Audit the current dependencies

Run uv run pip-audit locally and examine the output. If any vulnerabilities are reported, look up the CVE identifier for one of them and read its description. Write a one-paragraph summary of what the vulnerability is, whether it affects this application's use of the package, and what the remediation options are.

Triage a ruff S violation

Enable ruff's S rule set locally by adding it to pyproject.toml and run uv run ruff check --select S. For each violation reported, decide whether it represents a real risk in this application or a false positive. For violations you suppress, write an inline comment that explains why the suppression is safe. For violations that represent real risks, fix them.