Database Foundations

Introducing SQLite for persistent, queryable storage

Concepts

Why a database?

Creating a schema

Opening a connection with sqlite3

Basic SQL

Inserting and querying records

Check for Understanding

What is SQL injection, and why does using ? placeholders prevent it?

SQL injection happens when user input is concatenated directly into a query string. For example, if site comes from a form and the query is f"select * from snails where site = '{site}'", a user who enters '; drop table snails; -- can destroy the database. Parameterized queries pass the value separately from the SQL text; the database driver escapes or quotes it automatically, so it is always treated as a literal value, never as SQL syntax.

What happens to the database if your Python script raises an exception inside a with conn: block?

The with conn: context manager calls conn.rollback() when the block exits due to an exception, undoing any changes made during the block. The database returns to the state it was in before the block started, so no partial or corrupted data is written.

Why does SQLite use a single file rather than a running server process, and what does that mean for concurrent access?

SQLite is an embedded database: the library runs inside your application process and reads and writes a single file directly. There is no separate server to start, configure, or connect to over a network. The trade-off is concurrency: SQLite allows multiple simultaneous readers but only one writer at a time, and locks the file for the duration of each write transaction. For a research tool with one or a few users this is fine; for a high-traffic site it is not.

What is the difference between a primary key and a unique constraint?

Both prevent duplicate values in a column. primary key additionally declares that the column is the canonical identifier for each row; a table can have at most one primary key, and it implicitly adds not null. unique can appear on multiple columns in the same table and allows null values (though the behavior of multiple nulls varies by database).

Exercises

Add a sites table

Prompt the LLM to add a second table for survey sites with columns for name, latitude, and longitude. Then prompt it to add a foreign key from the snail measurements table that references the sites table. Look up what a foreign key constraint does and verify that the LLM's SQL enforces it correctly.

Write a query yourself

Without prompting the LLM, write a SQL query that counts the number of measurements per site. Run it in the Python REPL against a database that has data loaded. Then prompt the LLM to write the same query and compare the two versions. Note any differences in style or correctness.

Inspect the database

Use the sqlite3 command-line tool (sqlite3 path/to/database.db) to open the database. Run .tables to list tables and .schema snails to display the table definition. Describe what the schema tells you about the constraints on the data.

Handle a migration

Add a new observer: text column to the existing snail measurements table using alter table. Before you run it, prompt the LLM what value existing rows will have for the new column. Run the migration and verify the answer is correct. Discuss whether not null would have been a reasonable constraint for this new column.