Processes
- FIXME
Program vs. Process
- A program is a set of instructions for a computer
- A process is a running instance of a program
- Code plus variables in memory plus open files plus…
- If files are nouns, processes are verbs
- Tools to manage processes were invented when most users only had a single terminal
- But are still useful for working with remote/cloud machines
Viewing Processes
- Use
ps -a -lto see currently running processes in terminalUID: numeric ID of the user that the process belongs toPID: process's unique IDPPID: ID of the process's parent (i.e., the process that created it)CMD: the command the process is running
ps -a -l
UID PID PPID F CPU PRI NI SZ RSS TTY TIME CMD
0 13215 83470 4106 0 31 0 408655632 9504 ttys001 0:00.02 login -pfl tut /
501 13216 13215 4006 0 31 0 408795632 5424 ttys001 0:00.04 -bash
501 13569 13216 4046 0 31 0 408895008 20864 ttys001 0:00.10 python -m http.server
0 13577 13216 4106 0 31 0 408766128 1888 ttys001 0:00.01 ps -a -l
- Use
ps -a -xto see (almost) all processes running on computerps -a -x | wctells me there are 655 processes running on my laptop right now
Exercise: Watching Processes
-
What does the
topcommand do? What doestop -o cpudo? -
What does the
pgrepcommand do?
Parent and Child Processes
- Every process is created by another process
- Except the first, which is started automatically when the operating system boots up
- Terminology: child process and parent process
echo $$shows process ID of current process$$shortcut for current process's ID because it's used so often
echo $PPID(parent process ID) to get parent- Yes, it's inconsistent
pstree $$to see process tree
-+= 34628 tut -bash
|--= 38255 tut emacs 03_proc/index.md
|-+= 38697 tut pstree 34628
| \--- 38699 root ps -axwwo user,pid,ppid,pgid,command
\--- 38698 tut pbcopy
Signals
- Can send a signal to a process
- "Something extraordinary happened, please deal with it immediately"
- Different codes show names and meanings of some standard signals
| Number | Name | Default Action | Description |
|---|---|---|---|
| 1 | SIGHUP |
terminate process | terminal line hangup |
| 2 | SIGINT |
terminate process | interrupt program |
| 3 | SIGQUIT |
create core image | quit program |
| 4 | SIGILL |
create core image | illegal instruction |
| 8 | SIGFPE |
create core image | floating-point exception |
| 9 | SIGKILL |
terminate process | kill program |
| 11 | SIGSEGV |
create core image | segmentation violation |
| 12 | SIGSYS |
create core image | non-existent system call invoked |
| 14 | SIGALRM |
terminate process | real-time timer expired |
| 15 | SIGTERM |
terminate process | software termination signal |
| 17 | SIGSTOP |
stop process | stop (cannot be caught or ignored) |
| 24 | SIGXCPU |
terminate process | CPU time limit exceeded |
| 25 | SIGXFSZ |
terminate process | file size limit exceeded |
- Create a callback function in Python to handle the signal
- Unsurprisingly called a signal handler
import signal
import sys
COUNT = 0
def handler(sig, frame):
global COUNT
COUNT += 1
print(f"interrupt {COUNT}")
if COUNT >= 3:
sys.exit(0)
signal.signal(signal.SIGINT, handler)
print("use Ctrl-C three times")
while True:
signal.pause()
python src/catch_interrupt.py
use Ctrl-C three times
^Cinterrupt 1
^Cinterrupt 2
^Cinterrupt 3
^Cshows where user typed Ctrl-C
Exit Status
- Every process reports an exit status when it finishes
- An integer that tells the parent process (usually the shell) what happened
- By convention: 0 means success, any non-zero value means failure
- This is the opposite of how Python treats integers as booleans
- The reason: there is exactly one way to succeed, but many ways to fail
- Non-zero values can therefore encode which error occurred
- Common non-zero codes (specific meanings vary by program):
- 1: general error
- 2: misuse of shell built-in or incorrect arguments
- 126: command found but not executable
- 127: command not found
- 128+N: terminated by signal number N (e.g., 130 = terminated by Ctrl-C, which is SIGINT=2)
- The shell stores the exit status of the last command in
$?
$ python -c "import sys; sys.exit(0)"
$ echo $?
0
$ python -c "import sys; sys.exit(1)"
$ echo $?
1
$ python -c "import sys; sys.exit(42)"
$ echo $?
42
- In Python, call
sys.exit(code)to set the exit statussys.exit()with no argument is equivalent tosys.exit(0)- Raising an uncaught exception causes Python to exit with status 1
- Printing a string to
sys.exit()prints it to stderr and exits with status 1
import sys
if len(sys.argv) < 2:
print("Usage: exit_example.py number", file=sys.stderr)
sys.exit(1)
value = int(sys.argv[1])
if value < 0:
print(f"error: {value} is negative", file=sys.stderr)
sys.exit(2)
print(f"value is {value}")
sys.exit(0)
- Use
subprocessto run another program and capture its exit status
import subprocess
result = subprocess.run(["ls", "/no/such/directory"])
print(f"exit status: {result.returncode}")
# exit status: 1 (or 2 on some systems)
result = subprocess.run(["ls", "/tmp"])
print(f"exit status: {result.returncode}")
# exit status: 0
- Exit status is the basis for shell conditionals and the
&&and||operators
$ ls /tmp && echo "directory exists"
…files in /tmp…
directory exists
$ ls /no/such/dir && echo "this will not print"
ls: /no/such/dir: No such file or directory
$ ls /no/such/dir || echo "ls failed, running fallback"
ls: /no/such/dir: No such file or directory
ls failed, running fallback
&&runs the right side only if the left side succeeded (exit status 0)||runs the right side only if the left side failed (exit status non-zero)- This is how Git hooks work: return non-zero to abort the operation
Exercise: Exit Status of Python Scripts
-
What exit status does Python use when a script raises an uncaught
ValueError? What aboutKeyboardInterrupt? -
Write a shell one-liner using
&&that runs your test suite and only prints "all tests passed" if the tests succeed.
Background Processes
- Can run a process in the background
- Only difference is that it isn't connected to the keyboard (stdin)
- Can still print to the screen (stdout and stderr)
import time
for i in range(3):
print(f"loop {i}")
time.sleep(1)
print("loop finished")
python src/show_timer.py &
ls site
$ src/show_timer.sh
birds.csv cert_authority.srl sandbox server.pem species.csv
cert_authority.key motto.json server.csr server_first_cert.pem yukon.db
cert_authority.pem motto.txt server.key server_first_key.pem
loop 0
$ loop 1
loop 2
loop finished
&at end of command means "run in the background"- So
lscommand executes immediately - But
show_timer.pykeeps running until it finishes- Or needs keyboard input
- Can also start process and then suspend it with Ctrl-Z
- Sends
SIGSTOPinstead ofSIGINT - It's up to the receiving program to handle this correctly
- Sends
- Use
jobsto see all suspended processes - Then
bg %numto resume in the background - Or
fg %numto foreground the process to resume its execution
$ python src/show_timer.py
loop 0
^Z
[1]+ Stopped python src/show_timer.py
$ jobs
[1]+ Stopped python src/show_timer.py
$ bg
[1]+ python src/show_timer.py &
loop 1
$ loop 2
loop finished
[1]+ Done python src/show_timer.py
- Note that input and output are mixed together
Why bother with backgrounding and foregrounding programs instead of opening another window?
-
Opening another window is often the better solution, particularly if your fingers know the keyboard shortcuts to cycle between windows.
-
But if you're working on a remote computer, it might be simpler to run several programs simultaneously in the same terminal window.
-
And talking about this is an excuse to introduce some more ideas about processes.
Killing Processes
- Use
killto send a signal to a process
$ python src/show_timer.py
loop 0
^Z
[1]+ Stopped python src/show_timer.py
$ kill %1
[1]+ Terminated: 15 python src/show_timer.py
- By default,
killsendsSIGTERM(terminate process) - Variations:
- Give a process ID:
kill 1234 - Send a different signal:
kill -s INT %1
- Give a process ID:
$ python src/show_timer.py
loop 0
^Z
[1]+ Stopped python src/show_timer.py
$ kill -s INT %1
[1]+ Stopped python src/show_timer.py
$ fg
python src/show_timer.py
Traceback (most recent call last):
File "/tut/sys/src/show_timer.py", line 5, in <module>
time.sleep(1)
KeyboardInterrupt
Fork
- Fork creates a duplicate of a process
- Creator (parent) gets process ID of child as return value
- Child gets 0 as return value (but has something else as its process ID)
import os
print(f"starting {os.getpid()}")
pid = os.fork()
if pid == 0:
print(f"child got {pid} is {os.getpid()}")
else:
print(f"parent got {pid} is {os.getpid()}")
starting 41618
parent got 41619 is 41618
child got 0 is 41619
- Output shown above comes from running the program interactively
- When run as
python fork.py > temp.out, the "starting" line may be duplicated- Programs don't write directly to the screen
- Instead, they send text to the operating system for display
- The operating system buffers output (and input)
- So the "starting" message may be sitting in a buffer when
forkhappens - In which case both parent and child send it to the operating system to print
- Operating system decides how much to buffer and when to actually display it
- Its decision can be affected by what else it is doing
- So running the same program several times can produce different outputs
- Because your program is only part of a larger sequence of operations
- Dealing with issues like these is part of what distinguishes systems programming from "regular" programming
Flushing I/O
- Can force OS to do I/O right now by flushing its buffers
import os
import sys
print(f"starting {os.getpid()}")
sys.stdout.flush()
pid = os.fork()
if pid == 0:
print(f"child got {pid} is {os.getpid()}")
else:
print(f"parent got {pid} is {os.getpid()}")
starting 41536
parent got 41537 is 41536
child got 0 is 41537
Exec
- The
execfamily of functions inosexecute a new program inside the calling process- Replace existing program and start a new one
- One of the reasons we need to distinguish "process" from "program"
- Use
fork/execto create a new process and then run a program in it
import os
import sys
print(f"starting {os.getpid()}")
sys.stdout.flush()
pid = os.fork()
if pid == 0:
os.execl("/bin/echo", "echo", f"child echoing {pid} from {os.getpid()}")
else:
print(f"parent got {pid} is {os.getpid()}")
starting 46713
parent got 46714 is 46713
child echoing 0 from 46714
Exercise: Different Kinds of exec
- What are the differences between
os.execl,os.execlp, andos.execv? When and why would you use each?