Missing Semester Lecture 5 - Command-line Environment
MIT The Missing semester Lecture of Your CS Education Lecture 5 - Command-line Environment
Job Control
In some cases you will need to interrupt a job while it is executing, for instance if a command is taking too long to complete (such as a find
with a very large directory structure to search through). Most of the time, you can do Ctrl-C
and the command will stop. But how does this actually work and why does it sometimes fail to stop the process?
Killing a process
Your shell is using a UNIX communication mechanism called a signal to communicate information to the process. When a process receives a signal it stops its execution, deals with the signal and potentially changes the flow of execution based on the information that the signal delivered. For this reason, signals are software interrupts.
In our case, when typing Ctrl-C
this prompts the shell to deliver a SIGINT
signal to the process.
Here’s a minimal example of a Python program that captures SIGINT
and ignores it, no longer stopping. To kill this program we can now use the SIGQUIT
signal instead, by typing Ctrl-\
.
1 |
|
Here’s what happens if we send SIGINT
twice to this program, followed by SIGQUIT
. Note that ^
is how Ctrl
is displayed when typed in the terminal.
1 | $ python sigint.py |
While SIGINT
and SIGQUIT
are both usually associated with terminal related requests, a more generic signal for asking a process to exit gracefully is the SIGTERM
signal. To send this signal we can use the kill
command, with the syntax kill -TERM <PID>
.
Pausing and backgrounding processes
Signals can do other things beyond killing a process. For instance, SIGSTOP
pauses a process. In the terminal, typing Ctrl-Z
will prompt the shell to send a SIGTSTP
signal, short for Terminal Stop (i.e. the terminal’s version of SIGSTOP
).
We can then continue the paused job in the foreground or in the background using fg
or bg
, respectively.
The jobs
command lists the unfinished jobs associated with the current terminal session. You can refer to those jobs using their pid (you can use pgrep
to find that out). More intuitively, you can also refer to a process using the percent symbol followed by its job number (displayed by jobs
). To refer to the last backgrounded job you can use the $!
special parameter.
One more thing to know is that the &
suffix in a command will run the command in the background, giving you the prompt back, although it will still use the shell’s STDOUT which can be annoying (use shell redirections in that case).
To background an already running program you can do Ctrl-Z
followed by bg
. Note that backgrounded processes are still children processes of your terminal and will die if you close the terminal (this will send yet another signal, SIGHUP
). To prevent that from happening you can run the program with nohup
(a wrapper to ignore SIGHUP
), or use disown
if the process has already been started. Alternatively, you can use a terminal multiplexer as we will see in the next section.
Below is a sample session to showcase some of these concepts.
1 | $ sleep 1000 |
A special signal is SIGKILL
since it cannot be captured by the process and it will always terminate it immediately. However, it can have bad side effects such as leaving orphaned children processes.
Aliases
It can become tiresome typing long commands that involve many flags or verbose options. For this reason, most shells support aliasing. A shell alias is a short form for another command that your shell will replace automatically for you. For instance, an alias in bash has the following structure:
1 | alias alias_name="command_to_alias arg1 arg2" |
Note that there is no space around the equal sign =
, because alias
is a shell command that takes a single argument.
Aliases have many convenient features:
1 | # Make shorthands for common flags |
Note that aliases do not persist shell sessions by default. To make an alias persistent you need to include it in shell startup files, like .bashrc
or .zshrc
.
Remote Machines
It has become more and more common for programmers to use remote servers in their everyday work. If you need to use remote servers in order to deploy backend software or you need a server with higher computational capabilities, you will end up using a Secure Shell (SSH). As with most tools covered, SSH is highly configurable so it is worth learning about it.
To ssh
into a server you execute a command as follows
1 | ssh foo@bar.mit.edu |
Here we are trying to ssh as user foo
in server bar.mit.edu
. The server can be specified with a URL (like bar.mit.edu
) or an IP (something like foobar@192.168.1.42
). Later we will see that if we modify ssh config file you can access just using something like ssh bar
.
Executing commands
An often overlooked feature of ssh
is the ability to run commands directly. ssh foobar@server ls
will execute ls
in the home folder of foobar. It works with pipes, so ssh foobar@server ls | grep PATTERN
will grep locally the remote output of ls
and ls | ssh foobar@server grep PATTERN
will grep remotely the local output of ls
.
Port Forwarding
In many scenarios you will run into software that listens to specific ports in the machine. When this happens in your local machine you can type localhost:PORT
or 127.0.0.1:PORT
, but what do you do with a remote server that does not have its ports directly available through the network/internet?.
This is called port forwarding and it comes in two flavors: Local Port Forwarding and Remote Port Forwarding.
Local Port Forwarding
Remote Port Forwarding
The most common scenario is local port forwarding, where a service in the remote machine listens in a port and you want to link a port in your local machine to forward to the remote port. For example, if we execute jupyter notebook
in the remote server that listens to the port 8888
. Thus, to forward that to the local port 9999
, we would do ssh -L 9999:localhost:8888 foobar@remote_server
and then navigate to localhost:9999
in our local machine.
Exercises
Job control
- From what we have seen, we can use some
ps aux | grep
commands to get our jobs’ pids and then kill them, but there are better ways to do it. Start asleep 10000
job in a terminal, background it withCtrl-Z
and continue its execution withbg
. Now usepgrep
to find its pid andpkill
to kill it without ever typing the pid itself. (Hint: use the-af
flags).
1 | sleep 10000 |
- Say you don’t want to start a process until another completes. How would you go about it? In this exercise, our limiting process will always be
sleep 60 &
. One way to achieve this is to use thewait
command. Try launching the sleep command and having anls
wait until the background process finishes.
1 | sleep 60 & |
However, this strategy will fail if we start in a different bash session, since wait
only works for child processes. One feature we did not discuss in the notes is that the kill
command’s exit status will be zero on success and nonzero otherwise. kill -0
does not send a signal but will give a nonzero exit status if the process does not exist. Write a bash function called pidwait
that takes a pid and waits until the given process completes. You should use sleep
to avoid wasting CPU unnecessarily.
1 | pidwait () { |
Aliases
- Create an alias
dc
that resolves tocd
for when you type it wrongly.
1 | alias dc="cd" |
- Run
history | awk '{$1="";print substr($0,2)}' | sort | uniq -c | sort -n | tail -n 10
to get your top 10 most used commands and consider writing shorter aliases for them. Note: this works for Bash; if you’re using ZSH, usehistory 1
instead of justhistory
.
1 | alias recent="history | awk '{\$1=\"\";print substr(\$0,2)}' | sort | uniq -c | sort -n | tail -n 10" |