User Tools

Site Tools


wiki:2018introduction_to_scripting

Introduction to BASH scripting 🙋

A script is a file of commands or instructions that execute as though you typed them on the command line.

BASH scripts are good at:

  • Running other programs
  • Manipulating files
  • Building pipelines
  • Reproducing work

BASH scripts are not good at

  • floating point arithmetic
  • data analysis
  • serious computation

Outline of scripting

  • Simple programs. Execution.
  • Variables (and constants).
  • Using arguments
  • Flow control: conditionals, if statements
  • Dealing with errors, input
  • Flow control: more advanced conditionals, loops
  • Scripted file manipulation
  • More loops and case-statement
  • Sed and grep

Before we begin

Can everybody log in to summit?

For me it looks like

$ ssh -l dcking@colostate.edu login.rc.colorado.edu
Password:
... ACCEPTABLE USE ...
... COMPILING CODE ...
...
[login11 ~]$

Then, immediately ssh to scompile:

[login11 ~]$ ssh scompile
...
[shas0137 ~]$

Also, we will be writing scripts in a text editor. We need to be able to edit, save, and, if editing locally rather than directly on summit, upload the working file.

Text Editor/ File Transfer

nano - Text editor for Linux

If you are comfortable with nano, you can use it to directly edit scripts without having to do any file transferring.

CyberDuck - File transfer for Mac and Windows

Download the installer from here. On my MAC, the setup screen is like this.

IT should automatically transfer changes to your file after you save it.

TextEdit - Text editor for MAC

Make sure you switch to Plain Text under the Format menu.

Notepad+ -Text editor for Windows

Notepad+ instructions.

Interactive exercise

1. Log on to summit from the command line

1b. (If you're not using nano) in a separate window, log onto summit with your file transfer program.

2. On summit. Create a new directory

  • Create a new directory called dev.
  • cd into dev.
  • Run the command touch on helloWorld.sh (this creates an empty file with that name).
$ mkdir dev
$ cd dev
$ touch helloWorld.sh

3. Edit helloWorld.sh in your editor.

If using nano, just do
$ nano helloWorld.sh
If using Cyberduck,

click Refresh. The new file is there. Select it, but don't double click.

Go to File→Edit with → TextEdit. This will open the file in TextEdit.

Once you are editing your file, add this line:

echo "Hello World"

and save it. In nano, type CTRL-X - enter. In Cyberduck, it will give you a notification that it saved and resent the file. If it has been too long, you have to revalidate using DUO.

Try ''echo'' on the command line.

echo is a utility to print to the screen.

$ echo "hi"
hi

Run the script!

Now in the command line, do

$ bash helloWorld.sh
Hello World

If you did not see that output. Raise your hand.

Otherwise…

:!: Congratulations! You've just created and run your first script!!


Running programs (the right way)

Since we are feeding bash the script as an argument, it's not strictly “executable” until we grant the executable permissions with chmod (change mode).

$ chmod a+x helloWorld.sh

This grants a=all the execute permission x=execute. You can see that the permissions have changed by using:

$ ls -l helloWorld.sh
-rwxr-xr-x  1 dcking@colostate.edu dckingpgrp@colostate.edu   19 Aug 31 12:46 helloWorld.sh
File Mode Number of links Owner Group Size Last Modified File Name
Description permissions string # hard links owner group bytes date name
Executable Example -rwxr-xr-x 1 dcking dckingpgrp 19 Aug 31 12:46 helloWorld.sh
Read-only Example -rw-r--r-- 1 fileowner filegroup 62 Jul 16 12:32 file.txt

The permissions string is -rwxr-xr-x. The presence of the x's shows it is executable.

But you still can't do this:

$ helloWorld.sh
-bash: helloWorld.sh: command not found
$ which helloWorld.sh
/usr/bin/which: no helloWorld.sh found in....

Why not? It's not in our PATH.

Until we put it there, we have to do one of the following:

$ bash helloWorld.sh
Hello World
$ ./helloWorld.sh
Hello World

which

To see if a command actually a program in your path, do

$ which programname
$ which ls
/usr/bin/ls

Try it with some other commands.

You'll see that they tend to be in only a handful of directories. When programs are installed on a computer, they are automatically put in a central location. These directories serve that purpose on UNIX/linux.

$PATH is an environmental variable, like $USER, that is set during login. Some important environmental variables are:

Environmental variable name Purpose
HOME Your home directory
HOSTNAME Name of your computer
PATH :-delimited list of executable directories
PS1 Pattern to use as your prompt
PWD Your current directory (as returned by pwd)
USER Who you are logged in as (same as returned by whoami)

These variables can be used throughout your login session for various purposes. Sometimes it's appropriate to change them.

The PATH variable

Most “commands” on UNIX/linux are actually programs. For example, ls is not really a command, but a program that gets run. There are two ways to run a program:

  • Give a valid path: directory/programname
  • Give just the program name, where the directory is in a list stored by the variable $PATH

The second method is to make it easier to run programs- otherwise you'd have to type in the full path of ls every time you needed to run it. The other is so you don't accidentally run malicious or unintended code.

Amending the PATH variable

For security reasons, only trusted directories should be in the path, and those directories should be designated for holding programs only. Since we want to create executables scripts, we are going to manage one for our use only.

1. Create the directory

$ cd
$ mkdir bin

The directory name bin, short for “binary”, is the conventional name for a directory with executable programs, even if they are scripts and not binary.

2. Add it to our PATH

$ PATH=$PATH:$HOME/bin

Remember, the PATH variable has to contain full directory names. So we use the $HOME variable to identify where our bin directory is.

3. Put your executable program there

We just made one. We're going to put a copy of it there, instead of moving it.

Go back to where you wrote helloWorld.sh, was it your home directory?

$ cd
$ cd dev
$ cp helloWorld.sh ../bin

4. Run it

Now, instead of getting “command not found”, your program will run.

$ helloWorld.sh
Hello World!

What happened is that the PATH variable substituted the directory part of the program location.

$ which helloWorld.sh
/home/dcking@colostate.edu/bin/helloWorld.sh

Best practice - executable scripts

There is one final thing you have to do to make this “correct”. Since this is a BASH script, it is running through the bash executable.

$ which bash
/usr/bin/bash

That's right, bash is a binary program, just like ls. To make our script properly executable, we use the #! convention (known as she-bang).

On the first line of your script, write:

#!/usr/bin/bash

This tells the shell to run the script using /usr/bin/bash. An even better practice (Best-er?) is to use the program env, giving it bash as an argument.

#!/usr/bin/env bash

This will make your script more portable, because not every computer puts bash in the same place.

Persistent changes to your environment

To make it stick, we have to add changes to a configuration file .bash_profile. Add the command as you had typed it to the bottom of .bash_profile.

...
PATH=$PATH:$HOME/bin

This will be executed every time you log in.

source

In general, to execute commands into your current shell from a file, such as .bash_profile, do

$ source .bash_profile

This reruns (re-sources) your configuration file, and prints out messages as though you have logged in again. You can't run a script to effect your current environment. You have to source it.

EXTRA

But what is a script? How does it differ from a command? or a program?

What is a command?

BASH scripts can act as (or in place of) commands. In fact, many of the “commands” you use are already other scripts, binary programs, or an alias. These are to be contrasted with shell built-in commands.

  • cd - builtin command
  • ls - executable binary
  • zless - executable script

★ Exercise: inspect shell commands

We are going to use some utilities: which, file and alias to inspect the source of a given command.

This is what mine looks like on summit:

$ which ls
alias ls='ls --color=auto'
  /usr/bin/ls

This tells me that ls is an alias to ls –color=auto, which is actually run when I type ls. The second line, /usr/bin/ls, is the executable program that resides in the system bin directory.

You can see all of your aliases with the command alias without arguments, or inspect an individual command.

$ alias
$ alias ls

Let's go back to /usr/bin/ls, and look at what else is in the system bin directory.

$ ls /usr/bin

Look at all those programs!!! You probably recognize some of these as “commands”. Some are binary programs:

$ file /usr/bin/ls
/usr/bin/ls: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=6129e7403942b90574b8c28439d128ff5515efeb, stripped
$ file /usr/bin/rm
/usr/bin/rm: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 2.6.32, BuildID[sha1]=8919cbb6bc106ef6b94dea3598c8b6978c9586bd, stripped

This is an unusually long description for an executable binary format. See https://en.wikipedia.org/wiki/Executable_and_Linkable_Format if you are interested in the format of binary compiled programs.

Some commands are actually shell scripts:

$ file /usr/bin/cd
/usr/bin/cd: POSIX shell script, ASCII text executable
$ file /usr/bin/zless
/usr/bin/zless: POSIX shell script, ASCII text executable

The command cd is a shell built-in command. On Summit, the “command” cd is wrapped inside a script… Let's take a look at it.

$ cat /usr/bin/cd
#!/bin/sh
builtin cd "$@"

It's a script that passes arguments (with $@) to the built-in shell command. You can explicitly run built-in commands with the keyword builtin.

$ builtin cd
$ builtin ls
-bash: builtin: ls: not a shell builtin

We already knew that ls was a program, not a builtin.

The “command” zless is actually a script that runs like less on a gzipped file. You can read this script, but not edit it:

$ less /usr/bin/zless

This is some daunting code for a command that just pipes gzip -c into less. This is a professional-level script that demonstrates some good aspects:

  • A lot of documentation ('#' comments),
  • a thorough usage statement
  • multiple checks on the input and the environment.

We'll use some of these approaches in this course, but at a simpler level.

Finally: the PATH, Trojan Horses, and the "./"

The PATH variable indicates where to look for trusted programs. We've visited one already in /usr/bin. You can see your list of trusted programs:

$ echo $PATH

The current directory is never in the PATH because it is not a trusted directory.

The point of a Trojan Horse is to masquerade as something trusted but to perform something malicious. There are many forms in modern computing, but one of the earliest comes from the command line interface and how programs are run on UNIX/linux.

The idea is to give your malicious program the same name as a command, such as ls or ssh.

Say an attacker gets you to cd into a directory they have access to– an action that is followed habitually by ls. If you are set up to run programs in your current directory, then the attacker could put a program there called ls, which acts as the real one, but also puts another Trojan Horse in your home directory, this time calling it ssh.

The next time you ran ssh while in your home directory, you'd actually be running the Trojan Horse. It would capture the login credentials and then pass control off to the real ssh. The captured information is stored and delivered to the attacker somehow. The attacker can now log in to that account as you.

Therefore, we never set up our account to run programs in our current directory.

  • You must explicitly run them explicitly by giving a valid path: ./program
  • You can modify your PATH, but never add the current directory, and only add trusted directories to the end
  • If you make a replacement for a common command, give it a different name e.g. myls

Now let's create our first useful script.

Basic Scripting 🙆

wiki/2018introduction_to_scripting.txt · Last modified: 2018/09/04 08:58 by david