User Tools

Site Tools


wiki:syncing

Syncing using rsync

Scratch space on summit is deleted every 90 days. If you would like to avoid this, please work in your project space (250 GB limit) or request petabyte storage space (for a fee).

To avoid having your projects disappear in the poof with no warning, it is important to save your work locally.

There are many strategies to do this. I'll show you my strategy. It is probably not the best. Consult with your lab and/or the summit help people for advice.

→ I write a shell script in a local directory on my office computer.

→ I run the shell script anytime I am working on summit. I typically try to run the script in the morning and at night.

→ My script uses rsync. Rsync is a powerful utility for synchronizing and backing up lots of data in complex ways.

rsync usage:

rsync -auvz -e 'ssh -p 22' <source> <target> 
rsync -auvz -e 'ssh -p 22' <source> .

This command will make a copy of all the data in <source> to <target>. <source> and <target> can be different locations on the same computer or locations on remote computers (servers, cloud locations, supercomputers, etc).

Let's break it down…

rsync   this is the command
-a      archive option. this is a shortcut for many options all bundled into
        one.
        It basically allows you to sync things within directories 
        (recursively) and preserves lots of information about the date of last 
        modification, permissions, etc.
-u      update option. Only move files that have changed since the last rsync.
-v               verbose option. This makes rsync spit out a lot of commentary 
                 on what it is doing.
-z               compress option. This compresses the data during transit.
-e 'ssh -p 22'   This specifies that you want to connect to the remotes server 
                 using secure shell.
<source>         source. This is the summit source directory. In the form: eID\@colostate.edu@login.rc.colorado.edu:/scratch/summit/location
<target>         This is your local space. I just use a dot. 
                         
Other helpful options:
-n               dry run option. When you add -n, nothing actually is moved or 
                 copied, but it is just a test.
--exclude <filesOrDirs>  an option that allows you to ignore syncing certain files. 
                         This can be useful if there is a giant input folder where 
                         you often unzip and re-zip files.

Here is an example of a script called rsync_summit.sh

#!/bin/bash

#Pull work from summit to the local directory
rsync -auvz -e 'ssh -p 22' --exclude '01_input' erinnish\@colostate.edu@login.rc.colorado.edu:/scratch/summit/erinnish@colostate.edu/EOP229_RNAseq_pipeline_mouse .

:!: Exercise: Make a shell script on your local computer inside the directory where you want to copy/backup your /scratch/summit files.

:!: Common pitfall: Be aware that directory and directory/ with a trailing slash will have different behaviors in rsync. directory will move the whole directory. directory/ with the trailing slash will move only the contents of directory.

:!: Common pitfall: This may be tricker for PC people than for Mac people.

:!: Common pitfall: Match the shebang line to your own configuration by looking up $ which bash.


Moving files using an ftp client

To move files to and from SUMMIT, you can use an FTP client like Cyberduck or sFTP client.

We will use an interactive ftp client to transfer and edit files on remote servers. We recommend Cyberduck . Cyberduck is available on MacOS and PC. The Cyberduck in the Apple Store costs money, and we don't recommend it. Just obtain the free version from the cyberduck website.

:!: Tip?: Mac users… Cyberduck not working? Try removing your cyberduck password from your mac keychain: Removing passwords from Mac Keychain

Cyberduck isn't working for me. What should I do?

Other options to try are

sFTP Client (also nice)

or

Filezilla free client (trickier)

FTP Client settings

Connect Using: SFTP
Connect to: login.rc.colorado.edu
Login: Your full eID@colostate.edu e-mail
Password: Try using password,push
Port: 22

Secure CoPy

Another option is to use

scp - Secure CoPy
scp <sourcefile> <target>
scp <http://address/to/file/file.txt> <.>

I typically use secure copy on my local computer to pull things from summit or push things to summit.

RNA-seq Schedule

wiki/syncing.txt · Last modified: 2018/12/04 04:46 by erin