Jonas

rachel.psc.edu and jonas.psc.edu

rachel.psc.edu and jonas.psc.edu

Jonas was decommissioned on July 1, 2008.

Jonas users with active accounts were transitioned to salk.psc.edu.

Contact PSC User Services with any questions.

Additional information

  • STORING FILES
  • TRANSFERRING FILES
  • Storing Files

    File Systems

    File systems are file storage spaces directly connected to a system. There are currently two such areas available to you on jonas.

    /usr/users/n/username

    This is your home directory. The numeral 'n' will be replaced by an integer and 'username' will be replaced by your userid. You can also refer to this directory as $HOME. You have a 1 Gbyte quota for your home directory. Thus you will probably not be able to store your data files on $HOME. Your home directory is backed up. $HOME is visible to all of the SMP machines, but through a relatively slow connection.

    $LOCAL

    Each of the SMP machines has 6 Tbytes of local disk space. The local space for a machine is not visible to the other SMP machines. However, the local spaces for all of the SMP machines are visible to the front end node, but through a very slow connection.

    When you run a batch job you cannot determine on which SMP machine your job will run. Thus, you cannot ensure that it will run on the same SMP machine with the same local disk system as any of your prior runs. Therefore, you should consider this local space only as working disk space. In other words, you should copy your data files between golem and this local disk system at the beginning and end of your batch jobs with the far command. The file archiver golem is discussed below.

    Within a job you can refer to the local space assigned to that job on its SMP machine as $LOCAL. You should refer to it with the variable name since we could change the implementation of $LOCAL for performance reasons. See the sample batch job below for an example of how to use $LOCAL and far in a batch job.

    Files on a $LOCAL are, however, accessible to you on jonas's front end both while the job is running and after the job ends, either with the tcscp command or with standard Unix commands. To use either type of command you need to know your job's PBS jobid, which is given when you submit your job or in your job's .o output, and on which SMP machine your job ran. This can be found by using the qstat -f command's exec_host output field for a running job or in the .o output of a finished job. These two pieces of information are used to refer to your files on a $LOCAL. For example, if your job is running or ran on salk64a and the PBS jobid is or was 15786 then the local disk space for that job can be referred to as

    /salk64a/local/15786
    

    in front end Unix commands and as

    salk64a:/local/15786
    

    in tcscp commands.

    Files on a local disk system left by a batch job will remain in place for a week after the job ends. However, they are not backed up and if disk space is low on a local disk system they can be deleted at any time to allow currently running jobs to continue. Thus, you should move any files that you want a permanent copy of from their local disk system to golem as soon as possible.

    You must use tcscp to copy files from a local disk system to golem after a job ends. Once a job ends you cannot use far to perform this transfer. A tcscp copy will use a very fast path between the SMP machines and golem. See the discussion below of tcscp for an example of how to use tcscp.

    You can also access your files on any of the local file systems using standard Unix commands if you use a filename similar to the one described above. For example, the command

    tail /salk64b/local/21689/output.dat
    

    can be used to examine the end of file output.dat for job 21689 on SMP machine salk64b. This command can be issued while your job is running. Other Unix commands, such as ls or rm, will work similarly.

    However, the connection between the jonas front end and the local file systems is very slow. Thus, you should not use the cp command to copy large files between $HOME and the local file systems. In general, you should limit your interactions with Unix commands between the front end and the local file systems on the SMP machines to essential operations.

    File Repositories

    File repositories are file storage spaces which are not directly connected to a front end node or compute processors. You cannot, for example, in a program open a file that resides in a file repository. You must use explicit file copy commands to move files to and from the repository. You currently have one file repository available to you on jonas, golem, PSC's file archiver.

    golem

    Golem runs Cray's DMF file archival system. It is a combination tape-and-disk archival system.

    The far program and the tcscp program can be used to transfer files between golem and jonas.

    You can use kftp, gridftp, scp or sftp to transfer files between golem and your remote machine. We strongly recommend you use kftp instead of scp for remote file transfer if kftp is available. We recommend against using sftp. See the golem Web page for more information.

    You should store your data files on golem rather than in your home directory because your home directory space is limited. At the beginning of your batch jobs you should transfer your data files to $LOCAL and then at the end of your batch jobs copy files that you want a permanent copy of back to golem.

    If you need to store a file to golem that is 2 Tbytes or larger please first contact User Services so that special arrangements can be made to store your file.

    Transferring Files

    Kftp

    Jonas is running Kerberos 5 (K5) client and server software. If your local site also has K5 client/server software installed, you can transfer files to and from jonas whether you are logged into jonas or your local machine. The examples below assume that you are logged into your local machine.

    Before you can use kftp to transfer files, you must authenticate yourself to jonas. To do this use the kinit command.

    kinit username@PSC.EDU
    

    For 'username' substitute your PSC userid. PSC.EDU is PSC's Kerberos realm name.

    After you enter this command you are prompted for your PSC Kerberos password, which is the password you use to login to jonas.

    Once you are authenticated you can use the kftp command to actually perform your file transfers.

    kftp jonas.psc.edu
    

    The kftp command functions like the ftp command.

    You should not use kftp to transfer files to $LOCAL.

    You should verify that the Kerberos commands operate on your local system as described here. Some installations of Kerberized ftp differ in their implementation.

    Man pages for kinit and kftp are available on jonas.

    A Unix kftp client is available at http://www.pdc.kth.se/heimdal. A Windows kftp client is available at http://web.mit.edu/network/kerberos-form.html.

    Kftp will be much faster than scp, discussed below, and sftp for file transfers.

    scp

    The scp program can be used to transfer files between your remote machine and your jonas home directory. You should not use it to transfer files to $LOCAL.

    The format for the scp command is

    scp source-filename target-filename
    
    where the filename on the remote system, whether it is the target or the source, must be specified as
    username@system:filename
    

    For example, to copy a file to your home directory on jonas when you are logged in to your home system use a command such as

    scp filename username@jonas.psc.edu:/usr/users/n/username/filename
    
    If you are logged in to jonas and you want to copy over a file from your home system to jonas, use a command such as
    scp username@remote-system:filename  filename
    

    The first time you use scp to or from jonas, you will receive a message similar to

    Host key not found from list of known hosts.  Are you sure 
    you want to continue connecting?
    

    Answer 'yes' to make the connection. You should not receive this message on subsequent connections.

    You will be prompted next for your password on the remote system. For jonas you should use your PSC Kerberos password.

    You may be able to improve your scp transfer rate by using the blowfish encryption method rather than the default method, if your version of scp supports it. To use this method issue your scp command as

    scp -c blowfish source-filename target-filename
    

    For more information on the scp command, see the scp man page.

    Scp is part of the ssh distribution.

    We strongly recommend that you use kftp rather than scp for remote file transfers if kftp is available.

    Far

    You can use the far program to move files between jonas and golem,.

    Tcscp

    The tcscp command, created by PSC, allows you to copy files from a local file system on an SMP machine to golem. Standard Unix file protections are used to determine which files you can copy with tcscp. Thus, other users will not be able to copy your files on a local file system unless you set the file permissions to allow this.

    The format of the command is based on the cp command, with the addition of the ability to specify source and target machines as well as source and target filenames. For example, the command

    tcscp salk64a:/local/15786/output.dat golem:output.dat
    

    copies output.dat from your directory /local/15786 on SMP machine salk64a to your golem home directory. You can get the name of the machine your job ran on from your .o output or from qstat -f for a running job. The PBS jobid is available when you submit your job or from your .o output. You issue the tcscp command interactively while logged into one of jonas's front end nodes.

    The wildcard characters '*' and '?' are permitted in source filename specifications and are treated as the shell treats them.

    Just as with the cp command, you can specify multiple source filenames

    tcscp salk64a:/local/15786/output1.dat  \
      salk64a:/local/15786/output2.dat golem:
    

    This command will copy output1.dat and output2.dat from your directory /local/15786 on salk64a to your golem home directory. When you use this form of the command the last file specification is the target specification and must be a directory.

    The tcscp command has several options. The -r option allows you to recursively copy directories and their contents, just like cp. The -v option runs the command in verbose mode. In verbose mode the fully expanded filenames used in the copy are shown as is timing data about the transfer. The -no option is used to specify that you do not want existing files to be overwritten if a target file has the same filename as an existing file. The default behavior of tcscp is to overwrite existing files. When you use the -no option existing files are skipped over by tcscp. The -nk option causes tcscp to delete its source files after it successfully copies them. Finally, the -h option provdes help information for tcscp.

    Tar

    Whether you are transferring files between jonas and golem or between jonas and your remote system if you have many files--1000 or more--it is much more efficient to tar them up into one file and then transfer this single tar file, especially if they are small files, 64 Kbytes or smaller.

    Tru64 tar--located at /bin/tar--can only create a tar file up to 8 Gbytes. Gnu tar--located at /usr/psc/gnu/bin/tar--can create tar files larger than 8 Gbytes. However, a file created by Gnu tar that is larger than 8 Gbytes cannot be read by Tru64 tar.

    You should first contact User Services if you are going to create a tar file that is 50 Gbytes or larger. You should move your tar file to golem or to your remote system as soon as you can after you create it.