Running jobs on elm

elm runs Solaris 10 which allows the multi-threading support necessary to obtain peak performance on parallel Sparc machines. elm contains 2 2.4 GHz AMD Opteron processors.  elm operates in a shared memory configuration. The compilers now offer options for automatic parallelization of source code as well as direct access to the multi-thread libraries for serious hand-coding of parallel implementations. See the man pages on f90, f77, and cc for more information.

Jobs that are disk I/O intensive should not be run directly from your account. This will cause the system to write the data across the network instead of taking advantage of the fast SCSI system local to the compute server (a difference of 1.25Mb/s vs. 10+Mb/s!). So, any job that will write a large amount of output and/or temporary files (greater than 10Mb) should be run on the local disk. A scratch space has been set up for exactly this purpose. To do this, simply transfer your input file to the temporary space:

% cd /scratch/elm
% mkdir your-login
% cd your-login
% cp $HOME/input .
% command < input > output &

Submitting the job in this manner will write the output of the job to /scratch/elm/your-login, taking advantage of the higher transfer rates.

The /scratch disk space has a capacity of approximately 13GB on elm and is temporary file space -- files left here will not be backed-up and, if not accessed for long periods of time, are subject to removal. When you have completed your job, please transfer the useful output files to your account and remove all remaining files. Everybody must share this file space, so please be courteous and clean up your files regularly.

elm should have little to no interactive use, so jobs do not need to be submitted with the nice command as explained below.

UNIX systems are intended to be multi-tasking systems. As such, they are capable of responding to interactive use while, at the same time, running large computational programs. Typically, large jobs are submitted and then the user logs off; this is called running "in the background". You can submit background jobs by simply appending an ampersand, "&", to your command line:

% command < input > output &

Note that the input and output redirection are optional, and are shown for the sake of generality. This will launch the job and then return you to the interactive prompt.

If you forget to launch a job in the background (and have lost your interactive command prompt), type "CTRL-Z" (control Z) to suspend the job. You can then put the job in the background using the "bg" command:

% command < input > output
CTRL-Z
Suspended
% bg
[1] command < input > output &

Since all other UNIX machines are available for interactive use, you should submit any long-running jobs (greater than 10 CPU minutes) using the nice command:

% nice +15 command < inp > output &

If you submit a long-running job without the nice command, all jobs submitted properly (with the nice command) will cease to collect CPU time -- an unfair situation. Please cooperate so that everybody has equal access to the CPU cycles.

If you forget to nice a job or discover a "quick" job that has run past 10 CPU minutes, you can renice the job:

% renice 15 PID

where PID is the process ID obtained from the "top" command. You can also renice a job from within top.