The Shared Computing Cluster (SCC) is a multi-user environment and it is important for all users to respectfully share the cluster to ensure the best possible experience for everyone. The guidelines below help with this:
- If you have never used a shared cluster, we highly recommend that you attend one or more of our introductory tutorials (you can also view the slides on your own) or meet with one of the Research Computing Services (RCS) staff members.
- Do not run long processes on the login nodes. The login nodes are intended for file editing, file transfer, compilation and light debugging. All other CPU-intensive operations will be aborted.
- Use the Project Space for production work and not your home directory. All home directories have a 10GB quota and when exceed you will not be able to transfer files, run jobs, and perform many other tasks. Additional information on File System Structure is available.
- Before submitting a large number of jobs (or tasks), run a small test case to make sure it works as expected.
- Request a run time limit of 12 hours or less when possible. There are many more compute nodes available to run 12-hour jobs than longer jobs, so your job will spend less time in the queue waiting to start if it does not request more than 12 hours.
- Use checkpoints! If there are breaks in your workflow, save the current state to a checkpoint file. This will allow you to restart your job from the last successful state instead of rerunning it from the beginning in case the job failed due to an error or failed to complete before the hard time limit.
- Use a job array instead of submitting many individual jobs when possible.
- Do not submit many short jobs. If your jobs take just a few minutes combine them into a single script – this will reduce the workload on the system and will help to avoid generating tons of tiny files, which are a problem for the system and generally annoying for the user. You can contact us for help in doing this.
- Try to avoid creating many (thousands+) tiny files when possible. It is better for the system to have a smaller number of larger files. If you do need to have a very large number of files, we recommend store them in a single archive file in
/projectnb
or/restricted/projectnb
to avoid a high cost on the system to back them up.
Please contact us (the RCS Staff) if you have questions about any of these guidelines. We are here to help and make sure everyone has the best possible experience using the SCC!