Intro to Cluster Computing

What is a Cluster?

A computing cluster is a group of closely linked computers that work together as a single computer. This configuration gives better performance than any single computers, called nodes, are able to achieve alone. Nodes are divided into three types.

Compute Nodes: These are what actually carry out the computations.

Login Nodes: These handle login requests, allow users to interact with their data, and submit jobs for the back-end compute nodes. 

Head Node: This controls the cluster and contains the configuration information for all the other nodes in the cluster.

Common Terms:

cpu time- The amount of time a CPU is in use, measured per CPU (so a 4 processor job that runs for 15 minutes takes 1 hour of cpu time)

job- Any user-submitted program executed on the cluster.

job file- A file containing information about a job used by the resource manager and the scheduler to schedule the job.

queue- An ordered group of jobs waiting to run.

partition- A virtual container of resources the scheduler can assign jobs to.

wall-clock time- The amount of time a job is expected to, or has run on the cluster (a 4 processor job that runs for 15 minutes takes 15 minutes of wall-clock time).

Cluster Hosting Services

S&T students and researchers who have a need for high-performance computing have access to consulting and hosting provided by ITRSS. Historically ITRSS has been able to provide some of the industries newest gear designed for high-performance computing to S&T students at no cost to them. ITRSS also provides consulting services for research faculty who need HPC equipment, specialized equipment is selected to provide the maximum benefit to the researcher and the computing environment as a whole.