OrangeGrid (OG) | HTCondor Support

OrangeGrid is the Syracuse University researching computing's high-throughput computing (HTC) cluster. OrangeGrid is a non-interactive Linux environment intended to run analyses that are too large or complex to run on a single server. 

IDEs and Development

OrangeGrid is not intended to be used as a development environment. Activities on the cluster should be limited to submitting jobs, doing light editing with a text editor such as nano or vim, and running small tests that use a single core for no more than a few minutes. The use of IDEs such as Jupyter, Spyder, VSCode, etc is prohibited as these programs can interfere with other users of the system or, in the worst case, impact the system as a whole. If you need a development environment please contact us and other more appropriate resources can be provided.

Looking for Zest? While similar, the OrangeGrid and Zest clusters are separate unique environments. Information about Zest is available on the Zest | Slurm home page

On This Page


Accessing OrangeGrid

To access OrangeGrid, simply make an SSH connection using your NetID and specifying the login node you have been assigned. The example below uses 'its-og-login3.syr.edu'. Refer to the access email you received from Research Computing staff with your node information. The cluster supports connection via CMD, programs like Putty, and the use of SCP and WinSCP. 

Example SSH Connection
ssh netid@its-og-login3.syr.edu


Campus Wi-Fi or Off-Campus?

Campus Wi-Fi or off-campus networks cannot access the cluster directly. Users off campus need to connect to campus via Remote Desktop Services (RDS). RDS will provide you with a Windows 10 desktop that has access to your G: drive, campus OneDrive, and the research clusters. Once connected, SSH and SCP are available from the Windows command prompt or you can use Putty and WinSCP which are also installed. Full instructions and details for connection to RDS are available on the RDS home page. Note that Azure VPN is an alternative option, but not available for all users. See the Azure VPN page for more details. 

In rare cases where RDS is not an option, the research computing team may provide remote access via a bastion host.

 Connecting Via a Bastion Host

To connect via a bastion host, first SSH to the bastion host specified by research computing staff. Note, however, this connection will require a Google Authenticator passcode. If you have not already configured the Google Authenticator app, instructions have been provided below. 

bastion login

Once on the bastion host, simply SSH normally to the login node you have been provided an account for. 

oglogin

Steps to Set Up Google Authenticator

1) If not already installed, download/install the Google Authenticator application from the application store (Apple) or Google play (Android)

2) Use your SSH client to connect to its-condor-t1.syr.edu.  If you need to download a SSH client PuTTY is a good option for Windows and Unix. PuTTY can be downloaded from here.  Apple user can use the built-in application called Terminal.

putty settings prior to connetion

3) Maximize your SSH window (you will a big window to display a QR code that you will scan through the Google Authenticator application).

4) When prompted use your SU NetID and password to login.

putty connection

5) It will display a basic instruction set for setting up your 2 factor authentication and then wait at a prompt before continuing – AGAIN BE SURE TO MAXIMIZE the SSH session window before you go to the next step to make sure the barcode will be fully displayed on the screen.

initial login

qr code screen

6) Once you continue it will display a key and barcode – use Google Authenticator Application you installed in step  to scan the barcode or enter the key.

7) This should log you in successfully,  on the subsequent logins, you’ll enter NetID password as Password prompt, then 6-digit Google Authenticator one time password at the Verification prompt. Enter the 6-digit code without any spaces even if Google Authenticator shows a space in the number string.

google auth app


HTCondor Basics

Below are examples of the most common HTCondor commands. 

Note that you can add '| less' to most commands so that you can enter a more navigable output using arrow keys.  Quit the 'less' viewer by pressing 'q' or search for specific text with '/'

Common HTCondor Commands
# To check the state of the machines available in the pool.
condor_status

# To see who is using the pool.
condor_userprio

# To see the status of jobs are queued in the pool.
condor_q

# To see the status of jobs for a particular user.
condor_q <netid>

# To see the status of jobs for a particular user continuously over a specified time interval.
watch -n <time/seconds> condor_q <netid>

# To submit a job to the pool.
condor_submit

# To remove a submitted job.
condor_rm

# Will give you all the options for submitting a job and creating job submission files.
man condor_submit

In the output of condor_q, the frst column ID is a unique identification number assigned by condor to a job. The OWNER column gives the user name of the job owner. The column labeled ST gives the job state.

The states that you will normally encounter are:

  • I for idle. Your jobs is in Condor’s job queue, but is not currently running.
  • R for running. Your job has been assigned to a CPU and is currently executing.
  • H for held. There is a problem with your job that requires manual intervention.
  • C for completed. Your jobs is fnished and is ready to be removed from the queue.
  • X for exiting.

Learn more basics with the HTCondor Quick Start Guide.

HTCondor Generic Instructions

Please note that the documentation in the links above refer to a generic cluster. Some of the examples use something called Condor File Transfer to move data between the head node (where you log in) and the worker node (where the job runs). This is not needed and should not be used on OrangeGrid, omit any lines like:

should_transfer_files = Yes
when_to_transfer_output = ON_EXIT


Submitting Jobs

To submit a job in HTCondor, you will generally need to create or have a script file and a job description file, often referred to as a "submit file". Once you have these two files, users can use the 'condor_submit' command to submit the job to the HTCondor system.

Simple Shell Script Example

Below is a basic shell script, called hello.sh in this example, that prints a welcome message and then some information about the host that it is running in. 

#!/bin/bash
# If any command fails, exit with a non-zero exit code
set -e
# Print a welcome message
/bin/echo "Hello, world!"
# Dump some information about the host that we are running on
/bin/echo -n "Host name is "
/bin/hostnamex
/bin/echo -n "Linux kernel version is "
uname -r
/bin/echo -n "Operating system install is "
cat /etc/issue
# Exit successfully
exit 0

 We can run from the command line to see what it prints on the head node. If you run the command '/hello.sh', it will print:

Hello, world!
Host name is its-condor-submit
Linux kernel version is 2.6.38-10-virtual
Operating system install is Ubuntu 11.04 \n \l

Simple Submit File Example

Below is a basic condor submit file using our hello.sh script. 

# Which university should be used? The vanilla university can be used to run any regular executable that can run on the cmd.
universe = vanilla
# Point the submit file to the script you would like to use. Use absolute path if the script is in another location.
executable = hello.sh
# These next three lines tell Condor that it must transfer the input and output files to and from the computer that will actually execute our job. Condor will automatically transfer back all files that your job creates, but if you require any input files, you
must explicitly specify them here. To do so, add 'transfer_input_files = file1,file2,etc.
transfer_executable = true
should_transfer_files = yes
when_to_transfer_output = on_exit_or_evict
# The next three lines specify the (relative) paths to the files that will store this information.  Notice that we have used two Condor variables in these file names: $(cluster) and $(process). These two variables make up the two parts of the Condor job user ID that we saw earlier when running condor_q.
output = hello.$(cluster).$(process).out
error = hello.$(cluster).$(process).err
log = hello.$(cluster).log
# This tells Condor to submit 1 job of this type to the pool.   
queue 1
# Note: If we add an integer after cluster, we can submit multiple identical jobs to the pool. 
# Ex. 'queue 10' will submit 10 identical jobs to the pool each with a unique process number.

Now that script and submit file are created, submit the job:

# Submit the job.
condor_submit hello.sub

# Condor will respond telling you that the job was submitted and give you the cluster number, for example:
Submitting job(s).
1 job(s) submitted to cluster <JobID>

Running GPU Jobs on OrangeGrid

Specific nodes within the OrangeGrid pool are equipped with Graphical Processing Units (GPUs). Typical uses include TensorFlow and PyTorch, but other uses and tools are welcome.

To take advantage of GPUs for your jobs, please include the following line within your HTCondor submit files.

+request_gpus = 1

OrangeGrid FAQ

Can I Use Tensorflow with OrangeGrid?

Yes, we have detailed instructions for Installing and Using Tensorflow with HTCondor.

Can I Use Docker with OrangeGrid?

The cluster doesn't support Docker directly, however, you can import Docker containers into Singularity. More info on Singularity is available from here: https://docs.sylabs.io/guides/3.6/user-guide/

What Software is Available on OrangeGride?

Software installation on the nodes is minimal, primarily due to the variety of software and versions needed by researchers.  To assist in getting needed software to the node the research team utilizes containers which allows creating a self-contained environment that includes the software, libraries and configuration needed.  The research computing team will assist with creation in collaboration with the researcher.




Getting Help

Question about Research Computing? Any questions about using or acquiring research computing resources or access can be directed at researchcomputing@syr.edu.