The Autograding Process
Autolab follows a complex procedure for running an autograding job. Jobs are executed in sandboxed containers to ensure they can't interfere with shared resources and distributed across multiple servers to handle a large quantity of submissions.
Tango is Autolab's autograding backend. It handles job queueing and distribution. It's a separate service from Autolab, but they're tightly integrated. Tango's API is documented by CMU.
Sending the job to Tango
An autograding job is started when a user makes a submission to an assignment with an autograder configured or an instructor regrades an existing submission.
To begin a job, Autolab first makes a GET request to Tango's /open/<key>/<course-assessment>/
endpoint. Tango creates a directory for the assignment if it doesn't already exist. If the directory already exists, Tango returns the MD5 hash of every file in the directory so Autolab can avoid uploading duplicates (e.g., your autograder files).
Next, Autolab repeatedly makes POST requests to Tango's /upload/<key>/<course-assessment>/
endpoint for each (new; skipping MD5 collision duplicates) file the job requires.
This includes:
- The autograder:
autograde.tar
- The Makefile:
autograde-Makefile
- The submission: E.g.,
student@buffalo.edu_5_handin.py
- The submission metadata: E.g.,
student@buffalo.edu_5_handin.py.settings.json
Files are saved to /opt/TangoService/Tango/courselabs/<key>-<course>-<assignment>/<filename>
After uploading all required files, Autolab generates a random callback URL, which Tango will request when the job is finished.
To queue the job, Autolab makes a POST request to Tango's /addJob/<key>/<course-assessment>/
endpoint. This request contains the following information:
- The image name configured in the autograder settings
- The timeout configured in the autograder settings
- Each file required for grading
- This contains the original filename on Tango's filesystem (e.g.,
student@buffalo.edu_5_handin.py
) and the destination filename (e.g.,handin.py
), which is what the file will be named within the container when grading. - The name of the feedback output file (generated by Autolab, ends with
_autograde.txt
) - The callback URL mentioned above (generated by Autolab)
- The job name (generated by Autolab)
Tango validates the job (e.g., all required files exist, the image exists, etc.) and returns an error if there is one. Error messages are propagated back to Autolab. Autolab will reveal more detailed errors to instructors, while students will see vague errors.
Tango queues the job
If the job is valid, Tango adds it to the job queue. Tango has a configured number of container instances that can run simultaneously for each image. The job will remain in the queue until there is an instance of the required image available. Usually, jobs are dequeued immediately, but during high-traffic times, such as assignment deadlines, the queue may get longer.
Running the job
When a job is ready to run, Tango:
- Picks a worker node to run the job on
- Creates a directory on the worker to copy the job files to via SSH
- E.g.,
/docker-volumes/dev-1005-autograding_image
- Copies each job file to the worker node via SCP
- E.g.,
/docker-volumes/dev-1005-autograding_image/handin.py
- Executes the command to begin autograding on the worker node via SSH (elaborated below)
The command that begins autograding is similar to the following. This example is from a development environment, but the core of the command is identical in production.
(docker run --name dev-1005-autograding_image -v /docker-volumes/dev-1005-autograding_image/:/home/mount autograding_image sh -c 'cp -r mount/* autolab/; su autolab -c "autodriver -u 100 -f 104857600 -t 20 -o 1024000 autolab > output/feedback 2>&1"; cp output/feedback mount/feedback')
Let's break that complicated command down:
- It creates and starts a new Docker container from the image you specify (e.g.,
autograding_image
) - The directory with the files required for the job is mounted to
/home/mount
within the container - Within the container, the following command is executed:
cp -r mount/* autolab/; su autolab -c "autodriver -u 100 -f 104857600 -t 20 -o 1024000 autolab > output/feedback 2>&1"; cp output/feedback mount/feedback
- All the input files are copied to the
/home/autolab
directory- The final
WORKDIR
in your Dockerfile should always be/home
. This document assumes it is.
- The final
- It switches to the
autolab
user, which must be created in your Dockerfile. (Up until this point, it has been running as whichever user you specified in your Dockerfile, or usuallyroot
by default.) - The
autolab
user executesautodriver -u 100 -f 104857600 -t 20 -o 1024000 autolab > output/feedback 2>&1
- (These limits are different in production)
-u 100
limits the number of processes that can be started-f 104857600
sets the maximum file size that can be created-t 20
sets the job timeout-o 1024000
limits the size of the outputautolab
specifies the directory that will be copied into the grading user's home directory> output/feedback 2>&1
saves stdout and stderr to the feedback file
- (autodriver is explained in depth below)
- After grading, we're back to the original user, and the feedback file is copied to
/home/mount/feedback
, which is shared with the host
Autodriver
The autodriver configures the environment in the container before running your autograder.
It does a lot, but the most important parts are:
- It is running as the
autolab
user (but the submission code is NOT; keep reading) - It moves the files from
/home/autolab
to/home/autograde/autolab
- It sets
/home/autograde/autolab
as the CWD - It changes ownership of the
/home/autograde
directory recursively to theautograde
user - It forks a child process to run the instructor's autograder with limited privileges
- The child process runs as the
autograde
user, but the environment is NOT updated- This means that running
whoami
printsautograde
, but the environment variables are:USER: autolab
andHOME: /home/autolab
- This means that running
- The child process' stdout and stderr are redirected to
/home/autograde/output.log
- The child process calls
Make
to begin running your autograder - It exits after Make either successfully exits or times out
The main takeaway is that the four files (your Makefile, your autograder, the submission, and the submission metadata) are in /home/autograde/autolab
, and Make will be called in that directory as the autograde
user.
After the job is finished
After a job finishes, Tango:
- Copies the feedback file (e.g.,
/docker-volumes/dev-1005-autograding_image/feedback
) from the worker node to its filesystem (e.g.,/opt/TangoService/Tango/courselabs/<key>-<course>-<assignment>/output/student@buffalo.edu_5_assignment_autograde.txt
) via SCP - Destroys the Docker container and the volume (e.g.,
/docker-volumes/dev-1005-autograding_image/
) on the worker node - Removes the job from the queue and disassociates the worker node from the job container
- Creates a temporary file combining a header (with the job history and status) and the feedback file
- Makes a POST request to Autolab's callback URL with the result file to notify Autolab that the job finished
Autolab assigns the feedback from Tango to the submission (in the database) and saves it to the filesystem.
Autograding is complete!
(And all of this happens in under 3 seconds for minimal autograders!)