Tutorial
Preliminaries
Ensure the aga CLI is available in your environment (which aga
) and updated
to the current version (aga --version
). If not, you can install it with pip
or a tool like poetry
.
Getting Started
We’re going to write a simple autograder for a basic problem: implementing a function to square an integer.
Whenever you write an aga autograder, you start by writing a reference implementation, which aga calls the golden solution. The library is based on the idea that reference implementations have uniquely beneficial properties for autograding homework; see motivation. So, here’s our implementation:
def square(x: int) -> int:
"""Square x."""
return x * x
The type annotations and docstring are just because they’re good practice; as of now, aga does nothing with them. You might put, for example, the text of the problem that you’re giving to students, so it’s there for easy reference.
Now we need to tell aga to turn this into a problem. We do that with the
problem
decorator:
from aga import problem
@problem()
def square(x: int) -> int:
"""Square x."""
return x * x
Aga’s API is based around decorators; if you’re not familiar with them, I
suggest finding at least a brief introduction. problem
will always be the
first decorator you apply to any golden solution.
Now if we save this as square.py
, we could run aga gen square.py
in that
directory, which would generate a problem.zip
file. However, we’re not quite
done: we haven’t given aga any test inputs yet! Let’s do that:
from aga import problem, test_case
@test_case(-2)
@test_case(2)
@problem()
def square(x: int) -> int:
"""Square x."""
return x * x
Now re-run aga gen square.py
and upload the resultant file to
Gradescope.
There are a couple of things to know about this behavior.
First, there must be exactly one problem present in square.py
. This is a
limitation that will hopefully be relaxed in the future.
Second, while the student can upload any number of files, precisely one of them
must contain a python object matching the name of the reference solution; in this
case, square
(note that the reference solution object’s name is used even if
another name is assigned to the problem itself via the name
argument to the
decorator). Otherwise, the solution will be rejected. It’s extremely important
to communicate this restriction to students.
Third, each test case will be run against the student’s submission and the golden solution. If the outputs differ, the test will be marked as failing. The score of each test case will be half of the total score of the problem; by default, each test case has equal weight. Modifying this default will be discussed in Customizing Test Case Score.
You can use a similar syntax for multiple arguments, or keyword arguments:
@test_case(2, 1) # defaults work as expected
@test_case(2, 1, sign=False)
@test_case(-3, 4, sign=False)
@problem()
def add_or_subtract(x: int, y: int, sign: bool = True) -> int:
"""If sign, add x and y; otherwise, subtract them."""
if sign:
return x + y
else:
return x - y
As a final note, you often won’t want to upload the autograder to gradescope
just to see the output that’s given to students. You can use the aga run
command to manually check a student submission in the command line.
Testing the Golden Solution
We still have a single point of failure: the golden solution. Golden tests are aga’s main tool for testing the golden solution. They work like simple unit tests; you declare an input and expected output, which aga tests against your golden solution. We expect that any cases you want to use to test your golden solution will also be good test cases for student submissions, hence the following syntax:
@test_case(-2, aga_expect = 4)
@test_case(2, aga_expect = 4)
@problem()
def square(x: int) -> int:
"""Square x."""
return x * x
Note that we prefix all keyword arguments to the test_case
decorator with
aga_
, so that you can still declare test inputs for problems with actual
keyword arguments.
aga
can now check golden stdout now as well! Just add aga_expect_stdout
to the test case(s). The format for the aga_expect_stdout
is either a str
or a Iterable
of str
.
When a str
is given, the given string will be checked against all the captured output. When an Iterable
is given, the captured output string will be divided using splitlines
, meaning each string in the Iterable
should contain NO \n
characters.
The following examples will show.
@test_case(10, 20, aga_expect_stdout="the result is 30\n", aga_expect=30)
@problem()
def add(a: int, b: int) -> int:
"""Add two numbers."""
print("the result is", a + b)
return a + b
@test_case("Bob", aga_expect_stdout=["What is your name? ", "Hello, world! Bob!"])
@problem(script=True)
def hello_world() -> None:
"""Print 'Hello, world!'."""
name = input("What is your name? ")
print(f"Hello, world! {name}!")
If you run aga check square
, it will run all golden tests (i.e., all test
cases with declared aga_expect
), displaying any which fail. This also happens
by default when you run aga gen square.py
, so you don’t accidentally upload a
golden solution which fails unit testing.
Customizing Test Case Score
By default, aga takes the problem’s total score (configured on Gradescope) and
divides it evenly among each problem. This division is weighted by a parameter,
aga_weight
, of test_case
, which defaults to 1
. If our total score is 20,
and we want the 2
test case to be worth 15 and the -2
to be worth 5, we can
do this:
@test_case(-2, aga_expect = 4)
@test_case(2, aga_expect = 4, aga_weight = 3)
@problem()
def square(x: int) -> int:
"""Square x."""
return x * x
It is also possible to directly control the value of test cases:
@test_case(-2, aga_expect = 4) # will get 100% of (total - 15) points
@test_case(2, aga_expect = 4, aga_weight = 0, aga_value = 15)
@problem()
def square(x: int) -> int:
"""Square x."""
return x * x
However, this is not recommended, because it can lead to strange results if there is incongruity between the values assigned via aga and the total score assigned via Gradescope.
For complete semantics of score determination, see Determining Score.
Generating Test Cases
You can check out examples/inputs_for_test_cases.py
in the GitHub repo for more complete examples and comparisons.
If we want many test cases, we probably don’t want to enumerate all of them by
hand. Aga therefore provides the test_cases
decorator, which makes it easy to collect python generators (lists, range
,
etc.) into test cases.
Let’s start by testing an arbitrary set of inputs:
from aga import problem, test_cases
@test_cases(-3, -2, 0, 1, 2, 100)
@problem()
def square(x: int) -> int:
"""Square x."""
return x * x
This will generate six test cases, one for each element in the list. Test cases
generated like this must share configuration, so while you can pass e.x.
aga_weight
to the decorator, it will cause each test case to have that
weight, rather than dividing the weight among the test cases.
The @test_cases(-3, -2, 0, 1, 2, 100)
is equivalent to
from aga import param, test_cases, problem
@test_cases(param(-3), param(-2), param(0), param(1), param(2), param(100))
@problem()
def square(x: int) -> int:
"""Square x."""
return x * x
The directive param
is used to wrap parameters to a function. Each param
object is considered as a test case.
Similarly, we can generate tests for all inputs from -5 to 10:
@test_cases(*range(-5, 11))
@problem()
def square(x: int) -> int:
"""Square x."""
return x * x
This will generate 16 test cases, one for each value in the range.
Or, we can generate tests programmatically, say from a file:
from typing import Iterator
def inputs() -> Iterator[int]:
with open("INPUTS.txt", "r", encoding="UTF-8") as f:
for s in f.readlines():
yield int(s.strip())
@test_cases(*inputs())
@problem()
def square(x: int) -> int:
"""Square x."""
return x * x
The generation happens when you run aga gen
on your local machine, so you can
rely on resources (network, files, etc) not available in the Gradescope
environment.
Multiple Arguments
Basics of Multiple Arguments
Say we want to generate inputs for multiple arguments (or keyword arguments), e.x. for a difference function. We can use the natural syntax:
@test_cases([(-3, 2), (-2, 1), (0, 0)], aga_params=True)
@problem()
def difference(x: int, y: int) -> int:
"""Compute x - y."""
return x - y
There are four ways you can specify a batch of test cases: params
, zip
and product
.
aga_params
will only take one iterable object, and each element in the iterable object will be unfolded when applied to the function. The example above will generate 3 tests, each to bedifference(-3, 2)
,difference(-2, 1)
anddifference(0, 0)
. In the case where you want to add keyword arguments, you can use theparam
directive.from aga import problem, test_cases, param @test_cases([param(-3, y=2), param(-2, y=1), param(0, y=0)], aga_params=True) @problem() def difference(x: int, y: int) -> int: """Compute x - y.""" return x - y
which is equivalent to
from aga import problem, test_cases, param @test_cases([(-3, 2), (-2, 1), (0, 0)], aga_params=True) @problem() def difference(x: int, y: int) -> int: """Compute x - y.""" return x - y
<no-flag>
Note that this is different from the one above withaga_params
flag. The example blow will generate 3 tests as well, but each to bedifference((-3, 2))
,difference((-2, 1))
anddifference((0, 0))
.
@test_cases((-3, 2), (-2, 1), (0, 0))
@problem()
def difference(tp) -> int:
"""Compute x - y."""
x, y = tp
return x - y
aga_singular_params
works similarly toaga_params
. The following code is equivalent todifference((-3, 2))
,difference((-2, 1))
anddifference((0, 0))
. (Note that theaga_params
flag is not needed.)
from aga import problem, test_cases, param
@test_cases([(-3, 2), (-2, 1), (0, 0)], aga_singular_params=True)
@problem()
def difference(tp: Tuple[int, int]) -> int:
"""Compute x - y."""
x, y = tp
return x - y
It comes useful when you have a iterable of things where each single thing is going to serve as a parameter.
from aga import problem, test_cases, param
@test_cases(range(5), aga_singular_params=True)
@problem()
def square(x: int) -> int:
"""Compute x - y."""
return x * x
The @test_cases(range(5), aga_singular_params=True)
is equivalent to expanding the generator in the no flag version @test_cases(*range(5))
. Note that @test_cases(range(5), aga_params=True)
is not valid.
aga_product
will take the cartesian product of all the arguments. In the above example, there will be 15 test cases, one for each combination of the arguments.
@test_cases([-5, 0, 1, 3, 4], [-1, 0, 2], aga_product=True)
@problem()
def difference(x: int, y: int) -> int:
"""Compute x - y."""
return x - y
aga_zip
will take the zip of all the arguments. In the example below, there will be 3 test cases, one for each pair of the arguments. This will short-circuit when the smaller iterator ends, so this will generate three test cases:(-5, -1)
,(0, 0)
, and(1, 2)
.
@test_cases([-5, 0, 1, 3, 4], [-1, 0, 2], aga_zip=True)
@problem()
def difference(x: int, y: int) -> int:
"""Compute x - y."""
return x - y
Shorthands
You will find typing all the aga_product
etc. to be tedious. In that case, you can use the shorthands provided. There are two ways you can write it simpler.
from aga import problem, test_cases @test_cases([-5, 0, 1, 3, 4], [-1, 0, 2]) @problem() def fn() -> None: # this is the same as @test_cases(...) ... @test_cases.params([-5, 0, 1, 3, 4], [-1, 0, 2]) @problem() def fn() -> None: # this is the same as @test_cases(..., aga_params=True) ... @test_cases.product([-5, 0, 1, 3, 4], [-1, 0, 2]) @problem() def fn() -> None: # this is the same as @test_cases(..., aga_product=True) ... @test_cases.zip([-5, 0, 1, 3, 4], [-1, 0, 2]) @problem() def fn() -> None: # this is the same as @test_cases(..., aga_zip=True) ... @test_cases.singular_params(([-5, 0, 1, 3, 4], [-1, 0, 2])) @problem() def fn() -> None: # this is the same as @test_cases(..., aga_singular_params=True) ...
from aga import problem, test_cases_params, test_cases_product, test_cases_zip @test_cases_params([-5, 0, 1, 3, 4], [-1, 0, 2]) @problem() def fn() -> None: # this is the same as @test_cases(..., aga_params=True) ... @test_cases_product([-5, 0, 1, 3, 4], [-1, 0, 2]) @problem() def fn() -> None: # this is the same as @test_cases(..., aga_product=True) ... @test_cases_zip([-5, 0, 1, 3, 4], [-1, 0, 2]) @problem() def fn() -> None: # this is the same as @test_cases(..., aga_zip=True) ...
Note on aga_*
keyword arguments
At this point, you might wonder what could be the input to aga_*
keyword arguments. The good news is that you can do both singletons or iterables. When singleton is given, aga
will match the number with the number of test cases. When an iterable is given, the number of elements must match the number of test cases and aga
will check that.
Foe example, if you want to set a series of tests to hidden and define a bunch of golden outputs for them, we can do
@test_cases([1, 2, 3], aga_hidden=True, aga_expect=[1, 4, 9])
@problem()
def square(x: int) -> int:
"""Square x."""
return x * x
@test_cases(1, 2, 3, aga_expect=[1, 1, 4, 4, 9, 9])
since the numbers don’t match.
Checking Scripts
Sometimes, submissions look like python scripts, meant to be run from the
command-line, as opposed to importable libraries. To test a script, provide the
script=True
argument to the problem
decorator:
@test_case("Alice", "Bob")
@test_case("world", "me")
@problem(script=True)
def hello_name() -> None:
"""A simple interactive script."""
listener = input("Listener? ")
print(f"Hello, {listener}.")
speaker = input("Speaker ?")
print(f"I'm {speaker}.")
This has three implications:
Aga will load the student submission as a script, instead of looking for a function with a matching name.
Aga will compare the standard output of the student submission to the standard output of the golden solution.
Aga will interpret the arguments to
test_case
as mocked outputs of the built ininput()
function. For example, for the “Alice”,”Bob” test case, aga will expect this standard output:
Hello, Alice.
I'm Bob.
Creating Pipelines
When testing against a class or an object, you can create a pipeline of functions to be called. This is useful if you want to test on the same object using different a sequence of actions.
A pipeline is a sequence of function (which sometimes is referred as a process) that accepts two inputs, the object it’s testing on and the previous result generated by the proceeding function, and outputs a result. The pipeline will be run on the golden solution and students’ solution, and the output results will be compared individually. You can create a pipeline from any of the following directives.
from aga import test_case, param, test_cases, problem
from aga.core.utils import initializer
def fn1(obj, previous_result):
...
def fn2(obj, previous_result):
...
@test_case.pipeline(initializer, fn1, fn2)
@test_cases(param.pipeline(initializer, fn1, fn2))
@problem()
class TestProblem:
...
The library provides several useful functions. They can be imported from aga.core.utils
, like the initializer
function above. One can use initializer
to initialize the class under testing. Note that if you want to initialize the class with arguments, you can ONLY use initializer
.
You can use the following linked list code as an example. It will generate a test case of multiple actions and outputs.
from __future__ import annotations
from aga import test_case, problem
from aga.core.utils import initializer, MethodCallerFactory, PropertyGetterFactory
prepend = MethodCallerFactory("prepend")
display = MethodCallerFactory("display")
pop = MethodCallerFactory("pop")
get_prop = PropertyGetterFactory()
actions_and_outputs = {
initializer: None,
prepend(10): None,
display(): None,
prepend(20): None,
display(): None,
prepend(30): None,
display(): None,
get_prop("first.value"): 30,
get_prop("first", "next", "value"): 20,
get_prop("first", ".next", ".value"): 20,
get_prop(".first", "next", "value"): 20,
pop(): 30,
pop(): 20,
pop(): 10,
}
class Node:
"""A node in a linked list."""
def __init__(self, value: int, next_node: Node | None = None) -> None:
self.value = value
self.next = next_node
@test_case.pipeline(
*actions_and_outputs.keys(),
aga_expect_stdout="< 10 >\n< 20 10 >\n< 30 20 10 >\n",
aga_expect=list(actions_and_outputs.values()),
)
@problem()
class LL:
"""A linked list for testing."""
def __init__(self) -> None:
self.first: Node | None = None
def __repr__(self) -> str:
"""Return a string representation of the list."""
return f"< {self._chain_nodes(self.first)}>"
def _chain_nodes(self, node: Node | None) -> str:
if node is None:
return ""
else:
return f"{node.value} {self._chain_nodes(node.next)}"
def display(self) -> None:
"""Print the list."""
print(self)
def prepend(self, value: int) -> None:
"""Add a new element to the front of the list."""
self.first = Node(value, self.first)
def pop(self) -> int:
"""Remove the first element from the list and return it."""
if self.first is None:
raise IndexError("Cannot pop from an empty list")
value = self.first.value
self.first = self.first.next
return value