# Scorable Most grading stages implement the ability to aggregate scores. ## Configuration Format There are currently two types of score aggregation which is implemented by the Grader. Refer to the documentation of each stage to see which types of aggregation is supported by the individual stage. ### Total-Based Scorable Total-based scorables are stages which uses the student's score and the total score to compute the final score of the stage. ``` : score: Double? # The total score of this stage. treatDenormalScore: DenormalPolicy? # Policy when the score evaluates to NaN. ``` - `score` - If `null` or not specified, this stage will not contribute to the final score. - The score of the student submission is normalized against this value. - E.g. if the student received `20/30`, and `score is `60`, the final score will be `40/60`. - `DenormalPolicy: IGNORE | FAILURE | SUCCESS` - `IGNORE`: Ignores this test case entirely, and treat this test case as if it is not present. - `FAILURE`: Treats this test case as if it has failed. - `SUCCESS`: Treats this test case as if it has passed. - If `null` or not specified, defaults to `IGNORE` - **Note**: Some stages may hide this field. Refer to the stage documentation for more information. #### Example 1 Given the following: - Pipeline stage has 40 test cases - Student scores 25/40 cases - Stage is configured as follows: ```yaml stage: score: 100 ``` Then, the final score will be computed as such: ``` score = 25 / 40 * 100 = 62.5 total = 100.0 ``` #### Example 2 - Denormal Case Given the following: - Pipeline stage has 40 **disabled** test cases - Student scores 0/0 cases - Stage is configured as follows: ```yaml stage: score: 100 treatDenormalScore: IGNORE ``` Then, the final score will be computed as such: ``` score = null total = 100.0 ``` The `null` score in this case indicates to the `Score` stage that the score of this stage should not be used to aggregate the final score. ### Per-Element Scorable Per-element scorables are stages which do not have a defined "total score". Instead, these stages uses an initial score and a per-element score to determine the final score of the stage. The definition of elements in this section is intentionally abstract because different stages have different definitions of an "element", which may impact how this section should be interpreted. Always refer to the stage documentation for more information. ``` : scorePolicy: ScorePolicy? ``` - `scorePolicy` - If `null` or not specified, this stage will not contribute to the final score. #### `ScorePolicy` ``` ScorePolicy: initialScore: Double scorePerElem: Double limit: Double? ``` - `initialScore`: The initial score to start with. - `scorePerElem`: The score to add per element encountered by the stage. - `limit`: The upper/lower bound of the score. - If `null` or not specified, implies that there is no limit to how much points this score may add or deduct. #### Notes on the Report Since there is no "total score" available using the per-element accumulator, the total score is instead inferred by the parameters in the `scorePolicy` field. The total score is computed as follows: - If both `initialScore` and `limit` is provided, the total score is `max(initialScore, limit)` - Otherwise (i.e. `limit` is not provided), the total score is `initialScore` - This means that the baseline score is treated as the total score of the stage. #### Constraints - If `limit != null` and `ScorePolicy.scorePerElem < 0`, `initialScore` must be greater or equal to `limit`. - If `limit != null` and `ScorePolicy.scorePerElem > 0`, `initialScore` must be smaller or equal to `limit`. #### Examples Assuming we have a pipeline stage which performs static analysis, and we would like to score student submissions based on the analysis results. 1. Bounded Negative Accumulator One of the ways we can perform scoring is to deduct points per issues found. A possible configuration is shown below: ```yaml scorePolicy: initialScore: 10.0 scorePerElem: -0.25 limit: 0.0 ``` In the configuration, all submissions begin with `10.0` points. For every issue found by the static analysis, `0.25` points is *deducted* (notice the negative sign). The minimum score the stage can achieve is `0.0`. 2. Unbounded Negative Accumulator If the static analysis tool is also provided to the student, it may be reasonable to assume the student should have already run it once on their submission. Therefore, it may make sense to penalize issues without a limit. A possible configuration for this is shown below: ```yaml scorePolicy: initialScore: 0.0 scorePerElem: -0.25 ``` In the configuration, all submissions begin with `0.0` points. For every issue found by the static analysis, `0.25` points is *deducted* (noticed the negative sign). There is no minimum score set for this stage, and the score can deduct past `0.0` to the negatives. 3. Bounded Positive Accumulator Now, let's assume that the static analyzer is also able to analyze good coding practices, and the marking scheme needs to reward submissions for good coding practices. A possible configuration is shown below: ```yaml scorePolicy: initialScore: 0.0 scorePerElem: 1.0 limit: 5.0 ``` In the configuration, all submissions begin with `0.0` points. For every good practice found by static analysis, `1.0` is added to the score. The maximum score a submission can get is `5.0`. 4. Unbounded Positive Accumulator The accumulator can also be set to have no limit on how many points to add. A possible configuration is shown below: ```yaml scorePolicy: initialScore: 0.0 scorePerElem: 1.0 ``` In the configuration, all submissions begin with `0.0` points. For every good practice found by static analysis, `1.0` is added to the score. There is no upper limit on how many points can be added. ### Weighted Scorable Weighted scorables are similar to per-element scorables, with the added benefit of using predication to change the score of different test cases. ``` stage: scoreWeighting: ScoreWeighting? ``` - `scoreWeighting` - If `null` or not specified, this stage will not contribute to the final score. #### `scoreWeighting` ``` ScoreWeighting: default: Double limit: Double? overrides: [Override]? ``` - `default`: The default score of each element - `limit`: The upper bound of the score; Lower bound is currently unsupported - `overrides`: A list of score overrides by matching elements by the predicate supplied in the override When determining the score for an element, the logic is shown in the following psuedocode: ```kotlin predicates.firstOrNull { it.test(targetObj) }?.score ?: default ``` In layman's terms, predicates will be executed sequentially on the target object (defined by the stage). If a matching predicate is found, the overriding score will be used; otherwise the default score will be used. #### `Override` ``` Override: score: Double joinPolicy: JoinPolicy? # Predicates... ``` - `score`: The score to use for the element if the predicate is matched - `joinPolicy`: When multiple predicates are specified, whether to join the predicates using `OR` or `AND` operation In addition to the fields specified above, each stage will have its own fields which act as predication statements. Refer to the stage for more information. ### Predicates There are currently 4 types of built-in predicates available, implemented using 3 types of comparison operators. In general, predicates are implemented with the following structure: ``` Predicate: value: T op: Op ``` - `value` is the target data to compare against - `op` is the operation used for comparison The above predicate is equivalent to `$value $op $field`, where `$field` is some data defined by the stage which can be compared against. #### `EqualOp` Compares the equality of two values. ``` EqualOp: [EQ | NOT_EQ] ``` - `EQ` and `NOT_EQ` represent "equal" (==) and "not equal" (!=) respectively. This operation is used by `Predicate.Bool`. ``` Predicate.Bool: value: Boolean op: EqualOp ``` ##### Example To write `true == $field` ``` predicate: value: true op: EQ ``` #### `CompareOp` Compares the ordering of two values. ``` CompareOp: [EQ, NOT_EQ, LT, LT_EQ, GT, GT_EQ] ``` - `EQ` and `NOT_EQ` represent "equal" (==) and "not equal" (!=) respectively. - `LT` and `LT_EQ` represent "less than" (<) and "less than or equal" (<=) respectively. - `GT` and `GT_EQ` represent "greater than" (>) and "greater than or equal" (>=) respectively. This operation is used by `Predicate.Integral` and `Predicate.FP`. ``` Predicate.Integral: value: Long op: CompareOp Predicate.FP: value: Double op: CompareOp ``` ##### Example To write `1 > $field` ``` predicate: value: 1 op: GT ``` To write `0.0 <= $field` ``` predicate: value: 0.0 op: LT_EQ ``` #### `StrEqualOp` Compares the equality of two strings-like objects. All arguments are stringified before performing the comparison. ``` StrEqualOp: [EQ, NOT_EQ, CASE_IGNORE_EQ, CASE_IGNORE_NOT_EQ, REGEX_EQ, REGEX_NOT_EQ] ``` - `EQ` and `NOT_EQ` represent "equal" (==) and "not equal" (!=) respectively. - `CASE_IGNORE_EQ` and `CASE_IGNORE_NOT_EQ` represent "equal" (==) and "not equal" (!=) respectively, after stringifying both operands and ignoring case. - `REGEX_EQ` and `REGEX_NOT_EQ` represent "equal" (==) and "not equal" (!=) respectively, after stringifying both operands and using the user-provided value as a regular expression. This operation is used by `Predicate.CharSeq`. ``` Predicate.CharSeq: value: String op: StrEqualOp ``` ##### Example To write `"abc" == $field` ``` predicate: value: abc op: EQ ``` To write `"abc" == $field`, ignoring case ``` predicate: value: abc op: CASE_IGNORE_EQ ``` To write `"abc" == $field`, treating `"abc"` as a regular expression ``` predicate: value: abc op: REGEX_EQ ``` ## Report Regardless of which scorable a stage implements, all scorable stages share the same report format. ``` : - score: score: Double? # Score achieved by the submission total: Double # Total score of the stage ``` - The top-level `score` may be `null`, which indicates that the user has disabled scoring for the pipeline stage. - If `score.score` is `null`, implies that the stage is unable to calculate a meaningful score. This can be due to unexpected failures in stage execution, `DenormalPolicy.IGNORE`, or other factors.