proc freq data = ratings ; tables Rater1 * Rater2 /agree; run;
Statistical significance. To get p-values for kappa and weighted kappa, use the statement:
test kappa wtkap ;
But if your ratings are character variables, like Lo, Med, and Hi, SAS will assign numerical weights based on alphabetical order, like:
Hi = 1
Lo = 2
Med = 3
If the alphabetical order is different than the true order of the categories, weighted kappa will be incorrectly calculated. To avoid this, either (1) recode the character values to numbers that reflect the true ordering of categories, or (2) use a format and specify the order=formatted option for Proc freq (see Example 2).
This is fixed by adding pseudo-observations, which supply the unused category(ies), but which are given a very small weight. This makes SAS process the table as square and calculate kappa. See Example 1 and Example 2 below.
Top of page
Back to Kappa Coefficient page
Back to Agreement Statistics main page
Note: this is just an example. The N is too small to produce a realistic standard error estimate, confidence range, or p-value for kappa and weighted kappa.
The code for the example is as follows:
/***** Example 1: Calculate Kappa from Raw Data *****/ * input ratings by three raters ; data raw ; infile datalines ; input rater1 rater2 rater3; datalines; 1 2 1 1 2 1 1 2 1 1 2 2 1 3 2 2 2 2 2 2 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 1 2 2 2 3 3 2 3 2 2 3 3 2 3 3 1 3 3 2 3 3 2 ; run; *------------------------------------------------------------*; * The above would produce non-square tables because Rater 2 *; * doesn't use category 1 and Rater 3 doesn't use category 3. *; * The next 3 data steps fix this. *; *------------------------------------------------------------*; * step 1: give all current observations a weight of 1 ; data raw ; set raw ; wgt = 1 ; run; * step 2: make pseudo-records ; data pseudo ; infile datalines ; wgt = .0000000001; input rater1 rater2 rater3 ; datalines; 1 1 1 2 2 2 3 3 3 ; run; * step 3: concatenate the original data and pseudo-observations ; data both ; set raw pseudo ; run; * calculate kappa and weighted kappa between all pairs of raters ; title "Example 1: Raw Data"; proc freq data = both ; weight wgt ; tables rater1 * (rater2 rater3) / norow nocol agree ; tables rater2 * rater3 / norow nocol agree ; * include significance tests ; test kappa wtkap ; run;
The following is part of the output produced by the code above:
Example 1: Raw Data Table of rater1 by rater2 rater1 rater2 Frequency| Percent | 1| 2| 3| Total ---------+--------+--------+--------+ 1 | 1E-10 | 4 | 1 | 5 | 0.00 | 21.05 | 5.26 | 26.32 ---------+--------+--------+--------+ 2 | 0 | 8 | 0 | 8 | 0.00 | 42.11 | 0.00 | 42.11 ---------+--------+--------+--------+ 3 | 0 | 1 | 5 | 6 | 0.00 | 5.26 | 26.32 | 31.58 ---------+--------+--------+--------+ Total 1E-10 13 6 19 0.00 68.42 31.58 100.00 Simple Kappa Coefficient -------------------------------- Kappa 0.4842 ASE 0.1380 95% Lower Conf Limit 0.2137 95% Upper Conf Limit 0.7547 Test of H0: Kappa = 0 ASE under H0 0.1484 Z 3.2626 One-sided Pr > Z 0.0006 Two-sided Pr > |Z| 0.0011 Weighted Kappa Coefficient -------------------------------- Weighted Kappa 0.4701 ASE 0.1457 95% Lower Conf Limit 0.1845 95% Upper Conf Limit 0.7558 Test of H0: Weighted Kappa = 0 ASE under H0 0.1426 Z 3.2971 One-sided Pr > Z 0.0005 Two-sided Pr > |Z| 0.0010
Top of page
Back to Kappa Coefficient page
Back to Agreement Statistics main page
This example shows how:
The SAS code to input the data and make pseudo-frequencies is as follows:
/***** Example 2: Calculate Kappa from Frequency Data *****/ * input crossclassification frequencies (including 0 frequencies) ; data rate ; length rater1 $3 rater2 $3 ; infile datalines ; input rater1 rater2 f ; datalines; Lo Lo 0 Lo Med 0 Lo Hi 0 Med Lo 5 Med Med 16 Med Hi 3 Hi Lo 8 Hi Med 12 Hi Hi 28 ; run; *----------------------------------------------*; * If all frequencies of any row or any column *; * of the crossclassification table are 0, SAS *; * will not calculate kappa. In this case, add *; * the next data step. *; *----------------------------------------------*; * change the 0 frequencies to a negligible non-zero value ; data rate ; set rate ; if f = 0 then f = .0000000001 ; run;For comparison, we first see what SAS reports if we don't apply category formats:
* see what happens by default ; title "Example 2a: Frequency Input" ; title2 "Default: Rows/Columns Ordered by Category Values"; title3 "Correct Kappa but Incorrect Weighted Kappa!"; proc freq data = rate ; weight f; tables rater1*rater2 / agree norow nocol; run;
Here is the output produced by the commands above:
Example 2a: Frequency Input Default: Rows/Columns Ordered by Category Values Correct Kappa but Incorrect Weighted Kappa! rater1 rater2 Frequency| Percent |Hi |Lo |Med | Total ---------+--------+--------+--------+ Hi | 28 | 8 | 12 | 48 ---------+--------+--------+--------+ Lo | 1E-10 | 1E-10 | 1E-10 | 3E-10 ---------+--------+--------+--------+ Med | 3 | 5 | 16 | 24 ---------+--------+--------+--------+ Total 31 13 28 72 Kappa Statistics Statistic Value ASE 95% Confidence Limits ------------------------------------------------------------ Simple Kappa 0.3333 0.0814 0.1738 0.4929 Weighted Kappa 0.3944 0.0917 0.2146 0.5741
Now let's do things the right way. First we create a format that assigns our categories to numbers. Then we refer to the format in proc freq:
* define category intervals using a format ; proc format ; value $rate 'Lo' = 1 'Med' = 2 'Hi' = 3 ; run; * calculate kappa and unweighted kappa using formatted values; title "Example 2a: Frequency Input" ; title2 "Order Rows/Columns by Formatted Values" ; proc freq data = rate order=formatted ; format rater1 rater2 $rate. ; weight f; tables rater1*rater2 / agree norow nocol; run;Here is the output produced by the above. Note that the value of kappa is the same, but the value of weighted kappa is now correct:
Example 2b: Frequency Input Order Rows/Columns by Formatted Values Table of rater1 by rater2 rater1 rater2 Frequency| Percent |1 |2 |3 | Total ---------+--------+--------+--------+ 1 | 1E-10 | 1E-10 | 1E-10 | 3E-10 ---------+--------+--------+--------+ 2 | 5 | 16 | 3 | 24 ---------+--------+--------+--------+ 3 | 8 | 12 | 28 | 48 ---------+--------+--------+--------+ Total 13 28 31 72 Kappa Statistics Statistic Value ASE 95% Confidence Limits ------------------------------------------------------------ Simple Kappa 0.3333 0.0814 0.1738 0.4929 Weighted Kappa 0.2895 0.0756 0.1414 0.4376
Top of page
Back to Kappa Coefficient page
Back to Agreement Statistics main page
(c) 2000-2009 John Uebersax PhD email
Last revised: 20 July 2002