[…] Check whether the consent of two evaluators is greater than one would expect too much. I`ve already hit it here once to test the match level of our 7 Generation Games Rater when they have the […] Let`s do it the right way. First, we create a format that assigns numbers to our categories. Then we refer to the format in proc freq: Cohens Kappa-Statistik, is a measure of the match between the variables classified X and Y. For example, kappa can be used to compare the ability of different spleens to rank subjects in one group among others. Kappa can also be used to assess the consistency between alternative methods of categorical evaluation when new techniques are being studied. The SAS code for data entry and pseudo-frequency production is as follows: If the alphabetical order changes from the actual order of categories, the weighted kappa is miscalculated. To avoid this, either you code (1) the character values in numbers that reflect the actual order of the categories, or (2) use a format and enter the order-formatted option for Proc freq (see example 2). The inclusion of odS GRAPHICS On and PLOTS – KAPPAPLOT in your TABLES statement gives you a graph of both the agreement and distribution of the ratings. Personally, I find Kappa`s plots, like the example below, very useful. The observed proportional adequacy between X and Y is defined as: Kappa and weighted Kappa results are displayed, with confidence limits of 95%. Kappa usually ranges from 0 to 1 with a value of 1 means perfect match.
(Negative values are possible.) The higher the value of Kappa, the better the strength of the agreement. The weighted Kappa coefficient is 0.57 and the asymptomatic confidence interval is 95% (0.44, 0.70). This indicates that the agreement between the two radiologists is modest (and not as strong as the researchers had hoped). 2. The simple Kappa coefficient measures the degree of correspondence between two advisors. If Kappa is large (most would say .7 or more), this indicates a high degree of concordance. Example sas (19.3_agreement_Cohen.sas): two radiologists evaluated 85 patients for liver damage. The evaluations were made on an ordinal scale like: This visual representation of the agreement shows that there was a large amount of exact correspondence (dark blue shade) for the wrong answers, obtained 0, with a small percentage of sub-agreement and very little without consent. For 3 categories, a specific or partial agreement is only possible for the middle category. Two other points of this plot are that the match for correct and partially correct answers is less than the wrong one and that the distribution is distorted, with much of the answers being misjudged. Because it is adapted to the agreements that are unsuitable for it, Kappa is concerned with the distribution between categories.