Inter-rater Reliability

Inter-Annotator Agreement

In order to assess the quality of annotation of a task, Inter-Rater Reliability or Inter-Annotator Agreement (both terms are equivalent) is frequently used. It all stems down to the consensus among multiple annotators when annotating the same documents, within the context of the same annotation task.

The Ellogon Annotation Platform offers facilities for inspecting the annotations over arbitrary sets of Documents, and perform comparisons among different annotation sets, typically created by different annotators.

A typical workflow for measuring inter-annotator agreement among annotators is:

  1. A set of annotators annotate the same resource (i.e. a textual document). Each annotator can privately annotate a copy of the original resource, or annotate the same, shared resource, limiting the display of annotations to his/her own annotations only, through the settings.
  2. When all annotators have finished their annotations, the inter-annotation agreement can be calculated by the Ellogon Annotation Platform.
  3. A user that has access to all annotated resources by all annotators, can select the Inspection menu entry of the left panel, as shown in Figure 1. There are several available options for a) comparing annotation within a single resource (Compare Annotations); b) comparing annotations between two different resources (assuming they both contain the same data, i.e. text) (Compare Documents ); and c) compare annotations from arbitrary sets of documents (Compare Collections).
Figure 1: `Inspection` menu.

Figure 1: `Inspection` menu.

Comparing Annotations

This comparison option assumes that a single document has been annotated by multiple annotators, thus different annotations sets created by more than one annotators exist.