Abstract
This project will develop cross-linguistic annotation protocols for exploring the content of sign language video datasets. The key progress lies in a) standardised lemmatisation protocols for lexicalised signs, and b) protocols for annotating partly-lexical and non-lexical (including gestural) elements. The project will demonstrate its approach using corpora of British Sign Language (BSL) and Sign Language of the Netherlands (NGT). Linguistic corpora – i.e. large, representative samples of naturalistic language use – are one of the richest type of resources for studying language structure and use. The new annotation protocols and resulting corpora will enable users to really dig into the content of the existing video data and to enable cross-linguistic research with sign language corpora. The project thus goes far beyond the current state of the art with online sign language corpus data which restricts searches to a few key background details about participants via metadata.



 
  
 