CORPORA

NAME TYPE SIZE AVAILABILITY
WEB PORTALS
GIGAFIDA written corpus 1.2 billion words special contract
KRES written balanced corpus 100 million words special contract
GOS spoken corpus 1 million words CC BY-NC-SA, download
TEXT COLLECTIONS
ŠOLAR learners’ corpus 1 million words CC BY-NC-SA, download
GOS spoken corpus 1 million words CC BY-NC-SA, download
DATA SETS (FOR LT)
TRAINING CORPUS manually tagged corpus 500,000 words CC BY-NC-SA, download
ccGIDAFIDA tagged corpus 100 million words CC BY-NC-SA, download
ccKRES tagged corpus 10 million words CC BY-NC-SA, download