Jump to content

Corpus manager

From Wikipedia, the free encyclopedia

A corpus manager (corpus browser or corpus query system) is a tool for multilingual corpus analysis, which allows effective searching in corpora.[1]

A corpus manager usually represents a complex tool that allows one to perform searches for language forms or sequences. It may provide information about the context or allow the user to search by positional attributes, such as lemma, tag, etc. These are called concordances. Other features include the ability to search for Collocations, frequency statistics as well as metadata information about the processed text.[2] The narrower meaning of corpus manager refers only to the server side or the corpus query engine, whereas the client side is simply called the user interface.

A corpus manager can be software installed on a personal computer or it might be provided as a web service.

List of corpus managers

[edit]
  • BNCweb[3] – a web-based interface for the British National Corpus
  • CQPweb[4] - a web-based interface for the study of a large variety of corpora including the Spoken BNC2014
  • BYU-BNC[5] – a website that allows searches of the British National Corpora and others created at Brigham Young University
  • Coma[6] – a tool extension of the system EXMARaLDA for working with oral corpora on a computer
  • NoSketch Engine[7] – a free open-source corpus management system combining Manatee (back-end) and Bonito (web interface)
  • KonText[8] – an extended and modified web interface to NoSketch Engine (a Bonito replacement)
  • Sketch Engine[9][10] – text corpus management and analysis software with more than 500 corpora in 90+ languages
  • Spoco [11]
  • WordSmith Tools[12] – a software package primarily for linguists

References

[edit]
  1. ^ "Korpusový manažer". Wiki Český národní korpus. Český národní korpus. 8 April 2015. Retrieved 18 April 2015.
  2. ^ Kouklakis, George; Mikros, George; Markopoulos, George; Koutsis, Ilias (2007). "Corpus Manager A Tool for Multilingual Corpus Analysis" (PDF). Proceedings from Corpus Linguistics Conference. University of Athens: 1–12.
  3. ^ interface to the British National Corpus more about British National Corpus
  4. ^ CQPweb Main Page
  5. ^ BYU-BNC: BRITISH NATIONAL CORPUS interface
  6. ^ EXMARaLDA Corpus-Manager Hamburger Zentrum für Sprachkorpora
  7. ^ NoSketch Engine (an open-source project combining Manatee, Bonito and Crystal into a powerful and free corpus management system)
  8. ^ A basic query interface for working with corpora Institute of the Czech National Corpus (ICNC), Faculty of Arts, Charles University in Prague
  9. ^ The Sketch Engine homepage
  10. ^ Concordancers, Search Engines, Text-analysis Tools Archived 15 March 2015 at the Wayback Machine a list on University of Wollongong website
  11. ^ Ruprecht von Waldenfels; Michał Woźniak (2016), "SpoCo - a simple and adaptable web interface for dialect corpora", Journal for Language Technology and Computational Linguistics, 31 (1): 133-148, doi:10.21248/jlcl.31.2016.206
  12. ^ WordSmith Tools homepage