Pollux Logo
PoliCorp Logo

Disclaimers and License

Pollux Political Corpora (PoliCorp) is an open resource for accessing and analysing processed political text data. PoliCorp serves as an integral part of the Pollux project. This demonstrator offers researchers access to extensive textual datasets (currently the official protocols of plenary debates published by the German Bundestag, Germaparl), facilitating in-depth analysis of parliamentary discourse across time. Based on Pollux Political Corpora, researchers can easily generate sub-corpora for individual research. 

Currently, the platform hosts collection of the official protocols of plenary debates published by the German Bundestag, spanning 76 years of parliamentary discourse, starting September 7, 1949. Raw parliamentary speeches up to September 7, 2021, were sourced from the GermaParl corpus, a comprehensive linguistic dataset curated by the PolMine project. GermaParl covers transcripts of parliamentary debates from September 7, 1949, to September 7, 2021, and comprises of 958,100 speech contributions. Raw parliamentary speeches published after September 7, 2021, were sourced from the Bundestag Open Data project. New speeches will be added monthly to the platform.

Disclaimers

  • GERMAPARL:Raw parliamentary speeches up to September 7, 2021, presented on this website, were sourced from the following publication: Blaette, Andreas (2017): GermaParl. Corpus of Plenary Protocols of the German Bundestag. The data are available as TEI files at the GermaParlTEI GitHub repository. Raw parliamentary speeches published after September 7, 2021, were sourced from the Bundestag Open Data project. The data provided herein have been utilized in accordance with the terms of use specified by the original source.
  • Data Processing:Our website utilizes experimental tools for data processing, including but not limited to named entity recognition (NER) models. While these models are designed to provide insightful automatic annotations, they are not flawless and may produce inaccurate or incomplete results. We recommend exercising caution when relying on these outputs and verifying any critical information independently. If you encounter any issues or have feedback about the annotations, please feel free to contact us.

License

The data is provided under the CLARIN PUB+BY+NC+SA license. For detailed information on the licensing terms, please click here!

GESIS:ImprintData protection

Project's website:Pollux

For inquiries, contact us atnina.smirnova@gesis.org | ahsan.shahid@gesis.org

PoliCorp is a service by:Pollux Logo

Cite this project:Smirnova, N., Shahid, M. A., & Mayr, P. (2025). Political Corpora (PoliCorp): An open resource for accessing and analysing processed political text data. https://demo-pollux.gesis.org/

From:Gesis LogoSuub LogoQualiservice LogoFunded by:Dfg Logo
The Pollux team received funding from the German Research Foundation (DFG) via grant: MA 3964/7‑3.
Disclaimers and License:Legal Notice