Statistics
The statistics presented below are based on technical and attribute-related data available in the core and expanded spreadsheets (see Corpus).
Statistics: technical data
Statistics in this section are related to data documenting the recognition and selection process of our data.
Authors
Over a half of these articles are by authors affiliated in Poland, but 75 other countries are represented too.
While most author affiliations are European, there are representatives of five continents among the authors.
Articles in linguistics (Polish-only vs other affiliations)
Articles in law (Polish-only vs other affiliations)
Articles in literary studies (Polish-only vs other affiliations)
Journals
In linguistics, there are 819 abstracts from 32 journals.
In law, there are 987 abstracts from 32 journals.
In literary studies, there are 330 abstracts from 21 journals.
Disciplines
The corpus contains abstracts from 2136 articles in three disciplines: linguistics, law, and literary studies.
Dates
The corpus covers articles from 2018-2021, but 2021 is represented to the least extent.
Conventions
The abstracts mostly have the conventional "abstract" label.
The abstracts are mostly located above the articles.
Statistics: attribute-related data
Basic and elaborated abstract attributes
Elaborated abstracts: attributes by discipline.
Basic abstracts: attributes by discipline.
Attribute co-occurrence in basic and elaborated abstract
The most common combination of attributes in elaborated abstracts differs between disciplines.
The most common combination of attributes in basic abstracts is the same in all disciplines.
Elaborated abstracts: number of positive attribute values by discipline.
Basic abstracts: number of positive attribute values by discipline.
Basic and elaborated abstract compositions
In a vast majority of cases, elaborated abstracts contain more than the basic abstracts.
In a vast majority of cases, basic abstracts contain more than the abstract texts.
Elaborated abstracts range from 1 to 10 text type-based parts.
Basic abstracts range from 1 to 7 text type-based parts.
Paralinguistic objects in basic and elaborated abstract compositions
At part borders in elaborated abstracts there is usually extra white space.
Basic abstracts are more diverse in this respect.
Most elaborated abstracts do not feature paralinguistic objects at part borders.
Basic abstracts are more diverse in this respect.
Linguistic constructions in basic and elaborated abstract compositions
The most common linguistic construction appearing in elaborated abstracts is keywords.
The most common linguistic construction appearing in basic abstracts is title.
Sequences of linguistic constructions in basic and elaborated abstract compositions
Three sequences of linguistic constructions represent over 50% of the elaborated abstracts in the corpus.
In law, three top sequences of linguistic constructions represent over 50% of elaborated abstracts.
In linguistics, four top sequences of linguistic constructions represent over 50% of elaborated abstracts.
In literary studies, just one top sequence of linguistic constructions represents over 50% of elaborated abstracts.
For more information on Statistics, contact D. Guttfeld (contact details in the footer).