Keywords

1 Introduction

Reading is a complex cognitive activity where learners read texts to construct a meaningful understanding from the verbal symbols i.e. the words and sentences and the process is called as reading comprehension. In Reading process, the three main factors - the learner’s context knowledge, the information aroused by the text, and the reading circumstances together construct a meaningful discourse. Previous researches claim that in academic environment several reading and learning strategies including intensive reading and extensive reading [2], spaced repetition [7] and top-down and bottom-up processes [1] play vital role in students developing comprehension skills.

Intensive Reading: It is the more common approach, in which learners read passages selecting from the same text or various texts about the same subject. Here, content and linguistic forms are repeated themselves, therefore learners get several chances to comprehend the meaning of the textual contents. It is usually classroom based and teacher centric approach where students concentrate on linguistics, grammatical structures and semantic details of the text to retain in memory over a long period of time. Students involve themselves in reading passages carefully and thoroughly again and again aiming to be able translating the text in a different language, learning the linguistic details in the text, answering comprehension questions such as objective type and multiple choice, or knowing new vocabulary words. Some disadvantages are - (a) it is slow, (b) needs careful reading of a small amount of difficult text, (c) requires more attention on the language and its structure, including morphology, syntax, phonetics, and semantics rather than the text, (d) text may be bored to students since it was chosen by the teacher, and (e) because exercises and assessments are part of comprehension evaluation, students may involve in reading only for the preparation for a test and not for getting any pleasure.

Extensive Reading: On the other hand, extensive reading provides more enjoyments as students read big quantities of own interest contents; focus on to understand main ideas but not on the language and its structure, skipping unfamiliar and difficult words and reading for summary [12]. The main aim of extensive reading is to learn foreign language through large amounts of reading and thus building student confidence and enjoyment. Several Research works claim that extensive reading facilitating students improving in reading comprehension to increase reading speed, greater understanding of second language grammar conventions, to improve second language writing, and to motivate for reading at higher levels [10].

The findings of previous researches suggest that extensive and intensive reading approaches are beneficial, in one way or another, for improving students’ reading comprehension skills.

Psycholinguistic Factors: Psycholinguistics is a branch of cognitive science in which language comprehension, language production and language acquisition are studied. It tries to explain the ways in which language is represented and is processed in the brain; for example, the cognitive processes responsible for generating a grammatical and meaningful sentence based on vocabulary and grammatical structures and the processes which are responsible to comprehend words, sentences etc. Primary concerned linguistic related areas are: Phonology, morphology, syntax, semantics, and pragmatics. In this field, researchers study reader’s capability to learn language for example, the different processes required for the extraction of phonological, orthographic, morphological, and semantic information by reading a textual document.

More recent work, Coh-Metrix [5] offers to investigate the cohesion of the explicit text and the coherence of the mental representation of the text. This metrix provides detailed analysis of language and cohesion features that are integral to cognitive reading processes such as decoding, syntactic parsing, and meaning construction.

2 Brief Description of Coh-Metrix Measures

Coh-Metrix is an automatic text analysis tool forwarding traditional theories of reading and comprehension to next higher level and therefore, can plays important role in different disciplines of education such as teaching, readability, learning etc. The tool analyses and measures features of texts written in English language through hundreds of measures, all informed by previous researchers in different disciplines such as computational linguistics, psycholinguistics, discourse processes and cognitive sciences. The tool integrates several computational linguistics components including lexicons, pattern classifiers, part-of-speech taggers, syntactic parsers, semantic interpreters, WordNet, CELEX Corpus etc. Employing these elements, Coh-Metrix can analyze texts on multi levels of cohesion including co-referential cohesion, causal cohesion, density of connectives, latent semantic analysis metrics, and syntactic complexity [5].

All measures of the tool have been categorized into following broad groups:

  1. 1.

    Descriptive measures: These measures describe statistical features of text in form of total number of paragraphs, total number of sentences, total number of words, average length of paragraphs with standard deviation, average number of words with standard deviation, mean number of syllables in words with standard deviation etc.

  2. 2.

    Easability components: For measuring text easability score, the tool provides several scores including text narrativity, syntactic familiarity, and Word Concreteness.

  3. 3.

    Referential Cohesion: It is a linguistic cue that helps readers in making connections between different text units such as clauses, and sentences. It includes Noun overlap (words overlap in terms of noun), and Argument overlap (sentences overlap in terms of nouns and pronouns). Coh-Metrix measures semantically similar pairs such as car/vehicle etc.

  4. 4.

    Latent Semantic Analysis: It is used to implement semantic co-referentiality for representing deeper world knowledge based on large corpora of texts.

  5. 5.

    Lexical Diversity: It is the variety of unique words (types) in a text in relation to number of words (tokens). It refers to variation of Type-token ratio (TTR).

  6. 6.

    Connectives: It provides clues about text organization and aid reader in the creation of cohesive links between ideas and clauses. It measures the cohesive links between different conceptual units using different types of connectives such as causal (because, so), logical (and, or), adversative (whereas), temporal (until) and additive (moreover). In addition to this, there is a difference between positive connectives (moreover) and negative connectives (but).

  7. 7.

    Situation Model: It refers to the level of reader’s mental representation for a text when a given context is activated.

  8. 8.

    Syntactic Complexity: It is measured using NP density, mean number of high-level constituents per word, and the incidence of word classes that indicate analytical difficulty (e.g. and, or, if-then, conditionals).

  9. 9.

    Syntactic Pattern Density: It refers to the density of particular syntactic patterns, word types, and phrase types. The relative density of noun phrases, verb phrases, adverbial phrases, and prepositions can affect processing difficulty of text, especially with respect to other features in a text.

  10. 10.

    Word Information: It provides density scores for various parts of speech (POS), including pronouns, nouns, verbs, adjectives, adverbs, cardinal numbers, determiners, and possessives.

  11. 11.

    Readability: It provides the readability formulas of Flesch Reading Ease and Flesch-Kincaid Grade Level [4, 9]. Both are two readability tests designed to indicate how difficult a passage in English is to understand. These tests use word length and sentence length as core measures; however they have different weighting factors.

The aim of present work is to identify the linguistic features that can classify students into two groups - students having proficient comprehension skills and students with poor comprehension skills from their summary speech transcripts.

3 Participants and Method

A brief description of the participants, materials, and procedure that we used in this study are described here.

Participants: Twenty undergraduate students (mean age (SD)- 21.4(0.86)) in information technology major; studied in same batch and performed all academic activities only in English, whereas their primary languages were different; participated in this experimental sessions. Students were told that they would be awarded some course credits for participating in the research. Based on their academic performance in last four semesters, these students were divided into two groups - ten as proficient and others as poor comprehenders.

Materials: The reading materials consisted of two passages. One passage (total sentences- 38, total words- 686, sentence length (mean)- 18.0, Flesch-Kincaid Grade level- 13.3) had been selected from students’ course book whereas other was a simple interesting story (total sentences- 42, total words- 716, sentence length (mean)- 17.0, Flesch-Kincaid Grade level- 3.9). Both passages were written in English and were unread until the experiment began. Reading story passage was simulated extensive reading experience and reading course passage was simulated intensive reading experience.

Procedure: All experimental sessions were held in a research lab in a set of 5 students. The experiment consisted of two tests. In each test, student had instructed to read a given passage and then to solve a puzzle and lastly to tell summary as much detail as they can. Both tests were similar except the reading material - the story passage was given in first test and the course passage was given in second test. Students were informed to read the passage on computer screen as they would normally read. The speech were recorded using a digital audio recorder software installed in the computer system. The puzzle task was useful to erase students’ short term memory of read text to ensure that the summary would come from their long term memory.

4 Feature Analysis

Feature Extraction: The recorded audio files were transcripted in English where brief pauses were marked with commas, while long pauses were marked with full stops (end of sentence) if their places were according to semantic, syntactic and prosodic features. Repetitions, incomplete words and incomprehensible words were not included in transcription. In the experiment, two sets of transcripts were generated - (a) story transcripts had texts of story summary audio files and (b) course transcripts had texts of course summary audio files. Both sets had twenty texts, ten of proficient comprehenders’ audio files and the other ten of poor comprehenders’ audio files.

For analysing the texts of both sets of transcripts, we used the computational tool Coh-Metrix. Coh-Metrix 3.0 (http://cohmetrix.com) provided 106 measures; which were categorized into eleven groups as described in Sect. 2.

Feature Selection: In machine learning classifiers including too many features may lead to overfit the classifier and thus resulting in poor generalization to new data. So, only necessary features should be selected to train classifiers.

We applied two different approaches for the selection of necessary features improving the accuracy of the classifiers.

Approach-1: Coh-Metrix provides more than hundreds of measures of text characteristics and several of them are highly correlated. For example, Pearson correlations demonstrated that z score of narrativity was highly correlated (\(r = 0.911\), \(p < 0.001\)) with percentile of narrativity. Of 106 measures of the tool, 52 variables were selected on the basis of two criteria. First, all such variables which had high correlations with other variables (\(\vert r \vert \ge 0.80\)) were discarded for handling the problem of collinearity. Remaining measures were grouped in feature sets. Thus, after removing all such redundant variables, the feature set of story transcripts had 65 measures whereas the feature set of course transcripts had 67 measures. In Table 1, superscripts 1, 2 and 3 indicate measures presented in only story transcripts, in only course transcripts and in both transcripts respectively. Therefore, in first step, measures indicated with superscripts 1 and 3 were selected for the classification of story transcripts; whereas measures indicated with superscripts 2 and 3 were selected to classify the course transcripts. In next step, we had selected only those measures which were presented in both feature sets. Therefore, in second step, 52 common measures indicated with superscript 3 in Table 1, were selected for the classifications.

Table 1. A comparison of proficient and poor comprehenders’ transcripts features. Values shown are mean (standard deviation).

Pairwise Comparisons: Pairwise comparisons were conducted to examine differences between proficient comprehenders’ text and poor comprehenders’ text of both sets of transcripts (story and course). These results are reported below.

  1. 1.

    Descriptive measures: Co-Metrix provided eleven descriptive measures in which six measures were selected as features. Paragraph count, Paragraph length, Sentence length and Word length had significant difference between proficient comprehenders’ text and poor comprehenders’ text of both sets of transcripts.

  2. 2.

    Easability components: The tool provided sixteen easability measures in which eleven measures were selected as features. Deep cohesion, Verb cohesion, Connectivity and Temporality had significant difference between proficient comprehenders’ text and poor comprehenders’ text of both sets of transcripts.

  3. 3.

    Referential Cohesion: The tool provided ten referential cohesion measures in which nine measures were selected as features. The findings from different overlap measures demonstrated that proficient comprehenders used more co-referential nouns, pronouns, or NP phrases than poor comprehenders.

  4. 4.

    Latent Semantic Analysis: The tool provided eight LSA measures and all were selected as features. LSA overlap measures had significant difference between proficient comprehenders’ text and poor comprehenders’ text of both sets of transcripts.

  5. 5.

    Lexical Diversity: The tool provided four lexical diversity measures in which two measures were selected as features. MTLD and VOCD had more significant difference between proficient comprehenders’ text and poor comprehenders’ text of both sets of transcripts.

  6. 6.

    Connectives: The tool provided nine lexical connective measures in which four measures were selected as features. The findings from different connective measures demonstrated that proficient comprehenders used more connectives, such as in other words, also, however, although etc. than poor comprehenders; whereas poor comprehenders used comparatively more logical operators such as and, then etc. as well as more temporal connectives, such as when etc.

  7. 7.

    Situation Model: The tool provided eight situation model measures in which seven measures were selected as features. Causal verb measures had significant difference between proficient comprehenders’ text and poor comprehenders’ text of both sets of transcripts.

  8. 8.

    Syntactic Complexity: The tool provided seven syntactic complexity measures in which three measures were selected as features. Words before main verb (mean), Number of modifiers per noun phrase (mean), and Sentence syntax similarity (mean) had less significant difference between proficient comprehenders’ text and poor comprehenders’ text of both sets of transcripts.

  9. 9.

    Syntactic Pattern Density: The tool provided eight syntactic pattern density measures and all were selected as features. Noun phrase density, Verb phrase density, Adverbial phrase density, Preposition phrase density, Agentless passive voice density, Negation density, Gerund density, and Infinitive density had high significant difference between proficient comprehenders’ text and poor comprehenders’ text of both sets of transcripts.

  10. 10.

    Word Information: The tool provided twenty two word information measures in which twenty one measures were selected as features. Noun incidence, Verb incidence, Adjective incidence, and Adverb incidence were highly significant. Poor comprehenders’ transcripts had a comparatively greater proportion of pronouns compared to that of proficient comprehenders.

  11. 11.

    Readability: The tool provided three readability measures in which one measure was selected as feature. Flesch Reading Ease had significant difference between proficient comprehenders’ text and poor comprehenders’ text of both sets of transcripts.

Approach-2: In this approach, we selected appropriate features from all 106 Coh-Metrix measures by applying Welch’s two-tailed, unpaired t-test on each measure of both types of comprehenders’ transcripts. All features that were significant at p < 0.05 were selected for classification. Thus, the feature set of story transcripts had 15 measures (Table 2) whereas the feature set of course transcripts had 14 measures (Table 3).

Table 2. A comparison of proficient and poor comprehenders’ features extracted from story transcripts. Values shown are mean (standard deviation).
Table 3. A comparison of proficient and poor comprehenders’ features extracted from course transcripts. Values shown are mean (standard deviation).

5 Classification

We examined several classification methods such as Decision Trees, Multi-Layer Perceptron, Naïve Bayes, and Logistic Regression using Weka toolkit [6]. 10-fold cross-validation method had been applied to train these classifiers. The results of these classifiers are reported in Table 4 in terms of classification accuracy and root mean square error (RMSE). The classification accuracy refers to the percentage of samples in the test dataset that are correctly classified (true positives plus true negatives). Root-mean-square error (RMSE) is frequently used as measure of the differences between values predicted by a classifier and the values expected. In this experiment, it provided the mean difference between the predicted students’ comprehension level and the expected comprehension level. The baseline accuracy represents the accuracies that would be achieved by assigning every sample to the larger training size of the two classes. In this experiment, both classes had 10 training samples, therefore, the baseline accuracy for poor vs. proficient comprehenders’ transcripts would be achieved by assigning all the samples in any one group and thus the baseline accuracy of the experiment would be 0.5 (10/20 = 0.5).

Table 4. Accuracies for the four classifiers.

6 Result and Discussion

Table 4 shows the accuracies for classifying poor vs. proficient comprehenders’ transcripts. The classifier accuracies were not as high for approach-1 compared to approach-2; however, they were above or equal to the baseline for all four classifiers. Also, common features provided better accuracies as compared to first step features (story or course feature set). In this experiment, the reduced set of features applied in approach-2, provided best results for all four classifiers. However it was observed that selection of features using approach-2 were dependent on the participants involved in the experiment as well as the read text; whereas the features of approach-1 were almost robust against these changes. The major findings of this study demonstrate that three cohesion indices- lexical diversity, connectives, and word information, common in both Tables 2 and 3, played a vital role in the classification of both types of the transcripts. The logistic regression classifier classified story transcripts and course transcripts with accuracies 100% and 80% respectively.

Generally in first attempt of reading a new text, science and technology course does not help most students to develop mental model to represent the collective conceptual relations between the scientific concepts, due to lack of their prior domain knowledge. In contrast, story texts carry some general schema such as name, specific place and chronological details of an event; all these schema help students to develop mental model by integrating these specific attributes of the event described in the story [11]. Therefore, students stored the mental model of story text comparatively in more details in their memory compared to that of course text; which was reflected in their transcripts. Proficient and poor both students’ story transcripts contained more noun phrases in comparison to course transcripts.

Poor comprehenders may not benefit as much as good comprehenders from reading a complex text because grammatical and lexical linking within the text increases text length, density, and complexity. As a consequence, reading such text involves creation and processing of more complex mental model. Comprehenders with low working-memory capacity experience numerous constraints on the processing of these larger mental models, resulting in lower comprehension and recall performance [8]. As a result poor comprehenders’ transcripts consist of comparatively more sentences with mixed content representing their confused state of mental models. Therefore, as shown in Table 1, values of the measures of situation model index were more in poor comprehenders’ transcripts in contrast to proficients’ transcripts.

The finding in this study also validates a previous study [3], which demonstrated that less-skilled comprehenders produced narratives that were poor in terms of both structural coherence and referential cohesion.

In short, the Coh-Metrix analysis of transcripts provides a number of linguistic properties of comprehenders’ narrative speech. Comprehension proficiency were characterized by greater cohesion, shorter sentences, more connectives, greater lexical diversity, and more sophisticated vocabulary. It is observed that lexical diversity, word information, LSA, syntactic pattern, and sentence length provided the most predictive information of proficient or poor comprehenders.

In conclusion, the current study supports to utilize Coh-Metrix features to measure comprehender’s ability.