This research reports an investigation into the reliability of the ‘Standards for Assessment of Intermediate-level Japanese Learners’ Essay' based on relevant recent studies in the field of Japanese language education, the Common European Framework of Reference for Languages: Learning, Teaching, Assessment(CEFR), and the JF standard for Japanese Language Education. The evaluation of 24 intermediate-level Japanese learners' essays from selected the ‘JLPTUFS Writing Corpus’ was conducted according to the ‘Standards for Assessment of Intermediate-level Japanese Learners’ Essay'.
The evaluators were four native teachers of Japanese at universities. These evaluators were divided into two groups according to the scoring methods employed: a holistic scoring method (Group A) and a newly developed scoring method (Group B). Correlation analysis, Cohen's kappa, a significance test of correlation coefficient, and intra-class correlation coefficient were conducted using IBM SPSS Software and Microsoft Office Excel.
The results are as follows. (1) Intra-group correlation analysis and Cohen's kappa showed neither significant correlation nor degree of concordance within any evaluator group. Results of inter-group analysis were also unable to be deemed significant or reliable. (2) There was no significant difference between either scoring method. (3) All evaluators provided a similar opinion regarding the advantages of the ‘Standards for Assessment of Intermediate-level Japanese Learners’ Essay': a clear categorization of evaluation items. On the other hand, ‘redundant items’ and ‘inadequacy and redundancy of evaluation items’ were discovered as disadvantages of the rating scale.
Three factors were deemed to contribute to unreliable results: an insufficient number of evaluators, a lack of evaluator training, and the influence of the level of regular classes run by evaluators. In conclusion, creation of a new rating scale that utilizes the advantages observed in the course of this study and includes corrections to the “Standards for Assessment of Intermediate-level Japanese Learners’ Essay” should be considered for further research.