Ahmadi, A. (2019). A Study of Raters’ Behavior in Scoring L2 Speaking Performance: Using Rater Discussion as a Training Tool. Issues in Language Teaching, 8(1), 195-224.
Ahmadi, A., & Sadeghi, E. (2016). Assessing English language learners’ oral performance: A comparison of monologue, interview, and group oral test. Language Assessment Quarterly
, 13, 341–358. https://doi.org/10.1080/15434303.2016.1236797
Ary, D., Jacobs, L. C., Irvine, C. K. S., & Walker, D. (2018). Introduction to research in education. Cengage Learning.
Attali, Y. (2016). A comparison of newly-trained and experienced raters on a standardized writing assessment. Language Testing, 33(1), 99-115.
Barkaoui, K. (2007). Rating scale impact on EFL essay marking: A mixed-method study. Assessing Writing, 12(2), 86-107.
Barkaoui, K. (2010). Variability in ESL essay rating processes: The role of the rating scale and rater experience, Language Assessment Quarterly, 7(1), 54-74.
Broad, B. (1997). Reciprocal authorities in communal writing assessment: Constructing textual value within a “New politics of inquiry”. Assessing Writing, 4(2), 133-167.
Broad, B. (2003). What we really value: Beyond rubrics in teaching and assessing writing. Utah: Utah State University Press.
Cambridge University Press. (2015). Cambridge IELTS 10: Authentic examination papers from Cambridge ESOL. New York, NY: Cambridge University Press.
Cambridge University Press. (2016). Cambridge IELTS 11: Authentic examination papers from Cambridge ESOL. New York, NY: Cambridge University Press.
Charney, D. (1984). The validity of using holistic scoring to evaluate writing: A critical overview. Research in the Teaching of English, 18(1), 65-81.
Clauser, B. E., Clyman, S. G., & Swanson, D. B. (1999). Components of rater error in a complex performance assessment. Journal of Educational Measurement, 36(1), 29-45.
Corbin, J., & Strauss, A. (2014). Basics of qualitative research: Techniques and procedures for developing grounded theory (4th ed.). SAGE.
Crismore, A., Markkanen, R., & Steffensen, M. S. (1993). Metadiscourse in persuasive writing: A study of texts written by American and Finnish university students. Written Communication, 10(1), 39-71.
Cumming, A. (1990). Expertise in evaluating second language compositions. Language Testing, 7(1), 31–51.
Cumming, A., Kantor, R., & Powers, D. (2001). Scoring TOEFL essays and TOEFL 2000 prototype tasks: An investigation into raters’ decision making, and development of a preliminary analytic framework. TOEFL Monograph Series, Report No. 22.
Cumming, A., Kantor, R., & Powers, D. (2002). Decision making while rating ESL/EFL writing tasks: A descriptive framework. Modern Language Journal, 86 (1), 67–96.
Davis, L. (2016). The influence of training and experience on rater performance in scoring spoken language. Language Testing, 33(1), 117-135.
Ducasse, A., & Brown, A. (2009). Assessing paired orals: Rater’s orientation to interaction. Language Testing, 26(3), 423–443. https://doi.org/10.1177/0265532209104669
Eckes, T. (2005). Examining rater effects in TestDaF writing and speaking performance assessments: A many-facet Rasch analysis. Language Assessment Quarterly: An International Journal, 2(3), 197-221
Eckes, T. (2008). Rater types in writing performance assessments: A classification approach to rater variability. Language Testing, 25(2), 155-185.
Eckes, T. (2012). Operational rater types in writing assessment: Linking rater cognition to rater behavior. Language Assessment Quarterly, 9(3), 270-292.
In’nami, Y., & Koizumi, R. (2016). Task and rater effects in L2 speaking and writing: A synthesis of generalizability studies. Language Testing, 33(3), 341-366.
Isaacs, T., & Thomson, R. I. (2013). Rater experience, rating scale length, and judgments of L2 pronunciation: Revisiting research conventions. Language Assessment Quarterly, 10(2), 135-159.
Johnson, R. L., Penny, J., Gordon, B., Shumate, S. R., & Fisher, S. P. (2005). Resolving score differences in the rating of writing samples: Does discussion improve the accuracy of scores? Language Assessment Quarterly: An International Journal, 2(2), 117-146.
Jølle, L. (2014). Pair assessment of pupil writing: A dialogic approach for studying the development of rater competence. Assessing Writing, 20, 37–52.
Kim, H. J. (2015). A qualitative analysis of rater behavior on an L2 speaking assessment. Language Assessment Quarterly, 12(3), 239-261.
Kim, S., & Lee, H. K. (2015). Exploring rater behaviors during a writing assessment discussion. English Teaching, 70(1).
Lim, J. (2019). An investigation of the text features of discrepantly-scored ESL essays: A mixed-methods study. Assessing Writing, 39, 1-13.
Lindhardsen, V. (2018). From independent ratings to communal ratings: A study of CWA raters’ decision-making behaviors. Assessing Writing, 35, 12-25.
Lumley, T. (2002). Assessment criteria in a large-scale writing test: What do they really mean to the raters? Language Testing, 19(3), 246–276.
Lumley, T. (2005). Assessing second language writing: The rater’s perspective. Frankfurt am Main: Peter Lang.
Lumley, T., & McNamara, T. F. (1995). Rater characteristics and rater bias: Implications for training. Language Testing, 12(1), 54–71.
May, L. (2009). Co-constructed interaction in a paired speaking test: The rater's perspective. Language Testing, 26(3), 397-421.
Moss, P., Schutz, A., & Collins, K. (1998). An integrative approach to portfolio evaluation for teacher licensure. Journal of Personnel Evaluation in Education, 12(2), 139–161.
Papajohn, D. (2002). Concept mapping for rater training. TESOL Quarterly, 36(2), 219–233.
Sakyi, A. A. (2003). A study of the holistic scoring behaviors of experienced and novice ESL instructors [Unpublished doctoral dissertation]. The University of Toronto.
Smith, D. (2000). Rater judgments in the direct assessment of competency-based second language writing ability. Studies in immigrant English language assessment, 1, 159-189.
Trace, J., Janssen, G., & Meier, V. (2017). Measuring the impact of rater negotiation in writing performance assessment. Language Testing, 34(1), 3-22.
Vaughan, C. (1991). Holistic assessment: What goes on in the rater’s mind? In L. Hamp Lyons (Ed.). Assessing second language writing in academic contexts (pp.111–125). Ablex.