Document Type: Research Paper


1 Faculty member at Shiraz university

2 PhD ca universityndidate at Shiraz

3 Shiraz Univeristy

4 Shiraz University


Although some piecemeal efforts have been made to investigate the validity and use of the Iranian PhD exam, no systematic project has been specifically carried out in this regard. The current study, hence, tried to attend to this void. As such, to ensure a balanced focus on test interpretation and test consequence, and to track evidence derived from a mixed–method study on the validity of Iranian PhD entrance exam of TEFL (IPEET), this study drew on a hybrid of two argument-based structures: Kane's (1992) argument model and Bennett's (2010) theory of action. Resting on the network of inferences and assumptions borrowed from the hybridized framework, the study investigated the extent to which the proposed assumptions would be supported by empirical evidence. It also examined the unintended consequences that may possibly be revealed through this validity investigation. Three sources of data informed the present study: (a) Test score data from about 1000 PhD applicants' taking IPEET test administered in 2014, (b) questionnaires completed by university professors and PhD students of TEFL, and finally, (c) telephone and focus-group interviews with university professors and PhD students of TEFL, respectively. The results from the analysis of mixed-method data indicated that all the inferences proposed for this study were rebutted, suggesting that some unintended consequences have happened to the technical as well as the decision quality of this test, hence its invalidity. Findings also provided valuable insights and suggestions for the betterment of the present content and current policy of IPEET in Iran.


Ary, D., Jacobs, L. C. & Sorensen, C. (2010). Introduction to research in education (8th Ed.). New York, NY: Wadsworth.

Azmoon.Net. (2014). PhD entrance examination news. Retrieved 2014, October, 15th from www.  Phd.Azmoon.Net. www. PhD Test.       

Bennett, R. E. (2010). Cognitively based assessment of, for, and as learning: A preliminary theory of action for summative and formative assessment. Measurement: Interdisciplinary Research and Perspectives, 8, 70-91.

Cohen, L., Manion, L., & Morrison, K. (2007). Research methods in education (sixth Ed.) London: Routledge.

Cronbach, L. J. (1980). Validity on parole: How can we go straight? New directions for testing and measurement: Measuring achievement over a decade. In Proceedings of the 1979 ETS Invitational Conference (pp. 99-108). San Francisco, CA: Jossey- Bass.

Douglas, D. (2014). Understanding language testing. Oxon.Hodder Education.

Dörnyei, Z. (2007). Research methods in applied linguistics: quantitative,       qualitative and mixed methodologies.  Oxford: Oxford University Press.

Drasgow, F. (1987). Study of the measurement bias of two standardized psychological tests. The Journal of Applied Psychology, 72, 19–29.

Farhady, H., Jafarpur, A. J., & Birjandi, P. (2014). Testing Language Skills from Theory to Practice. Tehran: SAMT.

Glaser, B. G., & Strauss, A. (1967). The discovery of grounded theory: Strategies for qualitative research. New York: Aldine.

Green, A.  (2007). Washback to the learners:  Learners and teacher perspectives on IELTS preparation course expectation and outcomes. Assessing Writing, 11, 113 -134.

Haertel, E. (2013). How is testing supposed to improve schooling? Measurement: Interdisciplinary Research and Perspectives, 11(1-2), 1-18.

Johnson, R.C., & Riazi, M. (2013). Assessing the assessments: Using an argument-based validity framework to assess the validity and use of an English placement system in a foreign language context. Papers in Language Testing and Assessment. 2(1), 31-58.

Kane, M. T. (1992). An argument-based approach to validity. Psychological Bulletin, 112(3), 527-535.

Kane, M., Crooks, T., & Cohen, A. (1999). Validating measures of performance. Educational Measurement: issues and practice, 18(2), 5-17.

Kane, M. T. (2006). Validation. Educational Measurement, 4, 17-64.

Kane, M.T. (2011). Validating score interpretations and uses. Language Testing 29(1), 3– 17.

Kane, M.T. (2013). Validating the interpretations and uses of test scores. Journal of Educational Measurement. 50(1), 1–73

Kiany, R., Shayestefar, P., Ghafar Samar, R., Akbari, R. (2013). High-rank stakeholders’ perspectives on high- stakes University entrance examinations reform: priorities and problems. High Educ 65, 325–340

Kline, P. (2000). The handbook of psychological testing (2nd Ed.). London: Routledge.

Kunnan, A. J. (2000). Fairness and justice for all. In A. J. Kunnan, (Ed.). Fairness and Validation in Language Assessment: Selected Papers from the 19th Language Testing Research Colloquium, Orlando, Florida (pp. 1-14). Cambridge: Cambridge University Press.

Kunnan, A. J. (2003). Test fairness. In M. Milanovic & C. Weir (Eds.), Select Papers from the European Year of Languages Conference, Barcelona. Cambridge: Cambridge University Press.

Maxwell, J. A. (1996). Qualitative Research Design: An Interactive Approach. Thousand Oaks, California: Sage Publications.

Monk, T H. (1990). The relationship of chronobiology to sleep schedules and performance demands. Work and Stress, 4(3), 227-236.

NOET. (2013). PhD entrance examination news. Retrieved 2013, December, 20th from Story.aspx? gid=1&id=730

Shulman, H C., Boster, F J., & Carpenter, C J. (2011).  Do data collection procedures influence political knowledge test performance? Paper presented at the   annual meeting of the Midwestern Political    Science Association in Chicago, IL.  Oaks, CA: Sage.

Sireci, S.G., & Rios, J.A. (2013). Decisions that make a difference in detecting differential item Functioning. Educational Research and Evaluation, 19, 170–187. DOI: 10.1080/13803611.2013.767621.

Takala, S., & Kaftandjieva, F. (2000). Test fairness: A DIF analysis of an L2 vocabulary test. Language Testing, 17, 323–40.

Teddlie, C. & Tashakkori, A. (2003).Major Issues and Controversies in the Use of Mixed Methods in the Social and Behavioral Sciences. In Tashakkori, A. & Teddlie, C. Handbook of mixed methods in social and behavioral research. Thousand Oaks, CA: Sage.

 Teddlie, Ch. & Tashakkori, A. (2006). A general typology of research designs featuring mixed methods. Research in Schools, 13 (1), 12-28.

Weir, C. J. (2005).Language testing and validation. Hampshire: Palgrave McMillan.

Wise, S L., Kingsbury, G., Hauser, C., & Ma, L. (2010). An investigation of the relationship between time of testing and test-taking effort. Paper presented at the annual meeting of the National Council on Measurement in Education, Denver, CO.

Xi, X. (2010). How do we go about investigating test fairness? Language Testing27(2), 147- 170.

Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordi­nal) item scores. Ottawa ON: Directorate of Human Resources Research and Evaluation, Department of National Defense

Zumbo, B. D. (2008, July). Statistical methods for investigating item bias in self-report measures. Florence Lectures on DIF and Item Bias. Lectures Conducted from Universita degli Studi di Firenze, Florence, Italy.