|
||||
.2012 - Volumen 5, Número 2
|
||||
Using Evaluation Results for Improving Teaching Practice: A Research Case Study |
||||
A substantial body of research has been conducted on the use of student evaluations of college teaching. This research has mainly centered on the dimensionality, validity, reliability and generalizability of the ratings, and the study of possible biasing factors that could affect the validity of the ratings (Marsh, 1984; Braskamp & Ory, 1994; El-Hassan, 1995). Only a limited number of studies have been conducted on the longitudinal use of student ratings and their possible influence on instructional improvement. This research case study is committed to that purpose. This is a case study of a senior professor in a public research university in the Midwest. The study traces how the instructor uses the feedback from peers, his department head and students. The research is an attempt to better understand the evaluation results as feedback for instructional improvement. 1.CONCEPTUAL FRAMEWORK Other research has shown mixed evidence about the possible effects of the feedback from student’s ratings for improving teaching. Cohen (1991) has supported the potential use of the ratings for instructional improvement and other authors (Miller, 1987; Roush, 1983) found a high positive effect of the ratings on instructional improvement when accompanied by consultation. Still other findings have shown only minor or limited impact of the ratings on instructional improvement when measured over time (Gray & Brandeburg, 1985). As Brinko (1996) mentions, while some studies have affirm that the feedback from student ratings is valid and reliable, the research showed only a marginal effect of the ratings when used alone. More research seems to be needed on how the faculty uses feedback from student ratings. In addition, future studies could look at the benefits and trade-offs when using the feedback provided for the ratings. 2. ISSUES
3. METHODOLOGY This study portrays how a male instructor, his peers, and the department head used feedback from student ratings{1}. A case study approach was employed to provide a holistic mode of inquiry that would be consonant with the exploration of issues addressed by the study. The method selected provided understanding of the process and events taking place and the complexities of the context under which the evaluation was conducted (Patton, 1980). The case study was organized around main questions or issues (Stake et al, 1995). The criteria for selecting the professor and his course was not so much “What course and faculty member represents the totality of the instructors and disciplines taught in this university?” but more “What instructor would help us to understand the use of student feedback and its impact on teaching?” The professor was also selected because of his desire to continue improving his instruction. The case was an anonymized George Aldermen, a senior professor who had been receiving low to moderate ratings in recent years. He had used consultation services within the Division of Instructional Development on campus. He was teaching math-related courses required for engineering students. The students who attend his course were undergraduates in their junior or senior year. Data were collected over a period of two semesters through classroom observations and semi-structured and open-ended interviews with the faculty member, with other instructors in their Department who teach the same course, and with the Department Head. Classroom observations provided an understanding of the teaching styles, classroom techniques and interactions between the instructor and the students. The professor was later asked how the feedback from the ratings had influenced activities taking place in the classroom. Interviews with the professor and Department Heads were centered on the four issues cited. Each of these interviews was conducted in the instructor's private office. Each interview lasted from twenty minutes to an hour. Student ratings and evaluation forms and a variety of documentary data were available to the researcher. Because the records are considered private, access was gained after obtaining the consent of the parties involved. The data collection used, the length of time devoted to data collection on site, and the use of multiple sources of evidence contributed to a better understanding of the context under which teaching took place. The first author gathered the data, using traditional ways for validating her observations. She took care to use more than one medium--from observation, interviews, and document review. Data collection resulted in multiple statements, testimony, documents, and observations. Data analysis in this study, as in most qualitative studies, commenced as data collection continued. Throughout the study, data analysis helped to redefine the issues. Field note analysis was qualitative. To organize the data in a form that facilitates analysis and interpretation, the researcher used a thematic approach. The researchers analyzed her field notes using the key questions. She showed her processed notes to the participants and confirmed the accuracy of quotes and descriptions of what she quoted about them. 4. FINDINGS Next is a description of the case study findings. First, we present a brief description of the case, followed by the source of evaluation feedback received by the instructor. Then comes the analysis of the utility of these sources of feedback for instructional improvement. 4.1. The case and its teaching context George Alderman was a senior professor with twenty years of experience. He obtained his Ph.D. in the early 70s from a prestigious university in the Eastern part of the U.S.A., and obtained his bachelor’s degree in Mathematics from a prestigious Institute of Technology. Since before 1980, George had worked for his present university’s Practical College. He usually had taught statistics-related courses, particularly ones required for undergraduate students. George’s university had been using a campus-wide cafeteria style evaluation system since 1967. It used student ratings as one source for evaluating college instruction. The university required all assistants and associate professors to submit their evaluation rating results every semester and for every course (Ory & Wieties, 1991). All evaluation survey forms included two “global items” and a set of specific items to evaluate instruction. The two global items were used for summative purposes and the set of specific items for instructional improvement. Many faculty members had the option to select their own specific rating items and/or to use departmentally-selected core items. Instructors and departments selected their items from a catalog of more than 400 items. (Ory & Wieties, 1991). During an interview with the Department Head, he talked about how much they valued teaching. He added that they reviewed their faculty ratings in comparisons across campus. Student ratings’ results were provided to administrators for making decisions about promotion, tenure and teaching awards, as well as for providing instructors with feedback about their teaching. It was expected that faculty members would use this information for improvement. If they required help they could go to the Division of Instructional Development. At the time of this study, the university was reviewing the evaluation system as one indicator of institutional assessment, seeking more information about the use of evaluation results and their impact on instruction at individual and departmental levels. I met George through the Division of Instructional Development on campus. He had been in contact with this division for several semesters. Since he had received low ratings on his teaching, he went to this office to receive help. 4.2. George’s Class George was teaching class twice a week. One of the class sessions started at 1:00 pm and the other at 3:00 pm. His class took place in a classroom located in a major building of the campus. The room had good illumination with capacity for 45 students. The room was large but not deep. There was little space for students to interact with each other. Student seats were arranged in rows facing the instructor’s desk. George’s style of teaching is briefly summarized below:
George traditionally received low to moderate ratings by the students. His course had a poor reputation according to the students.
A semester later, only George was rated low. His ratings actually went down, partly because he called for student teamwork, with each student grading each other’s participation. One of George's students formally charged him with "capricious grading." The Department Head sent George a letter asking him to find a way to improve.{2} 4.3. Feedback sources Next a description of the main sources of feedback that George received for improving his teaching: student ratings, Department Head, peers and the instructional specialist. 4.3.1. Students Analysis of the student ratings on evaluation forms of George’s class during the two years studied indicated that he received low to moderate ratings in relation to the campus norm. Although invited to, students seldom wrote comments on the back of the forms, comments that George might use to learn more about his strengths and weaknesses as an instructor. During the semester that George began to use more teamwork, he asked his students to rate each other’s participation in the teamwork activities. That semester he received the lowest ratings ever. On that occasion, some students wrote on the evaluation form that they disliked the teamwork activities. The following semester, George ratings began to improve. To obtain more information for improving his teaching, George then asked his students to write about his teaching strengths and weaknesses, as well as about suggestions for improving. It was noted that George was teaching a required course for freshmen and sophomore students. The course was part of a general education curriculum. Some students did not have the required background for his course. In addition, there were students who indicated they had had negative perceptions about the course before enrolling. They indicated that they registered for the course because it was a requirement for graduation. It is important to note that the literature on student ratings indicates that students tend to grade required courses lower than electives (Braskamp y Ory, 1994; Stake & Cisneros-Cohernour, 2004). During a focus group with some of George’s students, majoring in liberal arts (e.g., in History and Marketing) expressed a negative perception about the course because they lacked course prerequisites. During that semester, George asked students to provide him with feedback anonymously. Student comments about the strengths and weaknesses of George’s teaching included:
Some of these comments were about teaching style, the way in which George communicated with students, and about some practical suggestions for teaching. Most suggestions were on a teacher–centered approach. George thought that his student comments were not entirely consistent, so he selected what he considered he could change to improve his teaching. Before making any changes, he consulted with Alice, the instructional specialist. During classroom observations, students were using George’s class materials, taking notes and applying the concepts while solving math problems. 4.3.2. George’s department head In interviews, the Head spoke of a need for improvement of teaching. She acknowledged she was not providing faculty members with feedback on instruction nor creating mechanisms to support teaching:
4.3.3. Peers Two instructors presently taught the same course as George. One of them was Frank Edwards, a faculty member in his second year of tenure. The other instructor was Will Wilson, a tenured professor with twenty years of experience like George. During interviews, Wilson indicated that he had worked with George in the past and attempted to help him in improving his teaching. He said:
Will Wilson added that it was very important for instructors to be careful in the way they communicate with their students:
He added that after observing George’s class, he did not find a big difference between George’s teaching and his own. George’s other peer, Frank Edwards, said that he had worked with Wilson in the planning of the course but had not worked with George. He knew George liked to work independently. 4.3.4. The instructional specialist For two years, George received feedback from Alice, a consultant from the campus Division of Instructional Development. He had voluntarily requested help for improving his teaching. Alice commented that George often received compliments from his students for being a caring, knowledgeable and fair instructor, also that he was accessible. She added that her work at that moment was to help George find ways to improve his teaching as well as his ratings:
The next semester George followed Alice’s advice of including more two-person teamwork activities during his classes. That was when he encouraged students to rate each other’s participation in class. Several students were upset because most Department faculty members granted credit for class participation based on class attendance. As indicted earlier, one student formally accused George of capricious grading. At the end of that semester George ratings were very low. His Department Head sent a letter requesting him to improve his ratings. The situation did not discourage George. He made more changes. He continued using teamwork activities, giving handouts to his students, collecting early feedback on his teaching, and increasing the use of questioning in his class. Alice said that in her opinion George was a person who liked to make small changes when these did not conflict with his teaching philosophy. She also mentioned that George had changed gradually over two years. She added that change in teaching could be expected to be a gradual process and that one semester was a very short period of time to see any real change. 4.4. Usefulness of the feedback During interviews with George and his peers, it was clear that in that Department student ratings were the official source of feedback. The Department Head had a positive opinion of this source of information:
Unlike the Department Head, George’s colleague Frank Edwards said:
Will.Wilson, George’s other co-teacher, added:
Frank Edwards went further:
And George said:
George added that he preferred feedback from an external observer.
5. CONCLUSIONS The case study focused on a professor trying to improve his teaching. He worked with feedback from his students, peers and an instructional specialist. His Department Head used only student ratings as an official source of feedback. Concentrating on this single source made it difficult for him to improve the quality of his teaching. These student evaluation findings did not take into consideration the meaning of the scores in context nor that student evaluations can negatively be influenced by course characteristics{3}, the lack of student interest and prerequisites, the way the student questionnaire is administered, the instructor, the students, instruments, as well as the workload and the level of depth of the course coverage (Brodie, 1998; Ryan et al., 1980). Student ratings were especially problematic because they were used for multiple purposes. The literature on faculty evaluation clearly holds that evaluation purposes should not mix. Haskell (1997), for example states that the use of student ratings for both formative and summative purposes can lead to misuse. It can result in violation of the instructor’s academic freedom and diminishing the quality of teaching. In George’s case, there were two main tradeoffs that result from this decision. The first one was that the emphasis on student ratings might discourage faculty members from innovative teaching. George experienced this problem when he began to introduce changes in his teaching. His low ratings on teaching were used to deny a salary increase the year that he obtained his lowest ratings. George said that this problem could be solved if the department head were willing to allow the professor to experiment with a new practice without reviewing the ratings at least for a year. This suggestion, however, required the willingness of the department administration. If department leaders are not aware of the complexities of changes in teaching, they are unlikely to maximize improvements for their weaker teachers. The second trade-off was the pressure for homogeneity as the department equated student ratings to effective teaching. George perceived that equating good teaching with a single numerical score could result in limiting a professor’s academic freedom. Since he had tenure he had certain freedom to make changes in his teaching. Unfortunately, other faculty members may not have the same freedom as George in resisting the pressures for standardization. Untenured professors who work in departments that follow the same policies as George's department may feel their academic freedom restricted. But especially, using only student ratings can make instructors focus too much on improving scores rather than reflecting upon and improving the teaching itself. Pleasing students is not the only way of teaching effectively. The professors who participated in the study were concerned about the meaning that students gave to the construct of good teaching. As we look at the informal feedback that George received from his students, we observed that they sometimes had contradictory views. Some wanted more redundancy, some less. Some wanted topics restricted to the textbook, some wanted the instructor’s personal experience. They are experts in what they want but not in what others should have. Their suggestions focused on teaching style, favoring a teacher-centered type of instruction, seldom indicating a need for how they could work harder. Will Wilson, George’s associate, said:
Wilson identified the causality issue related to the weight given to the ratings by the department:
Wilson added that students prefer some courses over others, influencing the ratings. So, when he has obtained low ratings in a course, he said he moves to another course, one that he knows is liked by his students:
The professors restated the advantages and disadvantages of feedback from peers, and added that different professors may prefer different types of feedback for improving their teaching. So, what works for some may not work for all. Finally, the case study showed that in spite of more than sixty years of research, the evaluation of teaching continues largely to be a responsibility of the individual instructor. Few faculty members enjoy a community of practice, that is, a group of professionals, informally united, exposed to common problems, searching for common solutions and a source of knowledge of teaching to share with each other (Johnson-Lenz y Johnson-Lenz, 1998; Stake y Cisneros-Cohernour, 2004).
REFERENCES Braskamp, L. A. and Ory, J. C. (1994). Assessing faculty work: Enhancing individual and institutional performance. San Francisco, CA: Jossey Bass. Brodie, D. A. (1998). Do students report that easy professors are excellent teachers? The Canadian Journal of Higher Education, 23 (1), pp. 1-20. Brinko, K. (1996). The practice of giving feedback to improve teaching: What is effective? Journal of Higher Education, 74 (5), pp. 574-593. Centra, J. A (1993). Reflective faculty evaluation: enhancing teaching and determining faculty effectiveness. San Francisco: CA. Jossey-Bass. Cohen, P. A. (1991). Effectiveness of Student-Rating Feedback and consultation for improving instruction in dental schools. Journal of Dental Education, 55 (2), pp. 45-50. El-Hassan, K. (1995). Student’s ratings of instruction: Generalizability of findings. Studies on Educational Evaluation, 21 (4), pp. 411-429. Gray, D. M. and Brandeburg, D. C. (1985). Following student ratings over time with a catalog based system. Research in Higher Education, 22, pp. 155-168. Guba E. G., and Lincoln, Y. S. (1985). Effective evaluation: Improving the usefulness of evaluation results through responsive and naturalistic approaches. San Francisco: Jossey-Bass. Johnson-Lenz, P. and Johnson-Lenz, J. (1997). Bonding by exposure to common problems. In What is a Community of Practice? Community Intelligence Labs. Disponible en línea: http://www.co-i-l.com/coil/knowledge-garden/cop/definitions.shtml. Marsh, H. W. (1984). Student evaluations of university teaching: Dimensionality, reliability, potential biases, and utility. The Journal of Educational Psychology, 76, pp. 707-754. Miller, R. I. (1987). Evaluating faculty for promotion and tenure. San Francisco, CA: Jossey Bass Inc. Ory, J. C. and Weities, R. (1991). A longitudinal study of faculty selection of Student Evaluation Items. Paper presented at the annual meeting of the American Educational Research Association, Chicago, April. Ory, J. and Ryan, K. (2001). How do student ratings measure up to a new validity framework? New Directions in Institutional Research, 109, Jossey-Bass Inc., Publishers, San Francisco: CA. Patton, M. Q. (1980). Qualitative Evaluation Methods. Beverly Hills, CA: Sage Publications. Roush, D. C. (1983). Strategies for effective university teaching. Materials for teaching methodology workshops of the Fulbright Exchange Program. Latin American Scholarship Program for American Universities (LASPAU). Harvard, MA. Ryan, J. J., Anderson, J. A. and Birchler, A. B. (1980).Student Evaluations: The Faculty Responds, Research in Higher Education 12 (December, 1980): pp. 317-33. Stake, R. (1995). The Art of case study research. London: Sage publications. Stake, R. E. and Cisneros-Cohernour, E. J. (2004). The Quality of Teaching in Higher Education. The Quality of Higher Education. País: Lithuania, (1), pp. 94-107. Available en: http://skc.vdu.lt/downloads/zurnalo_arch/amk_1/094_117stake.pdf. Whittman, N. and Weiss, E. (1982). Faculty evaluation: The use of explicit criteria for promotion, retention and tenure. (Higher education report No. 2). Washington: DC. AAHE-ERIC.
{1} In order to ensure confidentiality pseudonyms have been used to replace the names of the departments, faculty and department head involved in this study. Some events have also been presented in a way to diminish recognition. The study includes only information seen likely to understanding the cases. {2} Frank Edwards, one of George’s peers, said “In general, this course, having now been taught by a number of different faculty members, does not receive good ratings. Since the course is required and is mathematically oriented, ratings are not typically as high as in other courses. In most other courses, students have some latitude in responding to a particular assignment. More than one response might be acceptable. In an analytically oriented course like this, there is much less latitude. An answer is usually either right or wrong. There are fewer shades of gray." {3} Esto es, los cursos obligatorios tienden a recibir menor puntaje por parte de los estudiantes que los cursos optativos. |
||||