El contenido de esta página requiere una versión más reciente de Adobe Flash Player.

Obtener Adobe Flash Player

El contenido de esta página requiere una versión más reciente de Adobe Flash Player.

Obtener Adobe Flash Player

       
.2012 - Volumen 5, Número 2
 
     
Using Evaluation Results for Improving Teaching Practice: A Research Case Study
 

Edith J. Cisneros-Cohernour and Robert E. Stake

 

A substantial body of research has been conducted on the use of student evaluations of college teaching. This research has mainly centered on the dimensionality, validity, reliability and generalizability of the ratings, and the study of possible biasing factors that could affect the validity of the ratings (Marsh, 1984; Braskamp & Ory, 1994; El-Hassan, 1995). Only a limited number of studies have been conducted on the longitudinal use of student ratings and their possible influence on instructional improvement.

This research case study is committed to that purpose. This is a case study of a senior professor in a public research university in the Midwest. The study traces how the instructor uses the feedback from peers, his department head and students. The research is an attempt to better understand the evaluation results as feedback for instructional improvement.

1.CONCEPTUAL FRAMEWORK

Other research has shown mixed evidence about the possible effects of the feedback from student’s ratings for improving teaching.  Cohen (1991) has supported the potential use of the ratings for instructional improvement and other authors (Miller, 1987; Roush, 1983) found a high positive effect of the ratings on instructional improvement when accompanied by consultation. 

Still other findings have shown only minor or limited impact of the ratings on instructional improvement when measured over time (Gray & Brandeburg, 1985). As Brinko (1996) mentions, while some studies have affirm that the feedback from student ratings is valid and reliable, the research showed only a marginal effect of the ratings when used alone. More research seems to be needed on how the faculty uses feedback from student ratings. In addition, future studies could look at the benefits and trade-offs when using the feedback provided for the ratings.

2. ISSUES
The main issues that guided the study were:

  1. Good teaching. In what ways is professor perception of good teaching conflicting or overlapping with the department’s perceived definition of good teaching? What happens if the professor has a different teaching philosophy from that adopted in his Department?
  2. Sources of feedback. If the professor is using more than one source, such as external observers or peers, to receive feedback about his/her teaching, has the use of feedback from the external observer provided the instructor with similar or conflicting information from that provided by student ratings and other sources?
  3. Influence of the feedback. In what ways did feedback from the ratings or other sources influence instructor decisions regarding classroom activities?
  4. Homogeneity. Does the department emphasis on homogeneity instead of individuality raise issues regarding the academic freedom of the faculty members?

3. METHODOLOGY

This study portrays how a male instructor, his peers, and the department head used feedback from student ratings{1}. A case study approach was employed to provide a holistic mode of inquiry that would be consonant with the exploration of issues addressed by the study. The method selected provided understanding of the process and events taking place and the complexities of the context under which the evaluation was conducted (Patton, 1980).

The case study was organized around main questions or issues (Stake et al, 1995). The criteria for selecting the professor and his course was not so much “What course and faculty member represents the totality of the instructors and disciplines taught in this university?” but more “What instructor would help us to understand the use of student feedback and its impact on teaching?” The professor was also selected because of his desire to continue improving his instruction.

The case was an anonymized George Aldermen, a senior professor who had been receiving low to moderate ratings in recent years.  He had used consultation services within the Division of Instructional Development on campus. He was teaching math-related courses required for engineering students. The students who attend his course were undergraduates in their junior or senior year.

Data were collected over a period of two semesters through classroom observations and semi-structured and open-ended interviews with the faculty member, with other instructors in their Department who teach the same course, and with the Department Head. Classroom observations provided an understanding of the teaching styles, classroom techniques and interactions between the instructor and the students.  The professor was later asked how the feedback from the ratings had influenced activities taking place in the classroom. Interviews with the professor and Department Heads were centered on the four issues cited. Each of these interviews was conducted in the instructor's private office. Each interview lasted from twenty minutes to an hour.  Student ratings and evaluation forms and a variety of documentary data were available to the researcher. Because the records are considered private, access was gained after obtaining the consent of the parties involved. The data collection used, the length of time devoted to data collection on site, and the use of multiple sources of evidence contributed to a better understanding of the context under which teaching took place.

The first author gathered the data, using traditional ways for validating her observations.  She took care to use more than one medium--from observation, interviews, and document review. Data collection resulted in multiple statements, testimony, documents, and observations. Data analysis in this study, as in most qualitative studies, commenced as data collection continued. Throughout the study, data analysis helped to redefine the issues. Field note analysis was qualitative. To organize the data in a form that facilitates analysis and interpretation, the researcher used a thematic approach. The researchers analyzed her field notes using the key questions. She showed her processed notes to the participants and confirmed the accuracy of quotes and descriptions of what she quoted about them.

4. FINDINGS

Next is a description of the case study findings. First, we present a brief description of the case, followed by the source of evaluation feedback received by the instructor. Then comes the analysis of the utility of these sources of feedback for instructional improvement.

4.1. The case and its teaching context

George Alderman was a senior professor with twenty years of experience. He obtained his Ph.D. in the early 70s from a prestigious university in the Eastern part of the U.S.A., and obtained his bachelor’s degree in Mathematics from a prestigious Institute of Technology. Since before 1980, George had worked for his present university’s Practical College. He usually had taught statistics-related courses, particularly ones required for undergraduate students.

George’s university had been using a campus-wide cafeteria style evaluation system since 1967. It used student ratings as one source for evaluating college instruction. The university required all assistants and associate professors to submit their evaluation rating results every semester and for every course (Ory & Wieties, 1991). All evaluation survey forms included two “global items” and a set of specific items to evaluate instruction. The two global items were used for summative purposes and the set of specific items for instructional improvement. Many faculty members had the option to select their own specific rating items and/or to use departmentally-selected core items. Instructors and departments selected their items from a catalog of more than 400 items. (Ory & Wieties, 1991).

During an interview with the Department Head, he talked about how much they valued teaching. He added that they reviewed their faculty ratings in comparisons across campus. Student ratings’ results were provided to administrators for making decisions about promotion, tenure and teaching awards, as well as for providing instructors with feedback about their teaching. It was expected that faculty members would use this information for improvement.  If they required help they could go to the Division of Instructional Development.

At the time of this study, the university was reviewing the evaluation system as one indicator of institutional assessment, seeking more information about the use of evaluation results and their impact on instruction at individual and departmental levels.

I met George through the Division of Instructional Development on campus. He had been in contact with this division for several semesters. Since he had received low ratings on his teaching, he went to this office to receive help.

4.2. George’s Class

George was teaching class twice a week. One of the class sessions started at 1:00 pm and the other at 3:00 pm.   His class took place in a classroom located in a major building of the campus. The room had good illumination with capacity for 45 students. The room was large but not deep.  There was little space for students to interact with each other. Student seats were arranged in rows facing the instructor’s desk.

George’s style of teaching is briefly summarized below:

George arrives early into the classroom. He usually uses transparencies during his lectures and provides handouts for his students. While George is explaining, his students take notes. During the lecture George uses various examples to illustrate the main concepts. He checks for understanding and asks for answers to the problems he is presenting. George's pace is smooth. He talks slowly, allowing students to take notes. The pitch of his voice is uniform, for some monotonous, when the class takes place after lunch.
On one occasion I find two students resting heads on desks as if sleeping. They return to attention when George uses a joke to illustrate the concept that he is explaining. The class routine follows a similar pattern. First, George checks role and returns assignments to his students. Then, he begins the lecture using the overhead transparencies. George asks students questions while he teaches. The students usually take notes. Sometimes, after George has finished explaining, the students work in teams solving a problem.

George traditionally received low to moderate ratings by the students. His course had a poor reputation according to the students.

In earlier years, George taught the same class each term to all who enrolled.  The students gave him and the course low ratings. Then the Department Head divided the students among three instructors, each of whom then taught two classes, 45 students each, meeting twice a week. After one semester, one of the instructors obtained high ratings for his teaching. George and the other man got low ratings. George sought help from the Division of Instructional Development. His peers chose to work together separately.

A semester later, only George was rated low.  His ratings actually went down, partly because he called for student teamwork, with each student grading each other’s participation. One of George's students formally charged him with "capricious grading." The Department Head sent George a letter asking him to find a way to improve.{2}

4.3. Feedback sources

Next a description of the main sources of feedback that George received for improving his teaching: student ratings, Department Head, peers and the instructional specialist.

4.3.1. Students

Analysis of the student ratings on evaluation forms of George’s class during the two years studied indicated that he received low to moderate ratings in relation to the campus norm. Although invited to, students seldom wrote comments on the back of the forms, comments that George might use to learn more about his strengths and weaknesses as an instructor.

During the semester that George began to use more teamwork, he asked his students to rate each other’s participation in the teamwork activities. That semester he received the lowest ratings ever.  On that occasion, some students wrote on the evaluation form that they disliked the teamwork activities. The following semester, George ratings began to improve.

To obtain more information for improving his teaching, George then asked his students to write about his teaching strengths and weaknesses, as well as about suggestions for improving. It was noted that George was teaching a required course for freshmen and sophomore students. The course was part of a general education curriculum. Some students did not have the required background for his course. In addition, there were students who indicated they had had negative perceptions about the course before enrolling. They indicated that they registered for the course because it was a requirement for graduation. It is important to note that the literature on student ratings indicates that students tend to grade required courses lower than electives (Braskamp y Ory, 1994; Stake & Cisneros-Cohernour, 2004).

During a focus group with some of George’s students, majoring in liberal arts (e.g., in History and Marketing) expressed a negative perception about the course because they lacked course prerequisites.  During that semester, George asked students to provide him with feedback anonymously. Student comments about the strengths and weaknesses of George’s teaching included:

  1. Strengths:
    • He uses relevant examples in the class
    • Good use of questioning
    • Provided clear "hints" on how to do the homework
    • Was available
    • Helpfulness
    • Excellent handouts
    • Well prepared for the class
    • Very well organized
    • Fairly thorough in describing each problem
    • His exams are fair, yet challenging
    • Adequate coverage of content in the exams
    • Excellent pace
    • Good at explaining
    • The text is written well and easily followed
    • Solid knowledge and command over the material and it shows in his teaching
  2. Weaknesses:
    • Sometimes he covers too much detail
    • Need to practice more of the problems that will be in the exam
    • Some problems need more explanation and more detail
    • Keep repeating examples "until we get it"
    • He teach at a slow  pace
    • He needs to teach slow, so all the students can understand
    • The text is not always easy to follow
    • He is good at explaining
  3. Suggestions
    • Update his homepage
    • "Teach to the text"
    • "Teach for the exam"
    • Put slides in the web before the sessions
    • Cover more examples than the ones included in the book
    • Do not cover anything not in the textbook
    • Be sure you do not teach over "the head of the students"
    • Teach more content in a detailed form.

Some of these comments were about teaching style, the way in which George communicated with students, and about some practical suggestions for teaching. Most suggestions were on a teacher–centered approach. George thought that his student comments were not entirely consistent, so he selected what he considered he could change to improve his teaching. Before making any changes, he consulted with Alice, the instructional specialist.

During classroom observations, students were using George’s class materials, taking notes and applying the concepts while solving math problems.

4.3.2. George’s department head

In interviews, the Head spoke of a need for improvement of teaching.  She acknowledged she was not providing faculty members with feedback on instruction nor creating mechanisms to support teaching:

“Our department reviews the ratings. We pay attention when a course receives a low rating over time. We encourage faculty to improve their ratings or look for the help they need in order to improve. Our faculty members usually improve their ratings over time.”

4.3.3. Peers

Two instructors presently taught the same course as George. One of them was Frank Edwards, a faculty member in his second year of tenure. The other instructor was Will Wilson, a tenured professor with twenty years of experience like George. During interviews, Wilson indicated that he had worked with George in the past and attempted to help him in improving his teaching. He said:

George is an honest professor. He is the kind of instructor that says things directly. He doesn’t know how to lie; if a student is not doing well in his course George doesn´t tell a white lie. He honestly tells his students how well they are doing in the course. Some students do not value this kind of behavior on the side of the professors. I think this negatively influences his student ratings.

Will Wilson added that it was very important for instructors to be careful in the way they communicate with their students:

I have found that the way I present information to my students can influence my ratings. It is better for me to tell them that I am using extra credit quizzes than telling them that we will have a weekly exam in the class. Some professors can use pop quizzes and tell the students that they are extra-credit tests but this is not always possible for the entire faculty. Being careful in how you communicate to them and selecting words that make tests look like rewards have nothing to do with learning or teaching but they have a secondary effect: "A happy student is a good student". This is important when he rates your teaching.

He added that after observing George’s class, he did not find a big difference between George’s teaching and his own.

George’s other peer, Frank Edwards, said that he had worked with Wilson in the planning of the course but had not worked with George. He knew George liked to work independently.

4.3.4. The instructional specialist

For two years, George received feedback from Alice, a consultant from the campus Division of Instructional Development. He had voluntarily requested help for improving his teaching.  Alice commented that George often received compliments from his students for being a caring, knowledgeable and fair instructor, also that he was accessible. She added that her work at that moment was to help George find ways to improve his teaching as well as his ratings:

We know students like teamwork, so he continues adding some teamwork activities to his class. We also suggested him to keep the lights on during his class because students come to class after lunch. George’ voice is so smooth that it can sound monotonous especially at a one o’ clock lecture. We advised him to use pauses to help him with this problem. George and I observed one of his peers teaching the same class and we advised George to ask more questions to involve the students more in the lecture. Finally, we suggested that he use early feedback in the semester to learn how students perceive his teaching. George is improving his communication skills and this is being reflected on his ratings.

The next semester George followed Alice’s advice of including more two-person teamwork activities during his classes.  That was when he encouraged students to rate each other’s participation in class. Several students were upset because most Department faculty members granted credit for class participation based on class attendance. As indicted earlier, one student formally accused George of capricious grading. At the end of that semester George ratings were very low. His Department Head sent a letter requesting him to improve his ratings.

The situation did not discourage George.  He made more changes. He continued using teamwork activities, giving handouts to his students, collecting early feedback on his teaching, and increasing the use of questioning in his class.

Alice said that in her opinion George was a person who liked to make small changes when these did not conflict with his teaching philosophy. She also mentioned that George had changed gradually over two years. She added that change in teaching could be expected to be a gradual process and that one semester was a very short period of time to see any real change.

4.4. Usefulness of the feedback

During interviews with George and his peers, it was clear that in that Department student ratings were the official source of feedback. The Department Head had a positive opinion of this source of information:

We would like to use various sources such as alumni or peers to evaluate instruction in addition to current students but this is not possible. We use students because they are accessible. Using students is cost-effective. We also believe that they have the capacity to judge instructional quality. A good teacher has a good relationship with his students and satisfies their learning needs. A good teacher receives good ratings.

Unlike the Department Head, George’s colleague Frank Edwards said:

I do not consider student ratings to be a good measure of teaching effectiveness.  I advocate the use of classroom visitations by peer faculty.  More senior faculty could work with junior faculty.  Not only would this increase the rate at which junior professors improve their teaching skills, but we would also have more reliable feedback regarding teaching effectiveness.

Will.Wilson, George’s other co-teacher, added:

On at least four occasions I have asked a colleague to sit in my class and provide me with feedback.  Sometimes the feedback from a peer is productive; sometimes it is not.  My experience is that the feedback from a peer is more useful when it is provided in private (if your department does not know the information).

Frank Edwards went further:

In this department, a formal mentoring program does not exist.  It is quite common, however, for informal mentoring, where one faculty member helps another, to take place. My department is trying to do something like this.  However, classroom observations are quite time consuming.  Since teaching is still not rewarded to the extent that research is, many faculty members feel that it is not in the best interest to do this. 

And George said:

The feedback from a colleague may work if several professors teach the same subject, they observe each other classes, and then they have an informal conversation. But this needs to be provided as consultation, not with professors writing reports for the administration.

The danger I see of having several professors providing feedback is that it may be that different philosophies of teaching and personality conflicts could influence the feedback provided by peers. For example, if two professors have different teaching styles or personalities they may also have different ideas about teaching and preferred teaching methods. It could also be a potential problem if the professors do not agree in their views of teaching and the students’ role. For example, a professor may prefer ‘open-book exams’ whereas the others do not support this type of examination form.

George added that he preferred feedback from an external observer.

Personally, I prefer to receive advice from people outside the department. Students usually provide little feedback that can be used for instructional improvement. When they make a suggestion about the class, I look at their comments and try to respond to them when they are appropriate for improving the class. However, most of the time they don´t make any comments. I consider the instructional specialists the only source of meaningful feedback to improve my teaching.

5. CONCLUSIONS

The case study focused on a professor trying to improve his teaching. He worked with feedback from his students, peers and an instructional specialist. His Department Head used only student ratings as an official source of feedback. Concentrating on this single source made it difficult for him to improve the quality of his teaching.

These student evaluation findings did not take into consideration the meaning of the scores in context nor that student evaluations can negatively be influenced by course characteristics{3}, the lack of student interest and prerequisites, the way the student questionnaire is administered, the instructor, the students, instruments, as well as the workload and the level of depth of the course coverage (Brodie, 1998; Ryan et al., 1980).

Student ratings were especially problematic because they were used for multiple purposes. The literature on faculty evaluation clearly holds that evaluation purposes should not mix. Haskell (1997), for example states that the use of student ratings for both formative and summative purposes can lead to misuse. It can result in violation of the instructor’s academic freedom and diminishing the quality of teaching.

In George’s case, there were two main tradeoffs that result from this decision. The first one was that the emphasis on student ratings might discourage faculty members from innovative teaching. George experienced this problem when he began to introduce changes in his teaching. His low ratings on teaching were used to deny a salary increase the year that he obtained his lowest ratings.

George said that this problem could be solved if the department head were willing to allow the professor to experiment with a new practice without reviewing the ratings at least for a year. This suggestion, however, required the willingness of the department administration. If department leaders are not aware of the complexities of changes in teaching, they are unlikely to maximize improvements for their weaker teachers.

The second trade-off was the pressure for homogeneity as the department equated student ratings to effective teaching.  George perceived that equating good teaching with a single numerical score could result in limiting a professor’s academic freedom. Since he had tenure he had certain freedom to make changes in his teaching. Unfortunately, other faculty members may not have the same freedom as George in resisting the pressures for standardization. Untenured professors who work in departments that follow the same policies as George's department may feel their academic freedom restricted.

But especially, using only student ratings can make instructors focus too much  on improving scores rather than reflecting upon and improving the teaching itself.  Pleasing students is not the only way of teaching effectively.

The professors who participated in the study were concerned about the meaning that students gave to the construct of good teaching. As we look at the informal feedback that George received from his students, we observed that they sometimes had contradictory views.  Some wanted more redundancy, some less.  Some wanted topics restricted to the textbook, some wanted the instructor’s personal experience.  They are experts in what they want but not in what others should have.  Their suggestions focused on teaching style, favoring a teacher-centered type of instruction, seldom indicating a need for how they could work harder.

Will Wilson, George’s associate, said:

My student ratings change depending on the group. A group may give me 4 or 4.5 points or even 5 points on the scale from 1 to 5. I occasionally change what I teach. One problem with the feedback from the ratings is that I do not know what I did wrong. Even if I teach the same course again there is no guarantee that I would obtain the same ratings even teaching the class the same way.

Wilson identified the causality issue related to the weight given to the ratings by the department:

The emphasis on the ratings as synonymous of instructional effectiveness could lead faculty to find only ways to improve their ratings. In the last new faculty orientation that I attended, we were advised not to give the questionnaires at the same time when we give a test to the students because this could influence the ratings in a negative way. This seems to be evidence of how faculties learn to manipulate the ratings.

Wilson added that students prefer some courses over others, influencing the ratings. So, when he has obtained low ratings in a course, he said he moves to another course, one that he knows is liked by his students:

I have a repertoire of courses that I teach. If my ratings do not improve in a course, I move to another course that I know I can teach better. I have the freedom to do this (change to another course if my ratings are low), but not everybody can do this.

The professors restated the advantages and disadvantages of feedback from peers, and added that different professors may prefer different types of feedback for improving their teaching. So, what works for some may not work for all.

Finally, the case study showed that in spite of more than sixty years of research, the evaluation of teaching continues largely to be a responsibility of the individual instructor. Few faculty members enjoy  a community of practice, that is, a group of professionals, informally united, exposed to common problems, searching for common solutions and a source of knowledge of teaching to share with each other (Johnson-Lenz y Johnson-Lenz, 1998; Stake y Cisneros-Cohernour, 2004).

 

REFERENCES

Braskamp, L. A. and Ory, J. C. (1994). Assessing faculty work: Enhancing individual and institutional performance. San Francisco, CA: Jossey Bass.

Brodie, D. A. (1998). Do students report that easy professors are excellent teachers? The Canadian Journal of Higher Education, 23 (1), pp. 1-20.

Brinko, K. (1996). The practice of giving feedback to improve teaching: What is effective? Journal of Higher Education, 74 (5), pp. 574-593.

Centra, J. A (1993). Reflective faculty evaluation: enhancing teaching and determining faculty effectiveness. San Francisco: CA. Jossey-Bass.

Cohen, P. A. (1991). Effectiveness of Student-Rating Feedback and consultation for improving instruction in dental schools. Journal of Dental Education, 55 (2), pp.  45-50. 

El-Hassan, K. (1995). Student’s ratings of instruction: Generalizability of findings. Studies on Educational Evaluation, 21 (4), pp. 411-429.

Gray, D. M. and Brandeburg, D. C. (1985). Following student ratings over time with a catalog based system. Research in Higher Education, 22, pp. 155-168.

Guba E. G., and Lincoln, Y. S. (1985). Effective evaluation: Improving the usefulness of evaluation results through responsive and naturalistic approaches. San Francisco: Jossey-Bass.

Johnson-Lenz, P. and Johnson-Lenz, J. (1997). Bonding by exposure to common problems. In What is a Community of Practice? Community Intelligence Labs. Disponible en línea: http://www.co-i-l.com/coil/knowledge-garden/cop/definitions.shtml.

Marsh, H. W. (1984). Student evaluations of university teaching: Dimensionality, reliability, potential biases, and utility. The Journal of Educational Psychology, 76, pp. 707-754.

Miller, R. I. (1987). Evaluating faculty for promotion and tenure. San Francisco, CA: Jossey Bass Inc.

Ory, J. C. and Weities, R. (1991). A longitudinal study of faculty selection of Student Evaluation Items. Paper presented at the annual meeting of the American Educational Research Association, Chicago, April.

Ory, J. and Ryan, K. (2001). How do student ratings measure up to a new validity framework? New Directions in Institutional Research, 109, Jossey-Bass Inc., Publishers, San Francisco: CA.

Patton, M. Q. (1980). Qualitative Evaluation Methods. Beverly Hills, CA: Sage Publications.

Roush, D. C. (1983). Strategies for effective university teaching. Materials for teaching methodology workshops of the Fulbright Exchange Program. Latin American Scholarship Program for American Universities (LASPAU). Harvard, MA.

Ryan, J. J., Anderson, J. A. and Birchler, A. B. (1980).Student Evaluations: The Faculty Responds, Research in Higher Education 12 (December, 1980): pp. 317-33.

Stake, R. (1995). The Art of case study research. London: Sage publications.

Stake, R. E. and Cisneros-Cohernour, E. J. (2004). The Quality of Teaching in Higher Education. The Quality of Higher Education. País: Lithuania, (1), pp. 94-107.  Available en: http://skc.vdu.lt/downloads/zurnalo_arch/amk_1/094_117stake.pdf.

Whittman, N. and Weiss, E. (1982). Faculty evaluation: The use of explicit criteria for promotion, retention and tenure. (Higher education report No. 2). Washington: DC. AAHE-ERIC.

 

{1} In order to ensure confidentiality pseudonyms have been used to replace the names of the departments, faculty and department head involved in this study. Some events have also been presented in a way to diminish recognition. The study includes only information seen likely to understanding the cases.

{2} Frank Edwards, one of George’s peers, said “In general, this course, having now been taught by a number of different faculty members, does not receive good ratings.  Since the course is required and is mathematically oriented, ratings are not typically as high as in other courses.   In most other courses, students have some latitude in responding to a particular assignment. More than one response might be acceptable.  In an analytically oriented course like this, there is much less latitude.  An answer is usually either right or wrong.  There are fewer shades of gray."

{3} Esto es, los cursos obligatorios tienden a recibir menor puntaje por parte de los estudiantes que los cursos optativos. 

 

El contenido de esta página requiere una versión más reciente de Adobe Flash Player.

Obtener Adobe Flash Player