Using Confidence Level Tables in Interpreting ClassEval Results

In 2010 the following examination and recommendations for using ClassEval results were put together.  We provide this analysis as a guide for current and future use of results as the findings here lend themselves to interpretation of current results.

How high a response rate is necessary to provide valid ratings?  It depends on class size and the variance among students’ responses.  It also depends on how much confidence an instructor and head want to have that a particular rating represents the whole class.  The following information may be useful in determining whether the magnitude of differences between instructors and course sections is significant.

Standard Reports

To aid the interpretation of results, a confidential standard report is posted that includes for each question the response rate, mean rating, the standard deviation around the mean rating, and the standard error of the mean.  The department mean and standard deviation are provided as benchmarks.

Mean Scores

Mean scores associated with low response rates should be interpreted with great caution.  The lower the response rate, the more likely it is that the mean score may be biased by responses of students with atypical opinions.

Consult the tables linked below to find the response rate necessary to reach your desired confidence level, given your class size.  These tables will provide information like this:  “For your class of X students, of whom Y responded, you can have over W% confidence that the true class mean is +/- Z points of the respondents’ mean.”

Confidence Level Tables for Various Class Sizes

Select the table based on the number of enrolled students (not the number of responses).
10- 14 PDF Document
15-19 PDF Document
20-29 PDF Document
30-39 PDF Document
40-49 PDF Document
50-100 PDF Document

These tables were created using the Monte Carlo Method using historic ClassEval data from course sections with response rates approaching 100%.  The tables will be updated annually as more data become available, so these measures will become more reliable with time.

While ClassEval ratings are useful in evaluating teaching effectiveness, ClassEval ratings should not be used as a sharply focused or precise evaluative tool.  Users are encouraged to look for general trends and outliers and to avoid giving undue emphasis to small differences.

Variance:  Standard Deviation (SD) and Standard Error of the Mean (SEM) 

Measures of variance are of limited use with ClassEval data because the mean, standard deviation, and response rate are not independent of each other.

ClassEval scores are limited to integer values between one and five, and the vast majority of ClassEval ratings are four or five.  A high response rate and a large standard deviation (i.e., high variability in ratings) may suggest that enough students may be dissatisfied to warrant attention.

For more information contact ClassEval support at