|Alan Agresti (USA)||firstname.lastname@example.org|
|At the University of Florida, I have
developed two courses in categorical data analysis. One is designed for
masters students in statistics and the other is designed both for undergraduate
statistics majors as well as graduate students in other disciplines who
have had some exposure to basic statistics including regression. Regardless
of the level, I unify methods taught in the course by showing how they occur
as special cases of generalized linear models for categorical responses.
For instance, each inferential method results from a choice of distribution
for the response (binomial, Poisson, ...), link function for the mean of
the response (logit, log, ...), and inferential use of likelihood function
(Wald, score, likelihood-ratio). As much as possible, I use the same generalized
linear modeling software throughout the course (e.g., PROC GENMOD in SAS
or the glm function in S-plus). Over time I have placed more emphasis on
logistic regression and less on loglinear models. This reflects most applications
having a single response variable and possibly quantitative as well as qualitative
predictors. Thus, the course is not simply one in "contingency table
analysis." As part of the course, I always require students to obtain
a data set (e.g., General Social Survey results off the WWW) and write a
report showing a data analysis. Even when students do well on exams, it
is humbling to see the rather naive errors students make in the modeling
process. It seems worth putting less emphasis on exams and having students
do at least two projects, even if the second only entails improving analyses
in earlier projects based on feedback from the instructor.
Various challenges arise in teaching such a course. For one, there is an increasing variety of methods for analyzing even the most basic of categorical data (e.g., single proportions, 2-by-2 tables), and the simplest approaches to teach sometimes have quite poor operational performance (e.g., Wald confidence interval for a proportion or difference of proportions). Second, it is difficult to provide general guidelines about when one can use large-sample inference, and yet teaching small-sample methods requires careful consideration of complicating effects of possibly substantial discreteness. Third, in practice many problems have clustered data. Methods for clustered data such as generalized estimating equations and random effects models have been developed relatively recently, and require sufficient sophistication that they are not easy to incorporate in a first course on this topic.
|Download in Word format (DOC).|
|Download in Adobe Acrobat format (PDF).|
Back to top
ICOTS-6, The Sixth International Conference on Teaching Statistics - International Program Committee (IPC) Website.
Copyright © 2001 by the IASE. All rights reserved. This information is subject to change without notice. This page was last modified on July 2, 2002.
For questions or comments, contact the Webmaster, Dagan Ben-Zvi.