SOSC 509
STATISTICS
FOR SOCIAL SCIENCE
FALL 2006
TUESDAY
Room 3311,
INSTRUCTOR: Dr. WU Xiaogang (sowu@ust.hk)
OFFICE:
OFFICE HOURS: Monday
TEACHING ASSISTANT: Mr. Liang Yucheng (liangyc@ust.hk)
OFFICE: Room 3003,
TUTORIAL SESSION: TBA
OVERVIEW:
This course provides an
introduction to quantitative methods in social research, and how they are used
to assemble, describe, and draw inferences from bodies of numerical data. The course
is organized into two modules. The first is primarily devoted to descriptive
statistics and fundamentals of statistical inference. Topics include frequency
distribution, probability theory, random variable and probability
distributions, estimation, and hypothesis testing, t-test and contingency table analysis. The second is devoted
exclusively to linear regression techniques, which is widely used in social
science research. Topics such as Analysis of Variance (ANOVA) and Analysis of
Covariance (ANCOVA) are covered under the framework of linear regression.
Models for categorical variables are reserved for a more advanced course. The
course materials are explored through the analyses of real data sets using
STATA.
OBJECTIVES AND REQUIREMENTS:
The course aims to develop your
skills as both a “consumer” and a “producer” of social research. As a
“consumer,” you will become a more informed and critical reader of academic
work, news accounts and advertising materials that present statistical
evidence. As a “producer,” you are expected to conduct elementary statistical
analysis and make sense of the quantitative results.
You will be assessed through 8
assignments (40% of final grade, 5% for each), a midterm exam (25% of final grade),
and a final term paper (35% of final grade).
● Assignments
are designed to help you understand the materials
presented in lectures. They are distributed on weekly/bi-weekly basis in
lecture (Tuesday) and usually due in the following week. Most assignments will
involve the use of computer to analyze several small data sets and subsets of
the data from national representative surveys. Starting from the second week,
there is a one-and-half-hour lab session very week to help you learn how to use
STATA.
● Midterm
is an IN-CLASS and CLOSE-BOOK exam covering the
material in the first seven weeks.
● Term paper is designed to help you gain some hands-on research
experience. After the midterm, you will be asked to pick a topic of your
interest, identify data source and conduct statistical analysis with techniques
you have learned or be learning from the course, and draw conclusions. Details
will be announced later.
PREREQUISITES:
Although there is no prerequisite
for the course, an undergraduate-level statistics is strongly recommended. I
assume that you are familiar with the material in Chapters 1-3 of the Agresti and Finlay book (see
“texts” below), and have some knowledge but imperfect understanding of the
materials in Chapters 4-6. In the first
section, we will review this material, but at an accelerated pace and/or in
more depth than typically covered in an introductory statistics course for
undergraduate students. If you think you need a thorough review for basic
statistics, a good text I can recommend is David S. Moore (2002) The Basic Practice of Statistics.
High school algebra, either
remembered or re-learned, will also be necessary to get through the course,
though formula derivations are presented only when necessary.
I have taught this course in the
past three years, with MA students, M. Phil. students, and Ph.D. student mixed
in class. Starting from this year, the Social Science Division offers a
separate method and statistics course to self-financed MA students. As a
result, SOSC509 is mainly for research students and evaluation standards are
raised. MA students are encouraged to take SSMA501 “Principle of Social
Science,” unless they have strong interests in quantitative social research and
prepared for the intellectual challenges in the course.
TEXTBOOKS:
● Required
Alan Agresti and Barbara Finlay, 1997. Statistical Methods for the Social Sciences
(3rd edition).
● Recommended
The required book is available at HKUST bookshop for
purchase. Your IA will hold the recommended book, and you are welcome to borrow
it.
COMPUTING
You will be doing substantial
amount of data analysis with a software package called STATA 9.2. Among a
number of popular statistical packages, STATA is a fast and efficient package
which includes most of procedures of interest to social scientists (http://www.stata.com).
You can either buy the software
package under “STATA Grad Plan” program, or gain access via social science
computing lab. We will rely on course handouts/examples, on-line help, and
instruction in lab sessions to teach you how to use STATA step by step, enough
to get you into position to do the exercises and conduct basic statistical analysis.
Due to the time limit, we will not be able to cover every aspect of the
software. Should you have further interest, you can learn by yourself with the
help of
Although some of you may prefer to
use SPSS, I strongly recommend STATA (for reasons, see http://www.stata.com/info/whystata/).
Homework using
You can
find the syllabus, homework assignments, data sets, and links to other learning
resources in the course web (http://lmes2.ust.hk).
You
may use your ITSC username and password to log in. The lecture notes will be distributed in class. All
copyrights, however, are reserved.
Free learning resource for STATA
can be found at http://www.ats.ucla.edu/stat/stata/webbooks/reg/default.htm
http://www.biostat.au.dk/teaching/software/
Specifically, all examples in Agresti & Finlay have been
programmed in STATA so that you can replicate the results with the package at http://www.ats.ucla.edu/stat/examples/smss/
DATASETS: What we deal with in statistics are
not pure numbers, but the ones with social meanings/contexts. Over the course I
will rely less on the data used in the textbook (which are either hypothetical
or too “American”), but more on the real data in the context you are more
familiar with (
1. The Hong Kong
By-Census Micro-Data (1996). The data include 309879 individuals nested in
93051 households, which may be further modified for you to analyze. A subset of
the data is generated for your assignments and final paper.
2. The Chinese Life
History and Social Change Survey (1996), a multistage stratified national probability
sample of 6,090 adults aged from 20 to 69 from all regions of
3. Other data collected in
INSTRUCTION FORMAT:
The course contains a three-hour
lecture and a one-and-half-hour lab session each week.
●
Lectures are designed to elaborate the
conceptual issues covered in your readings. Usually I spend 2 hours going over
the concepts/theoretical ideas, and the
rest 1 hour illustrating
them with examples using STATA. I will try to make STATA programs available to
you (without detailed annotations).
● Lab sessions, run by IA, are mainly designed to give you hands-on
experience to use STATA for homework problems. Time may be spent on: (a)
feedback
on prior problems; (b) instruction in computing for upcoming
assignments; (c) work examples to relevant to lecture materials and upcoming
assignments. The first
lab session is to be held next week to introduce you the STATA
software.
● Office hours. Please feel welcome to make full use of our office hours
for discussions concerning homework assignments, exams, or other matters about
the
course. Extra appointment
is possible. Most time of the day I will be in my office. Please feel free to
stop. Email communication and
are also good
ways for exchange.
TOPIC
OUTLINE
Week 1. Introduction (5 September)
Topics:
Social science research; variables and measurement; sampling
methods
Agresti and Finlay, Chapters 1-2
Yu Xie,
2005. “Methodological
Contradictions of Contemporary Sociology”
Week 2. Descriptive Statistics (12
September)
Topics:
Tabular and graphical displays of data; measures of central
tendency and dispersion; sample statistics and population parameters
Agresti
and Finlay, Chapter 3
Assignment 1 due on 12 September.
Week 3 & 4. Probability Distributions (19
& 26 September)
Topics:
Discrete and continuous random variables; normal
distribution; sampling distribution
Agresti
and Finlay, Chapter 4
Assignment 2 due on 19 September.
Week 5.
Statistical Inference I: Basics (3 October)
Topics:
Central limit theorem; point estimation; confidence interval;
choice of sample size
Agresti
and Finlay, Chapter 5
Assignment 3 due on 3 October.
Week 6.
Statistical Inference II: Hypotheses Testing (10 October)
Topics:
Hypotheses testing; t
distribution; type I and type II errors
Agresti
and Finlay, Chapter 6
Assignment 4 due on 10 October.
Week 7. Statistical Inference
Topics:
T-test,
one-way ANOVA; cross-tabulation; chi-square test of independence
Agresti
and Finlay, Chapter 7:1-3, 8:1-3
Assignment
5 due on 17 October.
Week 8. Midterm (24 October)
Week 9. Simple Linear Regression (31 October)
Topics:
Simple
linear regression; OLS estimates; correlation coefficient; goodness of fit
Agresti
and Finlay Chapter 9
Week 10. Special Topics: How to
Write a Research Paper (7 November)
Week 11. Multiple Regression I (14
November)
Topics:
Causality;
statistical control; multiple regression model estimation
Agresti
and Finlay, Chapters 10; 11:1-2
Assignment 6 due on 7 November.
Week
12. Multiple Regression II (21 November)
Topics: Inference for regression; reporting regression results,
variable transformation
Agresti
and Finlay, Chapter 11.4
Week 13.
Multiple Regression
Topics:
Interaction
effect with dummy variables, analysis of variance and covariance
Agresti
and Finlay, Chapters 12
Assignment
7 due on 21 November.
Week 14. Multiple Regression IV (5
December)
Topics:
Assumptions of linear models; non-linearity; multi-collinearity,
regression diagnostics
Agresti
and Finlay, Chapters 13-14
Assignment 8 due on 5 December.
The
term paper is due no later than
MERRY CHARISTMAS AND HAPPY
NEW YEAR!