SOSC 509

STATISTICS FOR SOCIAL SCIENCE

FALL 2006

            TUESDAY 6:30-9:20 PM

         Room 3311, Academic Building

 

INSTRUCTOR:    Dr. WU Xiaogang (sowu@ust.hk)

OFFICE:                ROOM 3377, Academic Building  (Phone 23587827)

OFFICE HOURS: Monday 4:00 - 6:00 p.m. or by appointment.

 

TEACHING ASSISTANT: Mr. Liang Yucheng (liangyc@ust.hk)

OFFICE:                               Room 3003, Academic Building (Phone 23584488)  

TUTORIAL SESSION:      TBA

 

OVERVIEW:

This course provides an introduction to quantitative methods in social research, and how they are used to assemble, describe, and draw inferences from bodies of numerical data. The course is organized into two modules. The first is primarily devoted to descriptive statistics and fundamentals of statistical inference. Topics include frequency distribution, probability theory, random variable and probability distributions, estimation, and hypothesis testing, t-test and contingency table analysis. The second is devoted exclusively to linear regression techniques, which is widely used in social science research. Topics such as Analysis of Variance (ANOVA) and Analysis of Covariance (ANCOVA) are covered under the framework of linear regression. Models for categorical variables are reserved for a more advanced course. The course materials are explored through the analyses of real data sets using STATA.

 

OBJECTIVES AND REQUIREMENTS:

The course aims to develop your skills as both a “consumer” and a “producer” of social research. As a “consumer,” you will become a more informed and critical reader of academic work, news accounts and advertising materials that present statistical evidence. As a “producer,” you are expected to conduct elementary statistical analysis and make sense of the quantitative results.    

 

You will be assessed through 8 assignments (40% of final grade, 5% for each), a midterm exam (25% of final grade), and a final term paper (35% of final grade).

 

Assignments are designed to help you understand the materials presented in lectures. They are distributed on weekly/bi-weekly basis in lecture (Tuesday) and usually due in the following week. Most assignments will involve the use of computer to analyze several small data sets and subsets of the data from national representative surveys. Starting from the second week, there is a one-and-half-hour lab session very week to help you learn how to use STATA.

 

Midterm is an IN-CLASS and CLOSE-BOOK exam covering the material in the first seven weeks.

 

Term paper is designed to help you gain some hands-on research experience. After the midterm, you will be asked to pick a topic of your interest, identify data source and conduct statistical analysis with techniques you have learned or be learning from the course, and draw conclusions. Details will be announced later.

 

PREREQUISITES:

Although there is no prerequisite for the course, an undergraduate-level statistics is strongly recommended. I assume that you are familiar with the material in Chapters 1-3 of the Agresti and Finlay book (see “texts” below), and have some knowledge but imperfect understanding of the materials in Chapters 4-6.  In the first section, we will review this material, but at an accelerated pace and/or in more depth than typically covered in an introductory statistics course for undergraduate students. If you think you need a thorough review for basic statistics, a good text I can recommend is David S. Moore (2002) The Basic Practice of Statistics. New York: W. H. Freeman and Company. The first six chapters are available in pdf via our course WebCt.

 

High school algebra, either remembered or re-learned, will also be necessary to get through the course, though formula derivations are presented only when necessary.

 

I have taught this course in the past three years, with MA students, M. Phil. students, and Ph.D. student mixed in class. Starting from this year, the Social Science Division offers a separate method and statistics course to self-financed MA students. As a result, SOSC509 is mainly for research students and evaluation standards are raised. MA students are encouraged to take SSMA501 “Principle of Social Science,” unless they have strong interests in quantitative social research and prepared for the intellectual challenges in the course.

 

TEXTBOOKS:                                                                                        

     ● Required

Alan Agresti and Barbara Finlay, 1997. Statistical Methods for the Social Sciences (3rd edition). Upper Saddle River, NJ: Prentice Hall.

 

     ● Recommended

Lawrence C. Hamilton, 2006. Statistics with STATA (Updated for Version 9). Belmont, CA: Duxbury Press.

 

The required book is available at HKUST bookshop for purchase. Your IA will hold the recommended book, and you are welcome to borrow it.  

 

COMPUTING AND THE INTERNET LEARNING RESOURCES: 

You will be doing substantial amount of data analysis with a software package called STATA 9.2. Among a number of popular statistical packages, STATA is a fast and efficient package which includes most of procedures of interest to social scientists (http://www.stata.com).

 

You can either buy the software package under “STATA Grad Plan” program, or gain access via social science computing lab. We will rely on course handouts/examples, on-line help, and instruction in lab sessions to teach you how to use STATA step by step, enough to get you into position to do the exercises and conduct basic statistical analysis. Due to the time limit, we will not be able to cover every aspect of the software. Should you have further interest, you can learn by yourself with the help of Hamilton’s book and other on-line resources.

 

Although some of you may prefer to use SPSS, I strongly recommend STATA (for reasons, see http://www.stata.com/info/whystata/). Homework using SPSS is NOT acceptable.

 

You can find the syllabus, homework assignments, data sets, and links to other learning resources in the course web (http://lmes2.ust.hk). You may use your ITSC username and password to log in. The lecture notes will be distributed in class. All copyrights, however, are reserved.

 

Free learning resource for STATA can be found at http://www.ats.ucla.edu/stat/stata/webbooks/reg/default.htm

http://www.biostat.au.dk/teaching/software/

 

Specifically, all examples in Agresti & Finlay have been programmed in STATA so that you can replicate the results with the package at http://www.ats.ucla.edu/stat/examples/smss/

 

DATASETS:                                                                                                                               What we deal with in statistics are not pure numbers, but the ones with social meanings/contexts. Over the course I will rely less on the data used in the textbook (which are either hypothetical or too “American”), but more on the real data in the context you are more familiar with (China or Hong Kong Data). I have prepared two datasets, which you will also be asked to analyze in homework assignments.

1. The Hong Kong By-Census Micro-Data (1996). The data include 309879 individuals nested in 93051 households, which may be further modified for you to analyze. A subset of the data is generated for your assignments and final paper.

2. The Chinese Life History and Social Change Survey (1996), a multistage stratified national probability sample of 6,090 adults aged from 20 to 69 from all regions of China (Treiman and Walder, Principal Investigators). The survey gathered extensive information on respondents’ job activities. The sample includes 3003 rural cases and 3,087 urban cases. Only a small number of variables are selected.

3. Other data collected in China will also be used in demonstration occasionally. 

 

INSTRUCTION FORMAT:

The course contains a three-hour lecture and a one-and-half-hour lab session each week.

    ● Lectures are designed to elaborate the conceptual issues covered in your readings. Usually I spend 2 hours going over the concepts/theoretical ideas, and the

       rest 1 hour illustrating them with examples using STATA. I will try to make STATA programs available to you (without detailed annotations).

 

● Lab sessions, run by IA, are mainly designed to give you hands-on experience to use STATA for homework problems. Time may be spent on: (a) feedback

on prior problems; (b) instruction in computing for upcoming assignments; (c) work examples to relevant to lecture materials and upcoming assignments. The first

lab session is to be held next week to introduce you the STATA software.          

 

    ● Office hours. Please feel welcome to make full use of our office hours for discussions concerning homework assignments, exams, or other matters about the  

     course. Extra appointment is possible. Most time of the day I will be in my office. Please feel free to stop. Email communication and Web CT discussion board

     are also good ways for exchange.        

 

 


TOPIC OUTLINE AND TENTATIVE SCHEDULE (SUBJECT TO REVISION)

 

Week 1.  Introduction (5 September)

Topics:

Social science research; variables and measurement; sampling methods

Readings:

            Agresti and Finlay, Chapters 1-2

            Yu Xie, 2005.  Methodological Contradictions of Contemporary Sociology”

             

Week 2. Descriptive Statistics (12 September)

Topics:

Tabular and graphical displays of data; measures of central tendency and dispersion; sample statistics and population parameters

Readings:

            Agresti and Finlay, Chapter 3

 

Assignment 1 due on 12 September.

 

Week 3 & 4. Probability Distributions (19 & 26 September)

Topics:

Discrete and continuous random variables; normal distribution; sampling distribution

Readings:

            Agresti and Finlay, Chapter 4

           

Assignment 2 due on 19 September.

 

Week 5. Statistical Inference I: Basics (3 October)

Topics:

Central limit theorem; point estimation; confidence interval; choice of sample size 

Readings:

            Agresti and Finlay, Chapter 5

 

Assignment 3 due on 3 October.

 

Week 6. Statistical Inference II: Hypotheses Testing (10 October)

Topics:

Hypotheses testing; t distribution; type I and type II errors        

Readings:

            Agresti and Finlay, Chapter 6   

 

Assignment 4 due on 10 October.

 

Week 7. Statistical Inference III: Bi-variate Relationship (17 October)

Topics:

            T-test, one-way ANOVA; cross-tabulation; chi-square test of independence  

Readings:

            Agresti and Finlay, Chapter 7:1-3, 8:1-3

Assignment 5 due on 17 October.

 

Week 8. Midterm (24 October)

 

Week 9.  Simple Linear Regression (31 October)

Topics:

            Simple linear regression; OLS estimates; correlation coefficient; goodness of fit   

Readings:

            Agresti and Finlay Chapter 9

 

Week 10. Special Topics: How to Write a Research Paper  (7 November)

Readings: TBA

 

Week 11. Multiple Regression I (14 November)

Topics:

            Causality; statistical control; multiple regression model estimation  

Readings:

            Agresti and Finlay, Chapters 10; 11:1-2

 

Assignment 6 due on 7 November.

           

Week 12. Multiple Regression II (21 November)

Topics: Inference for regression; reporting regression results, variable transformation

Readings:

            Agresti and Finlay, Chapter 11.4

 

Week 13. Multiple Regression III (28 November)

Topics:

            Interaction effect with dummy variables, analysis of variance and covariance

Readings:

            Agresti and Finlay, Chapters 12   

 

Assignment 7 due on 21 November.

 

Week 14. Multiple Regression IV (5 December)

Topics: 

Assumptions of linear models; non-linearity; multi-collinearity, regression diagnostics 

Readings:

            Agresti and Finlay, Chapters 13-14

           

Assignment 8 due on 5 December.

 

The term paper is due no later than 12:00 p.m., 15 December.

 

MERRY CHARISTMAS AND HAPPY NEW YEAR!