The analyses in this report are based on the Programme for International Student Assessment (PISA) data. This project was carried out under the auspices of the Organization for Economic Co-operation and Development (OECD) and was intended to measure students' skills in three domains: reading, mathematics and science. In 2000, over 30,000 Canadian 15-year olds, drawn from 1,200 schools participated in PISA, in addition to 15-year olds in 32 other countries.
The outcome of interest is the reading proficiency of a 15 year old, irrespective of their grade in school. Thus, the PISA reading test scores not a curriculum based test. The survey also measured proficiencies in Mathematics and Scientific literacy, however, these two were the minor domains and not all students completed an assessment in these two areas. In PISA, reading literacy is defined as — understanding, using and reflecting on written texts, in order to achieve one's goals, to develop one's knowledge and potential, and to participate in society (OECD, 2001).
In this paper, following the international literature on the definition of immigrants, we subdivide student population into three groups: native born student, first generation student and immigrant student.
Accordingly, the sample consists of 80.7% (276,823) native born students, 10.2% (35,091) of first generation students and 9% (30,971) immigrant students.
PISA sampling frame is hierarchical by design as schools are sampled first across Canada and then students within schools are randomly selected. Therefore, the data allows for students to be nested within schools and to partition this variance at the within and between schools, we use the hierarchical linear modelling (HLM).
In addition, the PISA 2000 survey in Canada collected background information from three different sources — student questionnaire, parent questionnaire and the school questionnaire. Therefore, the survey provides a rich set of variables drawn from different levels allowing it to examine individual, family and school level characteristics to be used in the HLM analysis.
The first set of multivariate analysis that examines the differences in reading skills between the three groups is done using ordinary least squares method, using appropriate statistical methods to account for the sampling as well as the measurement error in collecting such data. This allows us to answer the first two questions outlined in the introduction section.
To analyze variation at the individual and school level, the HLM method is used. The following set of equations explain the basic two level structure of the model:
| Level1: | ![]() |
...[1.1] |
| Level2: | ![]() |
...[1.2] |
![]() |
...[1.2] |
average school mean

population variance among the school means

population covariance between slopes and intercepts
In this paper, level 1 is the student and level 2 is the school. Equation 1.1 shows that the student level regression is a function of student specific regressors included in vector X. Equations 1.2 and 1.3 show that the intercept and slope coefficients estimated for students within schools are allowed to randomly vary across schools and can also be functions of school level variables as indicated by vector W. In this paper, we estimate five different specifications of the HLM model which are explained in detail in section VI.