RESEARCH QUESTION

Questions, possible answers, and our approach


Questions posed by the project:

  • How have the returns to education been changing?
  • Have these changes resulted in changes in the shape of the earning function?
  • How have any changes differed over age groups?
  • Should you believe OLS results?
  • What are the results telling us about education policy?
  • In modelling the returns to education what are the key econometric issues that matter?

 

ABOUT THE PROJECT

The question and (possible) answers

  • How have the returns to education been changing for skilled workers?
    In the US (and possibly in the UK) they have risen at least over some of the period since the 1970s.
  • Two broad answers to that question:
    Skill biased technical change
    Trade effects of increasing globalisation
  • The answer that probably most economists would back is the skill biased technical change explanation.
    Based on findings that measures of technology impinge on wages when we control for education and that the returns to education appear to have increased.

We are going to approach the problem from the perspective of low skill economies

  • There is a common argument that the returns to education are highest at lower levels ie the earnings function is concave.
  • Does this show up across countries i.e. those with lower levels of education have higher returns?
  • Does this show up within poor countries?
  • Is any skill effect on earnings symmetrical
  • ie in low skill low growth economies do we observe falls in the returns to education?

Why are Kenya and Tanzania interesting for these general questions?

  • Both are clearly low skill economies.
  • For both we have data from John Knight for the 1980s so we can look at long run effects.
  • For both we have data for repeated cross-sections for the 1990s for manufacture workers to assess whether there are any shifts over time when we know that there was no technical progress in these firms.
  • For both the sample contains a substantial proportion of low and high education workers.

Data and the Model

Kenya and Tanzania Data

 KenyaTanzania
 MeanStandard Dev.MeanStandard Dev.
Earnings(1)74.6117.454.771.2
Years of Education9.12.98.83.5
Age33.99.135.510.0
Years of Tenure7.97.28.17.3
Male Dummy0.85 0.80 
Works in Capital City0.64 0.44 
Old Age group(2)0.57 0.63 
Observations4039 2738 

(1) Earnings are in US$

Figure 1 Sample Proportion against Years of Education

Figure 2

Estimating the Earnings Function

  • In investigating the shape of the earnings function we follow Belzil and Hansen (2002), and several other authors, in assuming that individual differences in realised returns to schooling are due to the shape of the earnings function being nonlinear.
  • We do not wish to impose the precise form of the shape of function and therefore estimate the equation using a semi-parametric approach modelling the earnings-education profile as a piecewise linear spline function.
  • Variables included in the controls are years of tenure, age and age squared, a dummy variable for whether the individual is a male or not and a dummy variable for whether the individual lives in the capital city.
  • Our data begin in 1993 and span seven years for Kenya and eight years for Tanzania, and central to our concerns is whether the returns to education have changed over this period and whether there are differences across age groups. Therefore we estimate period-age group specific profiles.

Earnings, Endogeneity and Age

  • So far we have focused on the shape of the function implicitly abstracting from the concerns of endogeneity which have been extensively investigated (Card (2001) gives what has become a seminal review).
  • Later we will consider the possible role of instruments and biases in the OLS results.
  • Throughout the analysis we put in nodes of the earnings-education profile at 7, 10 and 12 years of education.
  • Using four segments of the earnings-education profile ensures that there is a reasonable number of observations in each category.
  • We divide the data into two age groups only, where an individual is considered ‘young' if his/her age is less than 30 years and ‘old' otherwise.
  • This way of dividing up the sample enables us to assess to some extent how changes in the returns to education have affected new entrants into the labour market.

RESULTS

OLS Results

  • Table 2a shows OLS estimates of the parameters of the earnings function, by year and age group, for Kenya. Table 2b shows results for Tanzania.
  • We focus on the role of education.
     
  • We now investigate whether the data pool over time and/or across age groups.
     
  • We start from a model where the explanatory variables are interacted with time and age group in such a way as to make the specification equivalent to separate earnings functions as in Tables 2a-b, and then test for the joint significance of the relevant interaction terms.
  • Results are reported in Table 3.
  • For both countries we can reject at the one per cent level the hypothesis that all the time and age group effects are jointly zero (row 1), hence the earnings equations in Tables 2a-b do not pool.
  • We accept the null hypothesis that the coefficients on the control variables (i.e. age and its square, tenure, male and capital city) do not vary across age groups and over time (row 2), and firmly reject the hypothesis that the earnings education profile is constant across age groups and over time (row 3).
  • Thus, the control variable effects appear stable over time and across age groups; the education effects do not.
  • We drop the interaction terms associated with the control variables, and the cross terms between time and age group, yielding specification 2.
  • The remaining time and age group effects are highly significant (row 1), and there is strong evidence that the shape of the earnings education profile varies across age groups for both countries (row 8).
  • The interaction terms {time x age group x education} are redundant (row 9), and so we drop these next.
  • Table 4 shows the resulting specifications, our preferred models, and Figures 3a-b show the predicted earnings education profiles.

Figure 3a: Predicted Earnings Based on Preferred Specification: Kenya

Figure 3b: Predicted Earnings Based on Preferred Specification: Kenya

Figure 4a: Predicted Earnings Based on Preferred Specification: Tanzania

Figure 4b: Predicted Earnings Based on Preferred Specification: Tanzania
 

OLS Results of the Preferred Model

  • For both countries and for both age groups there is strong evidence that earnings are convex in education.
  • For both countries the results suggest the earnings profiles differ across age groups
  • the earnings profile for the young age group is virtually flat for less than twelve years of education, indicating small or no marginal returns to education before the tertiary level.
  • In Kenya there is a clear upward intercept shift referring to 1995, which was sustained in 2000 for the old age group but not for the young. There is little evidence for intercept shifts over time in Tanzania.
  • Looking specifically at the first and last wave of the data, we accept the hypothesis that the earnings profiles exhibit the same shape for Kenya, and reject it for Tanzania.
  • For Tanzania there is evidence of a gradual and systematic change in the shape of the earnings profile for more than twelve years of education, with increased convexity as a result.

Figure 4a: Predicted Earnings Based on Preferred Specification: Tanzania

Figure 4b: Predicted Earnings Based on Preferred Specification: Tanzania
  

OLS Results of the Preferred Model

  • For both countries and for both age groups there is strong evidence that earnings are convex in education.
  • For both countries the results suggest the earnings profiles differ across age groups
  • the earnings profile for the young age group is virtually flat for less than twelve years of education, indicating small or no marginal returns to education before the tertiary level.
  • In Kenya there is a clear upward intercept shift referring to 1995, which was sustained in 2000 for the old age group but not for the young. There is little evidence for intercept shifts over time in Tanzania.
  • Looking specifically at the first and last wave of the data, we accept the hypothesis that the earnings profiles exhibit the same shape for Kenya, and reject it for Tanzania.
  • For Tanzania there is evidence of a gradual and systematic change in the shape of the earnings profile for more than twelve years of education, with increased convexity as a result.

Robustness

Two robustness checks

  • The first test refers to the sensitivity of the results to functional form.
  • Our spline function is flexible, but linearity within each segment is imposed. This may be too restrictive.
  • Our second robustness check refers to the sensitivity of the results to the inclusion of sector effects.


Predicted Earnings Based on Polynomial Specification: Kenya
figure 5a

figure 5b  

Predicted Earnings Based on Polynomial Specification: Tanzania
figure 6a

figure 6b  

Comparison with non-African Data

  • Using data from Trostel et al (2002)
  • 28 countries
  • Multiple cross-sections of data
  • Only one developing country the Philipines
  • No Africa or Asia
  • “ There is tenuous evidence that the ROR declines with educational level ”
  • They do not draw graphs.

figure 7: non African data  

Robustness checks for sector

  • Recall that our data contain individuals working in four manufacturing sub-sectors in the two countries: food, wood, textiles and metal.
  • Such earnings regressions (with sector controls) indicate how education is rewarded within sectors. The pooling tests results are shown in Table A.1, panel B.
  • The overall picture is similar to what we have seen above. For both countries the sector effects are jointly significant at the five per cent level or lower (tests not reported).


The Control Function Estimator

  • To retain our flexible non-linear earnings-education profile whilst controlling for unobserved ability effects on earnings and returns to earnings, we adopt a two-stage control function approach.
  • Stage one: regress education on a set of instruments, and estimate the residual (denoted math).
  • Stage two: Estimate the earnings function using math as a control variable for ability.
  • Our general empirical model is thus of the form: Equation 6

Why not 2SLS?

  • For linear models with constant slope coefficients, 2SLS and the control function estimator are equivalent.
  • But the control function approach is more robust than 2SLS when slope parameters co-vary with the unobserved factors of the model (Card, 2001).
  • And even if all slope parameters are constant, 2SLS is likely to result in relative imprecise parameter estimates since the model is non-linear in the endogenous variable. (For 2SLS we would have to estimate four first stage regressions, modelling each component of the spline function separately, and then use the predictions instead of the actual values in the second stage. A much richer instrument set would thus be required for 2SLS than for the control function estimator.)
  • To implement the control function estimator, we approximate math and Figure 9 by third-order polynomials.


Instruments and exclusion restrictions

  • In the last wave of the data there is information on:distance to primary school at the age of six,
    • distance to secondary school at the age of twelve,
    • parents' education,
    • parents' main occupations
  • Distance to school is a supply side measure of education, so should be correlated with education and not with ability (Card, 2001).

 

  • Family background variables have been used as instruments for education in many previous studies, on the grounds that such variables should have no direct causal effects on earnings.

Selectivity Bias?

  • We have considered above a relatively conventional role of unobserved ability in potentially leading to bias. In Kenya and Tanzania, unlike more developed economies, having a job in the wage sector is atypical of outcomes in the labour market.
  • We do not have data on individuals outside the manufacturing sector, so we are limited in our ability to control for endogenous sample selection. In particular, we are unable to use a sample selection model along the lines proposed by Heckman (1976), since we cannot estimate a participation equation.
  • Can the control function address the sample selectivity problem? The answer is yes, provided the instruments are independent of the error term in the selected sample.

 

  • One example of a model in which this will apply is when the job selection model is of the form
  • math, (6) where math is an indicator variable equal to one if the individual has a manufacturing job and zero otherwise and math is an unobserved factor which is potentially correlated with math (i.e. the non-ability component of the error term in the earnings equation).
  • It is shown in the paper that this form of sample selectivity can lead to a downward bias in the return to education, if the selectivity mechanism is sufficiently strong, even though education and ability are positively correlated in the population.
  • But a well specified control function estimator will correct for this problem and give consistent estimates.
  • More general cases of sample selection can be more problematic, however.

 

Summary of Results

Table 6: Control Function Estimates: Kenya 2000 and Tanzania 2001

  Kenya Tanzania
  [1] Young [2] Old [3] Young [4] Old
Education -0.060 0.113 0.106 0.022
  (0.169) (0.052)* (0.063)+ (0.038)
max(0,EDUC-7) 0.202 0.109 0.050 0.115
  (0.182) (0.060)+ (0.086) (0.049)*
max(0,EDUC-10) 0.154 -0.035 -0.280 -0.081
  (0.098) (0.086) (0.189) (0.121)
max(0,EDUC-12) 0.099 0.313 0.463 0.258
  (0.099) (0.098)** (0.196)* (0.181)
Education earnings profile linear (p-val.) 0.00 0.00 0.06 0.00
EXCRES (p-value)(1) 0.00 0.00 0.00 0.00
RETHET (p-value)(2) 0.16 0.60 0.95 0.61
EXOGEN (p-value)(3) 0.00 0.00 0.35 0.58
Observations 371 579 227 432

math
OLS and control function estimates of the earnings education profile for Kenya:

 

Summary of Results

  • No evidence that the returns to education are correlated with unobserved ability.
  • Kenya: convexity gets more pronounced as a result of treating education as endogenous. Thus, OLS seems to underestimate the differences in marginal returns between those with little education and those with much.
  • Tanzania: effects of controlling for endogeneity of education are smaller. Control function estimates similar to OLS, and we can accept exogeneity.
  • Thus the main conclusion here: Our finding that returns to education are convex is not altered by treating education as an endogenous variable.

Returns rise when education is endogenous – why?

  • A common result in the empirical literature is that the estimated returns to education increase as a result of treating education as an endogenous variable. We obtain a similar result. This appears inconsistent with the idea that unobserved ability leads to bias. Why might this happen?
  • Education is measured with error => OLS downward biased.
  • We are identifying a local average treatment effect (LATE). If there is heterogeneity in the returns to education, and the instruments alter the behaviour mainly on those with high returns, then treating education as an endogenous variable may lead to higher estimates.
  • Methodological problems

Maybe the instruments are invalid. However compared to most other studies of returns to education in Africa we would argue that our data contain what would seem relatively good instruments.

Another possibility is that sample selectivity plays a role. In general it is hard to sign the nature of selectivity bias within our modelling framework. However, if the selectivity equation is of the form
math the control function will yield consistent estimates under certain assumptions that have already been discussed. OLS would be biased downward if the selectivity mechanism is relatively strong.

Given the information available, we cannot determine whether there is support for this and we leave it for future research to probe these issues further.

Conclusions

  • There is limited empirical evidence on changes in the returns to education in developing countries over long periods of time.
  • We have documented changes in the returns to education in Kenya and Tanzania during the 1990s and also compared with earlier work by Knight and Sabot (1990).
  • The long run the pattern across the two countries has been very different. Kenya: large long-run falls in the return to education at the post primary level; Tanzania: not. Indeed we find for Tanzania an increase in the return to education in the 1990s.
  • The average return has risen in both countries – could this be due to “skill biased technical change”? No - the rate of technological progress in manufacturing has been very low in these countries over the sample period.
  • Knight and Sabot (1990) argue that the high returns in Kenya in the 1980s relative to Tanzania reflected a willingness to allow market processes to work in Kenya relative to Tanzania. Over the 1990s Tanzanian polices have become much more similar to those of Kenya. By the end of the 1990s the earnings profile was virtually identical for the two countries.
  • What are the key econometric issues?
  • Functional form: Is the earnings function linear, concave or convex? Does it matter?
  • Unobserved ability: Education is a choice variable and so may be correlated with the earnings residual. How test whilst allowing for a flexible functional form?
  • We find strong evidence that the earnings function is convex, and typically more so in the young age group. In Tanzania there is increasing convexity over the 1990s, for Kenya stability.
  • The findings are robust to allowing for the possibility that education is endogenous. In fact, there is no evidence that unobserved ability biases the OLS results in the way that is typically presumed in theory (and rarely verified in practice).

Does convexity matter?

  • Convexity matters for policy – returns are very high at high levels, and low at low levels. The average return (from linear model) masks this completely. Aggregate return to expanding education depends on who gets the additional education. Heterogeneity in education feeds into higher income inequality, ceteris paribus, under convexity than under concavity.
  • Convexity should matter for theory – concavity (as in David Card's model) does not appear a good starting point if we want to write down a behavioural model. Clearly allowing the theoretical earnings function to be convex raises analytical issues as to why not everyone gets a PhD – marginal costs could be increasing, there could be binding credit constraints that stop students to proceed to higher levels, etc.
  • Our results link to one of the micro-macro puzzles in the development literature: Why, at the macro level, has the expansion of education in Africa during the last two decades has generated so little growth, while at the micro level the average returns to education appear high? With convexity, these results could be reconciled if the expansion of education has primarily occurred on relatively flat segments of the earnings function.

RESEARCHERS

Måns Söderbom

Francis Teal

Director GPR

Deputy Director CSAE, microeconomics

CSAE

Godius Kahyarara

Anthony Wambugu

DOCUMENTS AND LINKS

The dynamics of returns to education in Kenyan and Tanzanian manufacturing

G. Kahyarara, M. Söderbom, F. Teal and A. Wambugu

CSAE Working Paper WPS/2003-17, 2003