Skip to main content
  • Original Research
  • Open access
  • Published:

What determines adult cognitive skills? Influences of pre-school, school, and post-school experiences in Guatemala


Most empirical investigations of the effects of cognitive skills assume that they are produced by schooling. Drawing on longitudinal data to estimate production functions for adult verbal and nonverbal cognitive skills, we find that: (1) School attainment has a significant and substantial effect on adult verbal cognitive skills but not on adult nonverbal cognitive skills; and (2) Pre-school and post-school experiences also have substantial positive significant effects on adult cognitive skills. Pre-school experiences captured by height for age at 6 years substantially and significantly increase adult nonverbal cognitive skills, even after controlling for school attainment. Post-school tenure in skilled jobs has significant positive effects on both types of cognitive skills. The findings (1) reinforce the importance of early life investments; (2) support the importance of childhood nutrition (“Flynn effect”) and work complexity in explaining increases in nonverbal cognitive skills; (3) call into question interpretations of studies reporting productivity impacts of cognitive skills that do not control for endogeneity; and (4) point to limitations in using adult school attainment alone to represent human capital.

1 Introduction

Increasing the stock of human capital is central to expanding individual options and to economic development. One dimension of human capital, cognitive skills—the capacity to assess and solve problems, or in Schultz’s (1975) memorable phrase, “the ability to deal with disequilibria” is widely believed to affect productivity in many activities, including work, raising children, and improving one’s own and others’ health and nutrition. Substantial empirical literature, mostly based on the relationship between school attainment and the outcomes of adult activities, is interpreted in support of these possible influences.Footnote 1

Consequently, it is critical to understand the processes by which dimensions of adult human capital, such as cognitive skills, are determined. This may seem to be well-trodden territory, as there are hundreds of studies of the determinants of schooling.Footnote 2 But school attainment is not necessarily a direct measure of adult cognitive skills. Instead, it can be considered one of the inputs, though likely an important one, into the production of those skills. Moreover, in developing countries, schooling typically is limited to particular periods of individuals’ lives, childhood and adolescence. Other experiences, both before and after one’s school years, also may affect cognitive skills. For example, much literature emphasizes the importance of nutrition from conception onward for neural and cognitive development.Footnote 3 There is also a substantial literature that emphasizes the importance of post-school experiences, particularly in the labor market, in determining adult cognitive skills or in determining productivity and wages, which usually are interpreted as reflecting such skills.Footnote 4 If pre-school or post-school experiences have significant impacts on the cognitive skills of adults and are correlated with schooling—as is likely if there is correlation among human capital investments across life-cycle stages—analyses that use school attainment alone to represent adult cognitive skills are problematic.

In this paper, we examine the importance of pre-school, school, and post-school experiences in the production of both verbal and nonverbal cognitive skills of adults.Footnote 5 We do this in Guatemala, a low-income context in which it is widely expected that greater cognitive skills will lead to increases in individuals’ options and welfare, and consequently advance economic development.

We utilize data that have been collected over 35 years (1969–2004) with sample members 25–42 years of age at the final survey and with substantial prospective and recall information on their development through the pre-school, school, and post-school years, as well as on a number of factors that conditioned these experiences. We investigate the effect of experiences during these three life-cycle stages on adult cognitive skills, incorporating the possibility that the experiences reflect behavioral choices in the presence of unobserved factors that might affect adult cognitive skills directly, as well as indirectly via their effects on the different life-cycle stage experiences. In addition to using plausibly exogenous instruments including an experimental intervention in childhood, identification is strengthened by the inclusion of the different life-cycle stages.

We find that (1) school attainment has a significant and substantial effect on adult verbal cognitive skills, but not on adult nonverbal cognitive skills; (2) pre-school and post-school experiences—represented, respectively, by the pre-school nutritional status of the individual and years of skilled post-school work experience—have substantial and significant effects on both of the adult cognitive skill measures; and (3) failure to control for the behavioral determinants of these experiences leads to substantially overestimating the importance of schooling while at the same time underestimating the importance of pre- and post-school experiences.

2 Conceptual framework

We estimate production functions for two types of adult cognitive skills, verbal (reading and vocabulary) and nonverbal. We posit that each skill is produced by the following: a vector of genetic and other observed and unobserved endowments related to learning capacities and motivations (E 0 ); previous experiences (E i , i = 1, 2, 3 for the three life-cycle stages defined below); and a stochastic term (U) reflecting all other idiosyncratic, and assumed exogenous, experiences related to cognitive skill formation. Because of the nearly exclusive emphasis in the literature on schooling, and its plausibly important role in forming cognitive skills, we organize an individual’s experiences into three life-cycle stages:

  • Stage 1: pre-school (from conception through about age 6 years)

  • Stage 2: school (from age seven through about age 15 years),Footnote 6

  • Stage 3: post-school (from about age 15 years old to age at the time of the final survey, i.e., 25–42 years).

The production functions for the two types of adult cognitive skills (K) areFootnote 7

$$ K_{3v} = K_{3v}^{\text{p}} \left( {{\mathbf{E}}_{{\mathbf{1}}} , \, {\mathbf{E}}_{{\mathbf{2}}} , \, {\mathbf{E}}_{{\mathbf{3}}} ,{\mathbf{E}}_{{{\mathbf{0r}}}} ,U_{3v} } \right), $$


$$ K_{3n} = K_{3n}^{\text{p}} \left( {{\mathbf{E}}_{{\mathbf{1}}} , \, {\mathbf{E}}_{{\mathbf{2}}} , \, {\mathbf{E}}_{{\mathbf{3}}} ,{\mathbf{E}}_{{{\mathbf{0n}}}} ,U_{3n} } \right) $$

The first subscript refers to the life-cycle stage, the second subscript refers to verbal cognitive skills (v) or nonverbal cognitive skills (n), and the superscript p indicates that the relation is a production function. While all variables except the left-side dependent variables and the disturbance terms are potentially vectors (indicated in bold), for tractability in the subsequent empirical analysis, we treat E 1 , E 2 , and E 3 as scalars E1, E2, and E3.

Our main research questions relate to the first derivatives (i.e., K pE1 , K pE2 , K pE3 ) of the general adult cognitive skills production functions. If K pE1 is significantly positive and E1 and E2 are positively correlated, for example, a specification that excludes E1 is likely to overestimate the effect of experiences during the school years (E2). Identification of relation (1v, 1n) is challenging, however, because the experiences for the three life-cycle stages on the right side all reflect previous behavioral choices.

To motivate our modeling of these life-cycle stage experiences and to elucidate the possible influences of the endowments on estimates that do not control for all three of them, we describe a stylized model in which the “dynasty” (first the parents, then the children themselves as they age into youth and adulthood) make decisions as if they were maximizing a welfare function (W) for each individual in adulthood that depends, inter alia, on the adult cognitive skills of that individual:

$$ W = W\left( {K_{3v} ,K_{3n} , \, \ldots U_{3W} } \right). $$

For instance, W might represent consumption that is financed by resources generated by labor earnings that depend in part on cognitive skills. Welfare is maximized subject to the constraints at each life-cycle stage related to relevant current and expected production functions, family resources allocated to this individual, community characteristics (including community services and markets that affect household decisions), and stochastic factors.

Life-Cycle Stage 1 (pre-school years): The parents allocate resources to obtain the optimal E1 given a production function and the current community-determined options (e.g., availability of nutritional and health programs), expected future community-determined options (e.g., schooling options in the second life-cycle stage and labor market options in the third), the expected relation between E1 and W, and child-specific endowments. The E1 production function is

$$ E_{1} = E_{1}^{\text{p}} \left( {{\mathbf{N}}_{1} ,{\mathbf{C}}_{{1{\text{p}}}} ,\varvec{E}_{{0{\text{p}}}} ,U_{{1{\text{E}}}} } \right), $$

where N 1 is a vector of family determined inputs into the production of E1 (e.g., family provided nutrients), C 1p is a vector of community inputs into the production of E1 (e.g., community-provided nutrients), E 0p is the vector of child endowments that directly enter into the production of E1 (e.g., innate robustness), and U1E is a stochastic disturbance term that directly affects the production of E1 (e.g., fluctuations in the infectious disease environment). Parents choose the inputs N 1 and, therefore, E1 to maximize the expected welfare W, given the following: a vector of parental family resources such as parental schooling and assets (F 1 ); all relevant community characteristics for this life-cycle stage C 1 (which includes the community characteristics that directly affect the production of E1 through C 1p , but also other community characteristics that may affect the household through other channels); all of the child endowments E 0 (which includes E 0p , but also E 0v and E 0n that affect the decision to invest in E1 because the impact of E1 on W in general may depend on these other endowments); all the stochastic terms that affect outcomes in the first life-cycle stage of the child U1 (which includes U1E but also other stochastic factors that affect the family during the first life-cycle stage for this child, for example, stochastic factors affecting the health of other siblings may influence the inputs devoted to this child); and, lastly, the expected values of these variables in the next two life-cycle stages (F e 12 , F e 13 , C e 12 , C e 13 , U e12 , U e13 , where the superscript e indicates that the variable is an expected value, the first subscript refers to the life-cycle stage at which the expectations are held, and the second subscript refers to the stage for which the expectations are held) because the optimal decision for investing in E1 to maximize W depends in part on expectations of these variables over the next two life-cycle stages. The resulting relation is

$$ E_{1} = E_{1}^{d} \left( {{\mathbf{F}}_{{\mathbf{1}}} ,{\mathbf{C}}_{{\mathbf{1}}} ,{\mathbf{E}}_{{\mathbf{0}}} ,U_{1} ,{\mathbf{F}}_{12}^{{\mathbf{e}}} ,{\mathbf{F}}_{{{\mathbf{13}}}}^{{\mathbf{e}}} ,{\mathbf{C}}_{{{\mathbf{12}}}}^{{\mathbf{e}}} ,{\mathbf{C}}_{{{\mathbf{13}}}}^{{\mathbf{e}}} ,U_{12}^{e} ,U_{13}^{e} } \right), $$

where the superscript d indicates that it is a reduced-form demand relation.

Life-Cycle Stage 2 (school years): The dynasty (initially the parents but with time increasingly the child) decides on the optimal E2 for the child/youth conditional on (a) the outcome of stage 1, E1, (b) life-cycle stage 2 family, community, and stochastic factors, and (c) the expected values of those factors for life-cycle stage 3. Using (4), this yields the reduced-form demand relation for E2:

$$ E_{2} = E_{2}^{d} \left( {{\mathbf{F}}_{1} ,{\mathbf{C}}_{{\mathbf{1}}} ,{\mathbf{E}}_{{\mathbf{0}}} ,{\mathbf{F}}_{{{\mathbf{12}}}}^{{\mathbf{e}}} ,{\mathbf{F}}_{{{\mathbf{13}}}}^{{\mathbf{e}}} ,{\mathbf{C}}_{{{\mathbf{12}}}}^{{\mathbf{e}}} ,{\mathbf{C}}_{{{\mathbf{13}}}}^{{\mathbf{e}}} ,{\mathbf{F}}_{{\mathbf{2}}} ,{\mathbf{C}}_{{\mathbf{2}}} ,{\mathbf{F}}_{{{\mathbf{23}}}}^{{\mathbf{e}}} ,{\mathbf{C}}_{{{\mathbf{23}}}}^{{\mathbf{e}}} ,U_{1} ,U_{2} ,U_{12}^{e} ,U_{13}^{e} ,U_{23}^{e} } \right). $$

If the outcome of stage 1, E1, encompasses all the impacts of first life-cycle stage exogenous variables,Footnote 8 then the conditional (on E1) demand function for E2 is

$$ E_{2} = E_{2}^{c} \left( {E_{1} ,{\mathbf{F}}_{{\mathbf{2}}} ,{\mathbf{C}}_{{\mathbf{2}}} ,{\mathbf{F}}_{{{\mathbf{23}}}}^{{\mathbf{e}}} ,{\mathbf{C}}_{{{\mathbf{23}}}}^{{\mathbf{e}}} ,U_{2} ,U_{23}^{e} } \right), $$

where the superscript c refers to a conditional reduced-form demand relation. This relation allows for the possibility, for example, that the investment in E2 is dependent on E1.

Life-Cycle Stage 3 (post-school years): The dynasty (primarily the youth/young adult, but with possible input from parents) decides on the individual’s post-school experience E3 conditional on (a) the outcome of the first-stage E1, (b) the outcome of the second stage E2, and (c) the third stage family, community, and stochastic factors, yielding the reduced-form demand relation for E3:

$$ E_{3} = E_{3}^{d} \left( {{\mathbf{F}}_{{\mathbf{1}}} ,{\mathbf{C}}_{{\mathbf{1}}} ,{\mathbf{E}}_{{\mathbf{0}}} ,{\mathbf{F}}_{{{\mathbf{12}}}}^{{\mathbf{e}}} ,{\mathbf{F}}_{{{\mathbf{13}}}}^{{\mathbf{e}}} ,{\mathbf{C}}_{{{\mathbf{12}}}}^{{\mathbf{e}}} ,{\mathbf{C}}_{{{\mathbf{13}}}}^{{\mathbf{e}}} ,{\mathbf{F}}_{{\mathbf{2}}} ,{\mathbf{C}}_{{\mathbf{2}}} ,{\mathbf{F}}_{{{\mathbf{23}}}}^{{\mathbf{e}}} ,{\mathbf{C}}_{{{\mathbf{23}}}}^{{\mathbf{e}}} ,{\mathbf{F}}_{{\mathbf{3}}} ,{\mathbf{C}}_{{\mathbf{3}}} ,U_{1} ,U_{2} ,U_{3} ,U_{12}^{e} ,U_{13}^{e} ,U_{23}^{e} } \right). $$

The conditional demand function for E3 under the assumption that the outcome of stage 2, E2, encompasses all the impacts of second life-cycle stage predetermined variables (including E1) is

$$ E_{3} = E_{3}^{c} \left( {{\mathbf{E}}_{{\mathbf{2}}} ,{\mathbf{F}}_{{\mathbf{3}}} ,{\mathbf{C}}_{{\mathbf{3}}} ,U_{3} } \right). $$

This relation allows for the possibility, for example, that post-school experiences depend, inter alia, on previous school experiences and, through schooling, on pre-school experiences.

This conceptual framework yields a number of important implications for the estimation of adult cognitive skills production functions.

  1. 1.

    Omitted variable bias: If the true relations (1v) and (1n) include all three life-cycle stages, excluding one or more of them may lead to substantial omitted variable biases. The endowments and the actual or expected values of the family, community, and stochastic factors for the three life-cycle stages are all on the right side of each of the reduced-form demand relations for each individual life-cycle stage (relations 4, 5, and 7) and, therefore, the three life-cycle experiences are likely correlated. The same conclusion can be drawn from the conditional demand functions (relations 6 and 8) in which school experience (E2) depends explicitly on pre-school experience (E1) and post-school experience (E3) depends explicitly on the school experience (E2), as well as indirectly on pre-school experience (E1). That the three life-cycle stage experiences are likely to be correlated is also intuitive. A priori, a child who lives in a household with better parental family background or who lives in a community with better health and educational services or employment opportunities, is likely not only to have more schooling but also better pre- and post-school experiences.

  2. 2.

    Endogeneity bias: Even when all three life-cycle stages are included, estimates of relations (1v) and (1n) that do not attempt to control for their behavioral determinants are also likely to be biased. Moreover, the signs of such biases are ambiguous. For instance, “ability bias” is consistent with positive correlation between E2 (e.g., if measured with school attainment) and the innate ability component of E 0 , suggesting a likely upward bias on E2 in ordinary least squares (OLS) estimates. However, if the summary measure of pre-school experience is some variable such as child nutritional status (see Sect. 3 below), and if ability and physical endowments that influence nutritional status are negatively correlated as suggested by Behrman and Rosenzweig (2002), estimates for child nutritional status (E1) in relations (1v) and (1n) may be negatively biased while at the same time estimates for school attainment (E2) are positively biased.Footnote 9

  3. 3.

    The same potential instruments identify all three life-cycle stage experiences: The reduced-form demand relations for the three life-cycle stage experiences in relations (4), (5), and (7) suggest potential instruments that could be used to identify the different life-cycle stage experiences in the adult cognitive skills production functions in (1v) and (1n). Although some instruments may seem likely to have first-order effects on particular life-cycle stages (e.g., pre-school nutrition programs on E1, school characteristics on E2, labor market characteristics on E3), the three reduced-form demand relations make clear that the same actual or expected values of the family, community, and stochastic factors can all influence each of the three life-cycle stages. Consequently, our approach is oriented around two-stage least squares rather than an instrumental variable approach in which different instruments are excluded from the various first-stage predictions. By capturing all three life-cycle stages, other pathways through which the instruments might influence cognitive skills are directly included.

  4. 4.

    Instruments must be uncorrelated with the disturbance term, which includes unobserved endowments: The reduced-form demand relations indicate the potential set of instruments, but not all of the right side variables in those relations are likely to be valid instruments in the sense of being uncorrelated with the disturbance terms in relations (1v) and (1n). For example, previous studies make clear that there can be intergenerational correlations of endowments through genetics and probably other means (Behrman and Rosenzweig 1999, 2002, 2005; Stein et al. 2003). To limit this possible correlation, we do not include variables such as parental school attainment or parental wealth as instruments in our preferred models.

3 Data

Data demands are considerable for estimating these adult cognitive skills production functions. We use longitudinal data from Guatemala collected over a 35-year period with anthropometric, socioeconomic, and adult cognitive skills measures, shocks from an experimental intervention and a major earthquake, and other market and policy changes.

3.1 The INCAP experimental nutritional intervention

In 1969, the Institute of Nutrition of Central America and Panama (INCAP), based on Guatemala, began a nutritional supplementation trial (Martorell et al. 1995a). After screening approximately 300 communities in eastern Guatemala, two sets of community pairs (one pair with about 500 residents each and the other with about 900 residents each) were selected purposively for the trial. Two of the communities, one from within each pair matched on population size, were assigned randomly to receive as a dietary supplement a high protein-energy supplement, atole. The other two communities received an alternative no-protein, low-energy supplement, fresco.

Collection of the data used in this paper began with the supplementation trial in these four communities. From February 1969 to February 1977, INCAP implemented the nutritional supplementation program, together with data collection on child growth and development. The survey associated with the intervention focused on all children under 7 years of age, including children born in the study area during those years. Detailed data collection for individual children ceased when they reached 7 years of age or when the study ended, whichever came first. The children included in the 1969–1977 longitudinal survey were thus born between 1962 and 1977, and consequently the length and timing of exposure to the interventions for specific children depended on their individual birth dates. For example, only children born after early 1969 and before February 1974 were exposed to an intervention for all of the time from birth to 36 months of age, posited in the nutritional literature (Sect. 3.2) to be a critical time period for child growth. Atole and fresco were distributed in each community in centrally located feeding centers and were available daily, on a voluntary basis, to all members of the community, regardless of their age or participation in the survey components of the research, during times that were convenient to mothers and children, but that did not interfere with usual meal times.

Because we use children’s differential exposure to the availability of the nutritional supplements as instruments to estimate first-stage relations (4), (5), and (7), that are then excluded from relation (1v, 1n), it is important to establish that the two interventions resulted in differential consumption of calories, protein, and other nutrients. Approximately, 70 % of children between the ages of 0–36 months consumed at least some atole, with no difference between boys and girls. Similar overall participation rates were observed in fresco communities. Averaging over all children in the atole communities (regardless of their levels of voluntary participation), children 6–12 months consumed approximately 70 kcal of atole supplement per day; children 12–24 months, 90 kcal; and children 24–36 months, 120 kcal. Children in the fresco communities, however, consumed only 20 kcal of fresco supplement per day between the ages of 6–24 months, rising to approximately 30 kcal by the age of 36 months (Schroeder et al. 1992; Islam and Hoddinott 2009).

In 2002–2004, a team of investigators, including most of the authors of this paper, undertook a follow-up survey targeted to all subjects in the original survey.Footnote 10 At the time of the final survey, sample members ranged from 25 to 42 years of age. Of the 2,392 individuals in the original 1969–1977 sample by the time of the 2002–4 HCS, 1,855 (78 %) were alive and known to be living in Guatemala (11 % had died—the majority due to infectious diseases in early childhood, 7 % had migrated abroad, and 4 % were not traceable). Of these 1,855, 60 % lived in the original communities, 8 % lived in nearby communities, 23 % lived in or near Guatemala City, and 9 % lived elsewhere in Guatemala. Of the 1,855 traceable sample members living in Guatemala, 1,571 (85 %) completed at least one interview during the 2002–2004 survey (Grajeda et al. 2005). This study includes the 1,448 respondents (46 % male) interviewed in 2002–2004 for whom the two measures of adult cognitive skills we use in the main analyses were available (Sect. 3.2). They comprise 78 % of the 1,855 individuals who were alive and known to be living in Guatemala and 61 % of the original sample. Measured from 1977 to 2002, the latter figure indicates an annual attrition rate of approximately 2 %, low when compared to shorter-term longitudinal surveys in developing countries (Alderman et al. 2001) or to long-term longitudinal surveys in developed countries (Fitzgerald et al. 1998).Footnote 11 Nevertheless, nearly 40 % represents substantial attrition and, therefore, we assess potential attrition biases (Sect. 4.2.4).

3.2 Central variables for the analysis

Table 1 presents summary statistics for the 1,448 individuals used in the main analyses.

Table 1 Summary statistics: mean and standard deviations (N = 1,448)

3.2.1 Dependent variables: adult cognitive skills (K)

  1. 1.

    Verbal cognitive skills: The vocabulary (approximately fourth-grade equivalent) and reading comprehension (approximately 3rd grade equivalent) modules of the Inter-American Reading and Comprehension Tests (Manuel 1967) were administered to the 1,197 individuals (83 % of the sample of 1,448) who reported completing six or more grades of school or who passed a pre-literacy screen.Footnote 12 The vocabulary portion had 45 questions and reading comprehension had 40 questions, yielding a maximum possible score of 85 points. The distribution of test scores (for those who took the test) appears to be symmetric and approximately normal. The 17 % of the sample who did not pass the pre-literacy screen were assigned a value of zero for the reading comprehension tests. Including those we scored at zero, the mean score is 36.1 with a standard deviation (SD) of 22.3, indicating substantial sample variation (Table 1). Women (mean 34.4, SD 21.9) score significantly lower than men (mean 37.9, SD 22.7). These tests had adequate test–retest reliability (correlation coefficients of 0.87 and 0.85 for vocabulary and reading) in previous studies in this population when the same subjects were adolescents and young adults (Pollitt et al. 1993). We use the combined vocabulary and reading comprehension score (hereafter, RCS) as our measure of verbal cognitive skills. Because of the nature of the distribution of the test scores, in the empirical analysis, we explore the robustness of our results to changes in the specification of these test scores, including directly incorporating the pre-literacy test score (Sect. 4.2.3).

  2. 2.

    Nonverbal cognitive skills: All respondents were administered Raven’s Progressive Matrices (Raven et al. 1984), a widely used nonverbal measure of interpretative cognitive skills,Footnote 13 that consists of a set of shapes and patterns, with the respondent asked to supply the “missing piece” from a set of six choices. The Raven thus measures different aspects of cognition than the verbal skills described above, although they are highly correlated (0.57) and may have some components of general cognitive skills in common. We used the first three of the five scales (with 12 questions each, for a maximum possible score of 36) because pilot data suggested and subsequent survey data confirmed that few respondents were able to progress beyond the third scale. As with the test of reading and vocabulary, there is considerable sample variation; similar to the distribution of verbal skills, the distribution of these test scores also appears to be symmetric and approximately normal, with a mean of 17.7 points (SD 6.1). Women (mean 16.3, SD 5.4) score significantly lower than men (mean 19.4, SD 6.5). The Raven’s test also exhibited adequate test–retest reliability (correlation of 0.87) in this population in the past (Pollitt et al. 1993). We use the Raven’s test scores as our measure of nonverbal cognitive skills.

3.2.2 Life-cycle stage experiences (E1, E2, E3)

We assume relation (1v, 1n) is linear and include one commonly used indicator each for the first two life-cycle stages.Footnote 14 For the third stage, we consider two different indicators of potentially relevant post-school experiences. Technically, then, we assume the indicators we use are sufficient statistics for each stage, though practically they are probably better interpreted as being among the most important experiences related to adult cognitive skill development.

Pre-school experience. There is substantial emphasis in the literature on the importance of nutrition—reflecting nutrient intakes and exposure to infections—as measured by height in early childhood development. We use pre-school child height-for-age Z scores (HAZ) at age six,Footnote 15 a widely accepted indicator of childhood long-run nutritional status, to calculate our indicator for pre-school experience. There is no difference between average female and male HAZ.

School experience. We use completed school attainment (highest grade completed) as our indicator of experiences related to cognitive skill formation during this life-cycle stage. The mean is 4.7 grades (SD 3.5). Women (mean 4.3, SD 3.3) average significantly fewer completed grades than do men (mean 5.2, SD 3.6). The distribution has a local mode at six grades (29 % of the individuals), completion of primary school. There are also secondary modes at zero grades (14 %) and three grades (11 %). School attainment is significantly correlated with the indicators for the pre-school experience (correlation of 0.16) and post-school experiences described below (skilled job tenure, 0.13; whether living in Guatemala City, 0.28), suggesting that if relation (1v, 1n) is the true relation but only school attainment is included in the estimation, the estimate of the effect of the latter is likely to be biased.

Post-school experience. We consider two measures for this third life-cycle stage.

  • Tenure in skilled jobs: We use the duration in years of continuous work experience prior to the 2002–2004 survey in skilled jobsFootnote 16 as our first indicator of post-school experience that may contribute to adult cognitive skills. The survey does not permit the calculation of total experience in all jobs since individuals left school, but does capture tenure in all jobs held in 1998 and in 2002–2004.Footnote 17 Such experience is likely to strengthen cognitive skills via (1) learning-by-doing through problem solving; (2) furthering skills learned in school by using them in real world applications; and (3) exposing individuals to a wider environment, through interactions with coworkers and customers. The mean number of years of tenure in a skilled job is only 2.8, but with substantial variation (SD 4.8). Two-thirds of the individuals were not in skilled jobs in 1998 or 2002–2004 and, therefore, have reported tenure of zero years. Among the 530 sample members who had at least 1 year of tenure in a skilled job, the mean is 7.8 years (SD 5.0). There are substantial and significant differences for tenure in skilled jobs by gender, with 58 % of men having such experience (mean 8.1 years, SD 5.0), but only 18 % of women (mean 6.9 years, SD 5.0).

  • Migration to Guatemala City: Living in Guatemala City, rather than in one of the original sample communities or elsewhere in Guatemala, might alter experiences related to cognitive skill formation through work, but also via shopping, entertainment, transportation, or many other aspects of life in a major city. Seventeen percent of the sample lived in Guatemala City at the time of the 2002–2004 survey, 18 % of women and 15 % of men. The mean of the estimated number of years that these individuals had been away from their origin community is 3.9 years (SD 6.6), with no significant difference by gender, even though men have been in the city for, on average, one additional year (i.e., on average, women had spent an additional year in their original community).

The two indicators have low correlation (0.04), underscoring the possibility that they might capture different dimensions of post-school experiences.

3.2.3 Initial conditions (E 0 , F 0 )

Genetic and other endowments and other individual characteristics(E 0 ). We do not have direct observations on genetic endowments beyond the gender of the individual, which we control for in our first-stage estimates. Other individual characteristics that we observe include age at the time of interview (which we include in both the first and second stages) and whether the individual was a twin, which may have longer-run implications through the human capital investments across the life-cycle (Behrman and Rosenzweig 2004). If the instruments that are only in the first stage are uncorrelated with individual-level endowments, however, failure to include direct measures of the endowments will not lead to biased estimation of relation (1v, 1n), since the predicted life-cycle stage experience measures will be orthogonal to endowments in the error term.

Parental wealth and school attainment (F 0 ). In some specifications, although not our preferred ones because of their likely association with unmeasured endowments through intergenerational transmission, we also consider the role of parental characteristics in the production of cognitive skills.

3.2.4 Observed events and shocks (C)

Experimental nutritional shocks.

One observed shock is the nutritional intervention underlying the original study. The atole intervention has been shown to have improved child growth (Martorell et al. 1995b). We construct two intent-to-treat measures, based on the community and date of birth of each individual and the dates of operation of the interventions. For each individual, we calculate whether they were exposed to either intervention for the entire period from 0–36 months of age. A potential exposure to the atole intervention (relative to fresco) is then calculated by multiplying this cohort measure by a dummy indicator of whether or not the child lived in one of the two atole communities.

Natural, market, or policy events.

Guatemala suffered a major earthquake in 1976 and we incorporate a dummy variable for whether an individual was born in the 2 years prior to it because the economic shock due to the earthquake may have been particularly important for infants and very young children. We also include community-level variables that relate as closely as possible to the timing of key schooling- and labor market-related decisions in each individual’s development. Using information reported in earlier work about infrastructure, markets, and services in the communities, complemented with a retrospective study in 2002, we construct the student–teacher ratio and an indicator of whether a lower secondary school (grades 7–9) was available, both measured in the community in which the individual lived when the individual was 7 years old. To capture changes in local market conditions, we construct a variable indicating the logarithm of the (national) salary in manufacturing when the individual was 18 years old and likely to be in their early working years. Thus, while reflecting community-level characteristics (except for the manufacturing salaries), these variables vary by single-year age cohorts within each community, as well as across communities. This is preferable to the more common approach of using measures of such factors at a given time for a population with different ages at that point in time, since our indicators more closely relate to periods in individuals’ lives when critical decisions (e.g., starting school) were being made. We also include an additional individual-specific shock—whether the individuals’ father or mother had died (prematurely and possibly accidentally) by the time they were 18, which relates to a critical juncture for entering the labor force.Footnote 18

4 Results

In Sect. 4.1, we summarize our main estimates, using indicators of experiences during each of the three life-cycle stages, and endogenizing using a two-step instrumental variable feasible generalized methods of moments (IV-GMM) estimator (Baum et al. 2007). In Sect. 4.2, we consider how robust estimates are to gender differences, variations in the instrumental variables used, alternative representations of verbal cognitive skills, and further controls for attrition. We allow for clustering at the birth-year-community cohort level in the calculation of the standard errors to control for potential serial correlation among children born in the same community in the same year; this yields 64 clusters.Footnote 19

4.1 Basic specification of the adult cognitive skills production functions

Our identification strategy has three components. First, using the framework developed in Sect. 2, we select plausibly exogenous characteristics from the individuals’ backgrounds and communities to predict the three E i life-cycle stage outcomes. These include community-level market and policy shock variables derived from the detailed community histories and the intent-to-treat exposure variables derived from the original nutritional supplementation intervention, as well as an indicator for having been born just prior to the 1976 earthquake. Second, we include all three of these life-cycle stages in our main specification, which means that the possibility that the instruments have direct effects that do not operate through the included measured experiences in the second stage of the model is substantially mitigated, especially in comparison to a model which excluded one or more of those experiences. For example, in our main specifications, the presence of a lower secondary school at age 7 years would be an invalid instrument only if it had a direct effect above and beyond its (linear) effect operating through each of the three E i , pre-school nutrition (through expectations), school attainment, and post-school skilled job tenure. Third, we carry out a range of diagnostic tests to assess the strength and validity of the instruments, as well as consider alternative instrument sets and estimation procedures.

4.1.1 First-stage estimates

The first-stage estimates have a number of individually significant point estimates consistent with our hypotheses about how the instruments affect each of the life-cycle stages (Table 2). Being a twin or being born within the 2 years prior to the 1976 earthquake reduces the HAZ at age six, whereas exposure to the atole intervention in the first 3 years (relative to fresco) increases it. Being male, an individual living in a community with a lower secondary school at age seven, or exposed to higher salaries in manufacturing at age 18, all lead to higher school attainment. A notable period of rising wages occurred in the late 1970s with reconstruction after the 1976 earthquake and in the late 1980s and early 1990s, associated with a building boom in Guatemala City and higher school attainment was one of the criteria used in hiring in manufacturing and other industries. For example, completed third grade was a requirement for even the lowest level jobs available in a local cement factory that opened in the 1980s. The loss of a parent before age 18 years and higher student–teacher ratios, on the other hand, are associated with lower school attainment. Men have significantly more tenure in skilled jobs. Exposure to atole when 0–36 months, lower secondary school available at age seven, and higher manufacturing wages at age 18 are linked to lower tenure in skilled jobs, consistent with individuals being more likely to go into low-skill manufacturing. The probability of migration to Guatemala City increases with lower secondary school being available at age 7 years and with higher manufacturing wages at age 18 years.

Table 2 First-stage estimates of life-cycle stage experiences (N = 1,448)

Table 3 presents four sets of estimates related to the linear version of the adult cognitive skills production function in relation (1v, 1n) for verbal cognitive skills in panel A and for nonverbal skills in panel B. Both are represented as Z scores (i.e., internally standardized in terms of their sample means and SDs). Three of the sets of estimates sequentially add experience in the chronological order of the three life-cycle stage experiences as follows: only pre-school experience (HAZ at age six) (Set 1A); pre-school experience and school-years experience (school attainment) (Set 2); and all three life-cycle stage experiences—HAZ at age six, school attainment, and skilled job tenure (Set 3). These alternative specifications show how the coefficient estimates of the earlier life-cycle stage experiences change as later experiences are incorporated. To permit comparisons with previous literature that typically does not included pre- or post-school experiences, we also present estimates that include only school attainment (Set 1B). For each set, we present both OLS and IV-GMM estimates (Baum et al. 2007; Hayashi 2000), using the instruments indicated in Table 2. For the final specification with all three life-cycle experiences, we also present IV limited information maximum likelihood (LIML) results using the same set of excluded instruments.

Table 3 Estimated impacts of pre-school, school, and post-school experiences on cognitive skills (N = 1,448)

4.1.2 IV diagnostics

For the full specifications with all three life-cycle experiences (Set 3), the first- and second-stage diagnostics are satisfactory for both types of skills; the instruments we utilize have reasonable power and we fail to reject the overidentification tests. The F tests on excluded instruments range from 7.3 to 26.3 (Table 2) (Bound et al. 1995) and the Kleibergen–Paap (KP) Wald F statistics for weak instruments are significant at a 5 % significance level (Table 3, bottom panel).Footnote 20 The Hansen J (HJ) statistics for overidentification indicate that the first-stage instruments are not correlated with the second-stage disturbance term.Footnote 21 Finally, the Hausman tests indicate that the IV-GMM estimates differ significantly from OLS for both verbal and nonverbal skills (Hayashi 2000; StataCorp. 2011). By contrast, in four of the six specifications in which one or two of the life-cycle stage experiences are excluded from the specification of relation (1v, 1n), the HJ tests indicate that there is a specification problem, consistent with concerns that the excluded instruments are correlated with the omitted life-cycle experience which belongs in the model.

4.1.3 Pre-school experience (E1)—HAZ at age six

The OLS associations between HAZ at age 6 years and both cognitive skills measures, conditional on a quadratic in age, are positive, significant, and fairly substantial, 0.20 SD for verbal skills and 0.19 SD for nonverbal skills (Table 3, first column in top two panels). If school attainment and skilled job tenure are included and along with HAZ at age 6 years are treated as behaviorally determined (IV-GMM estimates in the second to last column), the estimated effect of HAZ at age 6 years is similar, though now only marginally significant (p = 0.08) for verbal skills, but much larger and significant for nonverbal skills (0.33 SD). Therefore, our preferred estimates with all three life-cycle stages treated as endogenous indicate a substantial gain from higher HAZ at age 6 years in terms of both verbal and nonverbal skills more than 20 years later, controlling for both school- and post-school experiences and for the endogenous determination of all three of these life-cycle experiences. The alternative specifications for nonverbal skills indicate that the coefficient estimate for HAZ at age 6 years and for skilled job tenure increase markedly when moving from OLS to IV-GMM estimates in the full specification. This pattern is consistent with a negative correlation between the unobserved endowment components positively related to skill development and school attainment and the endowment components positively related to biological development (Behrman and Rosenzweig 2004), and skilled job tenure. Without controls for the endogenous determination of the life-cycle experiences, the effects of HAZ at age 6 years and of skilled job tenure on adult nonverbal skills appear to be substantially underestimated.Footnote 22

4.1.4 School-age experience (E2)—school attainment

If only school attainment is included using OLS, the estimated associations indicate increases of 0.22 SD of verbal skills and 0.15 SD of nonverbal skills for each additional grade attained. These are fairly substantial effects, implying, for example, increases of approximately 0.75 SD in verbal skills and of 0.50 SD in nonverbal skills for a 1 SD increase in school attainment. For verbal skills, however, this estimated effect is halved if IV estimates are used for school attainment, whether or not the other life-cycle experiences are included. Our preferred estimate for the school attainment coefficient from the IV-GMM estimates, with all three life-cycle experiences included, is 0.09 SD in verbal skills for every additional grade, or approximately 0.30 SD in verbal skills for a 1 SD increase in school attainment. While still an important effect, it is only about half of estimated effect using OLS. For nonverbal skills, the combination of including post-school experience and treating all three life-cycle experiences as behaviorally determined results in a decline in the estimated coefficient on school attainment to 0.04 SD for every additional grade, and it is no longer statistically significant. This is consistent with the claim that the Raven’s tests are not affected by schooling per se, but instead may reflect attributes that themselves affect schooling (Schweizer et al. 2007).

4.1.5 Post-school-age experience (E3)—skilled job tenure

For verbal skills, we find a small but significant impact on the IV-GMM estimates of skilled job tenure of 0.03 SD for every year of such tenure. For nonverbal skills, we find a significant impact of skilled job tenure that implies an increase of 0.14 SD for every additional year of skilled job tenure. This is a fairly substantial effect, implying an increase of approximately 0.67 SD in nonverbal skills for a 1 SD increase in skilled job tenure.Footnote 23 In Sect. 3.2, we discussed the possibility that migration to Guatemala City, the capital, also might provide experiences that could improve cognitive skills. We examined this possibility by including an indicator for migration to the capital, in addition to the other three experience measures, to the specifications shown in the final columns of Table 3, treating migration to the city as endogenous. For both verbal and nonverbal skills, the results for the other life-cycle stages are unchanged, and the coefficient on migration, while large, is insignificant (results not shown). While the F statistic for the excluded instruments on the migration indicator was 7.8 (Table 2), the KP statistic (3.4) indicates that the instruments as a set are weak when we endogenize all four of these measures.

The final column in Table 3 presents the model estimated by LIML, using the same instrumental variables as in the IV-GMM specifications. Results are similar, although the estimates of the effect of HAZ at age six are less precise and no longer marginally significant for verbal skills but still marginally significant (p = 0.08) for nonverbal skills.Footnote 24 Conditional on the other variables, both types of skills decline slightly with age.

4.2 Robustness considerations

4.2.1 Gender differences

Adult men in the sample score significantly higher on the tests of cognitive skills than women. This raises the question to what extent the patterns above reflect differences in the cognitive skills production functions or life-cycle stage experiences between men and women. One simple alternative is to incorporate gender directly into the production function, as we do with age.Footnote 25 Specifically, we include a male dummy variable in the second-stage estimated relations.Footnote 26 When we do so, the results with respect to the influence of the three life-cycle stages on verbal and nonverbal skills are similar to those in Table 3. The first-stage diagnostics, however, indicate that the instruments are weak, with a KP Wald F statistic of 3.2, so we do not report them. Consequently, while there is no evidence of substantial differences in the relationships for men versus women, such differences are hard to identify well in these data.

4.2.2 Alternative instrumental variables

A second concern is that the HJ statistic may have low power and, despite the multiple life-cycle stages in the model, some of the instruments are invalid, i.e., correlated with the error term. For either outcome, however, when we include parental characteristicsFootnote 27 in the set of excluded instruments, the HJ statistic rejects (p < 0.003) the null hypothesis that the overidentifying restrictions are valid. This suggests that the HJ does have sufficient power to detect invalid instruments, giving us greater confidence in the main results that exclude these parental characteristics from the first-stage instruments.

4.2.3 Alternative formulations of verbal skills

Due to the literacy screen (Sect. 3), the vocabulary and reading comprehension scores (RCS) on which verbal skills were based had a mass point at zero. In Table 4, we present alternative estimates (using the same instrument set as our preferred estimates in Table 3) that address this concentration of scores: (1) instrumental variables Tobit estimates using the verbal scores that account for the lower bound and mass point at zero; (2) IV-GMM estimates using RCS + pre-literacy test scores; and (3) IV-GMM estimates of the quartile of the RCS score.Footnote 28 The significance of HAZ is not robust to all three alternative specifications, but the findings regarding school attainment and skilled job tenure hold, with the latter significant at the 5 % level in the quartile specification but only at the 10 % level in the other two specifications. We conclude that these alternative representations do not alter substantially our qualitative findings or change our interpretations in Sect. 4.1.

Table 4 Estimated impacts of pre-school, school, and post-school experiences on verbal cognitive skills: alternative variable specifications (N = 1,448)

4.2.4 Attrition

Despite the considerable effort and success in tracing and reinterviewing participants from the original sample, attrition in our sample is substantial and associated with a number of initial conditions (Grajeda et al. 2005). What is of ultimate concern in this analysis, however, is not the level of attrition, but whether the attrition invalidates the inferences we make using these data. For example, does excluding migrants who were not located and who may have different characteristics lead to systematic bias of the estimates presented here?

To explore these concerns, we implement the correction procedure for selective attrition on observed characteristics outlined in Fitzgerald et al. (1998). We first estimate an attrition probit conditioning on all the right side variables (including instruments) considered in the main models, as well as an additional set of (endogenous) variables potentially associated with attrition, for all original sample members (N = 2,392). The latter variables include factors that reflect family structure in previous years, as these are likely to be associated with migration status: whether an individual lived with both their parents in 1975 and in 1987. During the fieldwork, locating sample members was facilitated by having access to other family members from whom the field team could gather information. Therefore, we also include variables that capture this feature of the success of tracking migrants: whether the parents were alive in 2002, whether they lived in the original community, whether a sibling of the sample member had been interviewed in the 2002–2004 follow-up survey, and the number of siblings in the family in the original sample. While we do not have adjustments to correct for selection on unobservable characteristics, by including a number of endogenous observables indicated above, which are likely to be correlated with unobservables, we expect that we are reducing the scope for attrition bias due to components of unobservables that are correlated with the included observed variables, as well.

Nearly all of the factors described above are significant and highly associated with attrition (Appendix Table 6). Following Fitzgerald et al. (1998), we construct weights that give greater weight to observations in the sample re-interviewed in 2002–2004 that had lower predicted probabilities of having been re-interviewed. Table 5 shows that application of these weights affects only slightly the results and that the central patterns of the coefficient estimates remain similar to those in our main estimates in the final column of Table 3, with the exception that the previously marginally significant HAZ score is now insignificant (p = 0.16). As found in other contexts with high attrition, we do not find evidence that the results are biased by attrition based on tests that use characteristics measured at baseline and during final round field work (Alderman et al. 2001; Fitzgerald et al. 1998).

Table 5 Estimated impacts of pre-school, school, and post-school experiences on cognitive skills: weighting for attrition (N = 1,448)

5 Conclusions

Most empirical investigations of the effects of cognitive skills assume that they are produced by schooling, and that schooling is exogenous. We argue that such approaches are likely to lead to incorrect inferences not only about the effect of school attainment, but also about the potential importance of pre-school and post-school experiences on adults cognitive skills. To explore this, we draw on longitudinal data set to estimate production functions for adult verbal and nonverbal cognitive skills as dependent on behaviorally determined pre-school, school, and post-school experiences. We present a basic specification as well as a range of tests of how robust the basic specification is. While some of the robustness tests suggest qualifications, overall they support the following results:

  1. 1.

    School attainment has a significant and substantial effect on adult verbal skills, but not on adult nonverbal cognitive skills.

  2. 2.

    Pre-school and post-school experiences have substantial positive significant effects on adult cognitive skills.Footnote 29 Pre-school experiences increase HAZ at age 6 years substantially and significantly increase verbal and nonverbal cognitive skills, even after controlling for school attainment and post-school skilled job tenure. The effect of HAZ on verbal skills, however, is less robust to changes in the methodology. Post-school tenure in skilled jobs has a significant positive effect on both verbal and nonverbal cognitive skills.

  3. 3.

    Estimates that do not account for the endogenous determination of these three life-cycle experiences are misleading. Estimates of the effect of schooling on adult cognitive skills that do not account for school attainment being behaviorally determined are biased upward substantially for adult verbal cognitive skills and make the impact on adult nonverbal cognitive skills appear highly positively significant rather than insignificant. Treating pre- and post-school experiences as statistically predetermined substantially underestimates their impacts. This contrasts with the upward bias for schooling, suggesting that the underlying physical and job-related components of genetic endowments are negatively correlated with those for cognitive skills.

While these results are of considerable interest in their own right, they also relate to four broader literatures.

First, in both developed and developing countries, there is growing interest in investing in disadvantaged children at an early stage in life. Drawing on a wide body of evidence from economics, psychology, and neuroscience, for example, Heckman (Heckman 2006a, b) argues that returns to such investments are much higher than those made later in life. However, the empirical base for these arguments is not as deep as would be desirable. For example, there are few studies that follow disadvantaged individuals over long periods of time. Our study adds to this literature by demonstrating that having relatively good nutritional status as pre-schoolers results in greater cognitive skills decades later as adults.

Second, a growing body of evidence suggests that, across a wide range of countries, scores on certain measures of cognitive skills or ability—including the Raven’s Progressive Matrices used in our study—are increasing over time. Referred to as the Flynn effect (Flynn 1987), Dickens and Flynn (2001) posit several pathways by which changes in environmental or behavioral factors, rather than in genetic factors, could cause scores on cognitive tests to increase over time. One of these is improved childhood nutrition. A second is increased cognitive complexity in the workplace. Our results for the impact of early childhood nutrition and for years of experience in skilled jobs—both treated as endogenous—provide direct evidence of the importance of these factors in shaping dimensions of cognitive skills that are consistent with the environmental and behavioral factors hypothesized to underlie the Flynn effect.

Third, a relatively small literature attempts to use what are interpreted to be direct measures of innate ability to examine whether human capital is associated with greater productivity as opposed to being mainly a signaling device (Boissiere et al. 1985; Alderman et al. 1996). Implicit in such approaches is the assumption that causality runs from cognitive abilities to productivities. But if more productive, higher remunerated work is also more complex, and if undertaking complex work improves cognitive skills, causality (also) runs the other way. We find evidence of this latter relationship. Therefore, our findings imply that studies that regress productivity on contemporaneous measures of cognitive abilities are flawed if they fail to take the endogeneity of cognitive skills into account.

And, fourth, there is considerable debate over the impact of human capital on income levels and growth. In this literature, aggregate country level estimates are presented in which human capital is typically represented by converting schooling enrollment rates into estimates of the stock of schooling (Nehru et al. 1995) or by school attainment for those older than 25 (Barro and Lee 1993). In doing so, these approaches assume that individuals do not accumulate additional human capital after completing schooling or after a certain age, nor does their human capital depreciate. Our results are at odds with this common assumption—we find that adult cognitive skills increase with experience in higher-skilled jobs (treated endogenously) but decline with age for our sample of 25–42 year olds. At the cross-country level, this implies that widely used representations of knowledge are problematic—they likely overstate human capital in slow-growing or traditional/subsistence economies and understate it in faster growing, modernizing economies. These biases are likely to result in underestimates of the importance of human capital in economic growth processes.


  1. For example, there are hundreds of empirical studies that are interpreted as showing the effect of cognitive and other skills obtained through education on wages or incomes and the vast majority of them use school attainment to represent these skills (Psacharopoulos and Patrinos 2004). A smaller number of studies use direct measures of adult cognitive skills (including Boissiere et al. 1985; Murnane et al. 1995; Alderman et al. 1996; Glewwe 1996; LaFave and Thomas 2013). The many empirical studies of the effects of cognitive and other skills on outcomes such as health, nutrition, and fertility nearly all use school attainment to represent these skills (Strauss and Thomas 1998).

  2. See references in the surveys by Strauss and Thomas (1995) and Behrman (2009).

  3. See, for example Glewwe and Jacoby (1995), Engle et al. (2007), Heckman (2006a), Grantham-McGregor et al. (2007), and Victora et al. (2008).

  4. There is considerable emphasis on post-school learning both “on-the-job” and through formal training programs. Standard earnings functions, whether motivated by a human capital investment models (e.g., Mincer 1974) or as hedonic price indices (e.g., Rosen 1974), generally include some measure of post-school work experience.

  5. We are unaware of studies that estimate adult cognitive skills production functions, though there is work in which what might be called cognitive skills production functions for students are estimated as functions of inputs such as teacher training, student–teacher ratios, availability of books and other attributes of schools (see Hanushek 1996, as well as more recent work such as Todd and Wolpin 2003, 2007 and Cunha and Heckman 2008, Cunha et al. 2010). These studies do not consider prime-age adults, the endogeneity of inputs, or the impacts of pre-school or post-school experiences.

  6. The vast majority of individuals in the sample started school by age seven (80 %) and completed formal schooling by age 15 (82 %).

  7. An alternative production function specification is a value-added form in which the change in cognitive skills across stages (periods) is posited to depend on the level of cognitive skills at the end of the previous stage (Todd and Wolpin 2003, 2007). We do not have sufficient data to explore this specification.

  8. That the pre-school experience encompasses all the effects of exogenous pre-school variables is a strong assumption that we relax in the empirical specifications. We make it here and in relation (8) to highlight the possible dependence of one life-cycle stage experience on a previous one.

  9. If the correctly-specified regression model is K3v =  1  + E 0v β 2  + U3v, where E is a vector with the three life-cycle stage experiences and E 0v is a vector with unobserved endowments, then the standard omitted variable result is that E[b 1 ] = β 1  + P 12 β2 where P 12 is the variance–covariance matrix between E and E 0v . With certain structures of P 12 , this can lead to the first component of b 1 (for E1) being negatively biased and the second component of b 1 (for E2) being positively biased in OLS estimates.

  10. This population also has been studied in the intervening years since the original survey, with particular emphasis on the impact of the nutritional intervention (Martorell et al. 2005) provide references to many of these other studies).

  11. Most measures of attrition refer to households or individuals who were past infancy and early childhood when the sample was taken, so they do not include the effects of infant and early childhood mortality that account for over a quarter of the attrition in the data used for this study.

  12. Respondents who reported having completed fewer than three grades of schooling, and those who reported three to five grades of schooling but could not read correctly the headline of a local newspaper article, were given a pre-literacy test that began with reading aloud single letters. They were considered literate if they passed the test with fewer than five errors out of 35 questions, the most difficult of which was reading aloud a five-word sentence.

  13. The Raven’s scores that we use often are interpreted as if they represent innate abilities, perhaps genetically determined, and thus often are referred to as measures of “ability.” The term “skills,” in contrast, tends to be used to refer to capabilities that can be, or have been, affected by various experiences, such as education. In this paper, we directly explore factors that may influence the Raven; consequently, we refer to them as nonverbal skills to reflect their possible malleability as opposed to their measuring innate abilities.

  14. If we were to approximate the function in relation (1v, 1n) with one indicator each for the three life-cycle stages in a second-order Taylor series expansion to allow diminishing marginal returns and interactions, we would need to estimate 9 parameters on endogenous variables. This more flexible specification exceeds the limits of what we are able to estimate with any precision.

  15. Appendix A provides details of the construction of this variable.

  16. We define skilled jobs to include white collar and administrative jobs, those with specialized skills (e.g., carpenters and mechanics), social service occupations (e.g., teachers, nurses) and own farm/own enterprise work that yields income in the top quintile for such activities in 2002–2004, under the assumption that such relatively large-scale enterprises required greater managerial and other skills to operate. Varying these definitions does not substantively change our results. Results are similar if we (1) treat as skilled labor only those with skilled wage employment; (2) use (1) along with a redefinition of skilled labor for agricultural work (based on planting a cash crop) and own business (based on the value of assets in the business); or (3) use the skilled years measure described in the text, but truncate it to 10 years to avoid the potential influence of outliers.

  17. While we are unable to test whether total experience has a greater effect than recent experience, if there is depreciation of unused knowledge recent experience is likely to be more important than earlier experience.

  18. Some of these shocks relate to the individuals’ parental families, particularly whether individuals’ parents were alive when individuals were at age 18. However, in contrast to when we include other parental characteristics (Sect. 4.3.2), these indicators can be different for different children within a household. Inclusion of the parental death shocks in the instrument set does not result in a rejection of the Hansen J overidentification test and leads to overall similar findings.

  19. Standard corrections for clustering are valid only when the number of groups or clusters is large (Wooldridge 2003). Therefore, following Bertrand et al. (2004), we also estimated the models using block bootstrapped standard errors, using the same 64 clusters and resampling 10,000 times. Standard errors calculated from this approach were typically slightly larger than those reported in the paper, but significance patterns the same as our preferred IV-GMM specification in Table 3).

  20. Using the 5 % critical value of 5.78 presented by Stock and Yogo (2005; Table 5), with a Kleibergen-Paap (KP) Wald F statistic (Kleibergen and Paap 2006), we reject the hypothesis that the instruments are weak, where weak in this case means having bias in the IV-GMM results that is larger than 20 % of the bias in the OLS results. To the extent that our estimates are biased, however, conditional on the validity of the excluded instruments, they are biased toward the OLS estimate, suggesting that the results we report are conservative and understate the differences between OLS and IV-GMM.

  21. The Hansen J (HJ) statistic for overidentification does not reject the null hypothesis that the overidentifying restrictions are valid (i.e., that the model is well-specified and the instruments do not belong in the second-stage equation) at usual significance levels. Failure to reject the null hypothesis for the Hansen test is evidence that if any one of the instruments is valid, so are the others. Since the instrument set includes the randomly allocated exposure to the intervention and the earthquake indicator, both of which are likely to be valid, we interpret this as evidence in support of the validity of all the instruments. Further supporting this is the finding that in models in which we include additional parental characteristics (Sect. 4.2.2), the overidentification test fails, indicating that the HJ test has sufficient power to reject with these data.

  22. Increases in the estimated effect of HAZ at age six and skilled job tenure after we instrument likely also reflects some random measurement error bias, although the approximate tripling of the estimated coefficient is not likely to be due solely to measurement error.

  23. As for pre-school experience described above, the increase in this coefficient estimate after instrumenting is likely in part, but not entirely, due to random measurement error.

  24. We also considered three-stage least squares estimates (with bootstrapped standard errors allowing for clustering at the birth cohort level) and they yield similar results.

  25. We also considered estimating relations for women and men separately, but results are imprecise as we halve the sample size for each analysis.

  26. For these estimates, we augment the instrumental variable set with two additional variables—an interaction between male and each of the exposure to the nutritional intervention variables, since there is evidence that the intervention affected males and females differently (Maluccio et al. 2009).

  27. These include mother’s and father’s school attainment and a household wealth index. The wealth index is constructed using data on a set of household durables and housing characteristics observed in the early 1970s. Using principal components, these assets and characteristics were combined into an index that we call a “wealth” index (Maluccio et al. 2005).

  28. For RCS + pre-literacy score, we sum the raw score from the pre-literacy screen and the reading and comprehension tests under the assumption that those who were exempt from the pre-literacy test would have earned a perfect score had they taken it. Since those who failed the pre-literacy test have scores in the 0–35 range, this leads to variation in scores among those to whom we assigned zero for the RCS alone. For the quartile measure, each individual’s reading and comprehension test score is recorded in the quartile of the distribution in which it falls. All those who failed the pre-literacy test (17 %) are in the first (lowest) quartile.

  29. Stein et al. (2008) present a related finding that the nutritional intervention had significant impacts on adult cognitive skills even with controls for school attainment (although they do not attempt to control for the determinants of school attainment). Maluccio et al. (2009) also present a related finding that reduced-form estimates of adult cognitive skills indicate significant and substantial effects of the nutritional intervention.

  30. In many studies there is particular focus on the nutritional status at 36 months as being critical, particularly for linear growth (e.g., Maluccio et al. (2009) and the references therein). We note that the correlation between the measured height-for-age Z-score at 36 months and our indicator of height-for-age Z-score at 72 months is 0.97, so the use of 36 rather than 72 months does not change our basic results. We prefer to use the indicator at 72 months rather than at 36 months because we want to represent the entire pre-schooling period.

  31. The age categories are those used in the 1969–1977 survey, with finer divisions for earlier ages to capture the more rapid growth during those ages: 15 days; and 3, 6, 9, 12, 15, 18, 21, 24, 30, 36, 42, 48, 54, 60, 72, and 84 months (with a small range around each targeted age). We also explored using single month intervals and obtained similar results; we prefer the age category estimates because they smooth the estimates over months for which there are fewer observations.

  32. The resulting estimates for the height-for-age Z-scores at age 72 months are based on actual observations for 41 % of the cases and age categories for 48 months and above (and therefore on an individual child curve parallel to the asymptote described in the text) for 68 % of the remaining cases. The estimates for the other 32 % of the imputed cases are based on the younger age categories, with the 28.5–31.5 month interval accounting for 5 % of the total, and all other categories <5 %.


  • Alderman H, Behrman JR, Ross D, Sabot R (1996) The returns to endogenous human capital in Pakistan’s rural wage labor market. Oxf B Econ Stat 58:29–55

    Article  Google Scholar 

  • Alderman H, Behrman JR, Kohler H-P, Maluccio JA, Cotts Watkins S (2001) Attrition in longitudinal household survey data: some tests for three developing country samples, Demogr Res [Online] 5:79–124. Available at

    Google Scholar 

  • Barro R, Lee J-W (1993) International comparisons of educational attainment. J Monet Econ 32:363–394

    Article  Google Scholar 

  • Behrman JR (2009) Investment in education—inputs and incentives. In: Rodrik D, Rosenzweig MR (eds) Handbook of development economics: the economics of development policy, vol 5. North-Holland Publishing Company, Amsterdam, pp 4883–4975

    Chapter  Google Scholar 

  • Behrman JR, Rosenzweig MR (1999) ‘Ability’ biases in schooling returns and twins: a test and new estimates. Econ Educ Rev 18:159–167

    Article  Google Scholar 

  • Behrman JR, Rosenzweig MR (2002) Does increasing women’s schooling raise the schooling of the next generation? Am Econ Rev 92:323–334

    Article  Google Scholar 

  • Behrman JR, Rosenzweig MR (2004) Returns to birthweight. Rev Econ Stat 86:586–601

    Article  Google Scholar 

  • Behrman JR, Rosenzweig MR (2005) Does increasing women’s schooling raise the schooling of the next generation?—Reply. Am Econ Rev 95:1745–1751

    Article  Google Scholar 

  • Bertrand M, Duflo E, Mullainathan S (2004) How much should we trust differences-in-differences estimates? Q J Econ 119:249–275

    Google Scholar 

  • Boissiere M, Knight JB, Sabot RH (1985) Earnings, schooling, ability and reading-comprehension cognitive skills. Am Econ Rev 75:1016–1030

    Google Scholar 

  • Bound J, Jaeger DA, Baker RM (1995) Problems with instrumental variables estimation when the correlation between the instruments and the endogenous explanatory variable is weak. J Am Stat Assoc 90:443–450

    Google Scholar 

  • Baum CF, Schaffer ME, Stillman S (2007) ivreg2: Stata module for extended instrumental variables/2SLS, GMM and AC/HAC, LIML, and k-class regression, Boston College, Department of Economics, Boston. Statistical Software Components S425401. Downloadable from

  • Cunha F, Heckman JJ (2008) Formulating, identifying and estimating the technology of cognitive and non-cognitive skill formation. J Hum Resour 43:738–782

    Google Scholar 

  • Cunha F, Heckman JJ, Schennach SM (2010) Estimating technology of cognitive and noncognitive skill formation. Econometrica 78:883–931

    Article  Google Scholar 

  • Dickens W, Flynn J (2001) Heritability estimates versus large environmental effects: the IQ Paradox resolved. Psychol Rev 108:346–369

    Article  Google Scholar 

  • Engle PL, Black MM, Behrman JR, Cabral de Mello M, Gertler PJ, Kapiriri L, Martorell R, Eming M (2007) Young, and the International Child Development Steering Committee, Strategies to avoid the loss of potential among 240 million children in the developing world. Lancet 369:229–242

    Article  Google Scholar 

  • Fitzgerald J, Gottschalk P, Moffitt R (1998) An analysis of sample attrition in panel data. J Hum Resour 33:251–299

    Article  Google Scholar 

  • Flynn J (1987) Massive gains in 14 nations: what IQ Tests really measure. Psychol Bull 101:171–191

    Article  Google Scholar 

  • Glewwe P (1996) The relevance of standard estimates of rates of return to schooling for education policy: a critical assessment. J Dev Econ 51:267–290

    Article  Google Scholar 

  • Glewwe P, Jacoby H (1995) An economic analysis of delayed primary school enrollment and childhood malnutrition in a low income country. Rev Econ Stat 77:156–169

    Article  Google Scholar 

  • Grajeda R, Behrman JR, Flores R, Maluccio JA, Martorell R, Stein AD (2005) The human capital study 2002–04: tracking, data collection, coverage, and attrition. Food Nutr Bull 26:S15–S24

    Article  Google Scholar 

  • Grantham-McGregor S, Cheung YB, Cueto S, Glewwe P, Richter L, Strupp B (2007) Developmental potential in the first 5 years for children in developing countries. Lancet 369:60–70

    Article  Google Scholar 

  • Hanushek EA (1996) Measuring investments in education. J Econ Perspect 10:9–30

    Article  Google Scholar 

  • Hayashi F (2000) Econometrics. Princeton University Press, Princeton

    Google Scholar 

  • Heckman JJ (2006a) Skill formation and the economics of investing in disadvantaged children. Science 312:1900–1902

    Article  Google Scholar 

  • Heckman JJ (2006b) The technology, and neuroscience of human capital formation. Proc Natl Acad Sci 104(33):13250–13255

    Article  Google Scholar 

  • Islam M, Hoddinott J (2009) Evidence of intra-household flypaper effects from a nutrition intervention in rural Guatemala. Econ Dev Cult Change 57:215–238

    Article  Google Scholar 

  • Kleibergen F, Paap R (2006) Generalized reduced rank tests using the singular value decomposition. J Econom 133:97–126

    Article  Google Scholar 

  • LaFave D, Thomas D (2013) Height and cognition at work: labor market performance in a low income setting. Department of Economics, Colby College, USA

    Google Scholar 

  • Maluccio JA, Murphy A, Yount KM (2005) Research note: a socioeconomic index for the INCAP Longitudinal Study 1969–77. Food Nutr Bull 26:S120–S124

    Article  Google Scholar 

  • Maluccio JA, Hoddinott J, Behrman JR, Quisumbing A, Martorell R, Stein AD (2009) The impact of improving nutrition during early childhood on education among Guatemalan adults. Econ J 119:734–763

    Article  Google Scholar 

  • Manuel HT (1967) Technical Reports, Tests of General Ability and Tests of Reading, Interamerican Series, Guidance Testing Associates, San Antonio

  • Martorell R (1997) Undernutrition during pregnancy and early childhood and its consequences for cognitive and behavioral development. In: Young ME (ed) Early child development: investing in our children’s future. Elsevier, Amsterdam, pp 39–83

    Google Scholar 

  • Martorell R, Habicht J-P, Rivera JA (1995a) History and design of the INCAP Longitudinal Study (1969–77) and its follow up (1988–89). J Nutr 125(1995):1027S–1041S

    Google Scholar 

  • Martorell R, Schroeder DG, Rivera JA, Kaplowitz HJ (1995b) Patterns of linear growth in rural Guatemalan adolescents and children. J Nutr 125:1060S–1067S

    Google Scholar 

  • Martorell R, Behrman JR, Flores R, Stein AD (2005) Rationale for a follow-up focusing on economic productivity. Food Nutr Bull 26:S5–S14

    Article  Google Scholar 

  • Mincer JB (1974) Schooling, experience, and earnings. National Bureau of Economic Research, New York

    Google Scholar 

  • Murnane RJ, Willet JB, Levy F (1995) The growing importance of reading-comprehension cognitive skills in wage determination. Rev Econ Stat 77:251–266

    Article  Google Scholar 

  • Nehru V, Swanson E, Dubey A (1995) New database on human capital stock in developing and industrial countries: sources, methodology and results. J Dev Econ 46:379–401

    Article  Google Scholar 

  • Pollitt E, Gorman KS, Engle PL, Martorell R, Rivera JA (1993) Early supplementary feeding and cognition: effects over two decades. Monogr Soc Res Child 58:122

    Article  Google Scholar 

  • Psacharopoulos G, Patrinos H (2004) Returns to investment in education: a further update. Edu Econ 12(2):111–134

    Article  Google Scholar 

  • Raven JC, Court JH, Raven J (1984) Manual for Raven’s progressive matrices and vocabulary scales, Section 2: coloured progressive matrices. H.K. Lewis, London

    Google Scholar 

  • Rosen S (1974) Hedonic functions and implicit markets. J Polit Econ 82:34–55

    Article  Google Scholar 

  • Schroeder DG, Kaplowitz HJ, Martorell R (1992) Patterns and predictors of participation and consumption of supplement in an intervention study in rural Guatemala. Food Nutr Bull 14:191–200

    Google Scholar 

  • Schultz TW (1975) The value of the ability to deal with disequilibria. J Econ Lit 13:827–846

    Google Scholar 

  • Schweizer K, Goldhammer F, Rauch W, Moosbrugger H (2007) On the validity of Raven’s Matrices Test: does spatial ability contribute to performance? Pers Indiv Differ 43:1998–2010

    Article  Google Scholar 

  • StataCorp. (2011) Stata Statistical Software: Release 12.0, Stata Corporation, College Station, Texas

  • Stein AD, Barnhart HX, Hickey M, Ramakrishman U, Schroeder DG, Martorell R (2003) Prospective study of protein-energy supplementation early in life and of growth in the subsequent generation in Guatemala. Am J Clin Nutr 78:162–167

    Google Scholar 

  • Stein AD, Wang M, DiGirolamo A, Grajeda R, Ramakrishnan U, Ramirez-Zea M, Yount K, Martorell R (2008) Nutritional supplementation in early childhood, schooling and intellectual functioning in adulthood: a prospective study in Guatemala. J Pediatr Adol Med 162:612–618

    Article  Google Scholar 

  • Stock JH, Yogo M (2005) Testing for weak instruments in Linear IV regression. In: Andrews DK, Stock JH (eds) Identification and inference for econometric models: essays in honor of Thomas Rothenberg. Cambridge University Press, Cambridge

    Google Scholar 

  • Strauss J, Thomas D (1995) Human resources: empirical modeling of household and family decisions. In: Behrman JR, Srinivasan TN (eds) Handbook of development economics, vol 3A. North-Holland Publishing Company, Amsterdam, pp 1883–2024

    Google Scholar 

  • Strauss J, Thomas D (1998) Health, nutrition, and economic development. J Econ Lit 36:766–817

    Google Scholar 

  • Todd PE, Wolpin KI (2003) On the specification and estimation of the production function for cognitive achievement. Econ J 118:F3–F33

    Article  Google Scholar 

  • Todd PE, Wolpin KI (2007) The production of cognitive achievement in children: home, school and racial test score gaps. J Hum Cap 1:91–136

    Article  Google Scholar 

  • Victora CG, Adair L, Fall C, Hallal PC, Martorell R, Richter L, Sachdev HS, on behalf of the Maternal and Child Undernutrition Study Group (2008) Undernutrition 2: maternal and child undernutrition: consequences for adult health and human capital. Lancet 371:340–357

    Article  Google Scholar 

  • WHO (World Health Organization) (2006) WHO child growth standards: length/height-for-age, weight-for-age, weight-for-length, weight-for-height and body mass index-for-age: methods and development. World Health Organization, Geneva

    Google Scholar 

  • Wooldridge JM (2003) Cluster-sample methods in applied econometrics. Am Econ Rev Papers Proc 93:133–138

    Article  Google Scholar 

Download references


This study is based on work supported by Grand Challenges Canada Grant Number 0072-03 “Team 1000 + Saving Brains: Economic Impacts of Poverty-Related Risk Factors During the First 1,000 days for Cognitive Development and Human Capital,” NIH/Fogarty grant TW-05598 “Early Nutrition, Human Capital and Economic Productivity,” NSF/Economics grants SES 0136616 and SES 0211404 “Collaborative Research: Nutritional Investments in Children, Adult Human Capital and Adult Productivities,” NIH grant HD046125 “Education and Health over the Life Course in Guatemala,” NIH R01 HD045627-01 grant “Resource Flows Among Three Generations in Guatemala,” NIH/NIA grant P30 AG12836 to PARC at the University of Pennsylvania. The authors thank their colleagues on the larger project of which this study is a part, including Alexis Murphy and Meng Wang for excellent research assistance in the preparation of the data. The authors also thank for valuable comments Harold Alderman, Orazio Attanasio, Richard Blundell, Alan de Brauw, Andrew Chesher, Janet Currie, Scott McNiven, Costas Meghir, Austin Nichols, Alessandro Tarozzi, the editor and an anonymous reviewer for the journal and participants in seminars at the University of Arizona, University of Chicago, Columbia University, the Center for Global Development, the Minnesota International Development Conference, the Stanford Institute for Theoretical Economics (SITE) Summer Workshop on Health and Economic Development, Syracuse University, and University College London.

Author information

Authors and Affiliations


Corresponding author

Correspondence to Jere R. Behrman.


Appendix A: Data appendix for pre-school HAZ score indicator

The data include from one to 15 measurements of height-for-age Z scores on each of 1,954 children from the original sample between 1969 and 1977 (WHO 2006). Not all of these individuals, however, were measured at the same ages or at any particular given age. For example, the greatest number of individuals was measured at nine months (8.5 m–9.5 m)—951 children or 49 % of those ever measured as infants and children. Because of the tendency for similar age patterns in the Z scores for a poorly nourished population such as this one (Martorell 1997), we use this information to obtain an estimate of the height-for-age Z score for individuals in the sample at a common pre-schooling age. For children in this sample, in the early months of life there is a tendency for a sharp drop in Z scores for height-for-age that then levels off and reaches a minimum at about 30 months of age, after which it increases slightly and approaches an asymptote just below −2.0 (the common cutoff for stunting) throughout the remainder of the pre-schooling period. Based on our objective of summarizing the entire pre-schooling experience, we use the height-for-age Z score at age 72 months (6 years) as our indicator of pre-schooling experience, which is both close to the age of starting school and an age where Z scores are relatively stable.Footnote 30 Since this measure is not available for the entire sample, when it is missing we estimate it using measurements of the Z score for the child at ages other than 72 months. We first estimate the Z score-age relation with dummy variables for age categoriesFootnote 31 other than 72 months, controlling for child-level fixed effects and then use the estimates of the age category dummy variables to adjust the nearest observed measurement of each child (for whom we do not have an observation at age 72 months) by the average difference between the measurement at the observed age and at 72 months.Footnote 32

Even though the data permit the estimation of height-for-age Z scores at age 72 months for 1,954 individuals in the sample as compared with 1,448 for whom we have both adult cognitive skills test scores, for 180 individuals (12 %) for whom we have the test scores we do not have information with which to estimate the height-for-age Z score at age 72 months. For the 1,268 individuals for whom we have an actual or predicted height-for-age Z score at age 72 months, the mean value is −2.24 (median −2.22), almost at the cutoff for the definition of stunting, with a SD of 0.96. The means do not differ significantly for males versus females. To retain the 180 observations on individuals without pre-schooling height-for-age when estimating the impact of the experiences during the schooling and post-schooling periods, we replace the missing height-for-age Z score with the sample median of −2.22. Removing these observations leads to similar, and more highly significant results to those in the paper.

Appendix B

See Table 6

Table 6 Attrition probits to construct weights used in Table 5, reweighting for attrition bias (N = 2,392)


Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Behrman, J.R., Hoddinott, J., Maluccio, J.A. et al. What determines adult cognitive skills? Influences of pre-school, school, and post-school experiences in Guatemala. Lat Am Econ Rev 23, 4 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


JEL Classification