The Long-Term Earnings Consequences of General vs. Specific Training of the Unemployed

Training programs for the unemployed typically involve teaching specific skills in demand amongst employers. In 1997, Swedish unemployed could also choose general training at the upper secondary school level. Despite the dominance of programs offering specific training, long-term relative earnings effects of general vs. specific training are theoretically ambiguous. Analyzing detailed administrative data 1990-2010, we find specific training associated with higher earnings in the short run, but that earnings converge over time. Results also indicate that individuals act on their comparative advantages. Long-run earnings advantages of general training are found for females with limited prior education and among metropolitan residents.


Introduction
Governments in most OECD countries offer training programs for the unemployed, typically oriented toward vocational/specific skills. The consensus view seems to be that vocational/specific training is a more efficient measure for unemployed individuals than are courses providing general/theoretical skills. In the short run, learning a branch specific skill is presumed to better enhance re-entry into employment. General training, without an obvious connection to a labor market branch, may have less of an impact. However, in the long run, if general skills increase the ability to learn new tasks, this could make workers less sensitive to changes in the demand for skills. Earlier studies of adults in general education have reported average earnings returns which still increase eight to ten years after enrolment (Jacobson et al. 2003, Stenberg 2011; see Figures 1a and 1b). As program effects vary between individuals and over time, these estimates are not directly comparable with evaluations of vocational training programs, but they raise the question of whether the long-term effects of general training would catch up with or exceed the earnings effects of specific training. 1 Some economists have suggested that governments should stimulate adults to enroll in formal schooling during economic downturns (e.g., Urzua 2008, Pissarides 2011), but there is an almost complete lack of empirical research on this topic. It is therefore unclear whether skill adjustments among the unemployed should involve a larger element of general training. 2   1 The results from evaluations of specific training for the unemployed in Sweden have differed across decades, with positive effects in the 1980s, zero or negative effects for participants at the start of the 1990s, and positive effects again in the late 1990s and early 2000s (e.g., Andrén and Gustafsson 2005, Calmfors et al. 2001, Axelsson and Westerlund 2005, Stenberg and Westerlund 2004, de Luna et al. 2008). The restrained results at the start of the 1990s have usually been ascribed to the economic recession's effect on employment prospects and/or the large scale of labor market training programs at the time.

3
The purpose of this article is to evaluate the relative earnings association of general versus specific training for the unemployed. In the spring of 1997, the Swedish government announced the Adult Education Initiative (AEI henceforth) which targeted the same groups of the unemployed as did the traditional vocational/specific training program. The AEI enabled unemployed adults aged 25-55 to attend a year of full-time schooling at the upper secondary level, with financial support equal to a maintenance of unemployment benefits. AEI started in August 1997 and attracted large numbers. We study a sample comprising the unemployed individuals who enrolled in 1997 in either the AEI or the largest vocational training program in Sweden (Arbetsmarknadsutbildning), which we will refer to as "Labor Market Training" (LMT).
We explore exceptionally rich population register data which includes annual earnings from 1990 until 2010, providing a follow-up period of 13 years. Our descriptive average earnings trajectories already represent an interesting contribution, as we are not aware of any analysis of this length of time for general vs. specific labor market programs. To move closer to a causal interpretation, the empirical strategy is based on difference-in-differences propensity score matching, which explicitly takes into account heterogeneous treatment effects and individual time invariant (fixed) unobserved characteristics. The evaluated samples are balanced on more than 100 covariates and our findings are overall robust, e.g., when we check for potential bias by including measures of cognitive and non-cognitive skills (males born 1953 or later) and for "parallel trends" by controlling for dynamic factors (changes) prior to program enrolment. The results obtained are, as expected, more sensitive to the length of the followup period. In addition, the expansion of the menu of programs may enhance efficiency to the extent that individuals act on their comparative advantages in practical/theoretical skills. This is possible to examine as propensity score matching accounts for individuals' heterogeneity, 4 and we find that results are also sensitive to the assumed counterfactual state, LMT versus the AEI. This point is discussed in Section 5.3 and the presented results include both cases.
Research comparing general and specific training for the unemployed is scant. Stenberg (2007) is a study similar to the present one, but it analyzes only the short-run annual earnings effects of the AEI and LMT (six years post-enrolment). The results were obtained with individual fixed effects estimates, i.e., basically relying on earnings and age as control variables. They corroborate the consensus view regarding short-term outcomes as the LMT individuals' earnings exceeded those of participants in the AEI by approximately € 3,500 for males and by €1,500 for females. The descriptive statistics in Figure 2(a) and 2(b) demonstrate the earnings trajectories from raw data for 1991-2003.
The main contribution of this study is the estimation of the long-term relative earnings impact of general versus specific training of the unemployed 13 years post enrolment. The length of the observation window makes it possible to examine if the earlier reported shortterm earnings advantage of LMT remains over time, whether trends converge or whether the long-term earnings are more in favor of general training. Because general training is rarely provided for the unemployed, a long-term relative earnings advantage of the AEI would potentially support an expansion of active labor market programs, by allowing individuals to choose the program type in accordance with their comparative advantages. A second contribution of this study is that we allow estimates to vary according to individuals' comparative advantages. This is achieved by considering heterogeneous program effects and by interchangeably modeling the counterfactual state as LMT or the AEI. The results indicate that specific training outperforms general training in the short run (5-7 years). In the longer perspective, 7-13 years after program enrolment, the estimates tend to converge toward zero.
The analyses indicate evidence consistent with individuals acting on their comparative 5 advantages. Results pertaining to subgroups also reveal substantial heterogeneity and imply scope for efficiency gains by expanding labor market programs to include general training of the unemployed. This is particularly true for females with limited education and may also apply to residents in a metropolitan labor market region (Stockholm). In separate analyses, there are indications that vocational training may be a way to compensate for low levels of noncognitive skills or, conversely, that non-cognitive skills are an important complement to skills obtained in general training.

Earnings returns to specific and general human capital
The distinction between specific and general skills made by Becker (1964) has often been used to formulate hypotheses on differences in expected short-term and long-term labor market outcomes (e.g. Brunello 2003, Hanushek et al. 2011, Kreuger and Kumar 2004a, 2004b, Shavit and Müller 1998. In the short run, specific skills are assumed to be instantly in demand in the labor market, and to yield short-term average earnings returns which exceed those of general skills. General skills instead enhance the ability to learn, at the expense of a more sluggish transition from training into employment. While these are stylized characterizations, they fit with the trajectories presented in Stenberg (2007) and reproduced here as Figure 2.
In a longer perspective, business cycle fluctuations and technological changes may influence the relative payoff of the different types of human capital. First, by definition, the degree of transferability between employers is lower for specific skills. If the business cycle generates structural changes which force individuals to switch careers, there is a risk attached to investments in specific skills. Relatedly, technological changes could create an advantage for general skills if they enhance the ability to learn new skills. Employers could be more likely to offer further training to these individuals, who then become even less sensitive to changes. In 6 sum, the long-run relative earnings implications are ambiguous, and the time frame emerges as an important aspect to appropriately analyze the impact of general vs. specific skills.
We expect individuals' comparative advantages to affect the choice of investment in specific or general human capital. From this follows two crucial implications. On the one hand, labor market efficiency and societal benefit may be enhanced when program options are increased. On the other hand, it also implies that program types may attract individuals with different characteristics. The latter potentially (but not necessarily) constitutes a source of endogeneity bias in our estimates. The empirical approach to take this into account is explained in Section 5.3.

Institutional setting
In Sweden, compulsory (comprehensive) school is nine years, with very limited tracking. This is followed by two-or three-year programs at the upper secondary school. The two-year programs are mainly vocational, but also encompass business, social science and technology.
The three-year programs are all theoretical and are intended to provide eligibility for higher studies.
A notable characteristic of the Swedish educational system is the prevalence of adults in formal education. Since 1969, Swedish municipalities have been obliged by law to offer schooling to adults who wish to re-enroll at the lower (compulsory) or upper secondary level. Importantly, prior to the early 1990s, Komvux enrolment was rarely offered to unemployed individuals. This is partly explained by the fact that UI benefits are more generous than are study allowances (and do not require repayment) and that this would have generated incentives for individuals to register as unemployed before enrolling in Komvux. Figure 3 shows historical data of the numbers unemployed who were registered in Komvux and LMT. At the start of the 1990s, following an extreme recession which saw unemployment increase from 2 percent to 11 percent, the unemployed were assigned to LMT, which then grew to its largest size to date. From 1993, as the levels of open unemployment did not decrease in any significant way, the government offered municipalities funding of slots in Komvux, reserved for the unemployed. These funds gradually increased, and the proportion of year for the LMT and SEK 34,000 per year for the AEI. This would correspond to similar costs per participant. To simplify the analysis, we will disregard the direct program costs when assessing the relative payoff of the programs. 3

Data
This study is based on annual population register data for 1990-2010, which encompasses all individuals residing in Sweden. To define our samples, the unemployment registers provide information on the day of enrolment in the LMT and the end date of this registration. We define the LMT participants as those enrolled in May or later in 1997, to make the timing of the programs reasonably similar. The courses at Komvux are usually ongoing from the end of August until December (autumn semester) and/or from January until the beginning of June (spring semester). For those enrolling in the AEI, we set the twofold condition that individuals were registered in Komvux in the autumn semester of 1997 and that they received the special grant for education and training (Särskilt utbildningsbidrag, UBS) that was introduced in 1997 specifically for the AEI. This helps us distinguish between participants in the AEI and participants in the regular Komvux program, who attended the same courses (and in the same classrooms). Excluding the individuals registered in both LMT and AEI in 1997, and those attending vocational courses within the AEI, the numbers registered in programs were 40,835 (LMT) and 46,227 (AEI). For our analyses, we exclude individuals who were registered in any of the two programs in 1996. We also set the condition that the individuals were aged 25-55 in 1997, received UI benefits and were registered as unemployed for at least one day between the 1 st of January and the 30 th of June. With these restrictions, the sample size is 15,129 (LMT) and 16,099 (AEI). This is our benchmark sample used in the analyses presented. Figure 4 displays the trajectories of the AEI and LMT participants' annual earnings for 1990-2010. There is remarkable similarity in earnings between the two groups for 1990-1996, which is mainly an effect of conditioning on the incidence of the UI benefits in 1997. At face value, the earnings of males after enrolment indicate an advantage of the LMT, but the general training appears to be more beneficial for females. To the best of our knowledge, this kind of descriptive evidence has not been presented earlier. In the results section, we perform robustness checks based on "limited samples", restricted to those never registered in either program in 1991-1996 (our earliest record of LMT is 1991). This increases the comparability and decreases the risk that estimated program effects are diluted, but at the cost of external validity. The remaining number of observations is then 7,153 (LMT) and 8,324 (AEI). Table A.1 in the Appendix gives the descriptive statistics.

Empirical strategy
To assess the relative earnings impact of the AEI and LMT, we use difference-in-differences propensity score matching (PSM) to compare comparable individuals and take into account that treatment effects are heterogeneous. Below, we describe our relative average treatment effect on the treated (ATT) of the AEI and LMT, taking a conventional ATT estimator as a point of departure. The interpretation of the relative ATT estimates is discussed in Section 5.3.

Difference-in-differences propensity score matching
In our empirical implementation, year t is 1997 and t+ is (1998, 1999, …, 2010). If a program occurs at time t, the change in annual earnings (Y t+ -Y t-) = ΔY is calculated for each individual.
In a potential outcomes framework, we wish to compare (ΔY 1 -ΔY 0 ), where subscripts denote 1 if treated and 0 if untreated (for now). One of these is always missing. We therefore make the assumption that conditional on individuals' pre-program observable characteristics X, and denoting D = 1 for actual treatment and zero otherwise: If this assumption holds, it also holds for some function of X, such that the matching is reduced to conditioning on a scalar (Rosenbaum and Rubin, 1983): The function P(X) is the propensity score, in our case a probit estimate of the probability of enrolment in a program. Each treated is matched with an untreated who is the nearest neighbor in terms of the probit estimate. Because ΔY 0 cannot be observed for treated individuals (D = 1), it is estimated by the observed outcomes of the matched comparisons.
Under assumptions i)iii) given below, the ATT is then the average of (ΔY 1 -ΔY 0 ) for samples which have been balanced on the covariates. Formally: Program effects are likely to be heterogeneous. It means that separate estimates of ATT for two programs are not necessarily comparable (i.e. ATT may be different from the average treatment effect, ATE). To directly compare AEI and LMT, one may estimate a relative ATT by applying the same reasoning as in the case of the ATT discussed above, but consider D = 1 the treatment and D = 0 the alternative treatment (instead of "no treatment"). We thereby obtain an estimate of relative program effects for comparable program participants. To give a hypothetical example, if the program effects are correlated with say, age, separate estimates of ATT for the AEI and the LMT may differ only because of participants' different age structure.
The relative ATT would correct this potential flaw by comparing ΔY of program participants of the same age, where the age variable has been balanced between the two groups. Table 3 provides an account of the probit model estimates of P(X), here the probability of AEI as treatment and LMT as the alternative treatment. 6 To give estimates of the (relative) ATT a causal interpretation, one needs to assume: i) that 0 < P(X) < 1; ii) that program participation does not affect the earnings of other individuals and; iii) conditional on the covariates, that the mechanisms behind enrolment decisions are independent of future earnings. The crucial assumption is iii. Even with a rich set of covariates, where our differenced outcome accounts for unobserved individual fixed effects affecting earnings, it is not possible to rule out that remaining unobserved factor(s) may correlate with both participation and future earnings. This will be discussed in the remainder of this section. 7 6 Unless essential for the balancing of the samples, covariates are discarded from the probit estimates if p-values exceed .2. This is because irrelevant covariates may increase bias and/or variance of matching estimators (e.g., Caliendo andKopeinig 2008, de Luna et al. 2011). 7 In the case under study, assumption ii can also be questioned because both training programs are large. However, Dahlberg and Forslund (2005) find no displacement effects of Swedish training programs in 1987-1996. One may note that they report substantial displacement effects of subsidized employment, as do Crépon et al. (2013) of job search assistance programs. Regarding positive externalities, Albrecht et al. (2009) argue that the returns to society of the AEI were higher than the individual earnings return by a factor of 1.5.

Application
In the Appendix, Tables A.2  This holds for all of the estimates discussed in the empirical section. The balancing tests encompass a rich set of covariates that include age, regional employment levels, dummies for region of residence (23 categories), employment sector (7 categories), prior education level (6 categories) and educational track (6 categories), number of children at home (6 categories), age of children (6 categories), indicators of marital status or divorce, pre-treatment annual earnings trajectories for 1990-1995 (1996 with our extended model, see below), and four different types of social insurance benefits in 1990-1995 (1996) related to unemployment insurance, parental leave, sick-leave and social welfare, applying both dummy variables (zero earnings, incidence of the various benefits) and continuous measures of amounts. We further balance on days registered as unemployed each year in 1992-1995 (1996) and on indicator variables if either zero days or the maximum number of days (365/366). In total, our balancing tests encompass at least 132 variables.
Our main concerns regarding sources of potential bias are differences in unobserved ability and in time-varying unobserved factors (see Biewen et al. 2014 for an extensive discussion on specification issues). As a check for ability bias in our estimates, for males born 1953 or later, we compare the results when including and excluding test scores relating to cognitive and non-cognitive skills. The estimation results then only display marginal changes, which on average correspond to .2 percentage points of the annual earnings (app. SEK 400).
Regarding time-varying unobserved factors, changes in motivation or health may not be captured by our covariates. 9 A common critique of difference-in-difference estimators is that a temporary earnings drop in the year prior to program enrolment among the treated generates an upward bias because the earnings level does not reflect the individual's true productivity (Ashenfelter 1978). The baseline model we use in the results section, unless otherwise stated, does not consider covariates recorded in 1996, with pre-program earnings defined as the average of the annual earnings in 1993-1995. A contrasting approach is to assume that changes post-1995 imply changes with permanent effects which must be controlled for (e.g., . We applied extended versions of our estimation models to consider changes in transfers and earnings 1995-1996. If our estimates are affected by diverging parallel trends, or time-varying unobserved characteristics, one would expect results to systematically change by model specifications. Overall, the different specifications yield negligible differences in estimates. This is perhaps expected, as we compare participants in two programs rather than comparing with "non-participants". In Sections 6 and 7, the extended model results are reported when relevant. 10 Overall, the stability of our findings with respect to the extended model specification and the check for potential ability bias indicate support for our empirical strategy. 11 9 For some of the unemployed, program participation seems to be motivated primarily by avoidance of an active job search and/or to qualify for another period of UI benefits (Stenberg and Westerlund 2008, p63). 10 For our extended model, the balancing concerns an additional 26 variables. We follow  to control for nine different transitions in labor force status 1995-1996 between outside the labor force, employment and unemployment. Also included are levels 1996 and changes in the amounts of earnings and social insurance benefits in 1995-1996 and regarding sick-leave or social welfare also for 1996-1997 (we then assume that program choice does not cause transfers to change). 11 This is consistent with findings from studies assessing non-experimental estimates based on data of high quality. Card et al. (2010) conclude that "The absence of an 'experimental' effect suggests that the research designs used in recent non-experimental evaluations are not significantly biased relative to the benchmark of an experimental design" (F475, their quotation marks). Of course, this is not to say that adequate experimental data is not preferred. Nevertheless, when good non-experimental data is available, it is unreasonable to abstain from studying important research questions while waiting for the uncertain event of future access to relevant experimental data.

Comparative advantages and relative program effects
A basic motivation for policy makers to expand the program types available is that it allows individuals to act on their personal abilities, which may generate comparative advantages.
However, if these abilities affect labor market outcomes independently of program participation, this may yield bias in our estimates of the relative ATT. will not affect our estimates. However, the distributions in Figure 5 are clearly tilted toward the probability of the program defined as "treatment", and away from the program defined as alternative treatment ("comparison"). 12 The asymmetry arises because of matching and is exacerbated by that matching is performed "with replacement" (to minimize bias). Thus, a matched comparison is always reinserted ("replaced") into the pool of potential comparisons. Consider the case where AEI is the treatment. The comparisons are LMT participants who, partly due to the replacement algorithm, are drawn to a greater extent from the side of the probability distribution where AEI participation is more likely. If individuals exploit their comparative advantages, one may then expect estimates of the relative ATT to be more favorable for the AEI program, without necessarily indicating bias.
Assuming that all individuals in our sample have decided to enroll in a program, and that they choose freely between only two existing programs, the Pr[AEI] set-up tests whether the AEI is associated with higher earnings compared with the LMT for those choosing the AEI.
However, estimates could hypothetically reflect that the comparative advantages affect earnings independently of the AEI. The results presented below will therefore concern both alternatives, Pr[AEI] and Pr [LMT].
Some rudimentary guidance to the question "what works and for whom?" may be conveyed by comparing the balancing tests of the alternative matching set-ups (Tables A.2  Section 7, we analyze heterogeneity in the relative estimates across subsamples.

6
Main results

Heterogeneous effects
We now turn to analyses of subgroups. 16  to local labor market characteristics, e.g. size, density, diversity and/or employment structure.
The foremost difference in observable employment structures is that Stockholm has a lower share employed in the public sector and in manufacturing.
In Figure 9, Finally, we use the information contained in the test scores relating to cognitive and non-cognitive skills, which are available for males born 1953 or later. We separate this sample based on whether the respective test scores are above or below the median values, resulting 21 in four groups in total ( Figure 10). The findings are now less precise but still display two clear patterns. First, dividing the sample based on cognitive skills, above or below the median, has little impact on estimates. Perhaps surprisingly, cognitive skills do not seem to be important for the relative earnings impact of general vs. specific training. Second, the individuals with non-cognitive test scores below median appear to benefit more from specific training. For this group, the point estimates are statistically significant (negative) throughout. In contrast, those with above-median non-cognitive skills are associated with relatively stronger earning effects of general training. The magnitude of the positive estimates is overall modest (also with the limited sample or the extended model specification), but it is interesting that the pattern of results between the groups above and below median is relatively clear. A possible interpretation is that learning a specific skill is a way to compensate for a lower level of noncognitive skills. Conversely, non-cognitive skills may be an important complement for benefiting from general training.

Summary
A principal contribution of this study is to provide empirical evidence on long term earnings associated with general training as an alternative to vocational/specific training. Heterogeneity among the unemployed, and in labor market demand for skills, implies that variety in the supply of training may allow individuals to capitalize on comparative advantages and improve the benefits of investments. With data on earnings 13 years post-enrolment showing differences between long-term and short-term outcomes, our analyses underscore the need for long follow-up periods to appropriately assess such programs. We also find strong indications that individuals tend to act on their comparative advantages. Characteristics predicting enrolment in general or specific training tend to be associated with estimated relative treatment effects that favor the chosen type of training. Methodologically, robustness 22 checks for ability bias and time-varying characteristics prior to the program confirm our main findings.
For females with limited prior schooling and for participants in the metropolitan labor market of Stockholm, we find that general training is associated with earnings that exceed those of specific training. These findings are in line with the hypothesis that general training better enhances labor market prospects in the long run, by providing skills which make individuals less sensitive to labor market-related changes. Nevertheless, most of our estimates imply that vocational/specific training is associated with more favorable earnings trajectories.
Therefore, arguments in favor of theoretical/general training programs must be based on the heterogeneity of the unemployed. As has been suggested earlier, theoretical programs may be especially appropriate in periods of high unemployment when opportunity costs are low and high numbers in specific training programs may inflict lower marginal returns.
Our study makes a distinct contribution compared with previous research, but there are some important caveats and we would like to point out four of these. First, the program costs are based on rough approximations and are assessed as equal on average. Second, the comparison between the two programs disregards outside alternatives, e.g., other programs.
Third, other goals for policy (equity, democracy, etc.) are not considered. Fourth, general equilibrium effects are not considered. One might think here of costs associated with general training because, in the presence of labor market frictions, firms have incentives to offer not only specific training but also general education (Acemoglu and Pischke 1999). As in the case of specific training, increased public supply of general training may be associated with a deadweight loss due to crowding out of firms' investments in general skills.           include age-dummies (males) and 13 additional regional dummies. Estimates are also based on interaction variables which for males only include (Social welf.>0 1990*UI 1995. For females, the indicator variable of 9 years of schooling is interacted with "no unemployment 1995"; five interaction variables involve "no upper secondary school" (age at immigration, sick leave 1992, social welfare 1990 and 1995 and earnings 1995); two interaction variables involve two year upper secondary school (no unemployment 1995, and age at immigration); Stockholm is interacted with sick leave benefits 1991; and finally earnings 1995 squared is also included.        373 Note: Regional employment levels are gender specific. In 1990, sick leave benefits were paid from the first day of absence. This rule was changed in 1993 and only paid from the second day of sick leave absence. Variables recorded in 1996 are balanced when an extended model is applied. See text for further details.