Big BLS employment data, disability, and worklife expectancy

Big Data. Bureau of Labor Statistics. Survey data. Employment Big Data.  Those are all things that calculating worklife expectancy for U.S. workers requires.  Worklife expectancy is similar to life expectancy and indicates how long a person can be expected to be active in the workforce over their working life.  The worklife expectancy figure takes into account the anticipated to time out of the market due to unemployment, voluntary leaves, attrition, etc.

Overall the goal of our recent work is to update the Millimet et al (2002) worklife expectancy paper and account for more recent CPS data. In addition we also wanted to supplement and expand on a few additional topics. The additional topics included looking at different definitions of educational attainment,  adding in reported disability, and looking at occupational effects on worklife expectancy.

Finding: We also looked at the worklife expectancy for individuals with and without a reported disability. Disability was not covered in the Millimet et al. (2002) paper. As has been well reported, the disability measure in the BLS data is very general in nature. Accordingly the applicability of the BLS disability measure to litigation is somewhat limited. However it is interesting to note that there is a substantial reduction in worklife expectancy exhibited by individuals who reported have a disability. On average the difference is about 10 years of work life. This is consistent with other studies on disability that a relied on the BLS data. Other factors such as occupation and geographical region do not appear to have much impact on WLE estimates.

Younger workers today have slightly less attachment to the workforce than younger workers in the past

Big Data. Bureau of Labor Statistics. Survey data. Employment Big Data.  Those are all things that calculating worklife expectancy for U.S. workers requires.  Worklife expectancy is similar to life expectancy and indicates how long a person can be expected to be active in the workforce over their working life.  The worklife expectancy figure takes into account the anticipated to time out of the market due to unemployment, voluntary leaves, attrition, etc.

The goal of our recent work is to update the Millimet et al (2002) worklife expectancy paper and account for more recent CPS data. Their paper uses data from  the 1992 to 2000 time period. Our goal is to update that paper using data from 2000 to 2013 and  see if estimating the Millimet et al (2002) econometric worklife models with more recent data changes the results in the 2002 paper in any substantive way.

Finding: Overall, the worklife expectancy estimated using more recent data from 2000-2013 is shorter then in the earlier time period (1992-2000) data set. This is true for younger worker (18-early 40’s); younger workers from the more recent cohorts have a shorter expected work life then younger workers in the earlier cohorts.  Conversely, while older workers in their 40s and 50s have a slightly longer worklife expectancy in the later time period data set. We are in the process of determining the statistical significance of these differences.

Table 4. Comparsion of Worklife Expectancy for 1992-2000 and 2001-2013 Time Periods
1992-2000 2001-2013
Age Less than High School High School Less than High School High School
18 31.469 38.410 30.569 37.314
19 30.926 37.846 30.128 36.833
20 30.306 37.180 29.603 36.237
21 29.670 36.493 29.021 35.590
22 29.027 35.787 28.419 34.917
23 28.365 35.054 27.809 34.231
24 27.685 34.293 27.205 33.539
25 27.007 33.518 26.588 32.830
26 26.319 32.728 25.964 32.108
27 25.643 31.939 25.357 31.387
28 24.958 31.123 24.736 30.646
29 24.271 30.304 24.110 29.892
30 23.590 29.481 23.491 29.136
31 22.892 28.640 22.866 28.371
32 22.191 27.796 22.237 27.599
33 21.487 26.944 21.606 26.819
34 20.783 26.097 20.970 26.034
35 20.095 25.254 20.327 25.239
36 19.400 24.408 19.685 24.446
37 18.707 23.560 19.039 23.648
38 18.018 22.714 18.392 22.850
39 17.324 21.864 17.737 22.044
40 16.627 21.014 17.085 21.242
41 15.944 20.169 16.421 20.432
42 15.264 19.328 15.764 19.627
43 14.595 18.494 15.110 18.825
44 13.931 17.664 14.456 18.024
45 13.272 16.840 13.798 17.220
46 12.616 16.018 13.154 16.429
47 11.972 15.204 12.520 15.641
48 11.328 14.398 11.886 14.859
49 10.682 13.593 11.259 14.081
50 10.053 12.803 10.642 13.311
51 9.432 12.020 10.030 12.550
52 8.802 11.239 9.429 11.798
53 8.199 10.477 8.843 11.057
54 7.593 9.723 8.270 10.333
55 6.996 8.980 7.709 9.618
56 6.422 8.263 7.152 8.912
57 5.872 7.564 6.618 8.230
58 5.339 6.883 6.095 7.560
59 4.812 6.216 5.587 6.908
60 4.307 5.578 5.097 6.280
61 3.840 4.979 4.624 5.677
62 3.400 4.415 4.181 5.112
63 3.024 3.918 3.782 4.593
64 2.708 3.485 3.428 4.128
65 2.422 3.093 3.109 3.700
66 2.180 2.756 2.819 3.312
67 1.970 2.461 2.556 2.960
68 1.787 2.200 2.323 2.646
69 1.624 1.967 2.102 2.359
70 1.471 1.756 1.905 2.101
71 1.348 1.584 1.728 1.869
72 1.238 1.430 1.577 1.670
73 1.134 1.289 1.427 1.484
74 1.042 1.167 1.296 1.322
75 0.965 1.065 1.184 1.181
76 0.904 0.983 1.077 1.054
77 0.834 0.899 0.980 0.942
78 0.784 0.836 0.894 0.843
79 0.735 0.778 0.807 0.750
80 0.694 0.735 0.675 0.636

Notes:

The econometric model described by Millimet  et al (2002) and logistic regression equations by gender and education are used to calculate the worklife expectancy estimates.   The worklife model iin the left panel of the table is estimated using matched CPS cohorts from 1992–2000 time period as described in the Millimet et al. (2002) paper.   The model on the right panel is estimated using data from 2001-2013.

The logistic equation includes independent variable for age, age squared, race, race by age interaction, race by age interaction squared, marital status, martial status by age, occupation dummies, year and year dummies.

The model is first estimated separately for each gender and education level combination for active persons.  The model is then estimated again for inactive persons.  The educational attainment variables used to estimate our model differ from that of Millimet et al. (2002)   In our model, only individuals whose highest level of attainment is high school are included in the high school category.  Millimet et al (2002) includes individuals with some college in the high school category.

Replication of the Millimet et al. (2002) work was sufficient and yielded similar results

Big Data. Bureau of Labor Statistics. Survey data. Employment Big Data.  Those are all things that calculating worklife expectancy for U.S. workers requires.  Worklife expectancy is similar to life expectancy and indicates how long a person can be expected to be active in the workforce over their working life.  The worklife expectancy figure takes into account the anticipated to time out of the market due to unemployment, voluntary leaves, attrition, etc

Overall the goal of our recent work is to update the Millimet et al (2002) worklife expectancy paper and account for more recent CPS data. Their paper uses data from  the 1992 to 2000 time period. Our goal is to update that paper using data from 2000 to 2013. The main goal of the paper is to see if estimating the Millimet et al (2002) econometric worklife models with more recent data changes the results in the 2002 paper in any substantive way

As for the results, overall there are several findings. First we were able to create a match CPS data set of 201,797 individuals where as the Millimet et al. (2002) found 200,916 matched individuals.

Overall we match their results very closely as well.  For example Millimet et al. (2002) found that a male who was 26 years old with a less than a high school education had a 27.27 years WLE remaining while we found that person had 26.319 years remaining based on our replication of their work. They found that the same age person with a high school had 32.89 years remaining while we found 32.728 years remaining. The replication was particularly good for both less than high school and high school levels of educational attainment.

The WLE  numbers are close but not quite as close for college and some college. This is primarily due to the fact that we use different definitions of some college and college then Millimet et al. (2002)  did in their 2002 paper

Table 3. Comparsion of Millimet et al. (2002) and Steward and Gaylor (2015) Active to Active Worklife Expectancy Probabilities
Millimet et al (2002) Steward and Gaylor (2015) Replication
Age Less than High School High School Less than High School High School
18 32.331 38.944 31.469 38.410
19 31.801 38.239 30.926 37.846
20 31.247 37.522 30.306 37.180
21 30.684 36.794 29.670 36.493
22 30.080 36.058 29.027 35.787
23 29.450 35.294 28.365 35.054
24 28.766 34.513 27.685 34.293
25 28.035 33.711 27.007 33.518
26 27.270 32.890 26.319 32.728
27 26.495 32.052 25.643 31.939
28 25.710 31.201 24.958 31.123
29 24.923 30.341 24.271 30.304
30 24.131 29.477 23.590 29.481
31 23.345 28.606 22.892 28.640
32 22.556 27.735 22.191 27.796
33 21.775 26.862 21.487 26.944
34 21.006 25.989 20.783 26.097
35 20.233 25.112 20.095 25.254
36 19.452 24.240 19.400 24.408
37 18.681 23.370 18.707 23.560
38 17.921 22.504 18.018 22.714
39 17.178 21.641 17.324 21.864
40 16.459 20.782 16.627 21.014
41 15.734 19.928 15.944 20.169
42 15.031 19.081 15.264 19.328
43 14.333 18.242 14.595 18.494
44 13.669 17.410 13.931 17.664
45 13.020 16.588 13.272 16.840
46 12.381 15.775 12.616 16.018
47 11.758 14.974 11.972 15.204
48 11.144 14.185 11.328 14.398
49 10.538 13.409 10.682 13.593
50 9.952 12.646 10.053 12.803
51 9.379 11.898 9.432 12.020
52 8.836 11.167 8.802 11.239
53 8.299 10.459 8.199 10.477
54 7.775 9.772 7.593 9.723
55 7.265 9.107 6.996 8.980
56 6.767 8.456 6.422 8.263
57 6.261 7.829 5.872 7.564
58 5.800 7.236 5.339 6.883
59 5.397 6.678 4.812 6.216
60 5.016 6.153 4.307 5.578
61 4.678 5.672 3.840 4.979
62 4.350 5.225 3.400 4.415
63 4.060 4.815 3.024 3.918
64 3.797 4.420 2.708 3.485
65 3.574 4.061 2.422 3.093
66 3.395 3.741 2.180 2.756
67 3.224 3.445 1.970 2.461
68 3.047 3.162 1.787 2.200
69 2.873 2.886 1.624 1.967
70 2.691 2.621 1.471 1.756
71 2.528 2.401 1.348 1.584
72 2.362 2.196 1.238 1.430
73 2.170 1.999 1.134 1.289
74 2.002 1.829 1.042 1.167
75 1.898 1.672 0.965 1.065
76 1.743 1.533 0.904 0.983
77 1.592 1.449 0.834 0.899
78 1.514 1.339 0.784 0.836
79 1.461 1.274 0.735 0.778
80 1.374 1.172 0.694 0.735
81 1.273 1.046 0.661 0.687
82 1.222 0.993 0.631 0.656
83 1.121 0.912 0.604 0.623
84 0.874 0.755 0.569 0.585
85 0.433 0.355 0.522 0.532

Notes:

The econometric model described by Millimet  et al (2002) and logistic regression equations by gender and education are used to calculate the worklife expectancy estimates.   The model is estimated using matched CPS cohorts from 1992–2000 time period as described in the Millimet et al. (2002) paper.  The logistic equation includes independent variable for age, age squared, race, race by age interaction, race by age interaction squared, marital status, martial status by age, occupation dummies, year and year dummies.  The model is first estimated separately for each gender and education level combination for active persons.  The model is then estimated again for inactive persons.

 

Steward and Gaylor (2015) Matched CPS Sample Sizes for 1993-2013 time period

Big Data. Bureau of Labor Statistics. Survey data. Employment Big Data.  Those are all things that calculating worklife expectancy for U.S. workers requires.  Worklife expectancy is similar to life expectancy and indicates how long a person can be expected to be active in the workforce over their working life.  The worklife expectancy figure takes into account the anticipated to time out of the market due to unemployment, voluntary leaves, attrition, etc.

Overall the goal of our recent work is to update the Millimet et al (2002) worklife expectancy paper and account for more recent CPS data.

The data for all years is shown below.  Ultimately there were over 590,000 data points used in the analysis.

Table 2.  Matched CPS Sample Sizes 1993-2013
Female Male
Year Less than High School High School Some College College Less than High School High School Some College College Total
1993 3,766 7,326 4,898 3,452 3,376 5,619 4,280 3,935 36,652
1994 3,539 7,019 5,357 3,619 3,097 5,477 4,411 4,013 36,532
1995 3,082 6,161 5,086 3,545 2,664 4,815 4,086 3,938 33,377
1997 3,079 6,172 4,771 3,488 2,723 4,857 3,926 3,723 32,739
1998 2,839 6,113 4,873 3,672 2,694 4,952 3,995 3,834 32,972
1999 2,709 6,027 4,987 3,770 2,513 4,830 4,134 3,923 32,893
2000 2,692 5,930 5,009 3,915 2,463 4,899 4,052 4,204 33,164
2001 2,545 5,806 4,971 3,901 2,458 4,919 4,232 4,016 32,848
2003 1,096 3,218 2,579 2,411 1,019 2,701 2,122 2,470 17,616
2004 2,579 6,372 5,803 5,009 2,394 5,307 4,745 4,819 37,028
2005 2,039 5,378 5,146 4,673 1,867 4,632 4,270 4,285 32,290
2006 2,297 5,500 5,608 4,657 2,131 4,953 4,263 4,389 33,798
2007 2,147 5,730 5,466 5,060 2,076 5,133 4,344 4,592 34,548
2008 2,159 5,659 5,787 5,281 2,040 5,212 4,593 4,826 35,557
2009 2,027 5,637 5,780 5,556 2,023 5,062 4,776 4,976 35,837
2011 1,845 4,844 5,106 5,136 1,786 4,603 4,176 4,432 31,928
2012 1,733 4,849 4,930 4,956 1,779 4,693 4,151 4,616 31,707
2013 1,658 4,542 5,061 5,109 1,668 4,579 4,271 4,650 31,538
Total 43,831 102,283 91,218 77,210 40,771 87,243 74,827 75,641 593,024

Notes:

The CPS data was matched using the algorithm similar to Millimet et al (2002) and Peracchi and Welch (1995).  Households in rotation 1-4 were matched using the household identifier number to the same household in rotations 5-8 of the following year. Individuals had to have the same sex, race and be a year older in rotation 5-8 to be determined a match.

 

Comparsion of CPS matched data sets – Millmet et al (2002) to Steward and Gaylor (2015)

Big Data. Bureau of Labor Statistics. Survey data. Employment Big Data.  Those are all things that calculating worklife expectancy for U.S. workers requires.  Worklife expectancy is similar to life expectancy and indicates how long a person can be expected to be active in the workforce over their working life.  The worklife expectancy figure takes into account the anticipated to time out of the market due to unemployment, voluntary leaves, attrition, etc.

Overall the goal of our recent work is to update the Millimet et al (2002) worklife expectancy paper and account for more recent CPS data. Their paper uses data from  the 1992 to 2000 time period. Our goal is to update that paper using data from 2000 to 2013. The main goal of the paper is to see if estimating the Millimet et al (2002) econometric worklife models with more recent data changes the results in the 2002 paper in any substantive way.

 

Our approach is two fold.  First we matched the BLS data cohorts based on the Millimet et al. (2002) and Peracchi and Welch (1995) papers. In a nutshell the CPS matching routine involves matching incoming and outgoing cohorts across a given year.  Once the data is matched, we then look at the work status of the individuals to determine if they were active or in active across the year that they were interviewed by the BLS. . We were able to create a match CPS data set of 201,797 individuals where as the Millimet et al. (2002) found 200,916 matched individuals.

Table 1. Comparsion of CPS cohort matched data sets
Year Millimet et al.  (2002) Steward and Gaylor (2015)
1992/93 37,709 36,652
1994/95 34,418 33,377
1996/97 31,691 32,739
1997/98 32,276 32,972
1998/99 32,083 32,893
1999/2000 32,739 33,164
Total 200,916 201,797

Notes:

The CPS data was matched using the algorithm similar to Millimet et al (2002) and Peracchi and Welch (1995).  Households in rotation 1-4 were matched using the household identifier number to the same household in rotations 5-8 of the following year. Individuals had to have the same sex, race and be a year older in rotation 5-8 to be determined a match.

 

Worklife expectancy for U.S. workers – updating the Millimet et al. (2002) econometric model of worklife

Big Data. Bureau of Labor Statistics. Survey data. Employment Big Data.  Those are all things that calculating worklife expectancy for U.S. workers requires.  Worklife expectancy is similar to life expectancy and indicates how long a person can be expected to be active in the workforce over their working life.  The worklife expectancy figure takes into account the anticipated to time out of the market due to unemployment, voluntary leaves, attrition, etc.
Overall the goal of our recent work is to update the Millimet et al (2002) worklife expectancy paper and account for more recent CPS data. Their paper uses data from  the 1992 to 2000 time period. Our goal is to update that paper using data from 2000 to 2013. The main goal of the paper is to see if estimating the Millimet et al (2002) econometric worklife models with more recent data changes the results in the 2002 paper in any substantive way.
In addition we also wanted to supplement and expand on a few additional topics. The additional topics included looking at different definitions of educational attainment,  adding in reported disability, and looking at occupational effects on worklife expectancy.
Our approach is two fold.  First we matched the BLS data cohorts based on the Millimet et al. (2002) and Peracchi and Welch (1995) papers. In a nutshell the CPS matching routine involves matching incoming and outgoing cohorts across a given year.  Once the data is matched, we then look at the work status of the individuals to determine if they were active or in active across the year that they were interviewed by the BLS.
Using this matched data we next replicated the work of Millimet et al. (2002)  using the 1992 to 2000 CPS data as they did in their paper. In general the Millimet et al. (2002) econometric model uses a standard logistic regression framework to estimate transitional probabilities based on a two state labor market  framework where a person is either active or in active in the workforce.
The methodology begins by estimating logistic regression using individuals who were active when first interviewed.  Independent variables such as the occupation, gender, marital status and number of children are included in the logistic regression. A separate regression is estimated for individuals who are inactive at the start of the BLS interview.  Separate active and inactive regressions are also estimated for certain factors of interest, such as education attainment level and reported disability status.
The logistic regression equations provide the probabilities that are conditional on the labor force attachment of the individual at the time of the interview. The conditional probabilities yield the transitional probabilities for initially active or in active individuals. For example, a person who is active at the start of a period could be either active or inactive in the next period.  The transitional probabilities obtained from the logistic regression is used to calculate the probability that a person who is active at the start of a period could be either active or inactive in the next period in this example.
As described in the Millimet et al (2002) paper, the expected work life for each age is obtained recursively by working backwards from an assumed terminal year (T+ 1).  The terminal year is the year in which after no one is assumed to be active.  In the analysis a terminal age of 80 or 85 is used.
Using the model we began by replicating the Millimet et al. (2002) econometric model.  After we replicated the model, we then performed some additional work and expanded logistic regression worklife equations.  The results of our estimation are shown in the tables that are attached.
As for the results, overall there are several findings. First we were able to create a match CPS data set of 201,797 individuals where as the Millimet et al. (2002) found 200,916 matched individuals.
Overall we match their results very closely as well.  For example Millimet et al. (2002) found that a male who was 26 years old with a less than a high school education had a 27.27 years WLE remaining while we found that person had 26.319 years remaining during our replication. They found that the same age person with a high school had 32.89 years remaining while we found 32.728 years remaining. The replication was particularly good for both less than high school and high school levels of educational attainment.
The WLE  numbers are close but not quite as close for college and some college. This is primarily due to the fact that we use different definitions of some college and college then Millimet et al. (2002)  did in their 2002 paper.
Overall, the worklife expectancy estimated using more recent data from 2000-2013 is shorter then in the earlier time period (1992-2000) data set. This is true for younger worker (18-early 40’s); younger workers from the more recent cohorts have a shorter expected work life then younger workers in the earlier cohorts.  Conversely, while older workers in their 40s and 50s have a slightly longer worklife expectancy in the later time period data set. We are in the process of determining the statistical significance of these differences.
We also looked at the worklife expectancy for individuals with and without a reported disability. Disability was not covered in the Millimet et al. (2002) paper. As has been well reported, the disability measure in the BLS data is very general in nature. Accordingly the applicability of the BLS disability measure to litigation is somewhat limited. However it is interesting to note that there is a substantial reduction in worklife expectancy exhibited by individuals who reported have a disability. On average the difference is about 10 years of work life. This is consistent with other studies on disability that a relied on the BLS data. Other factors such as occupation and geographical region do not appear to have much impact on WLE estimates.

Couch surfing: what do U.S. BLS surveys have to say about it?

According to dictionary.com:

[kouch-surf] couch surfing: sleeping on the couch or extra bed of an acquaintance when traveling or between permanent lodging places, esp. to save money.

 

Couch surfing, is an alternative way of living and traveling, especially among the young,  There are even websites, like https://www.couchsurfing.org/, dedicated to making couch surfing matches.

 

The prevalence of couch surfing can be measured to a good degree by U.S BLS Consumer Expenditure Survey data.  The table below shows the break down of who owns outright (1), owns with a mortgage (2), rents (3), stays without rent  (4), and who stays in a dorm (5).

couchsurf