Longevities

How many years did businesses in a given category (broad) or type (narrow) survive from 1996 to 2022? Using my data to answer this question is not straightforward. The set of businesses whose first and last appearances on the strip occurred during 1996–2022 is biased towards businesses with short longevities. For any number n ≤ 27, a business with longevity n is included only if it arrived between 1996 and n years before 2022: businesses destined to have longevity n years but that arrived more recently are still operating. And businesses with longevity more than 27 are not included at all.

To attempt to correct for the first bias, this page, unlike all the other pages, presents estimates rather than raw data. Assume that (1) each year the set of businesses that start operation has the same size and the same distribution of longevities. Denote by m the maximum longevity in this distribution and, for each number k ≤ m, denote by n(k) the number of businesses in the distribution destined to have longevity k years. Assume also that (2) m is at most two less than the number of cohorts in the data.

For each number k ≤ m, denote by r(k) the number of businesses whose arrival and departure have both been observed and whose longevity was k years. To use these data to calculate the values of n(k) for each value of k, note that the longevity of a business is observed only if both its arrival and departure occurred between 1996 and 2022, so that for its longevity to be k years it must be a member of one of the cohorts from cohort 2 (businesses that entered in 1996) through cohort 2023 – 1995 + 1 – k. Thus out of a total of 2023 – 1995 + 1 = 29 cohorts, only 29 – 1 – k = 28 – k cohorts of businesses with longevity k are in the data. That is, out of a total of 29n(k) businesses with longevity k that started operating during the survey period, the number in the data is r(k) = (28 – k)n(k). Thus n(k) = r(k)/(28 – k).

The charts on this page show the numbers n(k) calculated in this way, rather than the raw numbers r(k). Assumption (1) is probably not correct for my data and assumption (2) is definitely not correct, but it seems likely that the adjusted numbers reflect better the actual distribution of longevities than do the raw numbers. If assumption (1), though incorrect, is not too far off the mark, then the fact that assumption (2) is false — because some businesses that existed in 1995 are still operating — means that the average longevities I report are underestimates.

Note that a business may have been operating at a location not on the strip before its first appearance and may have continued to operate at a location not on the strip after its last appearance; "longevity" refers to the number of years the business operated on the strip, not necessarily the total number of years it was in operation.






Main page