r - Extrapolate missing data for each group by average percentage of change -
i have data frame containing average income zip code, years 2010-2014. want data years 2015-2017, i'm looking way extrapolate based on yearly average change of each zip code group years available.
for example:
year zip income 2010 1111 5000 2011 1111 5500 2012 1111 6000 2013 1111 6500 2014 1111 7000 2010 2222 5000 2011 2222 6000 2012 2222 7000 2013 2222 8000 2014 2222 9000
should (roughly) have:
year zip income 2010 1111 5000 2011 1111 5500 2012 1111 6000 2013 1111 6500 2014 1111 7000 2015 1111 7614 2016 1111 8282 2017 1111 9009 2010 2222 5000 2011 2222 6000 2012 2222 7000 2013 2222 8000 2014 2222 9000 2015 2222 10424 2016 2222 12074 2017 2222 13986
based on average growth of 8.78% zip code 1111 , 15.83% zip code 2222.
here's quick messy data.table idea
library(data.table) #create data last_year <- 2014 dt <- data.table(year=rep(2010:last_year,2), zip=c(rep(1111,5),rep(2222,5)), income=c(seq(5000,7000,500),seq(5000,9000,1000))) #future data dt_fut <- data.table(year=rep((last_year+1):2017,2), zip=c(rep(1111,3),rep(2222,3)), income=rep(na_integer_,6)) #calculate mean percentage change per year dt[,avg_growth:=mean(diff(log(income))),by=zip] #bind old future data dt <- rbindlist(list(dt,dt_fut),fill=t);setorder(dt,zip,year) #carry last value forward replace na dt[,avg_growth:=na.locf(avg_growth),by=zip][,income:=na.locf(income),by=zip] #calculate # after 2014+1 (2015) replace income # income*cumulative product of average growth (1+r)-1 dt[year>=last_year+1,income:=income*cumprod(1+avg_growth)-1,by=zip][]
Comments
Post a Comment