r - Extrapolate missing data for each group by average percentage of change -


i have data frame containing average income zip code, years 2010-2014. want data years 2015-2017, i'm looking way extrapolate based on yearly average change of each zip code group years available.

for example:

year  zip   income 2010  1111   5000 2011  1111   5500 2012  1111   6000 2013  1111   6500 2014  1111   7000 2010  2222   5000 2011  2222   6000 2012  2222   7000 2013  2222   8000 2014  2222   9000 

should (roughly) have:

year  zip   income 2010  1111   5000 2011  1111   5500 2012  1111   6000 2013  1111   6500 2014  1111   7000 2015  1111   7614 2016  1111   8282 2017  1111   9009 2010  2222   5000 2011  2222   6000 2012  2222   7000 2013  2222   8000 2014  2222   9000 2015  2222   10424 2016  2222   12074 2017  2222   13986 

based on average growth of 8.78% zip code 1111 , 15.83% zip code 2222.

here's quick messy data.table idea

library(data.table)  #create data last_year <- 2014  dt <- data.table(year=rep(2010:last_year,2),              zip=c(rep(1111,5),rep(2222,5)),              income=c(seq(5000,7000,500),seq(5000,9000,1000)))  #future data dt_fut <- data.table(year=rep((last_year+1):2017,2),            zip=c(rep(1111,3),rep(2222,3)),            income=rep(na_integer_,6))  #calculate mean percentage change per year dt[,avg_growth:=mean(diff(log(income))),by=zip] #bind old future data dt <- rbindlist(list(dt,dt_fut),fill=t);setorder(dt,zip,year)  #carry last value forward replace na  dt[,avg_growth:=na.locf(avg_growth),by=zip][,income:=na.locf(income),by=zip]  #calculate # after 2014+1 (2015) replace income  # income*cumulative product of average growth (1+r)-1 dt[year>=last_year+1,income:=income*cumprod(1+avg_growth)-1,by=zip][] 

Comments

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -