r - Number of firms per year using dplyr or datatable -
lets have data frame:
df <- data.frame(city = c("ny", "ny", "ny", "ny", "ny", "la", "la", "la", "la"), yearfrom = c("2001", "2003", "2002", "2006", "2008", "2004", "2005", "2005", "2002"), yearto = c(na, "2005", na, na, "2009", na, "2008", na, na))
where yearfrom year when e.g. firm established , yearto year when canceled. if yearto na still working.
i calculate number of firms every year.
the table should this
city |"year" |"count" "ny" |2001 1 "ny" |2002 2 "ny" |2003 3 "ny" |2004 3 "ny" |2005 2 "ny" |2006 3 "ny" |2007 3 "ny" |2008 4 "ny" |2009 3 "la" |2001 0 "la" |2002 1 "la" |2003 1 "la" |2004 2 "la" |2005 4 "la" |2006 4 "la" |2007 4 "la" |2008 2 "la" |2009 2
i solve dplyr or datatable package can't figure out how?
a shorter tidyverse
solution.
# firsts data prep df <- mutate(df, yearfrom = as.numeric(as.character(yearfrom)), #fix year coding yearto = as.numeric(as.character(yearto)), yearto = coalesce(yearto, max(c(yearfrom, yearto), na.rm = true))) #replace na max df %>% mutate(years = map2(yearfrom, yearto - 1, `:`)) %>% #find years unnest() %>% #spread on rows count(years, city) %>% #count them complete(city, years, fill = list(n = 0)) #add in zeros, if needed
Comments
Post a Comment