dataset - Separate long and complex names in R -
say have following list of full scientific names of plant species inside dataset:
fullspeciesnames <- c("aronia melanocarpa (michx.) elliott", "cotoneaster divaricatus rehder & e. h. wilson","rosa canina l.","ranunculus montanus willd.") i want obtain list of simplified names, i.e first 2 elements of given name, namely:
simplespeciesnames<- c("aronia melanocarpa", "cotoneaster divaricatus", "rosa canina", "ranunculus montanus") how can done in r?
we can use sub match word (\\w+) followed 1 or more white space (\\s+) followed word , space, capture group, , rest of characters (.*). in replacement, use backreference of captured group (\\1)
trimws(sub("^((\\w+\\s+){2}).*", "\\1", fullspeciesnames))
Comments
Post a Comment