dataset - Separate long and complex names in R -
say have following list of full scientific names of plant species inside dataset:
fullspeciesnames <- c("aronia melanocarpa (michx.) elliott", "cotoneaster divaricatus rehder & e. h. wilson","rosa canina l.","ranunculus montanus willd.")
i want obtain list of simplified names, i.e first 2 elements of given name, namely:
simplespeciesnames<- c("aronia melanocarpa", "cotoneaster divaricatus", "rosa canina", "ranunculus montanus")
how can done in r?
we can use sub
match word (\\w+
) followed 1 or more white space (\\s+
) followed word , space, capture group, , rest of characters (.*
). in replacement, use backreference of captured group (\\1
)
trimws(sub("^((\\w+\\s+){2}).*", "\\1", fullspeciesnames))
Comments
Post a Comment