R - Trying to avoid a loop here -


first post ever here, i've been reading lot thanks!

i have huge dataframes many columns, 4 matter here:

dates/classes/names/grades. 

for each date, have several classes (with students), each several people (names - same people in respective classes), each 1 having 1 grade per date.

on first date, retrieve best student per class considering grade, using max[].
however, next dates, want following:

  • if previous best student still in top 3 of class, consider him still best one.
  • else, consider new 1st student best one.

hence, every date depends on previous one.

is possible without loop?
can't find out how, every iteration depends on previous one.

this code below. apologies if it's not optimized!

thanks lot :)

for (i in (1:(length(horizon)-1))) #horizon vector of dates {     uni3 <- dataaf[dataaf[,1] == as.numeric(horizon[i]),]     #dataaf contains data, keep date considered date      if (i == 1)                             #we take best student per class     {         selecdate <- data.frame()                             #selecdate dataframe containing best people date          (z in (1:15)    #15 classes         {             selecsec <- na.omit(uni3[uni3[,14] == z,])                 #classes column 14             ligneselec <- max(selecsec[,13])                          #grades column 13             selecsec <- data.frame(uni3[match(ligneselec,uni3[,13]),])             selecdate <- rbind(selecdate,selecsec)         }     }      else {              #we keep student if in previous top 3, else take best 1         selecdate <- data.frame()          (z in (1:15))         {             lastsec <- na.omit(lastdate[lastdate[,14] == z,])         #last results              #retrieving top 3 people date             selecsec <- na.omit(uni3[uni3[,14] == z,])             newligneselec <- tail(sort(selecsec[,13]),3)             selecsec <- data.frame(selecsec[rev(match(newligneselec,selecsec[,13])),])              if((length(match(selecsec[,3],lastsec[,3])[!is.na(match(selecsec[,3],lastsec[,3]))]) == 0))              {                 ligneselec <- max(selecsec[,13])                 selecsec <- data.frame(uni3[match(ligneselec,uni3[,13]),])             }              else              {                 selecsec <- lastsec             }               selecdate <- rbind(selecdate,selecsec)         }     }      lastdate <- selecdate #recording last results } 

edit : here example.

  • in date 1, john , audrey both selected in class 1 , 2.
  • on date 2, john still among best 3, remains selected, while audrey 4th jim (ranked 1st date 2) replaces her.
  • on date 3, john still among best 3, remains selected (no ties issues in data work on). jim 4th, sandra takes place.

    structure(list(dates = structure(c(1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 1l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 2l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l, 3l), .label = c("12/02", "13/02", "14/02" ), class = "factor"), classes = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2 ), names = structure(c(6l, 3l, 9l, 7l, 1l, 8l, 4l, 10l, 5l, 2l, 6l, 3l, 9l, 7l, 1l, 8l, 4l, 10l, 5l, 2l, 6l, 3l, 9l, 7l, 1l, 8l, 4l, 10l, 5l, 2l), .label = c("ashley", "audrey", "bob", "denis", "jim", "john", "kim", "sandra", "terry", "tim"), class = "factor"), grades = c(10, 5, 3, 2, 1, 3, 4, 5, 6, 7, 8, 2, 10, 9, 1, 7, 5, 1, 8, 2, 5, 1, 4, 8, 8, 7, 6, 5, 4, 3)), .names = c("dates", "classes", "names", "grades"), row.names = c(na, -30l), class = "data.frame")

edited reflect clarified request in comments.

###---------- creating data (may different had in mind) # classes , students classes <- c("u.s. history", "english", "nonlinear optimization") students <- c("james", "jamie", "john", "jim", "jane", "jordan", "jose") df.1 <- expand.grid(classes = classes, students = students, stringsasfactors = t) # generate dates dates.seq <- seq(as.date("2017/2/10"), as.date("2017/3/27"), "days") df.2 <- merge(dates.seq, df.1) # generate grades grading <- c(4.0, 3.7, 3.3, 3.0, 2.7, 2.3, 2.0, 1.7) grades <- sample(grading, size = dim(df.2)[1], replace = t, prob = grading/sum(grading)) # smart students df <- data.frame(df.2, grades) colnames(df) <- c("dates","classes","students","grades")  # works assuming df has following labeled , formatted columns str(df) #'data.frame':  966 obs. of  4 variables: #  $ dates   : date, format: "2017-02-10" "2017-02-11" "2017-02-12" ... #  $ classes : factor w/ 3 levels "u.s. history",..: 1 1 1 1 1 1 1 1 1 1 ... #  $ students: factor w/ 7 levels "james","jamie",..: 1 1 1 1 1 1 1 1 1 1 ... #  $ grades  : num  2.3 3.3 2.3 3.3 2.7 4 4 1.7 2.3 4 ...  # no aggregateion, splitting classes df.split1 <- split(df, df[,"classes"]) # splitting each of lists dates df.split2 <- lapply(df.split1, function(x) split(x, x[,"dates"])) # double lapply becuase have lists within lists top1 <- lapply(df.split2, function(i) lapply(i, function(j) j[order(-j[,"grades"])[1], "students"])) top3 <- lapply(df.split2, function(i) lapply(i, function(j) j[order(-j[,"grades"])[1:3], "students"]))  # easier read allclasses <- levels(df[,"classes"]) alldates <- unique(df[,"dates"])  # initialize matrix keep track of changes in top1 , top3 superstar <- matrix(na, nrow = length(alldates), ncol = length(allclasses),                      dimnames = list(as.character(alldates), allclasses))  # looping for(date in 1:length(alldates)){   for(class in allclasses){     if(date == 1){        # first newtop1 = first top1        superstar[date, class] <- unlist(top1[[class]][date])     } else {       # if superstar in date-1 in top3 of date now,       if(superstar[date-1, class] %in% as.numeric(unlist(top3[[class]][date]))){         # still superstar         superstar[date,class] <- superstar[date-1, class]       } else{         # new superstar highest scorer of date         superstar[date,class] <- unlist(top1[[class]][date])       }     }   } } # painful me trying figure out how convert superstar numbers names worked superstar.char <- as.data.frame(matrix(levels(df[,"students"])[superstar], ncol = length(allclasses))) dimnames(superstar.char) <- dimnames(superstar) superstar.char # superstar students characters  

let me know if have difficulties!


Comments

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -