Grouping data in R based on specific column values -


i have set of data in csv file need group based on transitions of 1 column. i'm new r , i'm having trouble finding right way accomplish this.

simplified version of data:

time    phase    pressure    speed  1        0        0.015      0  2       25        0.015      0  3       25        0.234      0  4       25        0.111      0  5        0        0.567      0  6        0        0.876      0  7       75        0.234      0  8       75        0.542      0  9       75        0.543      0 

the length of time phase changes state longer above shortened make readable , pattern continues on , on. i'm trying calculate mean of pressure , speed each instance phase non-zero. example, in output sample above there 2 lines, 1 average of 3 lines phase 25, , average of 3 lines when phase 75. possible see cases same numeric value of phase shows more once, , need treat each of separately. is, in case phase 0, 0, 25, 25, 25, 0, 0, 0, 25, 25, 0, need record first group , second group of 25s separate events, other non-zero groups.

what i've tried:

`csv <- read.csv("c:\\test.csv")` `ins <- subset(csv,csv$phase == 25)` `exs <- subset(csv,csv$phase == 75)` `mean(ins$pressure)` `mean(exs$pressure)` 

this returns average of entire file when phase 25 , 75, need somehow split groups using trailing , leading 0s. appreciated.

edited: based on feedback asker, seeking aggregations across runs of numbers (i.e. first group of continuous 25s, second group of continuous 25s, , on). because of that, suggest using rle or run-level encoding function, group number can use in aggregate command.

i've modified original data contains 2 runs of 25, illustrative purposes, should work regardless. using rle encoded runs of data, , create group number each row. getting vector of total number of observed lengths, , using rep function repeat each 1 appropriate length.

after done, can use same basic aggregation command again.

df_example <- data.frame(time = 1:9,                          phase = c(0,25,25,25,0,0,25,25,0),                          pressure = c(0.015,0.015,0.234,0.111,0.567,0.876,0.234,0.542,0.543),                          speed = rep(x = 0,times = 9))  encoded_runs <- rle(x = df_example$phase) df_example$group_no <- rep(x = 1:length(x = encoded_runs$lengths),                            times = encoded_runs$lengths)  aggregate(x = df_example[df_example$phase != 0,c("pressure","speed")],           = list(group_no = df_example[df_example$phase != 0,"group_no"],                     phase = df_example[df_example$phase != 0,"phase"]),           fun = mean)    group_no phase pressure speed 1        2    25    0.120     0 2        4    25    0.388     0 

Comments

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -