r - Anomaly Detection - Correlated Variables -


i working on 'anomaly' detection assignment in r. dataset has around 30,000 records of around 200 anomalous. has around 30 columns & quantitative. of variables highly correlated (~0.9). anomaly mean of records have unusual (high/low) values column(s) while have correlated variables not behaving expected. below example give idea.

suppose vehicle speed & heart rate highly positively correlated. vehicle speed varies between 40 & 60 while heart rate between 55-70.

time_s  steering    vehicle.speed running.distance heart_rate 0       -0.011734953    40         0.251867414      58 0.01    -0.011734953    50         0.251936555      61 0.02    -0.011734953    60         0.252005577      62 0.03    -0.011734953    60         0.252074778      90 0.04    -0.011734953    40         0.252074778      65 

here have 2 types of anomalies. 4th record has exceptionally high value heart_rate while 5th record seems okay if individual columns. can see heart_rate increases speed, expected lower heart rate 5th record while have higher value.

i identify column level anomalies using box plots etc find hard identify second type. somewhere read pca based anomaly detection couldn't find it's implementation in r.

will please me pca based anomaly detection in r scenario. google search throwing time series related stuff not looking for.

note: there similar implementation in microsoft azure machine learning - 'pca based anomaly detection credit risk' job wan't know logic behind & replicate same in r.


Comments

Popular posts from this blog

javascript - Clear button on addentry page doesn't work -

c# - Selenium Authentication Popup preventing driver close or quit -

tensorflow when input_data MNIST_data , zlib.error: Error -3 while decompressing: invalid block type -