r - Anomaly Detection - Correlated Variables -
i working on 'anomaly' detection assignment in r. dataset has around 30,000 records of around 200 anomalous. has around 30 columns & quantitative. of variables highly correlated (~0.9). anomaly mean of records have unusual (high/low) values column(s) while have correlated variables not behaving expected. below example give idea.
suppose vehicle speed & heart rate highly positively correlated. vehicle speed varies between 40 & 60 while heart rate between 55-70.
time_s steering vehicle.speed running.distance heart_rate 0 -0.011734953 40 0.251867414 58 0.01 -0.011734953 50 0.251936555 61 0.02 -0.011734953 60 0.252005577 62 0.03 -0.011734953 60 0.252074778 90 0.04 -0.011734953 40 0.252074778 65
here have 2 types of anomalies. 4th record has exceptionally high value heart_rate while 5th record seems okay if individual columns. can see heart_rate increases speed, expected lower heart rate 5th record while have higher value.
i identify column level anomalies using box plots etc find hard identify second type. somewhere read pca based anomaly detection couldn't find it's implementation in r.
will please me pca based anomaly detection in r scenario. google search throwing time series related stuff not looking for.
note: there similar implementation in microsoft azure machine learning - 'pca based anomaly detection credit risk' job wan't know logic behind & replicate same in r.
Comments
Post a Comment