DBscan anomaly detection slowness issue

For discussions related to modeling, machine learning and deep learning. Related packages include caret, modelr, yardstick, rsample, parsnip, tensorflow, keras, cloudml, and tfestimators.

Hi All,

I am trying to build anomaly detection code. But my code is taking a huge time. Please let me know how can i speed up the process .

Thanks in advance for your help

### Please find below code

##model creation
dataFile <- read.csv("filepth\xx.txt",header = FALSE, sep = ";")

dataFile$TimeTag <-as.POSIXct(paste(dataFile$V1,dataFile$V2))

for creating model by training data on 25000 rows

dataFile <- dataFile[0:25000,]

75% of the sample size

smp_size <- floor(0.80 * nrow(dataFile))

set the seed to make your partition reproducible

set.seed(123)
train_ind <- sample(seq_len(nrow(dataFile)), size = smp_size)

#df <- df[ -c(1, 3:6, 12) ]

train <- dataFile[train_ind, ]
test <- dataFile[-train_ind, ]

##training data consists of all 14 columns
train1 <- train[-c(1,2,15:16)]
##model creation which will be loaded while prediction
res.fpc1 <- fpc::dbscan(train1, eps = 0.1, MinPts = 5,scale =FALSE,method = c("hybrid"))

prediction

library(shiny)
library(shinydashboard)
library(DT)
library(data.table)
library(shinyalert)
library(shinycssloaders)
library(stringr)
library(dygraphs)
library(datasets)
library(xts)
#library(mlbench)
library(caret)
library(dbscan)
library(fpc)
library(lubridate)

t1<-Sys.time()

dataFile <- read.csv("filepathxxxx",header = FALSE, sep = ";")

##training data
dataFile_train <- dataFile[0:25000,]

dataFile$TimeTag <-as.POSIXct(paste(dataFile$V1,dataFile$V2))

newdata <- dataFile

dataFile1 <- dataFile[-c(1,2,15)]

75% of the sample size

smp_size <- floor(0.80 * nrow(dataFile))

set the seed to make your partition reproducible

set.seed(123)
train_ind <- sample(seq_len(nrow(dataFile)), size = smp_size)

#df <- df[ -c(1, 3:6, 12) ]

##predicting for 5th and 7 th column
dataFile_train <- dataFile_train[-c(1,2,3,4,6,8:16)]
newdata1 <- newdata[-c(1,2,3,4,6,8:16)]

##predicting for 5th columnonly
#newdata1 <- newdata[-c(1:4,6:16)]
#dataFile_train <- dataFile_train[-c(1:4,6:16)]

#loading model
model2 <- readRDS("\filepath\xxx.rda")

##prediction

y_pred<- predict.dbscan(res.fpc1,dataFile_train, newdata = newdata1)

newdata1["Predictions"] = y_pred
newdata1["time"] = newdata[,"TimeTag"]

anomalyDataframe <- subset(newdata1, Predictions == 0)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.