i have a dataset for intrusion detection system stored in csv file, when i run my script there an error.

library(kernlab)

library(caret)
anomaly<-read.csv("D:\datasets\data\Dataset_Anomaly.csv", na.strings=c(".", "NA", "", "?"), strip.white=TRUE, encoding="UTF-8")
aRow<-nrow(anomaly)
aCol<-ncol(anomaly)

sub<-sample(1:aRow,floor(0.66*aRow))
anomalyTrainingSet<- anomaly[sub,]
anomalyTestSet<- anomaly[-sub,]
anomalyClassifier<- ksvm(AttackType~.,data=anomalyTrainingSet,type = 'C-svc', kernel = 'rbfdot')
Error in if ((type(ret) == "C-svc" || type(ret) == "nu-svc" || type(ret) == :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In .local(x, ...) : Variable(s) `' constant. Cannot scale data.
2: In .local(x, ...) : NAs introduced by coercion
anomalyPrediction<-predict(anomalyClassifier, anomalyTestSet[,-aCol])
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'object' in selecting a method for function 'predict': object 'anomalyClassifier' not found
confusionMatrix(anomalyPrediction,anomalyTestSet[,aCol] )
Error: object 'anomalyPrediction' not found

This is the part where your (first) error occurs. It appears that is code from the function ksvm(). You have not posted enough detail for anyone to do some true troubleshooting (FYI you should try to post a reprex, a "reproducible example", e.g. with some short example data, so others can run your code as well to reproduce the error).
But what this error message refers to, is that R is trying to run the conditions (e.g. type(ret) == "C-svc" etc) but somewhere there are missing values (NAs) instead of actual values (numbers or words). R can't give an output for that, e.g. if type(ret) is NA, it cannot check if it is the same as "C-svc". You need to check your data, or your arguments in ksvm().

dear,
thank you for your notes
can i send dataset to you by mail?

Note:
this script was working correctly with the same dataset before i change my pc and R version

I also note that the error message contains

1: In .local(x, ...) : Variable(s) `' constant. Cannot scale data.
2: In .local(x, ...) : NAs introduced by coercion

If your data frame contains a column with constant values, try removing that column before processing the data.

Dear,
is there function in r to remove constant column while reading csv file?

srv_count serror_rate srv_error_rate rerror_rate srv_rerror_rate same_srv_rate diff_srv_rate srv_diff_host_rate dst_host_count dst_host_srv_count dst_host_same_srv_rate dst_host_diff_srv_rate dst_host_same_src_port_rate dst_host_srv_diff_host_rate dst_host_serror_rate dst_host_srv_serror_rate dst_host_rerror_rate dst_host_srv_rerror_rate AttackType
0.001 0.05 0.1 0 0 0.05 0.1 0 0.218 0.001 0 0.095 0.096 0 0.096 0.1 0 0 Attack
0.001 0.071 0.1 0.029 0 0.014 0.057 0 0.255 0.001 0 0.002 0.003 0 0.002 0.1 0.001 0 Attack
0.003 0 0 0 0 0.1 0 0 0.031 0.255 0.1 0 0.003 0.004 0 0 0 0 Normal
0.013 0 0 0 0 0.1 0 0 0.235 0.255 0.1 0 0 0.001 0 0 0 0 Normal
0.008 0 0 0 0 0.1 0 0 0.021 0.156 0.1 0 0.005 0.004 0 0 0 0 Normal
0.023 0 0 0 0 0.1 0 0.009 0.166 0.255 0.1 0 0.001 0.002 0 0 0 0 Normal
0.003 0 0 0 0 0.1 0 0 0.072 0.072 0.1 0 0.001 0 0 0 0 0 Normal
0.001 0 0 0 0 0.1 0 0 0.098 0.018 0.01 0.005 0.001 0.011 0 0 0 0 Normal
0.001 0 0 0 0 0.1 0 0 0.039 0.255 0.1 0 0.003 0.004 0 0 0 0 Normal
0.013 0 0 0 0 0.1 0 0 0.025 0.255 0.1 0 0.004 0.004 0 0.001 0 0 Normal
0.011 0 0 0 0 0.1 0 0 0.255 0.255 0.1 0 0 0 0 0 0 0 Normal
0.008 0 0 0 0 0.1 0 0 0.049 0.255 0.1 0 0.004 0.002 0 0 0 0 Normal
0.001 0.1 0.1 0 0 0.1 0 0 0.214 0.001 0 0.095 0.096 0 0.096 0.1 0 0 Attack
0.001 0 0 0 0 0.1 0 0 0.033 0.017 0.009 0.012 0.003 0.012 0 0 0 0 Normal
0.005 0 0 0 0 0.1 0 0 0.043 0.255 0.1 0 0.002 0.004 0 0 0 0 Normal
0.315 0 0 0 0 0.1 0 0 0.147 0.002 0.001 0.002 0.001 0 0 0 0 0 Attack
0.001 0.004 0 0.089 0.1 0.001 0.099 0 0.255 0.001 0 0.067 0 0 0.002 0 0.06 0.1 Attack
0.001 0 0 0 0 0.033 0.067 0 0.035 0.012 0.011 0.011 0.003 0.017 0 0 0 0 Normal
0.001 0 0 0 0 0.1 0 0 0.073 0.045 0.062 0.007 0.001 0 0 0 0 0 Normal
0.014 0 0 0 0 0.1 0 0 0.049 0.255 0.1 0 0.002 0.004 0 0 0 0 Normal
0.014 0 0 0 0 0.1 0 0 0.041 0.132 0.1 0 0.002 0.004 0 0 0 0 Normal
0.005 0 0 0 0 0.1 0 0 0.044 0.255 0.1 0 0.002 0.005 0 0 0 0 Normal
0.017 0 0 0 0 0.1 0 0 0.255 0.255 0.1 0 0 0 0 0 0 0 Normal

It looks like none of your columns are constant but some have very few non-zero values. I suspect that when you sample the data to make a training set, you get a subset that does have at least one constant column. You can manually look for constant columns with the summary() function. It will show that the Min and Max are the same in the constant column. You can then remove the column using its column number. Here is an example where I remove the third column.

DF <- data.frame(A = rnorm(5), B = rnorm(5), C = 0, D = rnorm(5))
DF
#>            A          B C          D
#> 1 -1.4695138 -0.2189530 0 -0.9091177
#> 2 -1.0583674 -0.6116602 0 -0.2796064
#> 3 -0.2176584  0.7033546 0 -0.1457569
#> 4  0.7765660  1.5513289 0 -0.2933012
#> 5 -0.7435114  1.3171561 0  0.1938422
summary(DF)
#>        A                 B                 C           D          
#>  Min.   :-1.4695   Min.   :-0.6117   Min.   :0   Min.   :-0.9091  
#>  1st Qu.:-1.0584   1st Qu.:-0.2190   1st Qu.:0   1st Qu.:-0.2933  
#>  Median :-0.7435   Median : 0.7034   Median :0   Median :-0.2796  
#>  Mean   :-0.5425   Mean   : 0.5482   Mean   :0   Mean   :-0.2868  
#>  3rd Qu.:-0.2177   3rd Qu.: 1.3172   3rd Qu.:0   3rd Qu.:-0.1458  
#>  Max.   : 0.7766   Max.   : 1.5513   Max.   :0   Max.   : 0.1938
DF_new <- DF[, -3]
summary(DF_new)
#>        A                 B                 D          
#>  Min.   :-1.4695   Min.   :-0.6117   Min.   :-0.9091  
#>  1st Qu.:-1.0584   1st Qu.:-0.2190   1st Qu.:-0.2933  
#>  Median :-0.7435   Median : 0.7034   Median :-0.2796  
#>  Mean   :-0.5425   Mean   : 0.5482   Mean   :-0.2868  
#>  3rd Qu.:-0.2177   3rd Qu.: 1.3172   3rd Qu.:-0.1458  
#>  Max.   : 0.7766   Max.   : 1.5513   Max.   : 0.1938

Created on 2023-08-15 with reprex v2.0.2
You would need to do something similar with your data frame anomalyTrainingSet. There are many other ways to remove a column but that should be sufficient for this case.

Dear,
in this link there are code with dataset

could you help me to run it?

I have been busy and will be busy for the next few hours. Please try what I suggested and report any specific problems so others can help you.

I had a look at the code and the dataset. The (first) problem is that AttackType is not categorical, it is not a factor. Add anomaly$AttackType <- factor(anomaly$AttackType) before doing sampling and making training and test sets. FYI: always read the documentation for a function; it's fairly clear that for type = "C-svc" your response variable (y) needs to be factors.

While this makes the code run, note that you still get the warning:

Warning message: In .local(x, ...) : Variable(s) `' constant. Cannot scale data.

Using the summary() approach by @FJCC will show you which columns have the same values throughout (meaning they are useless in terms of predicting or explaining anything). Always keep in mind that a warning is just that, a warning, and your code will still run. It is your responsibility to check what the warning is about and whether or not it's something you can ignore, or if you need to fix something.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.