How to rand Sample id in Random Forest?

Hi,
I am trying to rand sample ID's in spite of variable ranking in random forest. Is anyone here help me to rank my sample id's.
Thanks

What principle do you want to rank by ?

I got a file e.g sample numbers 1, 2, 3 . . . . and variable A, B, C . . .
If we are running Random forest with feature importance parameter as True, it ranks ABC variables according to their random values. Now what I wanna do is, I want to rank sample numbers (1, 2, 3 . . ) according to vaues of variables. It is requirement of my input file to rank sample numbers on the basis of values of variables. And I wanna work with Random forest and neural network algorithms.

I have a file with samplDs 1,2,3 to 150, and variables Sepal.Length, Sepal.Width, Petal.Length, Petal.Width, and target variable Species. (Iris dataset).

library(randomForest)
library(tidyverse)
(my_iris <- mutate(iris %>% as_tibble(),
                  sample_id = row_number()))
# # A tibble: 150 x 6
# Sepal.Length Sepal.Width Petal.Length Petal.Width Species sample_id
# <dbl>       <dbl>        <dbl>       <dbl> <fct>       <int>
#   1          5.1         3.5          1.4         0.2 setosa          1
# 2          4.9         3            1.4         0.2 setosa          2
# 3          4.7         3.2          1.3         0.2 setosa          3
# 4          4.6         3.1          1.5         0.2 setosa          4
# 5          5           3.6          1.4         0.2 setosa          5
# 6          5.4         3.9          1.7         0.4 setosa          6
# 7          4.6         3.4          1.4         0.3 setosa          7
# 8          5           3.4          1.5         0.2 setosa          8
# 9          4.4         2.9          1.4         0.2 setosa          9
# 10          4.9         3.1          1.5         0.1 setosa         10
# # ... with 140 more rows

(my_forest <- randomForest::randomForest(formula= Species ~ Sepal.Length + 
                                                             Sepal.Width + 
                                                             Petal.Length + 
                                                             Petal.Width ,
                                         data=my_iris,
                                        importance=TRUE))
# Call:
#   randomForest(formula = Species ~ Sepal.Length + Sepal.Width +      Petal.Length + Petal.Width, data = my_iris, importance = TRUE) 
# Type of random forest: classification
# Number of trees: 500
# No. of variables tried at each split: 2
# 
# OOB estimate of  error rate: 4.67%
# Confusion matrix:
#   setosa versicolor virginica class.error
# setosa         50          0         0        0.00
# versicolor      0         47         3        0.06
# virginica       0          4        46        0.08

my_forest$importance
#              setosa      versicolor   virginica   MeanDecreaseAccuracy MeanDecreaseGini
# Sepal.Length 0.029520409  0.0200755630 0.031317932          0.027222880         7.924639
# Sepal.Width  0.007755149 -0.0004422338 0.007087913          0.004987078         2.370104
# Petal.Length 0.309264772  0.2994399041 0.289524287          0.295677873        42.211894
# Petal.Width  0.352249068  0.3200053003 0.275227825          0.312119859        46.740186

what do you want to do with this ?

What I want to do here is to rank sample ID. The RF also will give a classification model on the basis of species difference. But in my file, I want to rank Sample IDs. I've complex IDs in Name column and I want to rank them according to variables present in file (e.g. MACCS1, 2, 3 . . . )
Here is a glimpse of my file, I am attaching here with.!

Screenshot from 2020-06-24 00-46-19|690x388

how should they be ranked?
you havent really said anything definite about the 'algorithm' to apply.
ranked by the first variable ascending, ties broken by the second etc ? isnt that... arbitrary... ?

That is my question.. How to rank them as this is the prime requirement of my query. I am asking about how to do it with random forest.

You can rank them on practically an infinity of ways.
Good luck. I'm going to depart the discussion.

Thanks for your help...!!

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.