How can I divide the numeric data into groups of 500 each, without sorting the data!! ?

I again have some doubt as I am not able to get the desired output.

I have a numerical data. so far I used following code to get the data out of text file. Now I want to divide this data into group of 500 each, without sorting this data.

These are some functions used to gather the data.

> library(stringr)
> library(readr)
> myFile = readLines(file.choose())
> myResult = list() 
> vars = c(which(str_detect(myFile, "^\\[.*\\]\\s*$") == T), length(myFile))
> for(i in 1:(length(vars)-1)){
+   myData = myFile[vars[i]:(vars[i+1]-1)]
+ #remove lines that are comments or blank
+ myData = myData[!str_detect(myData, "^\\s*#|^\\s*$")]
+ #if content is a list of variables, create them as a list
+ if(str_detect(myData[2],"=")){
+   content = str_split(myData[-1],"=")
+   result = lapply(lapply(content,"[",2), parse_guess)
+   names(result) = sapply(content,"[",1)
+ } else{
+   #if the content just a vector of data, extract it
+   result = parse_guess(myData[-1])
+ }
+ #create the variable as a list item and assign the content 
+ myResult[[str_remove_all(myData[1], "\\[\\]")]]=result
+ }
> myFile = myResult$`[specdata0]`
> myFile=myFile[1:(myResult$`[specchannel0]`$fRPMmean*4/myResult$`[specchannel0]`$dF)]
> View(myFile)

myFile S-1


You are not doing this as part of a homework assignment right? In that case, we need to know as we have rules regarding that.

The question you asked has been answered in this post:

No this is not homework. I am learning machine learning using R (Self study).

I had asked same question before.

I got following error.

Can't you simply divide by 500 and then use the quotient as your grouping value?

Your error seems to show that your code is defining a function called 'creategroups', but numbers can't be the names of arguments of a function. It like you're trying to use 'function' to evaluate something using 'myfile' and '4', but that would be what 'function' does. What are you trying to accomplish with your code?


I expanded my function:

createGroups = function(nData, n, nPerGroup = F){
    n = floor(nData / n)
  groups = rep(1:n, floor(nData / n))
  if(nData %% n != 0){
      groups = c(groups, rep(n + 1, nData %% n))
    } else {
      groups = c(groups, 1:(nData %% n))

You now can choose whether n is the number of groups or the number of values per group by changing the parameter nPerGroup to FALSE or TRUE, respectively.


#Split into three groups
createGroups(nData = 14, n = 3, nPerGroup = F)
[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3

#Split into groups of three
createGroups(nData = 14, n = 3, nPerGroup = T)
1 1 1 2 2 2 3 3 3 4 4 4 5 5

In both cases, the function takes care of the grouping when exact matching of groups is not possible.

But I tried something very simple code for dividing the data. It is -

myFile = myResult$[specdata0]

#number of data points to be considered from complete set

myFile=myFile[1:n] #required number of data for analysis
myFile = as.numeric(myFile)

l=round(n/4) # number of data per group

#defining each group

