How can I divide the numeric data into groups of 500 each, without sorting the data!! ?

Hey People,

I again have some doubt as I am not able to get the desired output.

I have a numerical data. so far I used following code to get the data out of text file. Now I want to divide this data into group of 500 each, without sorting this data.

Could someone help me...

Thank You,

These are some functions used to gather the data.

> library(stringr)
> library(readr)
> myFile = readLines(file.choose())
> myResult = list() 
> vars = c(which(str_detect(myFile, "^\\[.*\\]\\s*$") == T), length(myFile))
> for(i in 1:(length(vars)-1)){
+   myData = myFile[vars[i]:(vars[i+1]-1)]
+ #remove lines that are comments or blank
+ myData = myData[!str_detect(myData, "^\\s*#|^\\s*$")]
+ #if content is a list of variables, create them as a list
+ if(str_detect(myData[2],"=")){
+   content = str_split(myData[-1],"=")
+   result = lapply(lapply(content,"[",2), parse_guess)
+   names(result) = sapply(content,"[",1)
+ } else{
+   #if the content just a vector of data, extract it
+   result = parse_guess(myData[-1])
+ }
+ #create the variable as a list item and assign the content 
+ myResult[[str_remove_all(myData[1], "\\[\\]")]]=result
+ }
> myFile = myResult$`[specdata0]`
> myFile=myFile[1:(myResult$`[specchannel0]`$fRPMmean*4/myResult$`[specchannel0]`$dF)]
> View(myFile)

myFile S-1


It seems a lot of people are asking the same question recently :slight_smile:
You are not doing this as part of a homework assignment right? In that case, we need to know as we have rules regarding that.

The question you asked has been answered in this post:

Remember, it's always best to first search the forum for similar questions, as this will save you (and the people who answer the posts) time.

Hope this helps,

Hey Pj,

No this is not homework. I am learning machine learning using R (Self study).

I had asked same question before.

I got following error.

Can't you simply divide by 500 and then use the quotient as your grouping value?

Your error seems to show that your code is defining a function called 'creategroups', but numbers can't be the names of arguments of a function. It like you're trying to use 'function' to evaluate something using 'myfile' and '4', but that would be what 'function' does. What are you trying to accomplish with your code?


I expanded my function:

createGroups = function(nData, n, nPerGroup = F){
    n = floor(nData / n)
  groups = rep(1:n, floor(nData / n))
  if(nData %% n != 0){
      groups = c(groups, rep(n + 1, nData %% n))
    } else {
      groups = c(groups, 1:(nData %% n))

You now can choose whether n is the number of groups or the number of values per group by changing the parameter nPerGroup to FALSE or TRUE, respectively.


#Split into three groups
createGroups(nData = 14, n = 3, nPerGroup = F)
[1] 1 1 1 1 1 2 2 2 2 2 3 3 3 3

#Split into groups of three
createGroups(nData = 14, n = 3, nPerGroup = T)
1 1 1 2 2 2 3 3 3 4 4 4 5 5

In both cases, the function takes care of the grouping when exact matching of groups is not possible.

Hope this helps,

Thanks Pj,

But I tried something very simple code for dividing the data. It is -

myFile = myResult$[specdata0]

#number of data points to be considered from complete set

myFile=myFile[1:n] #required number of data for analysis
myFile = as.numeric(myFile)

l=round(n/4) # number of data per group

#defining each group

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.