Using a for loop to create new columns to a dataframe

Good day all. I have a table called [SPC] with just one column with column name as (Table name) and 20 rows, each row with its table. I am trying to create a for loop to add two new columns: count of rows for each table, count of column for each table.

I get count of rows for each table using:
nrow(sqlQuery(con,F[1]))
[1] 2500011

nrow(sqlQuery(con,F[2]))
[1] 384098
nrow(sqlQuery(con,F[3]))
[1] 1334394,

and get count of columns for each table as:

ncol(sqlQuery(con,F[1]))
[1] 15

where F[i] is the script that spools table for the ith element.

I have used this 'for loop', but still didnt work, kindly assist me.

for (i in 1:nrow(SPC)){
SPC_TABLE$Count_of_row= nrow(sqlQuery(con,F[i]));
SPC_TABLE$Count_of_column= ncol(sqlQuery(con,F[i]))}

view(SPC)

It is not clear to me what the object SPC_TABLE is. Perhaps this is what you want:

SPC$Count_of_row <- 0
SPC$Count_of_column <- 0

for (i in 1:nrow(SPC)){
  SPC[i, "Count_of_row"] = nrow(sqlQuery(con,F[i]));
  SPC[i, "Count_of_column"] = ncol(sqlQuery(con,F[i]))
}
1 Like

Here's an explanation of "why" FJCC's code should work.

With this line of code you provided:

SPC_TABLE$Count_of_row= nrow(sqlQuery(con,F[i]))

You're assigning the number of rows in the data.frame returned by sqlQuery to the entire Count_of_row column. So, each iteration of the loop is replacing the result from before. I'll provide some example data.

for (i in 1:nrow(SPC_TABLE)) {
  SPC_TABLE$Count_of_row <- nrow(tables[[i]])
  SPC_TABLE$Count_of_column <- ncol(tables[[i]])
  message("SPC_TABLE for iteration ", i)
  print(SPC_TABLE)
}
# SPC_TABLE for iteration 1
#   name Count_of_row Count_of_column
# 1    A            3               2
# 2    B            3               2
# 3    C            3               2
# SPC_TABLE for iteration 2
#   name Count_of_row Count_of_column
# 1    A            5               3
# 2    B            5               3
# 3    C            5               3
# SPC_TABLE for iteration 3
#   name Count_of_row Count_of_column
# 1    A            7               1
# 2    B            7               1
# 3    C            7               1

Like I said, this is what should work. To be sure, you'll need to provide a reprex:

sorry , itss SPC, and your solution worked for me. thank you

Thank you, this worked. I also want to add percentage of null to SPC using same format but it gives me a constant percentage for eachrow. please what could be the issue.

SPC_TABLE[i,"Percentage_of_null"]=100*length(which(is.na(sqlQuery(con,F[i]))))/nrow(sqlQuery(con,F[i]))*ncol(sqlQuery(con,F[i]));

The only problem I see with your code is that you probably want to divide the result of length() by the product of nrow() and ncol(). As written, you are dividing by nrow() and multiplying by ncol(). Try

SPC_TABLE[i,"Percentage_of_null"]=100*length(which(is.na(sqlQuery(con,F[i]))))/(nrow(sqlQuery(con,F[i]))*ncol(sqlQuery(con,F[i])));

That change would not fix your problem.
I also suggst that you store the result of sqlQuery(con,F[i]) so that you do not have to repeat that call.

tmp <- sqlQuery(con,F[i]) 
SPC_TABLE[i,"Percentage_of_null"]=100*length(which(is.na(tmp)))/(nrow(tmp)*ncol(tmp));

thank you very much. this was quite helpful. finally, I want to fetch SPC tables for all DB in the server. lets say they are about 10 dbs,
so I see it to work in this way.

firstly: I select a db(table_list) from the list of dbs, where p= select * from each db.
for(j in 1:nrow(DB_LIST)){TABLE_LIST[j]=sqlQuery(con,p[j])}

secondly:
run SPC_TABLE for each table in the db selected which you helped me out on yesterday.

finally:
Return SPC_ANALYTICS_DB= list(SPC_TABLE(j))

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.