Multiply column with other columns (data type problem)

mesisa · October 8, 2021, 10:18am

Hi!

I have a large matrix (defined as a data frame). I want to multiply each value of the variables (here: V1 to V10) rowwise with a weight (here: key).

Here are some sample data:

library(tidyverse)

mx = matrix(data = c("abc", 0,1,2,3,4,5,6,7,8,9), nrow = 11, ncol = 11) # create some matrix
key = c(NA, 1,1,1,0.5,1,0.3,0.7,1,1,0.5) # define the weights
test = as.data.frame(cbind(mx, key)) # put the matrix and the weights together

In my real data, I would have some text above and on the left side of my data (because there are several col-/rownames - I didn't do the data).

The main problem is, that I have mixed data types, and that everything is formated as charachter. That's why my code to calculate the new values doesn't work:

test = test$key * test[,1:10]

# or just
test = test$key * test

Is there any trick how to deal with different data types? I tried to select only the parts of the columns which should be numeric, but they remain formated as character:

test[2:11,] = sapply(test[2:11,], as.numeric)

# or with:
test[2:11,]  = numeric()

I hope, I was clear enough with describing my problem.

nirgrahamuk · October 8, 2021, 10:57am

I think you need to fundamentally do your work in a different way.
numeric data should be numeric and character information should be character.
Mashing everything together as character and expecting to easily numerically compute on it is naive. Seperate your data by type at the earliest opportunity.
I don't understand if the data has reached you in this integrated way, or whether you have decided to integrate it. If the latter , then just rethink that choice.
if the former, you would untangle the data something like:

library(tidyverse)

(test <- tibble::tribble(
  ~V1,   ~V2,   ~V3,   ~V4,   ~V5,   ~V6,   ~V7,   ~V8,   ~V9,  ~V10,  ~V11,  ~key,
  "abc", "abc", "abc", "abc", "abc", "abc", "abc", "abc", "abc", "abc", "abc",    NA,
  "0",   "0",   "0",   "0",   "0",   "0",   "0",   "0",   "0",   "0",   "0",   "1",
  "1",   "1",   "1",   "1",   "1",   "1",   "1",   "1",   "1",   "1",   "1",   "1",
  "2",   "2",   "2",   "2",   "2",   "2",   "2",   "2",   "2",   "2",   "2",   "1",
  "3",   "3",   "3",   "3",   "3",   "3",   "3",   "3",   "3",   "3",   "3", "0.5",
  "4",   "4",   "4",   "4",   "4",   "4",   "4",   "4",   "4",   "4",   "4",   "1",
  "5",   "5",   "5",   "5",   "5",   "5",   "5",   "5",   "5",   "5",   "5", "0.3",
  "6",   "6",   "6",   "6",   "6",   "6",   "6",   "6",   "6",   "6",   "6", "0.7",
  "7",   "7",   "7",   "7",   "7",   "7",   "7",   "7",   "7",   "7",   "7",   "1",
  "8",   "8",   "8",   "8",   "8",   "8",   "8",   "8",   "8",   "8",   "8",   "1",
  "9",   "9",   "9",   "9",   "9",   "9",   "9",   "9",   "9",   "9",   "9", "0.5"
))

test_2 <- mutate(test,
                 rn=row_number())
(chartop <- filter(test_2,is.na(key)))
(numbottom <- filter(test_2,!is.na(key)) %>% 
    mutate(across(where(is.character),as.numeric)))

(comput_bottom <- mutate(numbottom,
                         across(starts_with("V"),~.*key)))

mesisa · October 8, 2021, 11:20am

Thank you for your response. Unfortunately, I got the data like that and at the end, it should look the same again (again in Excel).

I was also thinking about dividing the data frame into two parts.

system · October 15, 2021, 11:21am

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.