Combining multiple columns in Excel

Hi,
I'm trying to combine some columns together in an excel sheet, but I'm not sure what package to use or even where to begin. Pretty new to R.

Here's an example:

Sheet 1
Col1 Col2 Col3
John Smith
Betty Tess Wiggins
Rick Led

Desired Output:
Col1
John Smith
Betty Tess Wiggins
Rick Led

Sheet 2- 7 would all be in a similar scenario.

Ideally it would be three columns, but some people have long names that can go up to five columns. I want to combine everything that falls within a set parameter and delete the old columns if possible. Any guidance or hints would be appreciated.

I think we need a FAQ: How to do a minimal reproducible example ( reprex ) for beginners

A handy way to supply some sample data is the dput() function. In the case of a large dataset something like dput(head(mydata, 100)) should supply the data we need.

This is an actual output of the data from excel. I'm not running any function to combine them, just reading the data. Let me know if this helps or if this is just more confusing.


#[[2]] # This is sheet two, but the other outputs go up to seven 
#[[2]][[1]]
# A tibble: 89 × 20
#   Contact     `Contact Compa…` Office `Contact Title` `Primary Pract…` `Product(s)...6`
#   <chr>       <chr>            <chr>  <chr>           <chr>            <chr>            <chr>                  
# 1 Tucker Strange… M&M    Dallas… Managing Direc… Brokerage        Office                        
# 2 Alex PlayMusic… M&M    Dallas Senior Vice Pr… Brokerage               Office                      
# 3 Travis Carlz… M&M    San Anton… Senior Vice Pr… Brokerage            Office 
# 4 Alex Whov… M&M    Dallas… Associate Gene… Legal                        
# 5 Alison Mifflin… M&M    Dallas Client Operati… Client Services               
# 6 Andrea Skirt… M&M    Dallas… Client Relatio… Client Services                
# 7 Andrew Buzz… M&M    Chicage … Vice President  Brokerage        Office           
# 8 Andy Havard… M&M    Dallas Programs Manag… Operations                                
# … with 79 more rows, and 11 more variables: `Service Line(s)...10` <chr>, Email <chr>, `Team 2` <lgl>, 

Thank you for the advice also :slight_smile: not used to posting too much

Okay we have the data in R. I was not sure.

I think I more or less understand what you want to do but I think we need to have some usable data.

The easiest way to supply that tibble in a format we on the forum can use is to use the dput() function. If your tibble is called mydata simply issue the command

dput(mydata)

then copy and paste the output into a reply on this thread. In the case of a large dataset something like dput(head(mydata, 100)) should supply the data we need.

Was having issues with dput(head(filename,100)). But I was able to cut down on the info, also had to change some field for anonymity reasons, but was this more of what you were looking for. And going forward should I always just use dput when asking a question?

dput(all_file_contents))

    Company = c("M&M", "M&M", "M&M", 
    "M&M", "M&M", "M&M",
    "M&M", "M&M", "M&M","), Office = c("Lake Els", 
    "Lake Els", "Lake Els", "Lake Els","Lake Els","Lake Els",
    "Lake Els","Lake Els","Lake Els","Lake Els","Lake Els"), 
     Contact Title = c("Senior Managing Director Capital Markets", 
    "Executive Managing Director Investments", "Senior Managing Director Capital Markets", 
    "Senior Managing Director Investments", "Senior Managing Director Investments", 
    "Senior Managing Director Investments", "Senior Managing Director Investments"), 
     Product(s)...7` = c("Multifamily Office Industrial Retail", 
    "Multifamily Retail", "Multifamily Office Industrial Retail Senior Housing Hospitality", 
    "Multifamily Retail Office Net Lease Industrial", "Retail"), 
    Service Line(s)...8` = c("Capital Markets", 
    "Landlord Representation", "Capital Markets", "Landlord Representation", 
    NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA), 
    Product(s)...10` = c(NA, NA, NA, NA, NA, 
    NA, NA, "Net Lease", NA, "Industrial", NA, NA, NA, NA, NA), 
    Service Line(s)...11` = c("Multifamily Office Industrial Retail", 
    "Multifamily Retail", "Multifamily Office Industrial Retail Senior Housing Hospitality", 
    "Multifamily Retail Office Net Lease Industrial", "Retail", 
    "Industrial", "Multifamily", "Retail", "Multifamily", "Industrial", 
    "Multifamily", "Multifamily", "Multifamily", "Industrial", 
    "Multifamily/Land", "Multifamily", "Multifamily", "Multifamily", 
    Email = c("Danny.Abergel@marcusmillichap.com", 
    "Jeff.Lotus@mandm.com", "Shar.Panther@mandm.com", 
    "Earle.Miner@mandm.com", "Brad.Stone@mandm.com"), 
     Work Phone = c("(666) 666-6666", 
    "(666) 666-6666", "(666) 666-6666","(666) 666-6666",
     "(666) 666-6666","(666) 666-6666","(666) 666-6666"))), row.names = c(NA, 
    -312L), class = c("tbl_df", "tbl", "data.frame"))))

It almost is but it looks like you missed a bit of the output at the top when you copied. You are missing the structure(list etc.

Here is a tiny example that will show you the layout.

dput(dat1)
structure(list(aa = c(1L, 4L, 7L, 10L, 13L), bb = c(2L, 5L, NA, 
11L, 14L), cc = c(3L, 6L, 9L, 12L, NA)), row.names = c(NA, 5L
), class = "data.frame")

To reconstitute this as a working data.frame we simply do this


dat2  <- structure(list(aa = c(1L, 4L, 7L, 10L, 13L), bb = c(2L, 5L, NA, 
11L, 14L), cc = c(3L, 6L, 9L, 12L, NA)), row.names = c(NA, 5L
), class = "data.frame")

and run it

Thanks! I think I got a better idea of how to present sample data for questions in the future.

Also I was able to figure out how to merge multiple columns of numerous data frames. For anyone wondering here's the code:

# Make list of your data frames
personnel_list <- list(cr,hr,jlr,kiddr,newr,str)
# List columns to merge by
merged_col <- c("Company","Contact Name","Title","Office",
                "Primary Practice","Service Line",
                "Email","Work Phone","Product(s)")
# Merge by full join
mysheet <- personnel_list %>% reduce(full_join, by = merged_col)

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.