I am brand new to this forum (and to R), so please let me know if you need any more information.
I have a dataset of about 45K observations of phytoplankton species in several lakes - 245 different genuses. Each sample has the Genus and the Common name - what I am trying to do is create a list of the common names for each of the 245 unique genera.
I have tried extracting the unique genus values and then matching them against the common names, but all I get is an NA error.
gen.com <- phyto.new[c("Genera", "Common_Name")]
gen <- unique(gen.com$Genera)
match(gen,gen.com$Common_Name)
match(gen,gen.com$Common_Name)
[1] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[66] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[131] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
[196] NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA NA
It would be better if you gave us some sample data we can work with using dput() to make it easily readable, and then show us what you have tried, the output that is wrong and say what output you want.
Thanks so much for your quick reply. I have edited my original post - not sure how to use the dput() function, but hopefully my end goal is clearer. I put what I have tried and what I got, and what I ultimately want.
you've just shared a lot of repeated NA values... theres no information content.
if you have a dataframe, lets say its called 'mydataframe'
and lets say you dont want to share all observations, just the top 100 (at the head)
then you use dput like
A very simple and very effective way to supply some data is to use the dput() command.
dput(mydata)
and then simply copy the output and paste it here. If you have a very large data set then a sample should be fine. To supply us with 100 rows of your data set do
dput(head(mydata , 100))
where mydata is the name of your data.frame or tibble.