Hello, I am trying to align two sequences using the command
pairwiseAlignment(pattern = Seq2, subject = Seq1)
but I get an error message:
Error in .Call2("XStringSet_align_pairwiseAlignment", pattern, subject, :
key 0 not in lookup table
Hello, I am trying to align two sequences using the command
Hi @Nabi1! Welcome!
I'm afraid you'll need to supply some more info in order for helpers to be able to understand your problem. Most error messages are difficult to understand on their own without seeing all of the code that they depend on (not just the one line that threw the error), and it's not clear from the above code what package(s) you're using.
Can you try making a self-contained reproducible example that demonstrates your problem? This should be complete, runnable code with all the necessary
library() calls and some sample data that demonstrates your problem. For more specific info on how to do that, see the link! You might also want to take a look at this: FAQ: Tips for writing R-related questions
I am trying to align two sequences which i call Seq1 and Seq2.
I used the libraries:
#Read prok.fasta into a new variable prokaryotes <- read.fasta(file = "prok.fasta", seqtype = "DNA") #Read first sequence Seq1 <-as.character(prokaryotes[]) Seq1 = paste(Seq1, collapse="") #Read second sequence Seq2 <-as.character(prokaryotes[]) Seq2 = paste(Seq2, collapse="") #Align Seq2 and Seq1 using default libraries pairwiseAlignment(pattern=Seq2, subject=Seq1) #Error message Error in .Call2("XStringSet_align_pairwiseAlignment", pattern, subject, : key 0 not in lookup table
Below are my sequences as save in my working directory but converted to string form as seen above
ATGATAACCCTGACCTACCGCATCGAAACGCCCGGCAGCGTCGAGACGATGGCGGACAAGATCGCCAGCG ACCAGTCGACCGGAACCTTCGTGCCGGTTCCCGGCGAGACGGAAGAGCTGAAATCGCGCGTCGCCGCCCG GGTTTTGGCGATCCGCCCGCTCGAGAATGCGCGCCATCCGACCTGGCCCGAGTCCGCGCCCGACACGCTG CTCCACCGCGCCGACGTCGACATTGCCTTCCCTCTGGAGGCGATCGGCACAGATCTCTCGGCGCTGATGA CCATCGCGATCGGCGGCGTCTATTCGATCAAGGGCATGACCGGCATCCGCATCGTCGACATGAAGCTGCC CGAAGCTTTCCGGAGCGCCCATCCCGGGCCGCAATTCGGCATAGCGGGCAGCCGCCGCCTCACCGGCGTC GAGGGCCGCCCGATCATCGGCACGATCGTCAAGCCGGCACTGGGGCTGAGGCCGCACGAGACGGCGGAAC TCGTCGGCGAATTGATTGGGTCGGGCGTCGACTTCATCAAGGACGATGAGAAGCTGATGAGCCCGGCCTA TTCGCCGCTCAAGGAGCGCGTCGCCGCGATCATGCCGCGCATTCTCGATCACGAGCAGAAGACCGGCAAG AAGGTCATGTATGCCTTCGGCATCTCGCATGCCGATCCCGACGAGATGATGCGCAACCACGATATCGTCG CTGCGGCCGGCGGCAATTGCGCCGTCGTCAATATCAATTCGATCGGCTTCGGCGGCATGAGCTTCCTGCG CAAGCGCTCCAGCCTGGTGCTGCATGCGCATCGCAACGGCTGGGATGTGCTGACGCGCGATCCGGGCGCC GGCATGGATTTCAAGGTCTATCAGCAGTTCTGGCGGCTGCTCGGCGTCGACCAGTTCCAGATCAACGGCA TCAGAATCAAATATTGGGAGCCGGACGAGAGCTTCGTCTCTTCCTTCAAGGCCGTCAGCACGCCGCTCTT CGATGCCGCCGATTGCCCGCTTCCGGTCGCGGGCTCCGGCCAGTGGGGCGGGCAGGCGCCGGAGACCTAC GAGCGCACCGGCCGCACCATCGATCTTCTCTATCTCTGCGGCGGCGGCATCGTCAGCCATCCCGGCGGTC CTGCTGCCGGCGTGCGCGCCGTGCAGCAGGCCTGGCAGGCGGCGGTCGCCGGCATTCCGCTGGAGGTCTA TGCCAAGGATCATCCGGAGCTTGCCGCCTCGATTGCCAAATTCAGCGACGGCAAGGGCGCGTGA
ATGCCCAAGACGCAATCTGCCGCAGGCTATAAGGCCGGGGTGAAGGACTACAAACTCACCTATTACACCC CCGATTACACCCCCAAAGACACTGACCTGCTGGCGGCTTTCCGCTTCAGCCCTCAGCCGGGTGTCCCTGC TGACGAAGCTGGTGCGGCGATCGCGGCTGAATCTTCGACCGGTACCTGGACCACCGTGTGGACCGACTTG CTGACCGACATGGATCGGTACAAAGGCAAGTGCTACCACATCGAGCCGGTGCAAGGCGAAGAGAACTCCT ACTTTGCGTTCATCGCTTACCCGCTCGACCTGTTTGAAGAAGGGTCGGTCACCAACATCCTGACCTCGAT CGTCGGTAACGTGTTTGGCTTCAAAGCTATCCGTTCGCTGCGTCTGGAAGACATCCGCTTCCCCGTCGCC TTGGTCAAAACCTTCCAAGGTCCTCCCCACGGTATCCAAGTCGAGCGCGACCTGCTGAACAAGTACGGCC GTCCGATGCTGGGTTGCACGATCAAACCAAAACTCGGTCTGTCGGCGAAAAACTACGGTCGTGCCGTCTA CGAATGTCTGCGCGGCGGTCTGGACTTCACCAAAGACGACGAAAACATCAACTCGCAGCCGTTCCAACGC TGGCGCGATCGCTTCCTGTTTGTGGCTGATGCAATCCACAAATCGCAAGCAGAAACCGGTGAAATCAAAG GTCACTACCTGAACGTGACCGCGCCGACCTGCGAAGAAATGATGAAACGGGCTGAGTTCGCTAAAGAACT CGGCATGCCGATCATCATGCATGACTTCTTGACGGCTGGTTTCACCGCCAACACCACCTTGGCAAAATGG TGCCGCGACAACGGCGTCCTGCTGCACATCCACCGTGCAATGCACGCGGTGATCGACCGTCAGCGTAACC ACGGGATTCACTTCCGTGTCTTGGCCAAGTGTTTGCGTCTGTCCGGTGGTGACCACCTCCACTCCGGCAC CGTCGTCGGCAAACTGGAAGGCGACAAAGCTTCGACCTTGGGCTTTGTTGACTTGATGCGCGAAGACCAC ATCGAAGCTGACCGCAGCCGTGGGGTCTTCTTCACCCAAGATTGGGCGTCGATGCCGGGCGTGCTGCCGG TTGCTTCCGGTGGTATCCACGTGTGGCACATGCCCGCACTGGTGGAAATCTTCGGTGATGACTCCGTTCT CCAGTTCGGTGGCGGCACCTTGGGTCACCCCTGGGGTAATGCTCCTGGTGCAACCGCGAACCGTGTTGCC TTGGAAGCTTGCGTCCAAGCTCGGAACGAAGGTCGCGACCTCTACCGTGAAGGCGGCGACATCCTTCGTG AAGCTGGCAAGTGGTCGCCTGAACTGGCTGCTGCCCTCGACCTCTGGAAAGAGATCAAGTTCGAATTCGA AACGATGGACAAGCTCTAA
How do I resovle this
From the documentation for
subject need to be an
Xstring object (http://web.mit.edu/~r/current/arch/i386_linux26/lib/R/library/Biostrings/html/XString-class.html). It's hard to know for sure whther this will work without seeing your input data, but from looking at your sequences, I assume they're DNA, so you might be able to convert your sequences into
XString objects using:
library(Biostrings) Seq1 <- DNAString(prokaryotes[]) Seq2 <- DNAString(prokaryotes[]) pairwiseAlignment(pattern = Seq2, subject = Seq1)
From the documentation, it appears there are a lot of optional arguments you can pass to
pairwiseAlignment that affect analysis of your sequences, which I won't pretend to understand, but might be pertinent so be sure to have a read.
@mrblobby. Thank you so much mrblobby.
I used the suggestion you gave but it still did not work out. I tried to convert seq1 and seq2 to DNAStringdoing this
library(Biostrings) Seq1 <;- DNAString(prokaryotes[]) Seq2 <- DNAString(prokaryotes[])
It gave another error message this time
Error in (function (classes, fdef, mtable) : unable to find an inherited method for function ‘XString’ for signature ‘"SeqFastadna"’
Well firstly there's a typo when you define
Seq1 (which should throw up an error?) but I'm guessing the error you see here is probably because
DNAString does not know how to deal with the data you load via
read.fasta. Is there a reason you use
read.fasta rather than something like
read_csv? It would really help if you could provide some sample data otherwise I'm just guessing.
Maybe try the following after loading the data with read.fasta:
Seq1 = paste(prokaryotes[], collapse="") Seq1 <- DNAString(Seq1)
edit: from having a Google, it looks like you can load fasta format files using the BioStrings package: http://web.mit.edu/~r/current/arch/i386_linux26/lib/R/library/Biostrings/html/XStringSet-io.html. This might be your best bet.
Thank you mrbobly. I got the first error message again.
The file is a fasta file and is not authorized to be upload here. Maybe I will just give the lint to the web page.
I want to align the first two sequences using the command
only. Later I will be modifying to see how other factors affect the alignment.
It is actually a course in which I am learning to use R. So far everything was okay but at this point I got stuck.
These are the steps I followed
#Download the pro.fasta file into my working directory
prokaryotes <- read.fasta(file = "prok.fasta", seqtype = "DNA")
#Split first two sequences into individual sequences
#aligned the two sequences using the default libraries
I got the two sequences as strings but error in pairwiseAlignment
Hi @Nabi1, did you see my previous response? What happened when you tried to incorporate my suggestions? I'm feeling disinclined to just tell you the answer if the purpose of this is to learn your way around R as well.
If you follow the advice in my previous reply and copy it for
Seq2 you should find a solution that works;
I've just tried it with your data and
pairwiseAlignment returns a score. If you're still having issues after trying the above, paste the code that you've used in a reply here (please read the FAQ that jcblum posted if you're unsure how to do that) and I'll have a go at trying to figure it out with you
I tried your suggestion but it did not work out. I do not know if it is the version of Rstudio that I am using or what. The version is R 1.1.463 or my operating system which is windows.
Hi @Nabi1, without a reproducible example of your code I cannot help you. If you think it might be because of R or RStudio, posting your code will still help because then we can rule out the problem isn't because of something incorrect there. Again, see jcblum's post above on how to produce a good reproducible example; specifically this bit:
@mrblobby. Thank you so much for your support. It was a version problem. The 3.1 version did the trick.
If your question's been answered (even if by you), would you mind choosing a solution? (See FAQ below for how).
Having questions checked as resolved makes it a bit easier to navigate the site visually and see which threads still need help.
@ Mara, Thank you for your advise.