ROBV
April 23, 2020, 4:51pm
1
I'm learning and a beginning user of R
my little program should find the names that are almost equal
out of a frame called names in the second column
omitting the if line, works but I get all names with their distance
nr <- nrow(names)
for(i in 1:nr-1)
{
eq <- adist(names[i,2],names[i+1,2], counts=TRUE)
if(eq<3L & eq>0L)
{
cat(sprintf(" %d %s %s \n",eq,names[i,2],names[i+1,2]))
}
}
can someone help?
FJCC
April 23, 2020, 5:27pm
2
This works for me. Notice that in the for loop I put parentheses around nr-1
.
Names <- data.frame(A = LETTERS[1:6], B = c("Stan", "Mary", "Mari", "Alex", "Alexa", "Bob"))
nr <- nrow(Names)
for(i in 1:(nr-1)) {
eq <- adist(Names[i,2],Names[i+1,2], counts=TRUE)
dim(eq)
if(eq < 3L & eq > 0L) {
cat(sprintf(" %d %s %s \n",eq,Names[i,2],Names[i+1,2]))
}
}
#> 1 Mary Mari
#> 1 Alex Alexa
Created on 2020-04-23 by the reprex package (v0.3.0)
ROBV
April 23, 2020, 5:33pm
3
Indeed these brackets make it work, good to know, thanks
ROBV
April 24, 2020, 1:45pm
4
The exact copy indeed works, but as soon I add something to the script it fails again, see example:
j <- 1L
> nr <- nrow(names)
>
> for(i in 1:(nr-1))
+ {
+ eq <- adist(names[i,2],names[i+1,2], counts=TRUE)
+ dim(eq)
+ if(eq<3L & eq>0L)
+ {
+ cat(sprintf(" %3d %20s %20s/n",j,names[i,2], names[i+1,2]))
+ j <- j+1
+ }
+ }
Error in if (eq < 3L & eq > 0L) { : missing value where TRUE/FALSE needed
FJCC
April 24, 2020, 2:38pm
5
I copied your code and ran it without a problem (after I removed the + and > symbols inserted by the console, of course).
names <- data.frame(A = LETTERS[1:6], B = c("Stan", "Mary", "Mari", "Alex", "Alexa", "Bob"))
j <- 1L
nr <- nrow(names)
for(i in 1:(nr-1))
{
eq <- adist(names[i,2],names[i+1,2], counts=TRUE)
dim(eq)
if(eq<3L & eq>0L)
{
cat(sprintf(" %3d %20s %20s/n",j,names[i,2], names[i+1,2]))
j <- j+1
}
}
#> 1 Mary Mari/n 2 Alex Alexa/n
Created on 2020-04-24 by the reprex package (v0.3.0)
offering a way to do the same analysis that doesnt require loop management
library(tidyverse)
names <- data.frame(
A = LETTERS[1:6],
B = c("Stan", "Mary", "Mari", "Alex", "Alexa", "Bob"), stringsAsFactors = FALSE
)
n1 <- mutate(names,
C = lead(B)
)
n2 <- rowwise(n1) %>%
mutate(adist = adist(B, C, counts = TRUE)) %>%
filter(between(adist, 0, 3))
# > n2
# Source: local data frame [2 x 4]
# Groups: <by row>
#
# # A tibble: 2 x 4
# A B C adist
# <chr> <chr> <chr> <dbl>
# B Mary Mari 1
# D Alex Alexa 1
system
Closed
May 15, 2020, 3:27pm
7
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.