display table with hiding records

I want to show table which can display n number of top records and n number of bottom records if the table is very long.

df <- nycflights13::flights

funct <- function(data, var){
var_lab(data[[var]])<-"Table 1" 
t1<- expss::cro_cpct(data[[var]])
t1
}

funct(data=df,var="distance")

i just want to give a parameter like by which it can trim table like below, for example if i give new paramter n = 10 then it should show first 10 records and bottom 10 records and trim the rest of records without changing the original percentage values

Row label dist #total
Table 1 85 6.9%
94 7.6%
66 5.3%
57 4.6%
88 7.1%
35 2.8%
55 4.4%
30 2.4%
98 7.9%
……. ……….
58 4.7%
47 3.8%
68 5.5%
37 3.0%
65 5.2%
38 3.1%
79 6.4%
93 7.5%
87 7.0%
59 4.8%
Total 1239

Hello,

This is easiest to do with the slice function from dplyr I think. I used the iris dataset and the variable x to make the difference clearer with the n() function from dplyr that's counting the rows. I'm also showing the original rowId just to verify the cuts are correct. This is not needed.

library(dplyr)

x = 10

iris %>% mutate(rowId = 1:n()) %>% 
  slice(1:x, (n()- x + 1):n())
#>    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species rowId
#> 1           5.1         3.5          1.4         0.2    setosa     1
#> 2           4.9         3.0          1.4         0.2    setosa     2
#> 3           4.7         3.2          1.3         0.2    setosa     3
#> 4           4.6         3.1          1.5         0.2    setosa     4
#> 5           5.0         3.6          1.4         0.2    setosa     5
#> 6           5.4         3.9          1.7         0.4    setosa     6
#> 7           4.6         3.4          1.4         0.3    setosa     7
#> 8           5.0         3.4          1.5         0.2    setosa     8
#> 9           4.4         2.9          1.4         0.2    setosa     9
#> 10          4.9         3.1          1.5         0.1    setosa    10
#> 11          6.7         3.1          5.6         2.4 virginica   141
#> 12          6.9         3.1          5.1         2.3 virginica   142
#> 13          5.8         2.7          5.1         1.9 virginica   143
#> 14          6.8         3.2          5.9         2.3 virginica   144
#> 15          6.7         3.3          5.7         2.5 virginica   145
#> 16          6.7         3.0          5.2         2.3 virginica   146
#> 17          6.3         2.5          5.0         1.9 virginica   147
#> 18          6.5         3.0          5.2         2.0 virginica   148
#> 19          6.2         3.4          5.4         2.3 virginica   149
#> 20          5.9         3.0          5.1         1.8 virginica   150

If you like to use it as a function you can write it like this:

library(dplyr)

sliceEnds = function(myData, x){
  if(2 * x < nrow(myData)){
    myData %>% 
      slice(1:x, (n()- x + 1):n())
  } else {
    myData
  }
  
}

sliceEnds(iris, 10)

Not that here if the slicing set is larger than the data frame, the whole data frame is returned.

Created on 2022-04-23 by the reprex package (v2.0.1)

Hope this helps,
PJ

i am trying like below but its shrinking my data i want the output to be like spss kind table

library(tidyverse)

df <- nycflights13::flights

var_lab(df[["distance"]])<-"Table 1" 

t1<- expss::cro_cpct(df[["distance"]]) %>% filter(row_number() <= 10 | row_number() >= (n() - 10)) %>%
    add_row(.after = 10) 
t2 <- t1 %>%   mutate(across(everything(), as.character))
t3 <- t2 %>%   mutate(across(everything(), ~replace_na(.x, "...")))


Hi,

What is the purpose of this? If you just want to display the top to look at it, the function I provided (and yours it seems as well) will work as long as you don't save the output and thus overwrite the original dataframe.

so for example if a particular variable have many outliers (may be more than 100-500) the the requirement is to shrink the table with top 10 and bottom 10 and having "...." between table which indicates that there is more data between tables, this is already implemented in vba output file , now i want to implement it into R.

Note: there is one more thing like the table gets displayed on the basis of banner which i didn't included in my reproducible example. i am rely on expss package because the requirement is to display tables in expss like output.

t1<- expss::cro_cpct(data[[var]], banner)

Hi there,

It seems that what you want is not possible with the expss functions, but I did manage to find a way to manipulate the output by capturing the output to the console and changing it. It works nicely, but of course you'll need to tweak some values if the format of the table changes, because this only works for the given example for now.

library(tidyverse)
library(expss)

endRows = 5 #Number of rows to keep start / end

#Data
df <- nycflights13::flights
var_lab(df[["distance"]])<-"Table 1" 

#Build table
t1<- cro_cpct(df[["distance"]]) 

#Capture the table output on the console as stings
newOutput = capture.output(t1)

#Cut out the lines not needed and add "..."
newOutput = c(
  newOutput[1:(endRows+3)], #Header + first rows
  "", #optional blank line
  str_pad("...", nchar(newOutput[1]), "both"), #Add the ...
  "", #optional blank line
  newOutput[(length(newOutput)-endRows):length(newOutput)] #last rows + footer
)
 
#Paste all back together and print
newOutput %>% paste(collapse = "\n") %>% cat()
#>                                       
#>  |         |              |   #Total |
#>  | ------- | ------------ | -------- |
#>  | Table 1 |           17 |      0.0 |
#>  |         |           80 |      0.0 |
#>  |         |           94 |      0.3 |
#>  |         |           96 |      0.2 |
#>  |         |          116 |      0.1 |
#> 
#>                  ...                  
#> 
#>  |         |         2576 |      0.1 |
#>  |         |         2586 |      2.4 |
#>  |         |         3370 |      0.0 |
#>  |         |         4963 |      0.1 |
#>  |         |         4983 |      0.1 |
#>  |         | #Total cases | 336776.0 |

Created on 2022-04-26 by the reprex package (v2.0.1)

Hope this helps,
PJ

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.