FAQ: How to do a minimal reproducible example ( reprex ) for beginners

reprex
#1

A minimal reproducible example consists of the following items:

  • A minimal dataset, necessary to reproduce the error
  • The minimal runnable code necessary to reproduce the error, which can be run
    on the given dataset, and including the necessary information on the used packages.

Let's quickly go over each one of this with an example:

Minimal Dataset (Sample Data)

You need to provide a dataframe that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue.

Let's say, as example, that you are working with the iris dataframe

head(iris)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa
#> 6          5.4         3.9          1.7         0.4  setosa

Note: In this example we are using the built-in dataset iris, as a representation of your actual data, you should use your own dataset instead of iris, or if your problem can be reproduced with any dataset, then you could use iris directly (or any other built-in dataset e.g. mtcars, ToothGrowth, PlantGrowth, USArrests, etc.) and skip this step.

And you are having issues while trying to do a scatter plot between Sepal.Length and Sepal.Width, so a good minimal sample data for this would be just the first 5 rows of those two variables

head(iris, 5)[, c('Sepal.Length', 'Sepal.Width')]
#>   Sepal.Length Sepal.Width
#> 1          5.1         3.5
#> 2          4.9         3.0
#> 3          4.7         3.2
#> 4          4.6         3.1
#> 5          5.0         3.6

Now you just need to put this into a copy/paste friendly format for been posted in the forum, and you can easily do it with the datapasta package.

# If you don't have done it already, You have to install datapasta first with
# install.packages("datapasta")
datapasta::df_paste(head(iris, 5)[, c('Sepal.Length', 'Sepal.Width')])
# This is the sample data that you have to use in your reprex.
data.frame(
      Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5),
       Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6)
   )

A detailed guide about datapasta can be found here:

You can also use dput provided in base, which is as simple as this:

dput(head(iris, 5)[c("Sepal.Length", "Sepal.Width")])
#> structure(list(Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5), Sepal.Width = c(3.5, 
#> 3, 3.2, 3.1, 3.6)), row.names = c(NA, 5L), class = "data.frame")

This output may seem awkward compare to the output of datapasta, but it's much more general in the sense that it supports many types of R objects.

Minimal Runnable Code

The next step is to put together an example of the code that is causing you troubles, and the libraries that you are using for that code.

library(ggplot2) # Make sure to include the calls for all the libraries that you are using in your example

# Remember to include the sample data that you have generated in the previous step.
df <- data.frame(stringsAsFactors = FALSE,
                 Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5),
                 Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6)
)
# Narrow down your code to just the problematic part.
ggplot(data = df, x = Sepal.Length, y = Sepal.Width) +
    geom_point()
#> Error: geom_point requires the following missing aesthetics: x, y

Your Final reprex

Now that you have a minimal reproducible example that shows your error, it's time to put it into a propper format to be posted in the community forum, this is very easy to do with the reprex package, just copy your code with Ctrl + c and run reprex() function in your console pane

# If you don't have done it already, You have to install reprex first with
# install.packages("reprex")
reprex::reprex()

Now you can just do Ctrl + v in your forum post and voilà!, you have a properly formatted reprex like this:

```r
library(ggplot2)

df <- data.frame(stringsAsFactors = FALSE,
                 Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5),
                 Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6)
)
ggplot(data = df, x = Sepal.Length, y = Sepal.Width) +
    geom_point()
#> Error: geom_point requires the following missing aesthetics: x, y
```

![](https://i.imgur.com/rAYQlnn.png)

Note: The previous approach works if you are using a desktop version of rstudio but if you are using a server version (and you don't have access to your clipboard), you will have to paste your code inside the reprex() funtion like this.

reprex::reprex({
library(ggplot2)

df <- data.frame(stringsAsFactors = FALSE,
                 Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5),
                 Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6)
                 )

ggplot(data = df, x = Sepal.Length, y = Sepal.Width) +
    geom_point()
})

The Answer

If you follow all this steps, most likely someone is going to copy your code into its own rstudio session, figure out that you forgot to put your variables inside the aes() function, and answer to you with a working solution like this.

library(ggplot2)

df <- data.frame(stringsAsFactors = FALSE,
                 Sepal.Length = c(5.1, 4.9, 4.7, 4.6, 5),
                 Sepal.Width = c(3.5, 3, 3.2, 3.1, 3.6)
                 )

ggplot(data = df, aes(x = Sepal.Length, y = Sepal.Width)) +
    geom_point()

0 Likes

Can I move the files from the Data section of the Global Environment to the Values? The ones that were meant for Data went into the Values section and now nothing is graphing!
count the number of days between the appointments taken by the same patient.
code runs (no error) but ggplots not showing in RStudio plot window
R function daisy() from package cluster
Why am I facing the error when I try to convert long data set into wide data set? Please help my dear friends
How can I share the interactive visualizations that I created on R studio to my own website?
Phase and annotation in R
Sorting Month in the Highchater Graph X Axis
Baseball Data Question
Importing data from Excel to R
help with pulling out data
import dataset from excel
Missing data after cleaning NA on ezAnova
invalid type (list) for variable '.response' *
Filter next two rows
Exporting time prints from Kinovea and chaning the time to R format
Is it possible to have Table grid and Graph together in R at row level
calculating correlations by group using ddply [R Studio]
linear type plot in qqnorm
Error in data splitting with R
Difficulty in specific order data
Predicted probability values from Logistic regression are negative
i want to ask abour RStudio
Cant knit rmarkdown document
Error in twInterfaceObj$doAPICall: Forbidden (HTTP 403).
How to make a subset that isn't a list?
Having an error importing the data to r
Plotly, plot large dataset without loop.
Formal parameters in geoms and stats: how to pass them from one to the other?
ggplot2 Viewport/Axis Issue
Cross analysis with new dataset in R Studio
Need Help With Dplyr
Mean BMI with R and dplyr
Error in seq ...: wrong sign in "by" argument
Help still needed: White space around plot
Do shiny server need to be installed as I aready installed Rstudioserver on my linux cent OS 6 server for publishing shiny app on my own server ?
Merging two datasets for two ICC studies to get a single ICC
I need help, Column 0 Import data
How to enlarge the size of each sub-panel with common scale bar
Facing prob with corrplot
How to enlarge the size of each sub-panel with common scale bar
OCR using tesseract, magick
Plotting ts in rstudio
Use eval() for multiple source statements within UI?
fitting spline through local maxima
a question about 3D array in R
Y-axis scale in Rstudio
Date conversion gives NA
codes for back calculation method
r code for create the fuzzy linguistic terms
Remove comments from LaTeX output in .Rmd
Fitting linear regression model
Need Help Running a Regression
Rcmd package not install 3.5.1.
Error Message: Aesthetics
Legend from continue to discrete values
Markov function error
Newbie to RStudio - help with date import and transform
How to remove background from levelplot?
Clustering with Categorical variable
cannot knit an .rmd file to html
Unable to show ACF graph
random generation any easier code
Merge annual and monthly time series data
Using loops and equations within dplyr
XML error in Rstudio
forecast package time series in R
forecast package time series in R
R studio run really slow
Error in using HCLUST
some problems with my database
How to exclude certain file names in a folder in R
How to add colors and linetype with ggplot2
Change to dbl after import csv created factor
Average of data and creation of a dataset
Help with chi-sq test
Error with geom_sf()
Using "findcorrelation" to remove features
return function
Summing accross individuals
data.table conversion script question...
Histogram using ggplot
Unable to run CFA with EGA: "Error in paste..."
Edit datas with a Shiny App
TeX Capacity Exceeded
Using dummy variables for categorical data
String Splitting for file path (for entire data set) with one column not a single path given by the user
Can a mean line graph be plotted along with boxplot using highcharter.
Need help creating a Regression Chart
add group-level abline to plot
Doubt on retryonratelimit
Require R Script, to group similar Account Names based on Region
ggsurvplot() error
Descriptive analysis of a cluster
Why am I getting same forecast pattern for both the years (2016 and 2017) in holt winters method? What is wrong with my code?
Adding internal document links to section created using pdfpages package
2D plot from a matrix
parse_character chinese character
invalid to set the class to matrix unless the dimension attribute is of length 2
removing blanks/NA's
Outliers in Box Plots.
Find the difference between date vectors in months
how to use mutate_at and fct_rev() together?
Problem running datasets in Rstudio
Categorical data
Dot density map help (ratios) error :could not find function "calc_dots"
Plotting an hyperbolic density using ggplot
How to perform a double arcsine back-transformation after meta-regression with moderator analysis in RStudio
Help needed! Aggregated counts in R
time series questions
multiline functions that I can then plot
Both t.test ; why different p-values??t.test()&ggboxplot()
ggplot2 query - facet wrapping, removal of legends and general formatting
Saving factor scores from grm
Calculating the hours minute secs and millseconds for few groups - Error on conversion
Add label to the Top or center of column chart
Negative number is character. Want to change to double
Error: Input year(s) is not numeric in R
Need to convert the Multiple Data frames with Array to single data frame
follow-up interaction resulted from lmer
Extracting text fields from a list of pdfs
ERROR : system is exactly singular
debounce list of inputs
New to R, need help plotting
distance between two documents with jaccard distance
Contingency Table Question
Random Forest - Variable lengths differ
Double loop to create a dataframe
Creating a PolarPlot
collapse consecutively repeated rows and group by other varibales
Changing shape of legend in ggplot
Rstudio to analyse CPG promotions
Linear Regression
Error in gbm.fit
Which test to use (animal behavior)
Error: Evaluation error: no applicable method
ggplot2 error message for Hidden Markov Models
ggplot : Beginner question about string data.
Randomly assigning values to missing data
Lubridate as_date
Can not remove the title of subplot
How to collapse rows in a dataframe?
PCA plot mean point
Arithmetics with extremely small numbers
missForest not working
how to find conditional mean?
need help with Haversine function
need help with Haversine function
Summarize Daily Precipitation by 4 and 6 days
how do i clean my social network
Beginner question with running a t-test
High Accuracy- seems fishy
Describe function not working
How to recode a factor in order to make a pie chart
How can I find the difference in population by year and zip code using dplyr?
str_replace_all problem
How to change Pearson to Spearman rank correlation
How to change Pearson to Spearman rank correlation
How to graph a scatter plot in 3D?
problem loading a trained mobilenet - "Error in py_call_impl(callable, ..., ...) : TypeError: '<' not supported between instances of 'dict' and 'float'
strings in rows of a data frame column
Help on - Loop - argument is of length zero
Handling missing multiple time series data in a csv file
Help with Averaging Data per county per state for AQI data... then maps
how to put external regressors in DCC model in R studio
Estimating Proportion with Survey Data in R
combining datasets with different countries and rows
Animation Slider Date Order
Why am I receiving a map_lgl error using dplyr for filtering my data?
how to truncate data
how to put external regressors in DCC model in R studio
Data entry form in r
How to get forecasting output for all individual input in R
cannot fit gev model to data. please help :(
How to define quadratic model without SO() or PQ() functions ?
How to convert a chr variable to a factor variable to be able to identify levels
Needs help on R. I find it difficult to install agricolae and to run some analysis such as LSD etc
How to average a subset of data based on other columns
Point shape in ggplot2 with distance matrix
Forest Inventory and Analysis (FIA) Data analysis in R.
Construction of Repeated loop in R.
functional autoregressive model
Recode categorical variable
Applying a function to an entire data frame
Split data: text to columns
Creating diagonal matrix (diag) from Excel sheet
How to read text mining on pdf and how to call?
Analysis my data vs. library data
deidentify and duplicate data
A simple problem needs help【number of items to replace is not a multiple of replacement length】
Xplorerr Package/Library
A simple problem needs help【number of items to replace is not a multiple of replacement length】
Comparing Panel models after they have clustered SE R
replacing NA with for loop vs dplyr
Error on variable length
Trying to run a time series
Basic R Problem: Prepare data record
How to indent a code chunk without adding a list
Individual Scatter Boxplots for very large dataset marking price and product code
Converting 20110101 to 1
Apply function works incorrectly
Stacked bar chart with continuous Y variables
Help with Function()
metafor package, change the tol
Error in undefined Columns
translating xlsx.writeMultipleData("stdresids.xlsx", stdresids)" from library (xlsx) to library(writexl)
Formating DT table to add background color for interval values
Column that is a list
Mutate Evaluation error: objet '...' introuvable."
What is the difference of occurrence and density for different types with in 4 different populations
Filter unique and
Read characters with grep
trouble while using prophet() in R
How to compare a forecast model to actual data and what is uncertainty?
Regression model to predict student's grade in R
Error in if (attr(vectorizer, &quot;grow_dtm&quot;, TRUE) == FALSE
geom_bar display empty plot
creating histogram with lattice installed
How to substring value from row in r
How to extract factors names from anova function
plot is all black
plot is all black
Data vs Values in Global environment in R??
replace ALL... if the first...
Error in plot command - Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ
R studio trying to start plm
how to create dummy variable using age group
How to apply ARIMA/time series on multicolumn/variable dataset
Different colours and labels for each df on one plot
Error in plot command - Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ
Trying to figure out how to plot an age-length line / scatter graph in ggplot2?
no able to produce a legend
How to put "Input values" on the Box Title
debugging keras
Error with using %to%
Using function table() and calculate the Simple Matching Coefficient
fa.parallel function not found
Sorting to create a data frame off a pdf
#2

Please feel free to improve this FAQ, just keep in mind the general goal of making it friendly for r beginners (and if possible for non native English speakers as well)

3 Likes

Converting 20110101 to 1
Apply function works incorrectly
metafor package, change the tol
Error in undefined Columns
translating xlsx.writeMultipleData("stdresids.xlsx", stdresids)" from library (xlsx) to library(writexl)
Formating DT table to add background color for interval values
Mutate Evaluation error: objet '...' introuvable."
What is the difference of occurrence and density for different types with in 4 different populations
Filter unique and
Read characters with grep
trouble while using prophet() in R
Regression model to predict student's grade in R
Error in if (attr(vectorizer, &quot;grow_dtm&quot;, TRUE) == FALSE
geom_bar display empty plot
Mutiple .txt list to data frame in r
creating histogram with lattice installed
cspade failing in RStudio under Windows 10 (but not in Rterm or RStudio under Linux)
Error While using KNN model
read.table of "list.files()"
Error in plot command - Error in xy.coords(x, y, xlabel, ylabel, log) : 'x' and 'y' lengths differ
closed

This topic has been closed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.
#5
0 Likes