September 2021: "Top 40" New CRAN Packages

This is a companion discussion topic for the original entry at https://rviews.rstudio.com/2021/10/28/september-2021-top-40-new-cran-packages


Two hundred twenty new packages stuck to CRAN in September. Here are my “Top 40” picks in fourteen categories: Art, Computational Methods, Data, Econometrics, Finance and Insurance, Genomics, Machine Learning, Medicine, Networks and Graphs, Science, Statistics, Time Series, Utilities, and Visualization.

Art

rfishdraw v0.1.0: Automatically generates fish drawings using the fishdraw JavaScript library. See the vignette.

Line drawing of a fish

Computational Mehods

abmR v1.0.4: Supplies tools for running agent-based models (ABM) as discussed in Gochanour et al. (2021) including two movement functions based on the Ornstein-Uhlenbeck model (1930). See the vignette to get started.

Simulated movement on a map of Europe

SemiEstimate v1.1.3: Implements an improved method of two-step Newton-Raphson iteration based on implicit profiling. See the vignette for examples.

SimEngine v1.0.0: Implements functions for structuring, maintaining, running, and debugging statistical simulations on both local and cluster-based computing environments. Emphasis is placed on documentation and scalability. There is an Introduction, an example Power Calculation, and a vignette Comparing SE Estimators.

Line plot comparing estimators

Data

asylum v1.0.1: Provides data on asylum and resettlement from the UK Home Office. See README to get started.

hockeyR v0.1.1: Provides functions to scrape hockey play-by-play data from NHL.com and Hockey-Reference.com including standings, player stats, and jersey number history. There is a Getting Started Guide and a vignette on Scraping.

yowie v0.1.0: Provides longitudinal wages data sets and several demographic variables from the National Longitudinal Survey of Youth from 1979 to 2018 including: the wages data from the cohort whose highest grade completed is up to high school; the wages data of the high school dropouts and; the demographic data of the cohort in the survey year 1979. See the vignette for details.

Plots of wages data by year

Econometrics

COINr v0.5.5: Implements a development environment for composite indicators and scoreboards including utilities for construction (indicator selection, denomination, imputation, data treatment, normalization, weighting and aggregation) and analysis (multivariate analysis, correlation plotting. Look here for an online book, and see the vignette for an extended overview.

Plot of framework for an index

spflow v0.1.0: Provides functions to estimate spatial econometric models of origin-destination flows which may exhibit spatial autocorrelation in both the dependent variable and the explanatory variables. See LeSage and Pace (2008) for information on the model, Dargel (2021) for information on the estimation procedures, and the vignette for examples.

Spatial maps of population and other variables

Finance and Insurance

DOSPortfolio v0.1.0: Implements dynamic optimal shrinkage estimators for the weights of the global minimum variance portfolio reconstructed at given reallocation points as derived in Bodnar et al. (2021). See the Introduction.

SPLICE v1.0.0: Extends the individual claim simulator SynthETIC to simulate the evolution of case estimates of incurred losses through the lifetime of an insurance claim. See Taylor & Wang (2021) for background and the vignette to get started.

Genomics

MAnorm2 v1.2.0: Implements a method for normalizing ChIP-seq signals across individual samples or groups of samples and a system of statistical models for calling differential ChIP-seq signals between two or more biological conditions. Refer to Tu et al. (2021) and Chen et al. (2021) for the statistical details. The vignette provides several examples.

Before and after normalization MA plots

RevGadgets v1.0.0: Provides functions to process and visualize the output of complex phylogenetic analyses from the RevBayes phylogenetic graphical modeling software. Look here for a tutorial.

Circular plot of ancestral-state estimates

Machine Learning

morphemepiece v1.0.1: Provides functions to tokenize text into morphemes ether by table lookup or a modified wordpiece tokenization algorithm. There is a vignette on testing the fall-through algorithm and another on Generating a Vocabulary.

NIMAA v0.1.0: Implements a pipeline for nominal data mining, which can effectively find special relationships between data. See Jafari et al. (2021) for a description of the method and the vignette for an introduction.

Bipartite plot

Medicine

gapclosing v1.0.2: Provides functions to estimate the disparities across categories (e.g. Black and white) that persists if a treatment variable (e.g. college) is equalized. See Lundberg (2021) for the methodology and the vignette for an overview with examples.

Disparity plot of mean outcome by category

iDOVE v1.3: Implements a nonparametric maximum likelihood method for assessing potentially time-varying vaccine efficacy against SARS-CoV-2 infection under staggered enrollment and time-varying community transmission, allowing crossover of placebo volunteers to the vaccine arm. See Lin et al. (2021) for background and the vignette for the details.

powerly v1.5.2: Implements the sample size computation method for network models proposed by Constantin et al. (2021) which takes the form of a three-step recursive algorithm to find an optimal sample size given a model specification and a performance measure of interest. See README for an example.

Poster showing methodology

ppseq v0.1.1: Provides functions to design clinical trials using sequential predictive probability monitoring. See the vignette for details.

Networks and Graphs

multigrapher v0.1.0: Implements methods and models for analyzing multigraphs as introduced by Shafie (2015) including methods to study local and global properties and goodness of fit tests. See Shafle (2016). The vignette provides an introduction.

Examples of multigraphs

nevada v0.1.0: Implements a statistical framework for network-valued data analysis which leverages the complexity of the space of distributions on graphs. See Lovato et al. (2020) and Lovato et al. (2021) for the statistical background, and the vignette for an example.

robber v0.2.2: Implementation of a variety of methods to compute the robustness of ecological interaction networks with binary interactions as described in Chabert-Liddell et al. (2021). There is a vignette on Topological Analysis.

Plots comparing the influence of structure on robustness

Science

argoFloats v1.0.3: Supports the analysis of oceanographic data recorded by Argo autonomous drifting profiling floats. See Kelley et al. (2021) for more on the scientific context and applications. There is an Introduction and a vignette on Quality Control and Adjusted Data.

Spatial plot of Arfo floats off the Bahamas

biosurvey v0.1.1: Provides tools that allow users to plan systems of sampling sites with the goals of increasing the efficiency of biodiversity monitoring. See Arita et al. (2011) and Soberón & Cavner (2015) for background. There are vignettes on Preparing Data, Selecting Sample Sites, Using Preselected Points, and Testing Efficacy of Selected Sites.

Plots contrasting environmental space vs. geographic space

HostSwitch v0.1.0: Provides functions to aims to investigate the dynamics of the host switch in the population of an organism that interacts with current and potential hosts over generations. The underlying model is based on Araujo et al. (2015). See the vignette for an example.

Plot of distributions of phenotypes by number of generations

Statistics

cvam v0.9.2: Extends R’s implementation of categorical variables (factors) to handle coarsened observations and implements log-linear models for coarsened categorical data, including latent-class models. There is a vignette describing Coarsened Factors and another on Fitting Log-linear Models.

JustifyAlpha v0.1.1: Provides functions to justify alpha levels for statistical hypothesis tests by avoiding Lindley’s paradox, or by minimizing or balancing error rates. See Maier & Lakens (2021) for the theory and the vignette for an introduction.

Plots of error rate vs. alpha level

sphunif v1.0.1: Implements functions to test for uniformity on circles and hyperspheres. See García-Portugués et al. (2020) for background and the vignette for examples.

spatialRF v1.1.3: Implements methods to automatically generate and select spatial predictors for spatial regression with Random Forest. See Dray et al. (2006) and RFsp Hengl et al. (2017). Look here for documentation and examples.

Grid of prediction graphs

vdra v1.0.0: Implements three protocols for performing secure linear, logistic, and Cox regression on vertically partitioned data across several data partners in such a way that data is not shared among data partners. See Slavkovic et. al. (2007) for background. There is an Introduction and vignettes on Using PopMedNet, Communications and Files, and Workflow.

Diagram of information flow among data partners

Time Series

GlarmaVarSel v1.0: Implements functions for variable selection in high-dimensional sparse GLARMA models. See Gomtsyan et al. (2020) for the theory and the vignette for an example.

mrf v0.5.1: Implements a method to forecast univariate times series using a feature extraction algorithm based on the Haar wavelet transform. See the vignette.

Plot of electricity demand

Utilities

colorblindcheck v1.0.0: Provides functions to compare color palettes with simulations of color vision deficiencies - deuteranopia, protanopia, and tritanopia. It includes a calculation of distances between colors and creates summaries of differences between color palettes and simulations of color vision deficiencies. See the post by G. Aisch for background and the vignette for details.

Chart showing how colors ar perceived by people with various forms of colorblindness

filearray v0.1.1: Implements file-backed arrays for out-of-memory computation using gigabyte-level multi-threaded read/write via OpenMP. The vignette compares performance with native R.

httr2 v0.1.1: Implements tools for creating and modifying HTTP requests, executing them, and then processing the results. httr2 is a modern re-imagining of httr that uses a pipe-based interface and solves more of the problems that API wrapping packages face. See the vignettes httr2 and Wrapping APIs.

quadtree v0.1.2: Provides region quadtrees for working with spatial data, which allow for variable-sized cells. There are vignettes on code, creation, lcp, and usage.

Illustration of Quadtree Structure

tabxplor v1.0.2: Implements tools to create, manipulate, and color highlight cross-tables, and export them to Excel or HTML with formats and colors. See the vignette.

Table with color highlighting

Visualization

bittermelon v0.1.3: Provides functions for creating and modifying bitmaps with special emphasis on bitmap fonts and their glyphs including native read/write support for the hex and yaff bitmap font formats. Look here for documentation.

Bitmap rendering of a Go Board

d3po v0.3.2: Implements an Apache licensed alternative to Highcharter which provides functions for interactive visualization for Markdown and Shiny. See the vignette for examples.

Example of bubblechart

ggHoriPlot v1.0.0: implements geom_horizon() and other functions for building horizon plots with ggplot2. There is an Introduction and a vignette with examples.

Horizon plot showing content along human genome

plotbb v0.0.6: Provides a proof of concept for implementing a grammar for base R plots.See the vignette.

Enhanced base R scatter plot

semptools v0.2.9.3: Implements functions for customizing structural equation plots that can be chained using a pipe operator. There is a Quick Start Guide, and vignettes on Nodes, Matrix Layout, CFA, and SEM.

Plot depicting inter-factor covariances

    <script>window.location.href='https://rviews.rstudio.com/2021/10/28/september-2021-top-40-new-cran-packages/';</script>