A complete and consistent toolkit to process the Munich ChronoType Questionnaire (MCTQ) for its three versions (standard, micro, and shift). MCTQ is a quantitative and validated tool to assess chronotypes using peoples' sleep behavior, originally presented by Till Roenneberg, Anna Wirz-Justice, and Martha Merrow (2003, doi:10.1177/0748730402239679).
View DocumentationData that are collected through online sources such as Mechanical Turk may require excluding rows because of IP address duplication, geolocation, or completion duration. This package facilitates exclusion of these data for Qualtrics datasets.
View DocumentationRecodes Sex/Gender Descriptions into a Standard Set
Provides functions and dictionaries for recoding of freetext gender responses into more consistent categories.
View Documentationtreeio is an R package to make it easier to import and store phylogenetic tree with associated data; and to link external data from different sources to phylogeny. It also supports exporting phylogenetic tree with heterogeneous associated data to a single tree file and can be served as a platform for merging tree with associated data and converting file formats.
Scientific use casesClient for jq, a JSON Processor
Parse various reflectance/transmittance/absorbance spectra file formats to extract spectral data and metadata, as described in Gruson, White & Maia (2019) doi:10.21105/joss.01857. Among other formats, it can import files from Avantes https://www.avantes.com/, CRAIC https://www.microspectra.com/, and OceanInsight (formerly OceanOptics) https://www.oceaninsight.com/ brands.
View DocumentationReads Genbank files.
View DocumentationTree Biomass Estimation at Extra-Tropical Forest Plots
Standardize and simplify the tree biomass estimation process across globally distributed extratropical forests.
View DocumentationAutomated flagging of common spatial and temporal errors in biological and paleontological collection data, for the use in conservation, ecology and paleontology. Includes automated tests to easily flag (and exclude) records assigned to country or province centroid, the open ocean, the headquarters of the Global Biodiversity Information Facility, urban areas or the location of biodiversity institutions (museums, zoos, botanical gardens, universities). Furthermore identifies per species outlier coordinates, zero coordinates, identical latitude/longitude and invalid coordinates. Also implements an algorithm to identify data sets with a significant proportion of rounded coordinates. Especially suited for large data sets. The reference for the methodology is: Zizka et al. (2019) doi:10.1111/2041-210X.13152.
Scientific use casesLightweight Qualitative Coding
A free, lightweight, open source option for analyzing text-based qualitative data. Enables analysis of interview transcripts, observation notes, memos, and other sources. Supports the work of social scientists, historians, humanists, and other researchers who use qualitative methods. Addresses the unique challenges faced in analyzing qualitative data analysis. Provides opportunities for researchers who otherwise might not develop software to build software development skills.
View DocumentationAn implementation that combines trait data and a phylogenetic tree (or trees) into a single object of class treedata.table. The resulting object can be easily manipulated to simultaneously change the trait- and tree-level sampling. Currently implemented functions allow users to use a data.table syntax when performing operations on the trait dataset within the treedata.table object.
View DocumentationThere are a lot of different typical tasks that have to be solved during phonetic research and experiments. This includes creating a presentation that will contain all stimuli, renaming and concatenating multiple sound files recorded during a session, automatic annotation in Praat TextGrids (this is one of the sound annotation standards provided by Praat software, see Boersma & Weenink 2020 https://www.fon.hum.uva.nl/praat/), creating an html table with annotations and spectrograms, and converting multiple formats (Praat TextGrid, ELAN, EXMARaLDA, Audacity, subtitles .srt, and FLEx flextext). All of these tasks can be solved by a mixture of different tools (any programming language has programs for automatic renaming, and Praat contains scripts for concatenating and renaming files, etc.). phonfieldwork provides a functionality that will make it easier to solve those tasks independently of any additional tools. You can also compare the functionality with other packages: rPraat https://CRAN.R-project.org/package=rPraat, textgRid https://CRAN.R-project.org/package=textgRid.
View DocumentationSpell checking common document formats including latex, markdown, manual pages, and description files. Includes utilities to automate checking of documentation and vignettes as a unit test during R CMD check. Both British and American English are supported out of the box and other languages can be added. In addition, packages may define a wordlist to allow custom terminology without having to abuse punctuation.
Scientific use casesTools to import, clean, and visualize movement data, particularly from motion capture systems such as Optitracks Motive, the Straw Labs Flydra, or from other sources. We provide functions to remove artifacts, standardize tunnel position and tunnel axes, select a region of interest, isolate specific trajectories, fill gaps in trajectory data, and calculate 3D and per-axis velocity. For experiments of visual guidance, we also provide functions that use subject position to estimate perception of visual stimuli.
View DocumentationFunctions for the import, transformation, and analysis of data from muscle physiology experiments. The work loop technique is used to evaluate the mechanical work and power output of muscle. Josephson (1985) doi:10.1242/jeb.114.1.493 modernized the technique for application in comparative biomechanics. Although our initial motivation was to provide functions to analyze work loop experiment data, as we developed the package we incorporated the ability to analyze data from experiments that are often complementary to work loops. There are currently three supported experiment types: work loops, simple twitches, and tetanus trials. Data can be imported directly from .ddf files or via an object constructor function. Through either method, data can then be cleaned or transformed via methods typically used in studies of muscle physiology. Data can then be analyzed to determine the timing and magnitude of force development and relaxation (for isometric trials) or the magnitude of work, net power, and instantaneous power among other things (for work loops). Although we do not provide plotting functions, all resultant objects are designed to be friendly to visualization via either base-R plotting or tidyverse functions. This package has been peer-reviewed by rOpenSci (v. 1.1.0).
View DocumentationZero-dependency data frame to xlsx exporter based on libxlsxwriter. Fast and no Java or Excel required.
Scientific use casesClean Biological Occurrence Records
Clean biological occurrence records. Includes functionality for cleaning based on various aspects of spatial coordinates, unlikely values due to political centroids, coordinates based on where collections of specimens are held, and more.
Scientific use casesAn extension for the xml2 package to transform XML documents by applying an xslt style-sheet.
View DocumentationInterface to Phylocom
Interface to Phylocom (http://phylodiversity.net/phylocom/), a library for analysis of phylogenetic community structure and character evolution. Includes low level methods for interacting with the three executables, as well as higher level interfaces for methods like aot, ecovolve, bladj, phylomatic, and more.
Scientific use casesDeterministic Categorization of Items Based on External Code Data
Fast categorization of items based on external code data identified by regular expressions. A typical use case considers patient with medically coded data, such as codes from the International Classification of Diseases (ICD) or the Anatomic Therapeutic Chemical (ATC) classification system. Functions of the package relies on a triad of objects: (1) case data with unit id:s and possible dates of interest; (2) external code data for corresponding units in (1) and with optional dates of interest and; (3) a classification scheme (classcodes object) with regular expressions to identify and categorize relevant codes from (2). It is easy to introduce new classification schemes (classcodes objects) or to use default schemes included in the package. Use cases includes patient categorization based on comorbidity indices such as Charlson, Elixhauser, RxRisk V, or the comorbidity-polypharmacy score (CPS), as well as adverse events after hip and knee replacement surgery.
View DocumentationThe git2rdata package is an R package for writing and reading dataframes as plain text files. A metadata file stores important information. 1) Storing metadata allows to maintain the classes of variables. By default, git2rdata optimizes the data for file storage. The optimization is most effective on data containing factors. The optimization makes the data less human readable. The user can turn this off when they prefer a human readable format over smaller files. Details on the implementation are available in vignette(“plain_text”, package = “git2rdata”). 2) Storing metadata also allows smaller row based diffs between two consecutive commits. This is a useful feature when storing data as plain text files under version control. Details on this part of the implementation are available in vignette(“version_control”, package = “git2rdata”). Although we envisioned git2rdata with a git workflow in mind, you can use it in combination with other version control systems like subversion or mercurial. 3) git2rdata is a useful tool in a reproducible and traceable workflow. vignette(“workflow”, package = “git2rdata”) gives a toy example. 4) vignette(“efficiency”, package = “git2rdata”) provides some insight into the efficiency of file storage, git repository size and speed for writing and reading. Please cite using doi:10.5281/zenodo.1485309.
View DocumentationCreate and Query a Local Copy of GenBank in R
Download large sections of GenBank https://www.ncbi.nlm.nih.gov/genbank/ and generate a local SQL-based database. A user can then query this database using restez functions or through rentrez https://CRAN.R-project.org/package=rentrez wrappers.
Scientific use casesParse Darwin Core Files
Parse and create Darwin Core (http://rs.tdwg.org/dwc/) Simple and Archives. Functionality includes reading and parsing all the files in a Darwin Core Archive, including the datasets and metadata; read and parse simple Darwin Core files; and validation of Darwin Core Archives.
Scientific use casesThis is a utility for transforming Ecological Metadata Language (EML) files into JSON-LD and back into EML. Doing so creates a list-based representation of EML in R, so that EML data can easily be manipulated using standard R tools. This makes this package an effective backend for other R-based tools working with EML. By abstracting away the complexity of XML Schema, developers can build around native R list objects and not have to worry about satisfying many of the additional constraints of set by the schema (such as element ordering, which is handled automatically). Additionally, the JSON-LD representation enables the use of developer-friendly JSON parsing and serialization that may facilitate the use of EML in contexts outside of R, as well as the informatics-friendly serializations such as RDF and SPARQL queries.
View DocumentationPolyhedra Database
A polyhedra database scraped from various sources as R6 objects and rgl visualizing capabilities.
View DocumentationParse a BibTeX file to a data.frame to make it accessible for further analysis and visualization.
View DocumentationTo facilitate the analysis of positron emission tomography (PET) time activity curve (TAC) data, and to encourage open science and replicability, this package supports data loading and analysis of multiple TAC file formats. Functions are available to analyze loaded TAC data for individual participants or in batches. Major functionality includes weighted TAC merging by region of interest (ROI), calculating models including standardized uptake value ratio (SUVR) and distribution volume ratio (DVR, Logan et al. 1996 doi:10.1097/00004647-199609000-00008), basic plotting functions and calculation of cut-off values (Aizenstein et al. 2008 doi:10.1001/archneur.65.11.1509). Please see the walkthrough vignette for a detailed overview of tacmagic functions.
Scientific use casesThe Resource Description Framework, or RDF is a widely used data representation model that forms the cornerstone of the Semantic Web. RDF represents data as a graph rather than the familiar data table or rectangle of relational databases. The rdflib package provides a friendly and concise user interface for performing common tasks on RDF data, such as reading, writing and converting between the various serializations of RDF data, including rdfxml, turtle, nquads, ntriples, and json-ld; creating new RDF graphs, and performing graph queries using SPARQL. This package wraps the low level redland R package which provides direct bindings to the redland C library. Additionally, the package supports the newer and more developer friendly JSON-LD format through the jsonld package. The package interface takes inspiration from the Python rdflib library.
Scientific use casesAutomate the co-localization analysis of fluorescence microscopy images. Selecting regions of interest, extract pixel intensities from the image channels and calculate different co-localization statistics. The methods implemented in this package are based on Dunn et al. (2011) doi:10.1152/ajpcell.00462.2010.
Scientific use casesContains functions for developing phylogenetic trees as deeply-nested lists (“dendrogram” objects). Enables bi-directional conversion between dendrogram and “phylo” objects (see Paradis et al (2004) doi:10.1093/bioinformatics/btg412), and features several tools for command-line tree manipulation and import/export via Newick parenthetic text.
Scientific use casesThe Critical Care Clinical Data Processing Tools
An electronic health care record (EHR) data cleaning and processing platform. It focus on heterogeneous high resolution longitudinal data. It works with Critical Care Health Informatics Collaborative (CCHIC) dataset. It is created to address various data reliability and accessibility problems of EHRs as such.