rOpenSci | Blog

All posts (Page 39 of 86)

Mongolite 2.0: GridFS, connection pooling, and more

This week version 2.0 of the mongolite package has been released to CRAN. Major new features in this release include support for MongoDB 4.0, GridFS, running database commands, and connection pooling. Mongolite is primarily an easy-to-use client to get data in and out of MongoDB. However it supports increasingly many advanced features like aggregation, indexing, map-reduce, streaming, encryption, and enterprise authentication. The mongolite user manual provides a great introduction with details and worked examples....

Where to go observe birds in Radolfzell? An answer with R and open data

This post is the 1st post of a series showcasing various rOpenSci packages as if Maëlle were a birder trying to make the most of R in general and rOpenSci in particular. Although the series use cases will mostly feature birds, it’ll be the occasion to highlight rOpenSci’s packages that are more widely applicable, so read on no matter what your field is! Moreoever, each post should stand on its own....

phylotaR: Retrieve Orthologous Sequences from GenBank

In this technote I will outline what phylotaR was developed for, how to install it and how to run it with some simple examples. 🔗 What is phylotaR? In any phylogenetic analysis it is important to identify sequences that share the same orthology – homologous sequences separated by speciation events. This is often performed by simply searching an online sequence repository using sequence labels. Relying solely on sequence labels, however, can miss sequences that have either not been labelled, have unanticipated names or have been mislabelled....

Extracting and Processing eBird Data

eBird is an online tool for recording bird observations. The eBird database currently contains over 500 million records of bird sightings, spanning every country and nearly every bird species, making it an extremely valuable resource for bird research and conservation. These data can be used to map the distribution and abundance of species, and assess how species’ ranges are changing over time. This dataset is available for download as a text file; however, this file is huge (over 180 GB!...

A package for dimensionality reduction of large data

🔗 Motivation Note: Recently, two new UMAP R packages have appeared. These new packages provide more features than umapr does and they are more actively developed. These packages are: umap, which provides the same Python wrapping function as umapr and also an R implementation, removing the need for the Python version to be installed. It is available on CRAN. uwot, which also provides an R implementation, removing the need for the Python version to be installed....

Working together to push science forward

Happy rOpenSci users can be found at