I don't see how R specifically addresses the reproducibility problem, It's been around for almost 30 years and before its recent rise in popularity, lots of science was done in C, perl, fortran etc. Not to mention that actual dependency versioning is pretty poor. I struggle to run other people's R code after about 6 months (especially if they used the tidyverse as it pulls in hundreds of unstable dependencies) and nobody records what package versions are used and functions are seemingly deprecated every week.
1. Before R commercial statistical packages were mainly used. You can, in principle, just use assembler too and develop everything yourself but it isn't practical. Regarding C/C++ and Fortran, many R packages are, in fact, wrappers around code in those or other languages making it easier to access them. From that point of view R can be regarded as a glue language. 2. Regarding keeping versions straight, all past versions of packages in the CRAN repository are kept on CRAN. Microsoft MRAN repository also maintains histories of packages that can be accessed via the checkpoint package which will install packages as they existed on a given date. Furthermore, install_version in the remotes and devtools packages can install specific versions. 3. Regarding tidyverse dependencies you can reduce the number of packages you load by not using library(tidyverse) and instead load the specific packages you need. This will result in fewer packages being loaded.
> Before R commercial statistical packages were mainly used.
Maybe in your field, I work in bioinformatics - before R, perl was widely used as a high-level language.
> Regarding keeping versions straight, all past versions of packages in the CRAN repository are kept on CRAN...
This is woefully inadequate if you need to replicate somebody else's environment. Nobody should think manually guessing and then typing in each package version and hoping they're compatible is a viable option. Not to mention even if you specify an older version of a package it doesn't pull in compatible dependencies, it just pulls in the latest version. There's renv but it's not reached widespread use.
> Regarding tidyverse dependencies you can reduce the number of packages you load by not using library(tidyverse) and instead load the specific packages you need. This will result in fewer packages being loaded
We're talking about replicating other people's work. We don't have any control over their code, and R users are largely ignorant of best-software practices.
Totally agree. I find it frustrating trying to reproduce other people's work in R. How has this situation has been allowed to continue for so long? It's unacceptable, especially when used for science. It's impossible to replicate anything unless you are lucky enough you manage to find which package version introduces breaking changes and even then this is something you have to do repeatedly for every code break. Even with _renv_ it's a library you have to install within your R environment which is pointless. Where is a dependency solver like conda for R? - Not that it's perfect, but I've been happy with its drop-in replacement - mamba recently.
The packages that were used in statistics were SAS, SPSS and Stata. perl is not a statistical package and has nowhere near the depth of statistical capabilities of R.
Don't forget that I also mentioned the checkpoint package in my post. You only need to know the date for that, not the version of each of the packages.
In your last paragraph I think you are referring more to software development practices than what is available through R. Simply using R or any language doesn't guarantee this.
That's a very roundabout way to solve an actual problem. In many cases you don't pin your package version to _latest_ (whatever that date is) and you need a more fine-grained solution to keeping package versions. I don't think that solves this and I don't know if you can do it with checkpoint.