Curated CRAN Sources#
Enhanced Advanced
This section provides an overview of what curated-cran
sources are, why they are useful, and how to use them. If you are already familiar with curated-cran
sources, then reference the Quick Start section on creating your first curated subset of CRAN.
Overview#
Curated CRAN sources are based on a current mirror of CRAN. It can be useful to only include certain CRAN packages and versions within a source. This is especially helpful in secure, regulated environments, where only verified sets of packages are allowed.
Creating a Curated CRAN Source#
$ rspm create source --name=subset --type=curated-cran
<< Source 'subset':
<< Type: Curated CRAN - 2024-04-17
Curated CRAN sources don't need to be pinned to a specific snapshot date at the time of creation; any date can be picked when adding packages with rspm update
(described below). Once the source has been created, be sure to subscribe a repository to the source to make the packages available to users:
# Create a repository:
$ rspm create repo --name=cran --type=r --description='Access Curated CRAN packages'
<< Repository: cran - Access Curated CRAN packages - R
# Subscribe a repository to the curated-cran source:
$ rspm subscribe --repo=cran --source=subset
<< Repository: cran
<< Sources:
<< --subset (Curated CRAN - 2024-04-17)
Including Packages in a Curated CRAN Source#
Packages are included in a curated-cran
source by uploading a requirements.txt
definition with rspm update
. This section explains how a requirements file is defined and also discusses how to use a requirements file to include packages in a curated-cran
source.
Requirements Files#
The requirements.txt
format that Package Manager looks at is defined as:
[package name] [optionally: version constraints] [[optionally: extras]]
The extras
block allows you to specify which types of related packages should be included along with the package itself.
The extras
block is additive to the --include
flag that the source was created with. See the section on source-level included packages to learn more.
The following values are valid for the extras
block:
depends
imports
linking-to
suggests
all
Suggested dependency inclusions
Suggested dependencies, when included, are included only at a single level, not recursively; all other types are included recursively.
As an example, a requirements.txt
file could look like:
This fetches and installs:
A3
with versions greater than or equal to0.9.2
along with all related packages- Only
ggplot2
version3.4.4
- All available versions of
plumber
along with suggested packages. - All packages from
requirements2.txt
.
Filtering by version is not supported for:
- Curated CRAN sources created prior to Package Manager version
2024.04
- Sources created using the
--strict
flag
By default, the depends
, imports
, and linking-to
dependencies for each package are included. The default types of related packages to include can be configured per source.
See the section on source-level included packages to learn more.
As shown in the example above, a package doesn't need to have any version constraints defined. It can also have as many version constraints as needed. The versions made available to Package Manager depend on what is available at the snapshot date specified when updating the source.
All version parsing and matching criteria is based on PEP-440. Refer to the PEP-440 documentation for information on version formatting and constraints. For more information on the Requirements File Format, refer to pip's documentation.
requirements.txt
limitations
Not everything defined in the Requirements File Format specification is supported in Package Manager.
The curated-cran
source only parses package names, version ranges, and recursive file references. Any other definitions (e.g., option flags, environment markers) within an uploaded requirements.txt
file is ignored.
The requirements.txt
file also supports declaring multiple references of the same package with different version constraints:
This will be treated as an OR
operator, leading the curated-cran
source to evaluate the defined version constraints as "tidyr == 1.3.0 or tidyr == 1.3.13".
In this example, Package Manager pulls in version 1.3.0
and version 1.3.1
.
Excluding a package using multiple references
Use caution when referencing a package multiple times when using a !=
constraint. As an example:
This still includes version 1.2.1
because it is being evaluated as "tidyr >= 1.1.0, < 1.3.0 or tidyr != 1.2.1"
To guarantee that version 1.2.1
is excluded, include all version constraints on a single line so Package Manager evaluates all constraints together:
Filtering out package versions can break the R package graph, so it must be done with care.
Updating a Curated CRAN Source#
To make packages available in a Curated CRAN source, all that is necessary is to run rspm update
with a requirements file for a specific CRAN snapshot date. Package Manager allows running a dry-run before committing the changes to the source:
# Do a dry-run to visualize the changes to the source before doing them
$ rspm update --source=subset --file-in=/path/to/requirements.txt --snapshot=2024-04-17
Updating to the latest snapshot
To use the most recent snapshot available, omit the --snapshot
flag from the dryrun command.
A preview of the changes is presented:
rspm update --source=subset --file-in=requirements.txt --snapshot=2024-04-17
Packages from 'requirements.txt' to update source 'subset' at CRAN snapshot date '2024-04-17':
Name Version Action
A3 1.0.0 add
arrow 15.0.1 add
assertthat 0.2.1 add
base64enc 0.1-3 add
<truncated>
If the output above looks correct, execute this command again with the --commit and --snapshot=2024-04-17 flags to update the source with the new set of packages.
To commit the changes, repeat the command, adding the --commit
flag:
# Now commit the changes to the source:
$ rspm update --source=subset --file-in=/path/to/requirements.txt --snapshot=2024-04-17 --commit
The finalized contents of the source are then printed:
rspm update --source=subset --file-in=requirements.txt --snapshot=2024-04-17 --commit
Successfully updated source 'subset' at CRAN snapshot date '2024-04-17' with the following packages from 'requirements.txt':
Name Version Action
A3 1.0.0 add
arrow 15.0.1 add
assertthat 0.2.1 add
base64enc 0.1-3 add
<truncated>
rspm update
overwrite behavior
Running rspm update
on a Curated CRAN source overwrites the source with only the packages defined in your requirements.txt
file. However, previous snapshots of the source are still available with a pinned repo URL.
To update the source to a different snapshot date, use the update
command again:
# Update packages in a curated-cran source:
$ rspm update --source=subset --file-in=/path/to/requirements.txt --snapshot=2024-04-18 --commit
Curated CRAN sources can be pinned to any date for which Posit has a CRAN snapshot (typically, once per weekday). Curated CRAN sources also support using any date, regardless of the previously used snapshot dates. If the source was initially set to 2021-02-03
, it can then be set to a later date with --snapshot=2022-06-01
. If later you would like to pin it back to the original date used, that can be done by running rspm update
again with --snapshot=2021-02-03
.
Curated CRAN snapshots
This allows you to set the Curated CRAN source to any date where a CRAN snapshot has been taken on our servers. To pin to a version of a package that doesn't exist on CRAN anymore, pin to a date when the version of the package existed.
The snapshot date for non-strict sources can be moved both forwards and backwards in time.
Source-Level Included Packages#
A Curated CRAN source will automatically include the related depends
, imports
, and linking-to
packages for each requested package.
You can change the types of related packages that are included when creating a source. This can be done by passing in the --include
flag. For example:
# Equivalent to the default setting
$ rspm create source --name=subset --type=curated-cran --include=depends,imports,linking-to
# Include the defaults plus `suggests`
$ rspm create source --name=subset --type=curated-cran --include=depends,imports,linking-to,suggests
# Don't include any related packages
$ rspm create source --name=subset --type=curated-cran --include=none
# Include all related packages:
$ rspm create source --name=subset --type=curated-cran --include=all
Editing --include
The value of the --include
cannot be changed after source creation.
The following options are valid options for the --include
flag:
depends
imports
linking-to
suggests
none
all
Suggested dependency inclusions
Suggested dependencies, when included, are included only at a single level, not recursively; all other types are included recursively.
Included related packages can also be configured per-package by adding an extras
block to your packages in the requirements.txt
file. For example:
The extras
block is additive to the --include
flag that the source was created with.
- A3 with versions greater than or equal to 0.9.2 plus all related packages
- ggplot2 version 3.4.4 using the default inclusions
- All versions of plumber plus
depends
,imports
,linking-to
, andsuggests
packages
Strict Curated CRAN#
Older Curated CRAN sources (those created before Package Manager version 2024.04
) use a less permissive package version snapshot behavior. You can still create Curated CRAN sources using the previous behavior by using the --strict
option when creating the source:
$ rspm create source --name=subset --type=curated-cran --strict
<< Source 'subset':
<< Type: Curated CRAN - 2024-04-18 - Strict
Editing --strict
The value of the --strict
cannot be changed after source creation.
Strict vs Default Sources#
When a source uses the new default archive policy, updates to the source incorporate all versions of the affected package to match the associated CRAN snapshot. The package versions associated with a curated snapshot will match the versions available for the associated CRAN snapshot, including current and archived packages.
When a source uses the --strict
archive policy, any updates to the source incorporates only the current versions of the affected packages. Any previous package versions associated with the source at the previous transaction are recorded as archived packages.
Strict sources are useful for those who need more control over available versions without the ability to add version constraints. Non-strict sources are significantly more flexible.
rspm add
vs rspm update
#
The rspm add
command is still supported to add packages to Curated CRAN sources, however it is recommended that rspm update
is used instead. rspm add
may be deprecated in a future release of Package Manager.
rspm add
can be used to add individual packages to a Curated CRAN source:
# Specify the top-level packages you want to add:
$ rspm add --packages=ggplot2,shiny --source=subset
The output provides information on all the packages that will be added. The proposal can be saved to a CSV file using the csv-out
flag. The required dependencies for the named packages are automatically discovered and included. Optionally, use the --include-suggests
flag to also discover and add suggested packages.
Packages to update source 'subset' at CRAN snapshot date '2024-04-22':
Name Version Action
base64enc 0.1-3 add
bslib 0.7.0 add
cachem 1.0.8 add
cli 3.6.2 add
<truncated>
If the output above looks correct, execute this command again with the --commit and --snapshot=2024-04-22 flags to update the source with the new set of packages.
To commit the changes, repeat the command, adding the --commit
flag.
# Commit the top-level packages you want to add:
$ rspm add --packages=ggplot2,shiny --source=subset --snapshot=2020-07-06 --commit
rspm add
also supports adding a large number of packages that are specified in a file. To do this, create a file containing one package name per line. For example, /tmp/packages.csv
:
Then use the add
command, this time using the --file-in
flag: