6  - Package states

Now we have created our package structure, it’s important to take a moment to understand exactly the different states that your package can have and in addition make a link with all the things that you already do, like use the install.packages() function, maybe without knowing what is behind that. There are five different package states that we will explain below.

6.1 Source package

When we create our package in section 5, we create a source structure package. This is not the comment form you are most familiar with but to summarize it’s just a directory with files and subdirectories, grouped in a specific structure. If you have already interact with packages developed through a Git (the majority of packages have a Git repository dedicate), the structure associated with it is a source package.

General, the source package is the most updated form of a package but be aware because it’s mean not necessary the most stable one. To conclud, keep in our mind that this state is not associated to any OS, it’s just element with a particular organisation.

6.2 Bundled package

A bundled package is just a package compressed into a single file. By convention, R package extension is .tar.gz. The first one (.tar) is for the reduction in one single file, and the second one (.gz) is compressed with gzip. Like the previous state, it’s again not associated to any OS and it’s more a means of transportation for your package. If you are familiar with the package available on CRAN, you should see in a package description page on the CRAN website the item Package source with an associated .tar.gz file, which is the bundle version of the package.

If you decompress a bundle, you will find 3 differences with a source package structure:

  • If you have vignettes (we will discuss that during the section of the documentation), they have been rendered an appear below the repositories inst/doc/ with an index in the build repository.
  • During the development phase, you can use temporary files for example increase the compilation time. These files are never found in a bundle.
  • To finish, all the files listed into the .Rbuildignore of a source package is not included in the bundle because by definition there are excluded from the package compilation.

6.3 Binary package

This next state is the first one where you have the integration of the OS specification. If you want to share your package without any package development tools mandatory for the user, you need to provide it with a binary package. Like a bundled package, a binary package is a single file but as a user you have to choose the correct one for your OS. Normally you should find two kinds of binary package extensions, one .tgz for macOS and another .zip for Windows. On Linux, you have to install tools necessary to .tar.gz files but some recent resource like Posit Public Package Manager provide to Linux user access to binary packages just like the other OS.

We don’t go deeper into the internal structure of a binary package but be aware that is structure is very different than a source and bundled packages and if you want to go deeper you can find a very nice comparative between the three here.

6.4 Installed package

An installed package is a binary package that’s been decompressed into a package library (see section below). This is what happens when you use the function install.packages(). At this moment and it’s practically sure that you have already some troubles in the past, it could be complicated to reinstall a package already used in the current R session. Sometime you have popups associated to “need of compilation”, restart a fresh session or again question related to take a package from sources because the version is more updated that the one that you have selected. There don’t have a perfect solution by when troubleshooting, Windows users should strive to install packages in a clean R session, with as few packages loaded as possible. Regarding the popup question of taking the package from the source because the version is higher, I suggest doing that only if you know what you are doing, for example if you want a specific process or function developed so far only on the source.

If you have a lot of troubles with packages installations and update, remind of the pak [1] package. We used it before to check our package name, but he provides a good and powerful alternative to install.packages(). Use it with caution because it is a relative newcomer, but we are certainly using it more and more in our personal workflows (in the GitHub action that we see after for example).

6.5 In-memory package

Everyone how used R used this state all the time, thought the command library(). When you use that, you loaded the package associated into the memory (and for the R session) and in addition, attached the package to the search path. We will see after the difference between loading and attaching package when we talk about the “hell” of dependencies (it’s a very important thing when you writing package).

To conclude this part, did not confuse the word library and package. Sometime people talk about a package as a library with for example “I use the library DBI to dealing with my database”. A library is just like if the name suggests an element which containing installed packages, just like a library “in the world” contain books. Most of the time it is not a problem because people will understand anyway, but the distinction is important and useful as you develop package.