Menu
In R, the fundamental unit of shareable code is the package. A package bundles together code, data, documentation, and tests, and is easy to share with others. As of June 2019, there were over 14,000 packages available on the Comprehensive RArchive Network, or CRAN, the public clearing house for R packages. This huge variety of packages is one of the reasons that R is so successful: the chances are that someone has already solved a problem that you’re working on, and you can benefit from their work by downloading their package.
- Bookdown Github Pages
- Bookdown Package Github
- Bookdown Github Download
- Bookdown Github App
- Github Bookdown Demo
If you’re reading this book, you already know how to use packages:
Using Git and GitHub with R, Rstudio, and R Markdown. Sign up with email Sign up Sign up with Google Sign up with GitHub Sign. Browse other questions tagged r r-markdown pandoc gitbook bookdown or ask your.
- You install them from CRAN with
install.packages('x')
. - You use them in R with
library('x')
. - You get help on them with
package?x
andhelp(package = 'x')
.
The goal of this book is to teach you how to develop packages so that you can write your own, not just use other people’s. Why write a package? One compelling reason is that you have code that you want to share with others. Bundling your code into a package makes it easy for other people to use it, because like you, they already know how to use packages. If your code is in a package, any R user can easily download it, install it and learn how to use it.
But packages are useful even if you never share your code. As Hilary Parker says in her introduction to packages: “Seriously, it doesn’t have to be about sharing your code (although that is an added benefit!). It is about saving yourself time.” Organising code in a package makes your life easier because packages come with conventions. For example, you put R code in
R/
, you put tests in tests/
and you put data in data/
. These conventions are helpful because:- They save you time — you don’t need to think about the best way to organisea project, you can just follow a template.
- Standardised conventions lead to standardised tools — if you buy intoR’s package conventions, you get many tools for free.
It’s even possible to use packages to structure your data analyses, as described by Marwick, Boettiger, and Mullen in (Marwick, Boettiger, and Mullen 2018a)(Marwick, Boettiger, and Mullen 2018b).
1.1 Philosophy
This book espouses our philosophy of package development: anything that can be automated, should be automated. Do as little as possible by hand. Do as much as possible with functions. The goal is to spend your time thinking about what you want your package to do rather than thinking about the minutiae of package structure.
This philosophy is realised primarily through the devtools package, which is the public face for a suite of R functions that automate common development tasks. The release of version 2.0.0 in October 2018 marked its internal restructuring into a set of more focused packages, with devtools becoming more of a meta-package. The usethis package is the sub-package you are most likely to interact with directly; we explain the devtools-usethis relationship in section 3.2.
As always, the goal of devtools is to make package development as painless as possible. It encapsulates the best practices developed by first author Hadley Wickham, initially from years as a prolific solo developer. More recently, he has assembled a team of ~10 developers at RStudio, who collectively look after ~150 open source R packages, including those known as the tidyverse. The reach of this team allows us to explore the space of all possible mistakes at an extraordinary scale. Fortunately, it also affords us the opportunity to reflect on both the successes and failures, in the company of expert and sympathetic colleagues. We try to develop practices that make life more enjoyable for both the maintainer and users of a package. The devtools meta-package is where these lessons are made concrete.
Through the book, we highlight specific ways that RStudio can expedite your package development workflow, in specially formatted sections like this.
devtools works hand-in-hand with RStudio, which we believe is the best development environment for most R users. The main alternative is Emacs Speaks Statistics (ESS), which is a rewarding environment if you’re willing to put in the time to learn Emacs and customise it to your needs. The history of ESS stretches back over 20 years (predating R!), but it’s still actively developed and many of the workflows described in this book are also available there. For those loyal to vim, we recommend the Nvim-R plugin.
Together, devtools and RStudio insulate you from the low-level details of how packages are built. As you start to develop more packages, we highly recommend that you learn more about those details. The best resource for the official details of package development is always the official writing R extensions manual. However, this manual can be hard to understand if you’re not already familiar with the basics of packages. It’s also exhaustive, covering every possible package component, rather than focussing on the most common and useful components, as this book does. Writing R extensions is a useful resource once you’ve mastered the basics and want to learn what’s going on under the hood.
1.2 In this book
Chapter 2 runs through the development of a small toy package. It’s meant to paint the Big Picture and suggest a workflow, before we descend into the detailed treatment of the key components of an R package.
Chapter 3 describes how to prepare your system for package development, which has more requirements than simply running R scripts. This includes recommendations on some optional setup that can make your workflow more pleasant, which tends to lead to a higher-quality product.
The basic structure of a package and how that varies across different states is explained in chapter 4.
Chapter 5 goes over core workflows that come up repeatedly for package developers. This chapter also covers connections between our favored tools, such as devtools/usethis and RStudio, and the philosophies that drive the design of these tools.
Subsequent chapters of the book go into more details about each package component. They’re roughly organised in order of importance:
- R code, chapter 7: the most important directory is
R/
, where your Rcode lives. A package with just this directory is still a useful package. (Andindeed, if you stop reading the book after this chapter, you’ll have stilllearned some useful new skills.) - Package metadata, chapter 8: the
DESCRIPTION
lets youdescribe what your package needs to work. If you’re sharing your package,you’ll also use theDESCRIPTION
to describe what it does, who can use it(the license), and who to contact if things go wrong. - Documentation, chapter 10: if you want other people (includingfuture-you!) to understand how to use the functions in your package, you’llneed to document them. We’ll show you how to use roxygen2 to document yourfunctions. We recommend roxygen2 because it lets you write code anddocumentation together while continuing to produce R’s standard documentationformat.
- Vignettes, chapter 11: function documentation describes thenit-picky details of every function in your package. Vignettes give the bigpicture. They’re long-form documents that show how to combine multiple partsof your package to solve real problems. We’ll show you how to use Rmarkdownand knitr to create vignettes with a minimum of fuss.
- Tests, chapter 12: to ensure your package works as designed (andcontinues to work as you make changes), it’s essential to write unit testswhich define correct behaviour, and alert you when functions break. In thischapter, we’ll teach you how to use the testthat package to convert theinformal interactive tests that you’re already doing to formal, automatedtests.
- Namespace, chapter 13: to play nicely with others, your packageneeds to define what functions it makes available to other packages and whatfunctions it requires from other packages. This is the job of the
NAMESPACE
file and we’ll show you how to use roxygen2 to generate it for you.TheNAMESPACE
is one of the more challenging parts of developing an Rpackage but it’s critical to master if you want your package to work reliably. - External data, chapter 14: the
data/
directory allows you toinclude data with your package. You might do this to bundle datain a way that’s easy for R users to access, or just to provide compellingexamples in your documentation. - Compiled code, chapter 15: R code is designed for human efficiency,not computer efficiency, so it’s useful to have a tool in your back pocketthat allows you to write fast code. The
src/
directory allows you to includespeedy compiled C and C++ code to solve performance bottlenecks in yourpackage. - Other components, chapter 17: this chapter documents the handful ofother components that are rarely needed:
demo/
,exec/
,po/
andtools/
.
The final chapters describe general best practices not specifically tied to one directory:
- Git and GitHub, chapter 18: mastering a version control system isvital to easily collaborate with others, and is useful even for solo workbecause it allows you to easily undo mistakes. In this chapter, you’ll learnhow to use the popular Git and GitHub combo with RStudio.
- Automated checking, chapter 19: R provides very usefulautomated quality checks in the form of
R CMD check
. Running them regularlyis a great way to avoid many common mistakes. The results can sometimes be abit cryptic, so we provide a comprehensive cheatsheet to help you convertwarnings to actionable insight. - Release, chapter 20: the life-cycle of a package culminates withrelease to the public. This chapter compares the two main options (CRAN andGitHub) and offers general advice on managing the process.
This is a lot to learn, but don’t feel overwhelmed. Start with a minimal subset of useful features (e.g. just an
R/
directory!) and build up over time. To paraphrase the Zen monk Shunryu Suzuki: “Each package is perfect the way it is — and it can use a little improvement”.1.3 Acknowledgments
Since the first edition of R Packages was published, the packages supporting the workflows described here have undergone extensive development. The original trio of devtools, roxygen2, and testthat has expanded to include the packages created by the “conscious uncoupling” of devtools. Most of these packages originate with Hadley Wickham (HW), because of their devtools roots. There are many other significant contributors, many of whom now serve as maintainers:
- devtools: HW, Winston Chang, Jim Hester (maintainer, >= v1.13.5)
- usethis: HW, Jennifer Bryan (maintainer >= v1.5.0)
- roxygen2: HW (maintainer), Peter Danenburg, Manuel Eugster
- testthat: HW (maintainer)
- desc: Gábor Csárdi (maintainer), Kirill Müller, Jim Hester
- pkgbuild: HW, Jim Hester (maintainer)
- pkgload: HW, Jim Hester (maintainer), Winston Chang
- rcmdcheck: Gábor Csárdi (maintainer)
- remotes: HW, Jim Hester (maintainer), Gábor Csárdi, Winston Chang, Martin Morgan, Dan Tenenbaum
- revdepcheck: HW, Gábor Csárdi (maintainer)
- sessioninfo: HW, Gábor Csárdi (maintainer), Winston Chang, Robert Flight, Kirill Müller, Jim Hester
This book and the R package development community benefit tremendously from experts who smooth over specific pain points:
- Kevin Ushey, JJ Allaire, and Dirk Eddelbuettel tirelessly answered all sorts of C, C++, and Rcpp questions.
- Craig Citro wrote much of the initial code to facilitate using Travis-CI with R packages.
- Jeroen Ooms also helps to maintain R community infrastructure, such as the current R support for Travis-CI (along with Jim Hester), and the Windows toolchain.
TODO: revisit rest of this section when 2nd edition nears completion. Currently applies to and worded for 1st edition.
Often the only way I learn how to do it the right way is by doing it the wrong way first. For suffering through many package development errors, I’d like to thank all the CRAN maintainers, especially Brian Ripley, Uwe Ligges and Kurt Hornik.
Bookdown Github Pages
This book was written and revised in the open and it is truly a community effort: many people read drafts, fix typos, suggest improvements, and contribute content. Without those contributors, the book wouldn’t be nearly as good as it is, and we are deeply grateful for their help.
A special thanks goes to Peter Li, who read the book from cover-to-cover and provided many fixes. I also deeply appreciate the time the reviewers (Duncan Murdoch, Karthik Ram, Vitalie Spinu and Ramnath Vaidyanathan) spent reading the book and giving me thorough feedback.
Thanks go to all contributors who submitted improvements via github (in alphabetical order):
@aaronwolen
, @adessy
, Adrien Todeschini, Andrea Cantieni, Andy Visser, @apomatix
, Ben Bond-Lamberty, Ben Marwick, Brett K, Brett Klamer, @contravariant
, Craig Citro, David Robinson, David Smith, @davidkane9
, Dean Attali, Eduardo Ariño de la Rubia, Federico Marini, Gerhard Nachtmann, Gerrit-Jan Schutten, Hadley Wickham, Henrik Bengtsson, @heogden
, Ian Gow, @jacobbien
, Jennifer (Jenny) Bryan, Jim Hester, @jmarshallnz
, Jo-Anne Tan, Joanna Zhao, Joe Cainey, John Blischak, @jowalski
, Justin Alford, Karl Broman, Karthik Ram, Kevin Ushey, Kun Ren, @kwenzig
, @kylelundstedt
, @lancelote
, Lech Madeyski, @lindbrook
, @maiermarco
, Manuel Reif, Michael Buckley, @MikeLeonard
, Nick Carchedi, Oliver Keyes, Patrick Kimes, Paul Blischak, Peter Meissner, @PeterDee
, Po Su, R. Mark Sharp, Richard M. Smith, @rmar073
, @rmsharp
, Robert Krzyzanowski, @ryanatanner
, Sascha Holzhauer, @scharne
, Sean Wilkinson, @SimonPBiggs
, Stefan Widgren, Stephen Frank, Stephen Rushe, Tony Breyal, Tony Fischetti, @urmils
, Vlad Petyuk, Winston Chang, @winterschlaefer
, @wrathematics
, @zhaoy
.The light bulb image used for workflow tips comes from www.vecteezy.com.
1.4 Conventions
Throughout this book, we write
foo()
to refer to functions, bar
to refer to variables and function parameters, and baz/
for paths.Larger code blocks intermingle input and output. Output is commented so that if you have an electronic version of the book, e.g., https://r-pkgs.org, you can easily copy and paste examples into R. Output comments look like
#>
to distinguish them from regular comments.1.5 Colophon
This book was authored using R Markdown, using bookdown, inside RStudio. The website is hosted with Netlify, and automatically updated after every commit by Travis-CI. The complete source is available from GitHub.
This version of the book was built with:
Social Media Analysis has been gaining popularity for some time now. Every company invests loads of money investing in publicity, advertising, campaigns, etc. But, there is no definite way of suggesting the ROI for this investment except the indication of growth in number of satisfied customers. But again, it is not sure that only the social advertising and campaigning efforts have caused the growth in number of customers. The answer is Social Media Analysis.
For instance, we use the tweets of the costumers to analyze its sentiment and thus their experience. We could modify the advertisements, campaign policies considering various factors such as geological location, time of the year, etc. This seems to be the task for a genius hacker. But, with the right tools we can do this with writing minimal code.
The objective in this occasion is to show how easy it can be to build your own Social Media Tool. Here, I am going to use RStudio with some packages: shiny, twitteR, httr, tm and wordcloud.
shiny: A package created by RStudio (http://shiny.rstudio.org/), to build Web applications very easily. In this case, you will see the code to operate locally. Here you may be able to find more info on how to run it on your own or a hosted server. A shiny application consists of two files ui.R and server.R. The former includes code which determines how your application will look like and the latter includes the code for the logic of your application.
twitteR: A very powerful package for Twitter Monitoring. Simple, easy and very effective.
tm: tm stands for “Text Mining”. Apart from having text mining tools, it also provides very useful functions to pre-process texts
wordcloud: Package used to do Wordcloud plots.
Let’s get started.
Authentication Process with Twitter
For fetching twitter data, we have to use twitter API and authenticate the connection every time we run our shiny app.Its about creating the Twitter app and doing the handshake cause you have to do it every time you want to get data from Twitter with R. Since Twitter released the Version 1.1 of their API a OAuth handshake is necessary for every request you do. So we have to verify our app.
First we need to create an app at Twitter.
Got to https://apps.twitter.com/ and log in with your Twitter Account.
Click on it and then on “Create new application”.
You can name your Application whatever you want and also set Description on whatever you want. Twitter requires a valid URL for the website, you can just type in http://test.de/ ; you won´t need it anymore.
And just leave the Callback URL blank.
Click on Create you´ll get redirected to a screen with all the OAuth setting of your new App. Just leave this window in the background; we´ll need it later.
Before we go on, make sure you have installed the newest version of the twitteR package from github.
Therefore you can use the following code after you have opened RStudio > New Project > New[Existing] Directory > New Shiny Web Application which will create two files ui.R and server.R in the working directory. This code goes in the console
Therefore you can use the following code after you have opened RStudio > New Project > New[Existing] Directory > New Shiny Web Application which will create two files ui.R and server.R in the working directory. This code goes in the console
install.packages(c('shiny', 'twitteR', 'devtools', 'rjson', 'bit64', 'httr', 'wordcloud', 'tm'))
#RESTART R session!
library(devtools)
install_github('twitteR', username='geoffjentry')
library(twitteR)
Note: The latest version for httr is installed by default which is not compatible with twitteR version 1.1.8, instead download httr version 0.6.0.
Here is how you can do it:
packageurl = 'http://cran.us.r-project.org/src/contrib/Archive/httr/httr_0.6.0.tar.gz'
install.packages(packageurl, repos=NULL, type='source')
Or else try this link http://www.r-bloggers.com/installoldpackages-a-repmis-command-for-installing-old-r-package-versions/#
Now the twitteR package is up-to-date and we can use the new and very easy setup_twitter_oauth() function which uses the httr package. First you have to get your api_key and your api_secret as well as your access_token and access_token_secret from your app settings on Twitter. Just click on the “API key” tab to see them.
Bookdown Package Github
api_key = 'YOUR API KEY'
api_secret = 'YOUR API SECRET'
access_token = 'YOUR ACCESS TOKEN'
access_token_secret = 'YOUR ACCESS TOKEN SECRET'
setup_twitter_oauth(api_key,api_secret,access_token,access_token_secret)
And that´s it.
Shiny Code
UI.R
|
server.R
|
Please use the re-indent option in RStudio after copying this code into ui.R and server.R files. Before we go on to analyze in more detail all the code presented, this is how the application looks like:
Let’s look at the code now. The UI logic is extremely easy to understand. Each of the widgets to enter parameters has the id as first argument. To call it in further actions, you will write input$(id). It works as any other variable. In the selectInput, the name of the button is first entered and then the value that the selectInput variable will receive if that option is chosen (if you choose English, it’ll receive “en”).
The only thing to point out specially is the submitButton() function. Unless you include it, the script will process and output the results every time you change the parameters (reactive). The submitButton() function is particularly useful if the process that has to take place is expensive or if you need from the user to enter more than one parameter for the whole script to run correctly.
Next, we have server.R which is relatively a bit more complex. Firstly, all the necessary libraries (apart from shiny) have to be initialized in server.R. It is always advisable, when possible, to start them all together. Also, shinyServer(function(input, output) must always be the first line of your code in server.R.
reactive() is a function that indicates that whatever is processed inside that function, it will be done whenever the parameters are changed (if you entered the submitButton() in the UI, whenever you hit it. Otherwise, whenever you change the parameters). It is particularly useful to build the raw Data that you will process afterwards.
Shiny will execute all of these commands if you place them in your server.R script. However, where you place them in server.R will determine how many times they are run (or re-run), which will in turn affect the performance of your app.
Shiny will run some sections of server.R more often than others.
Shiny will run the whole script the first time you call runApp. This causes Shiny to execute shinyServer. shinyServer then gives Shiny the unnamed function in its first argument.
As users change widgets, Shiny will re-run the R expressions assigned to each reactive object. If your user is very active, these expressions may be re-run many, many times a second.
In this case, searchTwitter() is called whenever one or more from input$term, input$count or input$lang are changed and the submit button is pressed, It uses the information entered by the user (the first argument is the term entered, the second the amount of tweets and the third one the language) and gives its output.
As the object returned by searchTwitter() is a bit difficult to handle, it is advisable to turn it into a Data Frame (twListToDF()) if you want to work, for example, with their texts (to make wordcloud). Or you could also try
tweets = laply(tweets,function(t)t$getText())
The renderPlot is a bit more sophisticated: Firstly, it takes the tweets and changes the encoding (enc2native(); to native). Then, it converts everything to lower case. After that, the function removeWords() from the package “tm” is used to delete common words. As you can appreciate in the example, you can input whatever word, list of words, regex, etc. you would like to be removed. In this case, the stopwords from the user-entered language (input$lang) are removed, plus the term “rt” as in Re-Tweet. In order to do a wordcloud (our final objective), this is particularly useful, as we will never would like to have “common words” in it. After that, punctuation is also removed.
Bookdown Github Download
Finally, all the tweets are turned into a list of words, then turned into a table (i.e., a frequency table), ordered descendently and the first 50 are chosen to plot.
For the wordcloud function, we enter the labels (the words themselves) of the generated table and the frequencies (the first two arguments in the example). This will determine the size of the words. The last arguments refer to the color order, the palette, and the maximum and minimum size for each word in the plot.
And now you can place the output in main panel using tableOutput and plotOutput.
Done. ?
Explore the options, play with this data and read more about shiny here.
Bookdown Github App
This is the only comprehensive and complete guide to extracting twitter data in R and making a Web Application using shiny. I have tweaked the code a little for deploying the application via GitHub. Here’s the source code https://github.com/mngujral/twitterFeedShinyApp.
And you can also run my version on your RStudio simply by the command:
runGitHub('twitterFeedShinyApp', 'mngujral')
Github Bookdown Demo
Feel free to contact and/or leave a comment if you have any question, critic or correction.