Build better data science tools. Learn to design software for data tools, distribute R packages, and create custom visualizations.
Suggested by: Coursera (What is Coursera?)
No prior knowledge required
No unnecessary risks
R is a free programming language and software environment for statistical calculations and graphics, which is widely used by analysts, data scientists, and statisticians.
This specialization deals with software development in R to build data science tools. As the field of data science evolves, it is becoming clear that software development skills are essential for producing and scaling useful data science results and products.
You will learn modern software development methods to build tools that are reusable, modular, and suitable for use in team environments or developer communities.
In each of the courses, students will apply the advanced R skills they have acquired to:
These projects will yield a portfolio of R code that can be reused and built upon for real-world deployment.
This course provides an in-depth introduction to the R programming language, with an emphasis on using R to develop software for data science. Whether you are part of a data science team or working independently in a community of developers, this course will give you the knowledge of R you need to make meaningful contributions to these fields. As a first course in the specialty, it provides the necessary foundations of R for the courses that follow. We will introduce basic R concepts, language principles, key concepts such as structured data and related tools in “Thediverse”, processing and manipulating complex and large data sets, handling textual data, and basic data science tasks. Upon completion of the course, students will be fluent in using the R console and will be able to create structured datasets from a wide range of possible data sources.
This course covers advanced R programming topics required to develop powerful, robust, and usable data science tools. Topics include functional programming in R, robust error handling, object-oriented programming, performance profiling and testing, debugging, and proper function design. After completing the course, you will be able to identify and include common data analysis tasks within user-facing functions. Since every data science environment faces unique data challenges, there is always a need to develop software tailored to your organization’s reporting. You will also define new data types in R and develop unique workflows for those data types to enable clearer execution of data science tasks and stronger usability within a team.
Writing good code for data science is only part of the job. To maximize the usability and reuse of data science software, the code must be organized and distributed in a way that meets community-based standards and provides a good user experience. This course covers the main ways in which R software is organized and distributed to others. We cover developing R packages, writing good documentation and balancing, writing robust software, cross-platform development, continuous integration tools, and distributing packages via CRAN and GitHub. Students will produce R packages that meet the criteria for submission to CRAN.
The data science revolution has generated vast amounts of data from a wide variety of new sources. This new data is being used to answer new questions in ways that were previously unimaginable. Visualization remains one of the most powerful ways to draw conclusions from data, but the influx of new types of data requires the development of new visualization techniques. This course provides you with the skills to create those visualization tools. We will focus on the ggplot2 framework and show you how to use and extend the system to meet the specific needs of your organization or team. Upon completion of this course, students will be able to build the tools necessary to visualize a wide range of data types and will have the foundational knowledge needed to deal with new types of data as they arise.
Summary course for R programming

