Data science with R and the tidyverse
To do data science you need be to able to solve six main types of problems: 1. __Importing__ your data into your analysis environment of choice.
2. __Tidying__ your data into a consistent form.
3. __Transforming__ it to add new variables or create summaries.
4. __Visualising__ it to help refine your questions and to reveal both the mundane and the surprising.
5. __Modelling__ to scale to larger data volumes, and handle uncertainty in principled way:
6. __Communicating__ your results to others.
In this talk, I'll discuss these challenges in the context of the tidyverse, a set of R packages designed to facilitate interactive data analysis.
Bio: Hadley is Chief Scientist at RStudio and a member of the R Foundation. He builds tools (both computational and cognitive) that make data science easier, faster, and more fun. His work includes packages for data science (the tidyverse: ggplot2, dplyr, tidyr, purrr, readr, ...), and principled software development (roxygen2, testthat, devtools). He is also a writer, educator, and frequent speaker promoting the use of R for data science. Learn more on his website, http://hadley.nz.