Big Data con R Tidyverse y Spark

1.1. Intro slides

1.1.1. R & Big Data

https://github.com/rstudio/webinars/blob/master/14-Work-with-big-data/14-Work-with-big-data.pdf

1.1.2. Tidyverse

1.1.3. Spark (sparklyr):

https://github.com/rstudio/webinars/blob/master/42-Introduction%20to%20sparklyr/Introducing%20sparklyr%20-%20Webinar.pdf

1.1.4. Speeding up Spark via R via Arrow

https://arrow.apache.org/blog/2019/01/25/r-spark-improvements/

1.2. Intro Exercise

See:
https://gitlab.com/radup/curs-r-introduccio/blob/master/codi/extra.tips.bigdata.R

1.3. References:

TheRinSpark Book
https://therinspark.com/

RStudio Webinar: Introducing an R interface for Apache Spark
https://www.rstudio.com/resources/webinars/introducing-an-r-interface-for-apache-spark/
https://github.com/rstudio/webinars/blob/master/42-Introduction%20to%20sparklyr/sparklyr-webinar1.Rmd

Some online tutorials
- 30Gb DataSet
- Text mining using sparklyr

Some cheatsheets:
https://www.rstudio.com/resources/cheatsheets/
- And some in Spanish:
  https://www.rstudio.com/resources/cheatsheets/ > Spanish Translations – Traducciones en español