sparklyr — R interface for Apache Spark

Original Post

  • Connect to Spark from R — the sparklyr package provides a complete dplyr backend.
  • Filter and aggregate Spark datasets then bring them into R for analysis and visualization.
  • Orchestrate distributed machine learning from R using eitherSpark MLlib or H2O Sparkling Water.
  • Create extensions that call the full Spark API and provide interfaces to Spark packagessparklyr-illustration