Apache Spark Assembly
CtrlK
  • Apache Spark Assembly
  • Ways to Read This Book
  • Core - Operation related
    • SparkContext
    • SparkConf
    • SparkEnv
    • Heartbeat
    • Scheduler
  • CORE - Execution Related
    • RDD
      • RDD Design
      • Principles of Overriding RDD
      • Default RDDs
      • Transformations and Their Design
        • map / flatMap
        • filter
        • repartition / coalesce
        • sample / randomSplit / takeSample
        • union / ++ / intersection
        • sortBy
        • glom
        • cartesian
        • groupBy
        • pipe
        • mapPartitions / mapPartitionsWithIndex
        • zip / zipPartitions
        • Extra
      • Actions and Their Design
      • Cache & Persist
      • RDD Operation Scope
      • RDD Checkpointing
    • Shuffle
    • Serializer
    • Partitioner
    • Broadcast
    • Aggregator
    • Memory
    • Storage
  • Running Spark App
    • Starting point
    • Mastering SparkConf
    • Web UI
  • Untitled
  • Programming Spark
    • Debugging
Powered by GitBook
On this page

Was this helpful?

  1. CORE - Execution Related
  2. RDD

Transformations and Their Design

map / flatMapfilterrepartition / coalescesample / randomSplit / takeSampleunion / ++ / intersectionsortByglomcartesiangroupBypipemapPartitions / mapPartitionsWithIndexzip / zipPartitionsExtra
PreviousShuffledRDDNextmap / flatMap

Last updated 4 years ago

Was this helpful?