private[spark] defwithScope[U](body: => U): U = RDDOperationScope.withScope[U](sc)(body)/* 1. input */deffoo(f: T ): RDD = withScope { /* 2. withScope *//* 3. closure */val cleanedF = sc.clean(f)/* 4. return value */new SomeRDD() // if returning new RDD this.bar() // if transforming from itself}
Not all, but very common tf functions are designed like above
input RDD
withScope ; 자세한 내용은 operation scope 개념 참고.
cleanedF ; 자세한 내용은 sparkcontext.clean() g함수 참고.
returning RDD
method chaining 기술이 도입됨.
꼭 위와 같은 것은 아님. 그냥 많은 경우 이러하는 것임.ㄷ
Extra functions
for some rdds, it introduces implicit conversion to some special class
ㅇㅇ이거 어떻게 구현한건지 모르겠음..
object RDD
rddToPairRDDFunctions
rddToAsyncRDDActions
rddToSequenceFileRDDFunctions
... 3 more
각각 추가된 tf / act 있는데, tf / their design 에 대해 소
DeterministicLevel
중간중간 사용
* Returns the deterministic level of this RDD's output. Please refer to [[DeterministicLevel]]
* for the definition.
*
* By default, an reliably checkpointed RDD, or RDD without parents(root RDD) is DETERMINATE. For
* RDDs with parents, we will generate a deterministic level candidate per parent according to
* the dependency. The deterministic level of the current RDD is the deterministic level
* candidate that is deterministic least. Please override [[getOutputDeterministicLevel]] to
* provide custom logic of calculating output deterministic level.