Skip to content

Commit 138cd10

Browse files
committed
Add note on DataFrame recomme^Cation over RDD
1 parent 34889ce commit 138cd10

File tree

1 file changed

+2
-0
lines changed

1 file changed

+2
-0
lines changed

spark/spark.ipynb

+2
Original file line numberDiff line numberDiff line change
@@ -458,6 +458,8 @@
458458
"source": [
459459
"## RDDs\n",
460460
"\n",
461+
"Note: RDDs are included for completeness. In Spark 1.3, DataFrames were introduced which are recommended over RDDs. Check out the [DataFrames announcement](https://databricks.com/blog/2015/02/17/introducing-dataframes-in-spark-for-large-scale-data-science.html) for more info.\n",
462+
"\n",
461463
"Resilient Distributed Datasets (RDDs) are the fundamental unit of data in Spark. RDDs can be created from a file, from data in memory, or from another RDD. RDDs are immutable.\n",
462464
"\n",
463465
"There are two types of RDD operations:\n",

0 commit comments

Comments
 (0)