Moderator: Concepts and Technologies for DS and BDP
at the exam preparation was a question about the underlying data structure from Apache Spark and we shall show a minimal example.
The technology is RDD and in the lecture about Spark/RDD is the following example:
lines = spark.textFile("hdfs://...")
errors = lines.filter(lambda s: s.startswith("ERROR"))
messages = errors.map(lambda s: s.split('\t'))
messages = cache()
messages.filter(lambda s: "foo" in s).count()
Is this example sufficient for the exam?
Is the scope of this example also sufficient for other technologies like MapReduce and Scala?
> Is the scope of this example also sufficient for other technologies like MapReduce and Scala?
Sorry, I don't understand the question. Can you please explain what yu mean?