Seite 1 von 1

MR - Reduce starting time

Verfasst: 11. Jul 2017 17:07
von robhe
Hello,

I'm a little bit confused when the reduce task is starting relatively to the map task.
Is it correct that the reduce calculation starts only if every map task has finished, but the reduce workers are already reading the results of the map tasks while the map tasks are still in process?

Thank you
Robin

Re: MR - Reduce starting time

Verfasst: 12. Jul 2017 09:54
von yc81reja
I wouldn't think that the reducers start reading while mappers are still working because the scheduler shuffles the intermediate (key, values) and assigns them to reducers. If reducers start reading before all mappers finish, the partitions they read are incomplete.
Please take a look at page 15 of slide 4 where a barrier aggregating intermediate values exists between map tasks and reduce tasks

Re: MR - Reduce starting time

Verfasst: 12. Jul 2017 10:47
von robhe
Thank you for the answer but I think this is contrarily to some statements in the MapReduce paper:
"When a map task is executed first by worker A and then later executed by worker B (because A failed), all workers executing reduce tasks are notified of the reexecution. Any reduce task that has not already read the data from worker A will read the data from worker B."

Re: MR - Reduce starting time

Verfasst: 12. Jul 2017 11:11
von yc81reja
Well, I would assume that since each worker stores their intermediate values on local storage and reducers are reading them from a mapper, if the mapper is dead after completion, the reducer cannot read the intermediate values from the dead worker because the values are written to its local storage. And I think this should be the reason why even completed mappers must be re-executed when a worker on that node fails.

Re: MR - Reduce starting time

Verfasst: 12. Jul 2017 11:32
von robhe
Yes, I agree that this should be the reason why even completed mappers must be re-executed when a worker on that node fails.

However, "Any reduce task that has not already read the data from worker A will read the data from worker B" means that there could be reduce tasks that have already read the data from worker A and this would imply that there are reading reduce tasks while mapping is not finished.

Re: MR - Reduce starting time

Verfasst: 12. Jul 2017 12:03
von yc81reja
Oh, sorry I think my previous replies are wrong. Now I got what you mean. You are right on your first post.
In the paper section 3.1 step 4 it says the reducer has started reading when the output of writers are written on the local disk but until step 5 when the reducers have finished reading and sorting all intermediate (key, value) pairs, reducers do not compute. The barrier stands between reducers reading/sorting operations and reduce computation. And shuffles are not handled by reducers so they do not need to wait for all shuffles done.
Thanks a lot for pointing this subtle detail out

Re: MR - Reduce starting time

Verfasst: 12. Jul 2017 12:17
von robhe
You are right, this paper section should be the answer.
Thanks a lot for the help!