what does RR actually do?

what does RR actually do?

because Inputformat has read data as line list, why should we add rr in the structure?

My simple thought is rr makes(take word count as example) lines to [(word,1)] for mapper, so what the mapper should do is only groupbyKey, is that correct?

Thanks in advance

Re: what does RR actually do?

I am not sure if you mean rr. Is that RecordReader?
If so, I wouldn't think mapper only does groupByKey().
Please refer to this
https://www.quora.com/What-is-the-purpo ... -in-Hadoop
And please note the table on page 33 of slide 4 that by reading text file the rr provides (offset, line) as input for mapper.


