Seite 1 von 1

MapReduce

Verfasst: 21. Mai 2018 17:28
As in the format:

Code: Alles auswählen

``[   [pk, (id, quantity, price per unit), ... (id, quantity, price per unit) ]  , .... ] ``
given below a dataset:

Code: Alles auswählen

``````[
[1, ("5464", 4, 9.99), ("8274",18,12.99), ("9744", 9, 44.95)],
[2, ("5464", 9, 9.99), ("9744", 9, 44.95)],
[3, ("5464", 9, 9.99), ("88112", 11, 24.99)],
[4, ("8732", 7, 11.99), ("7733",11,18.99), ("88112", 5, 39.95)]
]``````
Using map reduce find a list of tuples of accumulated order value against each id. e.g.:

Code: Alles auswählen

`` [ ( '5464', 219.78 ), ... ] ``
Please show all the steps either in pseudo code or in any language of your choice.

Re: MapReduce

Verfasst: 21. Mai 2018 17:45
My Solution in python3 (Incomplete):

Code: Alles auswählen

``````from functools import reduce

orders = [
[1, ("5464", 4, 9.99),  ("8274",18,12.99), ("9744", 9, 44.95)],
[2, ("5464", 9, 9.99),  ("9744", 9, 44.95)],
[3, ("5464", 9, 9.99),  ("88112", 11, 24.99)],
[4, ("8732", 7, 11.99), ("7733",11,18.99), ("88112", 5, 39.95)]
]

step1 = list( map( lambda x:x[1:] , orders  ) )

#[[('5464', 4, 9.99), ('8274', 18, 12.99), ('9744', 9, 44.95)], [('5464', 9, 9.99), ('9744', 9, 44.95)], [('5464', 9, 9.99), ('88112', 11, 24.99)], [('8732', 7, 11.99), ('7733', 11, 18.99), ('88112', 5, 39.95)]]

step2 = list( reduce( lambda x,y: x+y, step1  ) )

#[('5464', 4, 9.99), ('8274', 18, 12.99), ('9744', 9, 44.95), ('5464', 9, 9.99), ('9744', 9, 44.95), ('5464', 9, 9.99), ('88112', 11, 24.99), ('8732', 7, 11.99), ('7733', 11, 18.99), ('88112', 5, 39.95)]

step3 = list( map( lambda x : ( x[0], reduce( lambda a,b:a*b , x[1:] ) ) , step2 ) )

#[('5464', 39.96), ('8274', 233.82), ('9744', 404.55), ('5464', 89.91), ('9744', 404.55), ('5464', 89.91), ('88112', 274.89), ('8732', 83.93), ('7733', 208.89), ('88112', 199.75)]

step4 = reduce( lambda a,b : ( a[0], a[1]+b[1] ) if a[0]==b[0] else (a[0], a[1]) , step3 )

#('5464', 219.78)``````
Now what I mean with "incomplete" is that I have flattened the data in step2 & 3 but in step4 i should have done "grouping" which I couldn't acheive using map/reduce/filter method. So I just have only one tuple instead of all 4.

Looking forward to a solution or hints here especially on "how to group/regroup" data.

Thanks