Let us try and do some aggregation based on dates. Well, the most common type of aggregation which we can think of using dates is the aggregation by year / date / month etc. etc.
I will be doing an aggregation using the morphia library. I will also be using map / reduce to do my aggregation.
Let's assume that we have a collection(posts) which has some data like below.
{ "_id" : ObjectId("52236140e40247b854000002"), "author" : "6mRlPExfM03UQTvMDUkS", "body" : "uygu4ndeV0", "comments" : [ { "author" : "lalit", "body" : "This is a test" } ], "date" : ISODate("2013-09-01T15:46:08.140Z"), "permalink" : "Sbuw5zo5iAuMMAcD008F", "tags" : [ "zeHUw" ], "title" : "Sbuw5zo5iAuMMAcD008F" }
Now, we need to do some aggregation so that we get the counts of all the posts w.r.t year / month / date / hour. Basically, here we would be getting counts for all the post hourly.
We would need to define Map and reduce function to do this and then call the Java Map / reduce API.
Let's get handle to our db and collection.
//injecting the mongo bean to our grails service def mongo //service method for map reduce calculation def mapReduce() { DBCollection posts = mongo.db.getCollection("posts") }
Here goes our map method
private static final String mapHourly = "" + "function(){ " + " d = new Date( this.date.getTime() - 18000000 );" + " key = { year: d.getFullYear(), month: d.getMonth(), day: d.getDate(), hour: d.getHours() };" + " emit( key, {count: 1} );" + "}";
Here's our reduce method
public static final String reduce = "function(key, values) { " + "var total = 0; " + "values.forEach(function(v) { " + "total += v['count']; " + "}); " + "return {count: total};} ";
Now we will be calling the Map Reduce API.
//setting the commands MapReduceCommand cmd = new MapReduceCommand(posts, mapHourly,reduce, null, MapReduceCommand.OutputType.INLINE, null); MapReduceOutput out = posts.mapReduce(cmd); //printing the results for (DBObject o : out.results()) { System.out.println(o.toString()); }
So, a complete service class might look something like below.
Class MapReduceService { //getting the bean def mongo //map to be used for hourly calculation final String mapHourly = "" + "function(){ " + " d = new Date( this.date.getTime() - 18000000 );" + " key = { year: d.getFullYear(), month: d.getMonth(), day: d.getDate(), hour: d.getHours() };" + " emit( key, {count: 1} );" + "}"; //map for daily calculation private static final String mapDaily = "" + "function(){ " + " d = new Date( this.date.getTime() - 18000000 );" + " key = { year: d.getFullYear(), month: d.getMonth(), day: d.getDate(), dow:d.getDay() };" + " emit( key, {count: 1} );" + "}"; //map for yearly calculation private static final String mapYearly = "" + "function(){ " + " d = new Date( this.date.getTime() - 18000000 );" + " key = { year: d.getFullYear() };" + " emit( key, {count: 1} );" + "}"; //map for monthly calculation private static final String mapMonthly = "" + "function(){ " + " d = new Date( this.date.getTime() - 18000000 );" + " key = { year: d.getFullYear(), month: d.getMonth() };" + " emit( key, {count: 1} );" + "}"; public static final String reduce = "function(key, values) { " + "var total = 0; " + "values.forEach(function(v) { " + "total += v['count']; " + "}); " + "return {count: total};} "; //service method for map reduce calculation def mapReduce() { DBCollection posts = mongo.db.getCollection("posts") //setting the commands // map method can be changed here to use whichever map you want eg. mapDaily, mapYearly ... MapReduceCommand cmd = new MapReduceCommand(posts, mapHourly,reduce, null, MapReduceCommand.OutputType.INLINE, null); MapReduceOutput out = posts.mapReduce(cmd); //printing the results for (DBObject o : out.results()) { System.out.println(o.toString()); } } }
No comments:
Post a Comment