MapReduce on AWS and Azure HDInsight
Running a series of analysis on Google n-grams dataset using MapReduce techniques
Multiple analysis on Google n-grams dataset to obtain statistical information using Apache Pig. At first, the interactive PIG shell has been used to check the procedure step-by-step on a smaller dataset through SSH, then the associated PIG script with all commands has been uploaded on the Amazon Elastic MapReduce (EMR).
In a similar project, a MapReduce program to compute some metrics of a large social media (Friendster) graph has been implemented on Microsoft Azure HDInsight in which I developed a Hadoop code in Java.