From: TMaR: a two-stage MapReduce scheduler for heterogeneous environments
Job | Description | CPU/IO-intensive | Shuffle-light/heavy |
---|---|---|---|
Wordcount | Counts the occurrence of each word in the input data | CPU-intensive | Shuffle-heavy |
K-means | A clustering analysis algorithm for multi-dimensional numerical samples in data mining | CPU-intensive | Shuffle-light |
TeraSort | A popular benchmark to sort one terabyte of randomly distributed data | IO-intensive | Shuffle-heavy |
Grep | Counts the number of occurrences of strings matching the target in a text file | IO-intensive | Shuffle-light |