Hadoop: Open Source Map Reduce

December 29, 2007 § 2 Comments

What is Map:

In Python, map() applies a certain function to each element in a list. Map returns a list.

What is Reduce:

Superficially, not much difference, reduce() takes a certain function and runs that function against every element in a list. Reduce returns 1 item.

What is MapReduce:

It is an architecture that allows functions to be executed across distributed cluster. MapReduce is special because the map and reduce functions are complemented with key-value mapping so that functions can be executed across distributed commodity servers.

What is Hadoop:

It is MapReduce open source implementation. It is written in Java.

Python obviously already have map and reduce functions, so what’s left is to figure out the distributed aspect of MapReduce. Below are two people who have already thought of MapReduce implementation in Python:



§ 2 Responses to Hadoop: Open Source Map Reduce

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

What’s this?

You are currently reading Hadoop: Open Source Map Reduce at RAPD.


%d bloggers like this: