Friday, October 7, 2011

Review: Pig Latin & Hive

PigLatin & Hive are designed with similar goal in mind and thus, they share common functionalities. Both of them are written to execute queries/plans on HDFS, an open-source, map-reduce implementation. Both of them also have schema for metadata. Both of them provide relatively simple SQL optimizations when compared to standard RDBMS.

Hive is a data warehouse infrastructure built on top of Hadoop that facillitates querying and managing large datasets residing in distributed storage. Hive also defines a simple SQL-like qeury, called QL. This SQL queries are compiled into MapReduce jobs to be executed as efficiently as possible. Providing this SQL-like interface is better for system administrations in my opinion since that means that sys admins will be familiar with the commands. PigLatin, which is created by Yahoo!, states that it is designed to be the sweet spot between SQL & MapReduce. The nice thing is that Pig has a debugger to its language. On the other hand, Hive has a web-interface to visualize the various schemas and issue queries which would be a great help to the developers.

Review: SCADS

SCADS look at the problem of scaling in the storage backend today. For example, if a company suddenly becomes popular or that in the event of ebay, during Christmast, there will be a drastic increase in the amount of query to the system's backends. As stated in the paper, it promises Data Scale Dependence in which it adjusts the capacity of the systems using machine learning models. With the existence of Amazon EC2, it's possible to do this. It's also interesting to note that the system gives the programmers the flexibility to trade off consistency and performance. Programmers could specify the level of consistencies that are to be implemented in the system (e.g. eventual consistency, etc). Since there are no implementations yet, I'm quite unsure about the impact of the papers since many things that could go wrong are undiscovered until the system has been implemented.

Review: Dryad Distributed Data-Parallel Programs from Sequential Building Blocks

Dryad is a general-purpose distributed execution engine for coarse-grain data-parallel applications presented as an alternative to MapReduce paradigm. The data flows are represented as directed acylic graphs (DAG). Relative to MapReduce, Dryad gives more flexibility to the programmers although at the expense of more complexities being presented to the programmers. Looking at the sample programs, I think that those complexities are not worth it, when the needed programs are substantial in size. I believe that another layer of abstraction (e.g. The nebula scripting language) is required. Its influence still needs to be questioned as this is a proprietary software written by Microsoft and thus, impact on the general programmers' community are barely seen. From the experiment results that are presented, this looks promising as the speed-up is pretty much linear.

Thursday, October 6, 2011

Review: MapReduce

MapReduce is used for programming distributed system that's currently very popular. Users specify a map function that processes a key/value pair to generate intermediate pairs of key/value, which is then passed to a reduce function that merges all those intermediate pairs of key/value in a way that's specified by the users too. This abstraction is very nice since the users do not need to focus on the complexities presented by distributed systems (e.g. failure tolerance, consistency issues, etc) and just focus on coding the main algorithm of the task (e.g. counting the number of words in the internet, etc) instead. The paper also goes on to describe how common tasks such as unix utilities could be adapted to the MapReduce programming framework. However, as simple this abstraction may be, this imposes restriction on the programmers to be creative. Also, I feel that some low-level details are still being exposed to the programmers, such as how the users are required to specify the number of mappers that they will need. I believe that there are a lot of improvements that need to be made on the programming framework.

Thursday, September 29, 2011

Review: Megastore

Megastore is a storage system developed to meet the requirements of interactive online services developed at google. It combines the scalability offered by a NoSQL datastore with the convenience of a traditional RDBMS in a novel way that provides a strong ACID semantics. It also provides wide-area replication with Paxos for synchronous wide area replication. The interesting part of MegaStore that may contribute to its success in my opinion is its applicability to a variety of workloads that could be evenly divided between entity groups; For example: emails, etc. Although data partitioning into complex entity groups such as social networking graph will be harder since interactions between individuals are hard to be sorted out.

Review: GFS

Google File System is a scalable distributed file system for large distributed data-intensive applications that has been implemented and used in Google. The goal of the file system has been the same as any previous distributed file system; to achieve a compromise between performance, scalability, reliability & scalability. To do that, they made a different fundamental assumption. Firstly, component failures are the norm than the exception. Secondly, files are very big (In orders of multi-GB). Finally, there are barely any random writes in the system; Thus, most writes exist by appending new data to the system. Architecture-wise, their system is very interesting too. There is a single master with a number of chunkservers. The master node is the application "contact point" for meta-data business such as retrieving the location of the data they would like to retrieve, etc. The paper then goes on to describe on how snapshots are implemented, etc. Some of the things that I'm wondering is that how will the assumptions differ if Google were to build their file system now (eg: Is it going to be also mostly append-only system? Is it only 1 master system?). However, clearly we have seen that GFS has been very successful as it has been utilized as the storage system within Google and their model is very good in the distributed system literature.

Review: Paxos & Chubby Lock Service

Paxos is a consensus protocol in a network of unreliable computers for implementing a fault-tolerant distributed system to ensure that one consistent value is chosen. There are three basic roles for the nodes that are participating in it: proposers, acceptors & learners. Proposers are nodes that are trying to get this "chosen" value across all of the participating nodes, acceptors are nodes that receive proposals and could vote on which proposals should be accepted while learners just learn the correct value that has been voted by the majority of the acceptors.

The following is how the algorithm will work on a typical system. Firstly, we enter the proposal phase in which a proposer selects a value n and sends it to the acceptors. This proposal could be rejected if an acceptor has received a proposal with a value greater than n. Else, it will be accepted. In the second phase, the proposer receives a response to its proposal from a majority of acceptors in which it will response with an accept request to each of those acceptors with a value v where v is the value of the highest-numbered proposal among the responses. In my opinion, the structure of this paper makes Paxos harder to understand than it actually is since the author proves each subsection of the Paxos algorithm before leading up to the overview of the protocol. If the author had given us the overview of the protocol in the beginning, that will be very helpful. However, the simplicity of the Paxos itself makes it, as we know today, very popular and have many papers that are written to further optimize it. So I'd say that this paper is very influential.

The paper "Paxos Made Practical" takes one more step on how to actually implement the algorithm in a real-life situation, how to handle nodes that failed and how to add a state machine to maintain replicas of the data since previous works that claim to use Paxos as a consensus protocol leaves many details unspecified. However, I feel that for this paper, its influential for itself is not that much since it's too specific on one implementation although I guess, it will help people who will like to design a system that utilizes Paxos Protocol.

On the other hand, we have Chubby lock service that is intended to provide coarse grained-locking in a loosely-coupled distributed system (such as one that exists in Google). It is interesting to note that, as the paper explicitly mentioned, Chubby lock service is an engineering effort that claims no new algorithms and techniques, not research. As shown in figure 1 of the paper, Chubby is running distributed consensus protocol (Paxos?) on a small cell of five machines to elect a master in which the master is a "contact point" that receives requests from the client until the master is down or the lease expire. The paper then goes on to describe other features of Chubby; such as how it does backups, mirroring, handling fail-overs, etc.