Although I try and keep personal information to a relative minimum on this blog, here’s some news that’s relevant. Recently I accepted an offer to start work at Cloudera, a young company in the San Francisco area. Initially I’ll be working from the UK, with a view to a permanent move out to California when timing and visas allow.
Hadoop is Cloudera’s business. Hadoop is an open-source implementation of Google’s MapReduce Cloudera provides support for Hadoop, and their own fully supported distribution of the Hadoop toolset. Hadoop allows very large scale distributed and parallel processing of huge data sets in a fault-tolerant and efficient manner. Data sets are getting bigger all the time, and there’s a mismatch now between the desire and the ability of companies to handle and process all those data. Cloud computing, at least in the form of dynamic provision of computing resources, and distributed processing technologies such as Hadoop are helping to make this problem tractable. Cloudera provides the expertise and the experience to help businesses make best use of this seriously powerful tech.
I’m joining, as you might expect, as a distributed systems engineer. There are plenty of interesting problems to attack, both in Hadoop and the ecosystem of technologies that support and extend it, such as HDFS, Hbase, Pig, Hive and ZooKeeper. Cloudera already has a killer team (see some of the press from the New York Times), and I’m really looking forward to being a part of it.