For me there is nothing quite like doing research in my engineering job. Only two companies I've worked at (one of them being Google) have embraced this concept whole heartedly. I hope more companies learn from this article.
Great article. I always try, sometimes with a little success, to apply ideas I read about to my own life. I am a one man consulting shop but I try to integrate experiments/research into my weekly work flow because that was what I was used to as an employee (about a day a week of IR&D at SAIC, lots of experiments were done at Angel Studios for work for Nintendo and Disney, etc.)
Compared to companies like Facebook and Google, where I often go wrong is not more tightly coupling experiments/side projects to what my customers need. Instead, I invest my 1/3 time for experiments more on things that just really interest me and very often I don't really apply what I learn to practical projects.
My Dad is a scientist (at 91 he is still a member of the National Academy of Sciences) and over breakfast today I was describing to him the really fast iterative business/research/deployment cycles at Google, Twitter, Facebook, etc. Very different time scale than what he was used to.
By closely connecting research and development Google is able to conduct experiments on an unprecedented scale, often resulting in new capabilities for the company.
The goal of research at Google is to bring significant, practical benefits to our users, and to do so rapidly, within a few years at most.
Because of the time frame and effort involved, Google’s approach to research is iterative and usually involves writing production, or near-production, code from day one.
Typically, a single team iteratively explores fundamental research ideas, develops and maintains the software, and helps operate the resulting Google services—all driven by real-world experience
and concrete data. This long-term engagement serves to eliminate most risk to technology transfer from research to engineering. This
[...] we blur the line between research and engineering activities and encourage teams to pursue the right balance of each, knowing that this balance varies greatly.
Overall, we undertake research work when we feel its substantially higher risk is warranted by a chance of more significant potential impact.
we just try to “factorize” [long-term research] into shorter-term, measurable components.
Even if we cannot fully factorize work, we have sometimes undertaken longer-term efforts. For example, we have started multiyear, large systems efforts (including Google Translate, Chrome, Google Health) that have important research components.
If the discrete steps required large leaps in vastly different directions, we admit that our primarily hillclimbing-based approach might fail. Thus, we have structured the Google environment as one where new ideas can be rapidly verified by small teams through large-scale experiments on real data, rather than just debated.
Organizationally, research is done in situ by the product team to achieve its goals. The most successful high-profile examples of this pattern are systems infrastructure projects such as MapReduce,
Google File System, and BigTable.
In our opinion, a research project is successful if it has academic or commercial impact, or ideally, both.
Another potential pitfall of the hybrid research model is that it is probably more conducive to incremental research. We therefore do support paradigmatic changes as well, as exemplified by our autonomous vehicles project, Google Chauffeur, among others.
Our hybrid approach to research enables us to conduct experiments at a scale that is generally unprecedented for research projects, generating stronger research results that can have a wider academic and commercial impact.
While our hybrid research model exploits a number of things particular to Google, we hypothesize that it may also serve as an interesting model for other technology companies.
The thing is that "MapReduce" is a concept that was in practice, e.g. by LISP programmers, long, long before Google rediscovered it.
Much like Google's many acquisitions that the public perceives as resulting from "Google R&D", things like map-reduce are also viewed as coming from "unparalleled Google capabilities".
Let's get real. Google is a big company that employs thousands upon thousands of overqualified Java and C++ programmers. They are a fat cat. Not necessarily a cunning and agile one.
With the amount of cash they have on hand, indeed they should be producing some interesting research.
But I have a hard time seeing things like map-reduce as state-of-the-art R&D.
That many programmers, who have standards that consistently hover around varying levels of mediocrity, are satisfied with Google's design choices does not necessarily make what they do "state of the art". It just makes it the most popular. (Popularity is of course very important, perhaps all-important, in this business, but has little to do with research and pushing the envelope.)
I see you created a brand new account just to write this post.
MapReduce is the name of a piece of software, not simply the concept of map followed by reduce. That's something that every high school freshman invents on his own in Algebra 1 class. The interesting research area is making that concept scale to "run this command on every web page on the Internet" billions of times a day. I don't know about you, but I don't see anything to do that in my apt repositories.
Research at Google isn't about solving problems that are beyond the comprehension or reach of any average practitioner of programming. A good example is Street View. Anyone can understand strapping some cameras and sensors to a car and driving it around to make pictures of places available on the Internet. It's hard to call that "state of the art research" because it's such a simple idea, but before Google did the research, the product didn't exist. Now you can see almost any street address anywhere in the world in your browser. (The hard part is in the details. How do you map images to locations on the Earth in areas where GPS reception isn't good enough to provide enough accuracy? Scaling something to the entire Earth is not easy.)
That many programmers, who have standards that consistently hover around varying levels of mediocrity
First you say, all programmers at Google have standards that "hover around mediocrity". Then you say you're amazed at Google's execution on world-changing ideas. Do you see a contradiction?
Argument over, you can close your trolling account! :)
What I see are your words, not mine. Yet they are attributed to me.
If this is an "argument" as you suggest, then you are doing a poor job at making your case.
There is a difference between 1. "research" and developing "novel" ideas and 2. executing well on large projects. It's possible to accomplish 2. without invoking 1., and vice versa.
There is no contradiction.
When a programmer complains there's nothing in "apt" to solve his problem, I'm never impressed. Nor am I ever surprised. Convenience makes some programmers very lazy.
Mapreduce as a concept goes beyond lisp implementations. On the surface it might seem like the point of mapreduce is expressing computations in terms of map and reduce functions. It isn't.
The point of mapreduce is reducing the problem of high-throughput fault-tolerant distributed systems to a very efficient and reliable distributed sorting algorithm (the shuffle phase, which is implemented by the implementations of mapreduce and not by the user code). If you can express all synchronization in your algorithm in terms of sorting, then whatever you do before sorting (map) or after it (reduce) is kind of trivial, as the hard part is taken care of by the framework.
This abstraction is novel, and profoundly useful, and that's the point of mapreduce, not so much the actual map() and reduce() functions.
Sorry, but applying an old concept to a new problem (actually just new buzzwords... it's only the size of the problem that's new) does not make a "novel" solution. Moreover, it's an obvious solution. But I guess that depends on who is doing the programming.
I would love to see how programmers with large clusters at their disposal were approaching large datasets before the moment they realized splitting the task into smaller pieces was what they should do.
It's not about splitting the task into smaller pieces. It's about factoring out the parts of the task that need synchronization among all machines into one specific subroutine (groupBy) which makes mapreduce so powerful.
If you speak with people experienced in multithreaded and distributed programming you will see that synchronization with fault-tolerance is _hard_, and mapreduce provides a widely-applicable set of sufficient conditions for an algorithm to be executable with implicit fault-tolerance and implicit synchronization.
Without mapreduce-like abstractions eveyr piece of software has to be responsible for its own (1) checkpointing (to recover from errors), (2) checksumming (to ensure that no errors happened), and (3) distributed communication (to make sure the global state becomes global and the local state becomes local).