Friday, August 23, 2019

Hugh Brock on Red Hat Research

Hugh Brock is Research Director at Red Hat. In this podcast, Hugh discusses how open source makes the way Red Hat approaches research different from the way it's done at other companies. He also talks about how the research program got started and, in particular, the role that Boston University has played.

Show notes:


Podcast:


Transcript:


-->
Gordon:  I thought I would get started off by talking about research programs in general at companies. There's a long history. Research labs, research organizations in corporations have often taken different forms. Sometimes there's a lot of fundamental research. Sometimes it's pre‑product development in a sense.
Can you take us through what some of the thinking was in Red Hat forming a research program and how you think about research, development, collaboration, intellectual property? That should take us a few minutes.
Hugh:  I think the way we're doing this at Red Hat is really exciting and also quite different from the way industry research has traditionally been done. The way companies...even the way companies have traditionally worked with universities. What we do at Red Hat Research is try to connect our engineers with researchers in universities via graduate student PhD projects.
The reason we work that way...well, there's two things, really. The reason we're able to work that way is because we're an open‑source company. The universities are happy to talk to us because they know that we're not interested in their intellectual property.
The reason it is an advantage for us to work that way is that it gives our engineers a chance to broaden their horizons, and it helps the universities focus what they're doing on something that is achievable in this now very fast moving world of IT and computer science research.
We think we have a winning model there. If you look at the way companies have traditionally done research, there's kind of two models. One model is you pay a researcher at a university to develop a project for you. I went to visit...Oh, I forget the guy's name. The research director at Mitsubishi Electric over here in Cambridge when I was first getting into this job.
One of the first things we learned in our conversation was that we have completely different jobs actually. Dick's whole enterprise is going to MIT and figuring out what he needs to pay them to do, which is a good model. It's good for MIT. It's good for Dick. He gets what he needs. All of that research then goes into a patent vault. Mitsubishi gets to use it but it doesn't get opened up until the patents expire.
Our model is completely different. What we're trying to do is, not pay researchers to do stuff that we want. It is to help them do what they want and get it into the open faster and more effectively than they could do it without our help.
So far, it's turned out to be a winning model. We hope that it will grow as we're able to contact more people and connect more people with our engineers.
Gordon:  One of the things that seems interesting in terms of academia, in industry, and open‑source, working together is...There are, of course, different objectives. There's also often different time scales in the way things happen.
In a way, I find academia interesting here because on the one hand, academia can often work in things that to someone sitting in a company, can look often very pie in the sky, very speculative. Maybe we will see something related to this in 15 years.
On the other hand, there's this incredible pace of change that you alluded to in IT and in tech where some project like Kubernetes goes from something internal at Google to something everybody is using in the course of just a few years. At least from teaching classes and so forth, things don't get revised that quickly at universities.
Obviously, there's also a tension of, you don't want universities chasing the toolkit of the day too quickly either.
Hugh:  Yeah. This is absolutely right. It's a real tension for universities. When we launched our relationship with Boston University, which would have been going on three years ago ‑‑ it'll be three years this December that we launched the Red Hat‑BU Collaboratory ‑‑ the case that my opposite number over at BU made to our VP of product, Paul Cormier was not only that Red Hat should do this because Red Hat will get interesting research out of it, but also that Red Hat should do this because we can help the universities do a better job of getting interesting research into the public.
My partner at BU, Dr. Orran Krieger, is a professor of computer engineering there. The case that he made to Paul was that industry is going so fast that academia needs to be pushed to keep up with them or academia will, in fact, become irrelevant. At least, in the computer science and computer engineering space. There's already a danger of that happening right now.
To some extent, there's a danger of it across the sciences with some of AI developments that we're seeing. The point is that we really are in a unique position at Red Hat to push research in a direction that's going to keep up with what we need in industry, and make a better, and closer partnership that works. That isn't parasitic or whatever, but actually serves the interest of both industry and university.
Gordon:  We'll get into some of that interesting AI work in a couple minutes. Before we dive down quite that deeply, let's talk about some of the threads that really came together in this whole program. Obviously, there's the universities, including universities in the Boston area.
Though, certainly not limited to that. The original Massachusetts Open Cloud work, which you just mentioned. Then, a lot of the interesting medical research that's happening in the Boston area. Probably many of my listeners know this, but Boston is known for having among the world class hospitals.
A lot of them are big research hospitals that are doing a lot of work. Can you talk about how all of this came together?
Hugh:  Yeah, it's actually a really interesting story. I mentioned the Red Hat Boston University Collaboratory. Red Hat Collaboratory at Boston University, I guess is the official name. This became the foundation of our research program at Red Hat. The way it came to being is a very interesting story.
Dr. Krieger over at BU wrote and received an NSF grant with his partner Peter Desnoyers about six years ago now, I want to say, five, six years. The grant was to study the feasibility of developing what's called an Open Cloud Exchange, which to put it very concisely is a bazaar to the public cloud's cathedral.
If Amazon is operating a single owner monolithic set of services that they control from top to bottom, the Open Cloud that Orran wanted to study is a marketplace where anybody can play. This is an interesting concept. There's a number of things that you need in order to make it real. The first thing you need is a data center.
It turns out that the five major research universities in the Boston area, Harvard, MIT, BU, Northeastern, and UMass system collaborated a few years ago on a data center called the Mass Green High Performance Computing Center MGHPCC, which is a very large data center in western Mass next to a hydro‑electric dam. Power's cheap.
They built this with the help of the commonwealth of Massachusetts so that they could move all of their research IT infrastructure out there.
With that done, it now is possible all of a sudden to start thinking about, OK, how can we build an Open Cloud that allows all five of these universities and the state and industries, small businesses, manufacturing to all participate in what amounts to a marketplace for computing services that looks like a cloud, but is ultimately much more efficient because there is no single owner.
It's more price efficient. This was the thing that Orran wanted to study. He got a grant. He got some computers, got a network infrastructure, set the thing the up, and set it up on the OpenStack while we Red Hat are a major OpenStack vendor. Orran kept pinging us saying, "Hey, do you guys want to help with this? This is really interesting."
He was trying to do some stuff with some of the OpenStack services that we at the time didn't think made sense. For a while, we put him off, like, "No, we're not really interested."
Eventually, he got through. The thing he was able to offer to Paul that no one else really had before was an opening for doing research in a practical way. Research that would draw our engineering team into partnership. That was a huge deal.
That was basis of this Red Hat BU Collaboratory, which is a million‑dollar a year partnership where we fund...Really, we fund basic research at BU, but we do it on a way that's connected to do what we want to do as an engineering company.
Fast forward a couple of years, we've established the Collaboratory. We're working with Orran at BU. Orran teaches what's called a Cloud Computing course. It's a project‑based course in Cloud Computing. This was the first year he had done it.
He brings in lots of industry partners as mentors to lead projects. We had a bunch of projects over there. One of the projects that we found in the Cloud Computing course was a little app that somebody at Boston's Children's Hospital had written to put a UI around medical image processing codes.
Codes that process MRI, for example, or any x‑rays or CT scans, or whatever, and do stuff with them. Not really AI, exactly, but advanced image processing. This project as it turns out was staffed by the wife of one our researchers at BU. Even funnier, the sponsoring researcher at BU turns out to be Orran's wife. He didn't know this at the time, which is hysterical.
We didn't realize that they were both working on opposite ends of the same project for a long time. They never talk apparently. Anyway, we got into this project and we realize that we could take this thing and put it to OpenShift fairly easily. That actually would be not only a great demo project for us, but a really nice contribution that would make back to this thing.
It is developed into a thing called the ChRIS project. That's how that got started. We're continuing to contribute to it, partly as because it's a great demo project for what you can do with OpenShift [Container Platform] on our tooling. Also, because we've been able to demonstrate some of the more advanced research results that we've found at BU.
Things like multi‑party computing, we have integrating that into ChRIS at that beginning part of this year. There are more things going to be coming there this year as well. That's been a really interesting story and it was a lot of fun.
Gordon:  The multi‑party computing and homomorphic encryption and differential privacy probably should be the topic for another podcast because they're really interesting. Essentially, ways that you can share data in a machine learning, AI context among institutions, and to have third‑party computing without compromising privacy.
That's actually a really interesting topic that...stay tuned, going to cover that in later episode in more detail. In addition to the MPC and chRIS, what are some other interesting projects that are going with research right now?
Hugh:  At BU in particular, we have two parallel tracks. One of them is the privacy‑preserving AI that you just mentioned. All the range of technologies around how to do machine learning in a privacy‑preserving way. The other kind of major research thrust that we have is all around how industry is going, and the world, is going to deal with the end of Dennard scaling.
Dennard scaling is the idea that you can continue to increase the number of transistors per square millimeter on a chip by doing various tricks to make that possible. It is a subject of a lot of debate exactly how long we will be able to continue to do Dennard scaling, but nobody is arguing that we'll be able to do it forever. It seems clear that we're approaching a point of diminishing returns.
What that means is that we are going to start looking again at specialty devices. Processors that are built for a particular purpose, the GPU is the most obvious example, but there are many of them. Many different types. The OS, which is our primary product here at Red Hat is going to have to start understanding how to deal with these things.
You can imagine a typical computer system in the cloud in three or four years is going to have not just a whole bunch of general purpose CPUs, but also GPUs, FPGAs which are programmable processors. You can program the architecture on the fly. Other devices we haven't even really thought of yet. All of these things are going to live together in the same machine.
We have a number of really interesting research projects going on right now that are all around the different aspects of that problem. Everything from can we build a Linux unikernel that really works to what do we need to do to create an open source tool chain for FPGAs. Any of a wide range of other projects along these lines. Partitioning hypervisors is another kind of key piece.
We've been very fortunate that BU turns out to have really strong Operating Systems department. We didn't know that when we made that partnership so that worked out quite well for us. We think we're going to do some groundbreaking stuff there. The unikernel project, in particular.
The idea of the unikernel is you take your app and you build it with the kernel it's going to run on so that you’ve basically built a bootable app. There's all kinds of interesting reasons why you might want to do that. It turns out that our kernel folks are really interested in this with the PhD who's working on it right now.
They're basically telling him, "If you can get this last stage of thing that you're working on right now to work, then we want to start looking at how we can actually use this in practice."
This is gone really quickly from pie‑in‑the‑sky idea Orran and Ali Raza who is the PhD, come in and say, "Hey, we want to look at unikernels and whether we can make a unikernel out of Linux and we think it's going to be really hard. We doubt it'll work," to in the space of not even, 18 months, Ali talking to our kernel engineers about the details of how we could make this real.
It is going very quickly. I think everybody's astonished that it's happened that quickly. It's an example of lots of different threads coming together at the same time. We've been really fortunate to be in the middle of it, but that's what we try to do at Red Hat Research. We weren't pulling these threads together, and then they would never intersect.
Gordon:  Probably topics for at least a couple more podcasts there. We have probably all the detail we can get into today. In closing out, where can people go to learn more about this? Many of these sounds interesting?
Hugh:  The best place to look for anything that we're doing is our website, which is research.redhat.com. We try to maintain there a list of all of the active projects and what the status is at all of the universities that we work with.
We also post there details of the events that we sponsor so colloquiums, workshops, things like that as well as the quarterly research review magazine that we produce to go into detail on these projects that we do.
DevConf is an annual conference that was launched in our Brno office in the Czech Republic. It's been running there now for, I think, 12 years. We did the first one here in the US last summer [2018] at the Boston University Student Center, the GSU, George Sherman Union.
We'll be doing it again August 15th‑17th this summer. It should be really good, I think. All of our interns will be presenting something there, particularly all of the PhD projects that I just mentioned. We think it's going to be a lot of fun.
Gordon:  Well, thank you, Hugh. Anything you'd like to close with?
Hugh:  I just want to thank you for reaching out and making this happen, Gordon. Thanks for listening everybody. If you're interested, again, in participating or just in knowing what's going on, check out research.redhat.com.

No comments: