Tuesday, July 18, 2017

Red Hat's Mark Wagner on Hyperledger performance work

Mark Wagner Red Hat

Mark Wagner is a performance engineer at Red Hat. He heads the Hyperledger Performance and Scalability Working Group. In this podcast, he discusses how he approaches distributed ledger performance and what we should expect to see as this technology evolves.


Listen to MP3 [13:45]

Listen to OGG [13:45]


Podcast with Brian Behlendorf

Hyperledger Announces Performance and Scalability Working Group

MIT Tech Review Business of Blockchain event

MIT Sloan CIO Symposium: AI and blockchain's long games


Gordon Haff:   I'm sitting here with Senior Principal Performance Engineer, Mark Wagner. What we're going to talk about today is blockchain, Hyperledger, and some of the performance work that Mark's been doing around there. Mark, first introduce yourself.

Mark Wagner:  My name is Mark Wagner. I'm in my 10th year here at Red Hat. My degree, from when I started many years ago, was hardware. I switched to software. I got the bug to do performance work when I saw the performance improvements I could make in software, in how things ran.

Here at Red Hat, I've worked on everything from the kernel up through OpenShift and OpenStack at all the layers. My most recent assignment is in the blockchain area.

Gordon: A lot of people probably associate blockchain with Bitcoin. What is blockchain, really?

Mark: Blockchain itself is a technology where things are distributed. I like to think of it more as a distributed database at a really high level. Bitcoin is a particular implementation of it, but in general, blockchain ‑‑ and there's also a thing called distributed ledgers ‑‑ they're fairly similar in concept, but the blockchain itself is more for straight financial things like Bitcoin.

Distributed ledgers are coming up a lot more in their uses across many different vertical markets, such as healthcare, asset tracking, IoT, and of course the financial markets, commodity trading, things like that.

Gordon: As we've really seen over the last, I don't know, year or two years, there's still a lot of shaking out going on in terms of exactly what the use case is here, which of course makes the job for people like you harder when you don't know what the ultimate objectives necessarily are.

Mark: Yes. It's shaking out in terms of both new verticals are being added, as well as there's multiple implementations going on right now, in a sense competing, but they're designed at different verticals in many cases, so that, in a true sense, not really competing, per se.

Gordon: Now you're working in Hyperledger. Introduce Hyperledger.

Mark: Hyperledger is a project in the Linux Foundation to bring open source distributed ledgers out into the world. I've been involved in it since December of 2016. Red Hat's been a member for two years.

One of the things in Hyperledger, there are multiple projects within Hyperledger. The two main ones that people know are Fabric from IBM, Sawtooth from Intel. There's a bunch of smaller projects as well to complement these technologies.

Both Fabric and Sawtooth are distributed ledger implementations with different consensus models and things like that, and getting to the point where they can do pluggable consensus models.

One of the things that no one was doing at Hyperledger, and where I felt I could help across all the projects, is performance and scalability. People see out in the world that the Bitcoin and Ethereum stuff is not scaling. When it hits scale issues, things go poorly.

I proposed in April that we have a Performance and Scale Working Group to go off, investigate this, and come up with some tests and ways to measure. It passed unanimously, but the scope was actually expanded from what I proposed, and they don't want it to just focus on Hyperledger but to focus industry‑wide.

Since that time, I've been in touch with the Enterprise Ethereum Association, with the person leading their performance and scale work. In principle, we've agreed to work together.

Gordon: I'm interested in some of the specific things that you've found in this performance and scale work. Maybe before we go into detail there, at a high level, where do you see the scalability and performance challenges with blockchain and distributed ledgers?

It's obviously early days. You've done performance work with the Linux kernel, which is about tweaking for very small increments of performance, where distributed ledgers are obviously in a very different place today.

Mark: The design of the original Bitcoin, and those technologies, is what was called proof of work. They gave you a large cryptographic hash you needed to go solve in order to prove that you actually did the work.

There were consensus algorithms based on that, and who got first and who got to build the chain and add to the chain. It quickly became people started using GPU offload or going off and fabricating FPGAs directly to give them an advantage doing this. There's a quick example of performance and scalability.

The other issue is, because it's consensus, everything gets shared. Everyone has to agree on it, or some large percentage has to agree on it. As the network grows, more and more nodes are involved in this, and it becomes a big scalability problem.

Gordon: Let's talk about the work that you've done so far. What have you been focusing on?

Mark: The Performance and Scale Working Group is really just getting started. Right now, we're trying to go through and identify three or four different vertical use cases. We're focusing more on distributed ledgers and their smart contracts, things like that.

We're trying to right now go through and identify use cases at Hyperledger. Another working group within Hyperledger has already defined. We can take those, and then say, "These are the key characteristics of those," because some of these vertical markets may not need the most transactions per second. It may be more how much you can scale.

The other interesting thing is there's two types of implementations, or deployments I should say. One is permissioned, where you need permission. That's called a private. The other is permissionless, which is public. Bitcoin is public. Anyone can join.

In the permission, you need to be invited so you can control the scale that way.

Gordon: Also, there's at least some discussion that in private distributed ledgers or blockchains, it's even possible you may not need proof of work.

Mark: Yes, a lot of it is working now towards proof of stake, where you prove that you're a stakeholder. It's less computation involved.

Gordon: Now, you mentioned it in the beginning of this podcast that you can almost think of a distributed ledger as almost a form of ‑‑ not to put words in your mouth ‑‑ distributed database. There's obviously very different performance characteristics, at least as things stand now.

How do you see that interplay of distributed databases substituting for, or instead of, or what do you see the relationship between distributed ledgers, blockchain, and distributed databases?

Mark: Distributed databases are more focused on sharing data, spreading it out. With blockchain and distributed ledgers, everyone has the same copy. People are looking at sharding now. You can go off and do just the specific set of transactions, or something like that with sharding.

It's also referred to as collections. Certain sets of nodes can go off and be involved in some transactions, others in different ones. That's one way to go around the performance and scalability.

Gordon: If you're looking back from, I don't know, five years from now or whatever, what do you think have been some of your toughest challenges that you've had to overcome in terms of improving the performance, usability, and so forth of distributed ledgers?

Mark: Five years from now, we'll look back, and we'll think how naive we were, in trying to solve some of these issues. Again, there will a big difference between public and private, but trying to come up with consensus algorithms, I think they'll keep evolving. The amount of work needed will change.

The other thing people will need to start thinking about is storage. How are you going to store all this data over time?

Gordon: What's Red Hat's interest in this?

Mark: Red Hat, right now, we have customers coming to us saying, "We like blockchain, but we'd like it to run on your enterprise‑class software."

One of the things I'm trying to do with Hyperledger is get things running on our OpenShift platform with Kubernetes with a RHEL base underneath it, looking at being able to contribute software so that it can become part of a CI environment once we get further along.

In general, right now our goal is to offer multiple blockchain solutions. Internally, we're figuring out what that means and how to do that. Right now, we're working with several.

Gordon: To your earlier "how naive we were" comment, that's one of the things we absolutely see today around blockchain, around distributed ledger, is really everyone's trying to figure out, "Where is this going to be a great fit?" Conversely, "We really thought we could use it for that? What were we thinking?"

I was at an event about a month ago, and Irving Wladawsky‑Berger, who basically ran Linux strategy for IBM when they were first developing a Linux strategy, was up in the panel on blockchain at the MIT Sloan CIO Symposium.

I think he's fairly representative of a lot of people who think that blockchain can very possibly be a very big deal, but also recognizing, Irving said we were probably in the equivalent of the 1980s Internet. It takes a long time to build out these kind of infrastructures.

Mark: That sums it up pretty well. One of the other things I heard when I first started with Hyperledger back in December at a conference in New York, was everyone agreed we're at the peak of the hype cycle, but also that it's still going to be very big.

Gordon: Actually, somebody made a very similar comment to me. It might have been the same event. They asked me where did I think it was in the hype cycle.

I actually looked up a Gartner "Emerging Technologies Hype Cycle" report and guess where blockchain was in that report? [At the peak of the hype cycle.] It scares me a little bit, but I agree with Gartner, to tell you the truth, but that was certainly their opinion.

Mark: Through my interactions here at Red Hat, I'm seeing lots of interest from healthcare, insurance. You can use this to cut down on paperwork for insurance companies, things like that.

"Here's the list of treatments that you're eligible for." The doctor goes in, says, "I did these," and he just gets paid. There's no going back through the review process, things like that.

Gordon: There certainly seem at least a lot of potential use cases out there. You have to believe that some of those are going to pan out at least.

Mark: Right.