Jignesh Patel’s Big Data Revolution
Over the past 40 years, Madison has played a significant but often unsung role in the rise of Big Data. Now one UW prof is carrying the torch, working on research with the potential to transform the digital revolution and democratize data as we know it.
Jignesh Patel's Quickstep project will make processing Big Data more efficient and available to more people.
PHOTO BY NOAH WILLMAN
(page 1 of 2)
"It's kind of like finding a needle in a haystack."
Jignesh Patel is sitting in a Madison café talking about big data. Between sips of coffee, the University of Wisconsin computer sciences professor uses the familiar expression to explain just what this buzzy tech phrase is all about before launching into a remarkable story about Madison’s connection to its past, present and future.
The tech trend du jour, “big data” refers to the vast amount of information that’s out in the world, made possible by the digital trails we leave with the ever-increasing portion of our routine activities occurring online—email, web searches, online banking and shopping, social media posts and so on. All of this digital activity, coupled with the miniscule costs of storing this information, enables those who can harness and analyze large sets of data to learn all kinds of valuable information about the world and those living in it. It’s big data to thank (or blame) when you see a web ad for a store you were just browsing online, and what makes it possible for the weather app on your phone to tell you if it’s going to snow today. In Patel’s hay and needle narrative, the enormity of all the data that’s out there is the hay; the tiny portion that someone actually wants is the proverbial needle. That weather app can tell you it’s going to snow in Madison today at noon, yes, but it also has information on the temperature in New York, the wind speed in San Diego and the overnight low in Little Rock.
And while the term big data has been around for a decade, crossing over from geekspeak to mainstream media usage over the past year or two, what it stands for—having an infinite amount of information available at the swipe of a finger—is what defines this digital revolution we’re in.
Patel, forty-three, has spent the last twenty years, most of them in Madison, studying data. His work has led him to receive such prestigious honors as the National Science Foundation Career Award and faculty awards from companies like Google, IBM and Microsoft. Research from his PhD, conducted at UW in the 1990s, was commercialized by computer hardware giant NCR Corp, where Patel worked as a software engineer consultant for a year. He’s published more than eighty academic papers and has been courted by all the major social networking companies, including Twitter, which acquired a startup Patel co-founded, called Locomatix, last August. Locomatix utilizes large sets of data to power real-time mobile analytics for businesses. While Twitter’s plans for the Locomatix platform remain veiled, it’s clear that the pithy—and now public—social messaging company is after Patel’s kind of brainpower.
BIG DATA'S BIG BROTHER
Originally from Mumbai, India, Patel first came to UW in 1991 as a graduate student to study computer hardware. He took one fateful course in databases, a sub-field of computer sciences that deals with organizing and analyzing data sets, and was hooked. While the topic of databases itself was interesting, it was UW’s reputation in this field and the people pushing it forward that won him over.
“A lot of data processing stuff that now runs the world, the ideas for that were invented here,” Patel says. “The influence of Wisconsin, going back to its legacy as being the pioneer in ... building database technology, is at the heart of technology today.”
The groundbreaking research Patel references is the stuff of the Wisconsin Database Systems Group, a research collective launched by then-associate professor David DeWitt in the late 1970s. “It was one of the very few departments around the world at the time that actually saw the value of data,” says Patel, who received his master’s and PhD from UW in 1993 and 1998, respectively, with DeWitt as his advisor.
DeWitt, now professor emeritus after a thirty-two-year tenure in the UW computer sciences department, including four years as its chair, is big stuff. He is thought of as the father of parallel database systems—a technical term that Patel says is the precursor to big data. Parallel database systems changed the game. It was like going from a single-lane highway plagued by traffic jams to a six-lane freeway where the cars are all still headed in the same direction but are able to move faster and more efficiently. “The world’s economy runs on databases,” DeWitt says.
With DeWitt at the helm, the group quickly grew to five members and became one of the premier academic research collectives of its kind right from the start. “Nobody else had five people in database systems,” DeWitt says. Considering that the “nobody else” here refers to Berkeley, MIT and the University of Michigan—the only other institutions with similar groups at the time—it’s no wonder Wisconsin’s early dedication to database systems secured its prominence in the field early on.
Jeff Naughton, current chair of UW’s computer sciences department, left his faculty post at Princeton to return to his hometown of Madison because of UW’s expertise in databases, his area of research.
“This was like the center of the universe for database systems,” he says.
And because UW’s computer sciences department played such a large role in developing this core technology, many higher-ups at technology companies are UW alumni. “Google has a strong influence of Wisconsin folks on the back-end,” Patel says. “Many were graduate students with me in the ’90s. Same thing with Facebook and Twitter.” Companies like Microsoft, Yahoo, Oracle and IBM have also had former members of the Wisconsin Database Systems Group within their senior leadership teams.
After receiving his PhD and consulting with NCR Corp. for a year, Patel moved to Ann Arbor to teach at the University of Michigan. He stayed there for nine years, eventually coming back to Madison in 2008 in a rare move for UW’s computer sciences department, which, according to DeWitt, seldomly hires back its own PhD graduates as a means to get new blood into the department. But they wanted Patel back. He’s that good.
Back with the Wisconsin Database Systems Group, Patel developed a particular interest in how much energy is consumed in running database systems, especially when he thought about the energy needs his two young children’s generation will inherit. The problem? “You can’t put in more power than we already put in,” he says. The amount of data in the world doubles every eighteen months to two years. “But to match that we obviously can’t start to double our power budget every two years,” Patel says. “That’s unsustainable.” This power limit appears across the spectrum of computing, from a personal tablet or phone to hundreds of thousands of corporate- and government-owned servers across the globe.
Think of it this way: If you wanted your iPad to be twice as fast—take half as much time to load a Netflix movie, for example—the only way to get that iPad the power it needs to do that would subsequently heat up the iPad, making it twice as hot. “You probably wouldn’t like that very much,” Patel says. More than just making the device painful to the touch, such a heat increase would melt its inner parts.
So that’s what Patel’s working on now. His current research at UW, a project launched in 2011 called Quickstep, is working toward developing a next-generation database system that will organize and process big data more efficiently to make the best use of the power that’s available. If you can’t put more power in, you have to make use of what you have, doing more with less.
Patel and his ten graduate research assistants are taking a nuanced approach with the Quickstep project, developing both computer hardware and software together to complement each other and make this new database system as efficient as possible.
Craig Chasseur, a current PhD student and one of the Quickstep research assistants, says that the advances in software and hardware over the past few decades have not matched up. “Current database systems are leaving a lot of potential ... on the table. A big thing we’re addressing with Quickstep is unlocking all that potential.”