Cloud Native/NFV

Trumpageddon Shows Limits of Analytics

SAN FRANCISCO -- Structure 2016 -- How did analytics go so spectacularly wrong during this year's presidential election? How can enterprises and telecoms trust conclusions delivered by analytics anymore?

As the Structure conference kicked off its first day, speakers extolled the virtues of cloud computing. Among these is the ability to deliver sophisticated analytics using big data, which is just not practical to deliver on a wide scale without the cloud.

At the same time, American voters selected Donald Trump as President, after months in which the best big data analysts in the world concluded Hillary Clinton would win.

I spoke to a few experts at the conference and asked them if this outcome means analytics is baloney. They said that analytics is valuable, but the election fiasco was a valuable lesson in the limits of analytics.

"When you have humans involved, analytics can only tell you one piece of the puzzle," Eric Chiu, founder and president of HyTrust, which provides security software for the VMware Inc. (NYSE: VMW) stack, told Light Reading. "Analytics are great when it has to do with machines, patterns and behaviors of things that aren't thoughts, feelings and emotions."

People voted for Trump but didn't tell pollsters they would do it, Jeetu Patel, SVP of platform and chief strategy officer for cloud provider Box.net, told Light Reading. "There were a fair number of people that didn't overtly state their preference for Donald Trump that did go out and vote for him," Patel said.

The conclusion for companies starting to put their faith in analytics: Don't rely on what people say. Rely on what they do.

Also, unlike people, machines don't lie, deceive themselves, or change their minds. And much of the domain of analytics doesn't involve people -- it involves network management and Internet of Things.

"Instrumentation in how people use products and services is getting baked into businesses," said Patel.

Analytics used right can predict human behavior. The key is to measure what people do, not what they say they will do. Mobile apps, location data, and web tracking are valuable tools for that.

"There's a huge difference between fuzzy and inaccurate data like polling, which is not a reliable indicator of how people will vote, and application data and mobile data," Matt Wood, GM product strategy for AWS, said in a presentation at the conference here. Application and mobile data "is not a shadow or mirage of intent -- it is exactly what customers have done." And unlike political polling, application and mobile data doesn't rely on sampling -- it collects all the data. (See Why Amazon Web Services?)

In other words: "Garbage in, garbage out."

That's not an expression you hear very much anymore, though it was common in the mainframe era of computing. For those who weren't around, tt means if you feed a computer bad information, you get bad answers.

The phrase was common when I was a pre-teen in the early 70s, learning in school about computers. And it's far older than that; the website Atlas Obscura finds an appearance in print in a 1957 Indiana newspaper, with hints that the phrase was already common among engineers by then.

Indeed, Atlas Obscura notes that "garbage in, garbage out" became self-referential, as Wikipedia attributed it wrongly to an IBM technician/instructor named George Fuechsel. That claim later appeared in a book -- perhaps originated by the Wikipedia entry. The book was added to the Wikipedia article as a citation, and then the information appeared in other books.

But Fuechsel himself wasn't sure; he posted a comment to a blog in 2004 wondering whether he was misremembering inventing the phrase.

The principle behind "garbage in, garbage out" dates to the 19th century, when Charles Babbage designed the first calculating machines. Babbage wrote: "On two occasions I have been asked,— 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question."

Here in the 21st century, data scientists should tattoo those two sentences on their arms.

Related posts:

— Mitch Wagner, Follow me on TwitterVisit my LinkedIn profile, Editor, Light Reading Enterprise Cloud

kq4ym 11/22/2016 | 1:18:46 PM
Re: Don't blame the analytics One does wonder if there is some pollster bias set up that slanted questions as well as just who and how many were polled. Polls aren't always exactly science, although it would be great if there could be a "control" and "experimental" group just to avoid any bias in the polling. But then again, we don't always know how to predict just who and what to ask either in order to cover all the variations possible.
kq4ym 11/16/2016 | 11:27:22 AM
Re: Yea As polls indicate what people say they will do, and what people say isn't always the truth, it's a bit sticky to bet the farm on polling. Analytics should measure on the other hand what people actually do. But, even there it's sometimes a guess that we're monitoring the data that will lead to a likely outcome. Polling and even analytyics can be sometimes like betting at the race track. We can ask horse racing fans "who they like" and then monitor the actually betting on the odds board. But, even seconds before the horses exit the starting gate we don't know the winner until a particular horse crosses the finish line.  The winner could be the odds on favorite, or a long shot.
chuckj 11/14/2016 | 11:43:55 AM
Don't blame the analytics The failure of the analytics for this election was not the analytics but those setting the parameters to skew the results to their globalist agenda.  What was surprising was despite of a sea of analytics to skew public opinion or squelch turnout Trump still won and won decisively. I have no problem trusting analytics on whether people like apples better or orange better, there is no globalist agenda there.
JohnAnderson1985 11/13/2016 | 8:59:05 AM
Yea I think that analysts weren't so wrong. It's just television showed Hilary cause she owed medias. It's obvious.
bosco_pcs 11/11/2016 | 3:01:17 PM
But what kind of analytics? To begin with, phone polling is no longer sufficient. I mean, I routinely hang up on people when they say "survey." I don't mean to be mean but nit it in the bud is the best way to prevent social engineering.

That said, it may be unfair to blame it on analytics. Polling is at best predictive analytics. Very often, it is just descriptive big data (data lake) kind. A good refining algorithm might get from the Warehouse to a Mart. Then what? Allegedly, the victor was using Cambridge Analytica's microtargeting services. And allegedly, the same outfit was the power behind Brexit. So, now we are venturing into prescriptive analytics territory. Perhaps one can call it precision data.

Honestly, I have been away from data warehouse stuff for more than 5 years; but if I know this stuff, the pros running the campaigns should too. They shouldn't just listen to outfits like 538 for their forecast and prescription.

Finally, Mrs Clinton won the popular votes. So the polls weren't entirely wrong. They just didn't account the electoral college level of complexity. Their system was incomplete. That's a pity, considering the campaign had enough resources to take other things into considerations. And today Chairman Podesta said (h/t Mother Jones) the campaign did witness the erosion in the last 10 days (and you know what happened, right)
mendyk 11/11/2016 | 10:51:33 AM
BDA = BAD? Good points, Mitch. In the rush to abdicate decision-making to non-humans, we kind of forgot about GIGO. The US election and the Brexit vote are stark reminders that output is ultimately dependent on input. BDA isn't useless, but we need to understand the limits of its usefulness and then determine whether the investment is warranted for specific applications. In the inevitable hype/disillusion/recovery process, BDA is now deeper into the disillusion phase.
Sign In