Pull insights from a sample. Push actions to the census.
Too often we geeks, suits, and wonks live in our own worlds of making it work, making it sell, or making it right, respectively. We don’t seem to have the capacity to routinely cross-pollinate. This may be why knowledge grows exponentially and wisdom grows only linearly. Conferences like Strata give us a venue to escape from our mile deep and inch wide comfort zones.
Take the talk by Quid’s Amy Heineike on maps, not lists. She references Eli Pariser’s warning on how listed search results spoon feed us the “best” result at the top. Heineike argues that maps lend themselves to mindful exploration even for non-spatial data. I certainly agree. What’s more, she also strikes a hopeful chord for data science. In the Filter Bubble (which I highly recommend), Pariser quotes Marshall McLuhan who warns that “we shape our tools and thereafter our tools shape us.” Heineike provides a great example of how we then reshape our tools — something I discussed at Strata+Hadoop World last Fall.
Twitter’s Nathan Marz sounded similar “people meet data” themes during his keynote on human fault-tolerance. His battle cry is for data systems that protect themselves from human error, similarly to how we protect systems from hardware faults. His ground truth axiom that the worst problems result from lost or corrupt data (especially silent corruption) is spot-on. All else is recoverable. He recommends immutable systems where bugs can’t delete or corrupt data. Immutability resonates with my electrical engineering training where outputs are a result of transfer functions on inputs. This is the essence of functional programming that I’ve taken a liking to lately.
Hosted by Strata Conference chair Alistair Croll, the Great Debate: Design Matters More Than Math pitted LinkedIn’s Monica Rogati and SkyTree’s Alexander Gray on the side of math against O’Reilly’s Julie Steele and ClearStory Data’s Douglas van der Molen to battle for the value of design. It was a valiant attempt, but math won in a landslide. I was on the losing design side for the simple reason that, yes, math defines what’s possible, but design defines what’s preferable. Math may be the source of knowledge, but design is the source of wisdom — something our species has in short supply.
Finally, my talk (video here) on the sensitive topic of criminal profiling attempted to push the technology and the debate. We designed a felon classifier based on a defendant’s publicly available non-felony criminal record and personal data. The resulting classifier is available on GitHub here.
One of the motivations for the talk was to prove that big data inferences are not a new category of thoughtcrime. However, actions based on those inferences could very well be criminal. For predictive policing, the courts will be the final human arbiter on the admissibility of such computerized informants. Here’s my interview with O’Reilly’s Mac Slocum that touches on some of these issues:
What I found most interesting about this exercise is that the technology can only take us so far. The classifier’s operating point determines how many innocents will be classified as felons (false positives) and how many felons will go undetected (false negatives). Only we fallible humans can choose the right trade-off between tyranny and anarchy. Such is the line that the responsible innovator must walk between high-tech mercenary, traditional capitalist, and social entrepreneur.
Never mistake motion for progress. — with apologies to Ernest Hemingway
The earth probably sees plastic as just another one of its children. Could be the only reason the earth allowed us to be spawned from it in the first place. It wanted plastic for itself. Didn’t know how to make it. Needed us. Could be the answer to our age-old egocentric philosophical question:
Us: “Why are we here?”
Earth: “Plastic … asshole.”
— George Carlin
I just finished Ray Kurzweil’s How To Create a Mind: The Secret of Human Thought Revealed. The book is technical enough for the nerdy, but plainspoken enough for everyone else. It got me thinking or, as Kurzweil would have it, pattern matching.
Kurzweil expands on his decades-long thesis that the Law of Accelerating Returns (LOAR as he’s coined it) drives the exponential increase in price/performance of computing. By 2029, this growth in hardware/software will create an intelligence that rivals our brain’s wetware. The LOAR is based on five key concepts that underly all computing:
Managers cannot single-handedly create value … but they can single-handedly destroy it.
For those that missed my talk at Strata+Hadoop 2012 in October, Big Data is a Hotbed of Thoughtcrime. So What?, the kind folks at O’Reilly have made it available on their YouTube Channel. Here’s the video link.
I also did an interview for the preview issue of Big Data Journal on the thoughtcrime topic. And for those that want to follow along with the video, here’s the presentation:
In great teams, there is a democracy of ideas, but a dictatorship of decisions. — Khoi Tu, author of Superteams
There’s a lot at stake in today’s election. As expected, the candidates have been vocal about education, health care, women’s rights, foreign policy, the environment, economics — even Big Bird. But my question to the candidates is: “What’s your stand on the balance between government regulation and technology innovation?”
If the republicans win, we can count on business-friendly policies that drive innovation. If the democrats win, we’ll see continued focus on consumer protections. Regardless of who wins, one thing is certain: the country loses if we fail to find the balance between the two.
A version of this post appeared on the TRUSTe blog.
Despite the ongoing discussions about online privacy by legislatures, regulators, and data conservationists — self-regulation remains the primary tool to ensure consumer information is handled responsibly. And rightly so.
Too often, privacy debates devolve into false dichotomies, dominated by arguments that advocate for being always anonymous or that privacy is dead. Both are wrong. Online privacy continues to be an important conversation because we humans are both social and autonomous creatures. And, we need solutions that balance values of both disclosure and discretion. The U.S. founders knew that.
If my life is fruitless, it does not matter who praises me, and if my life is fruitful, it does not matter who criticizes me. — John Bunyan