Just wrapping up my promotion of the Strata ‘Making Data Work’ Conference taking place on 1-3 February.
Last week I launched a quickfire Twitter raffle for one lucky person to win a free full pass to the Strata conference. Thanks to all who entered by re-tweeting or mentioning the post.
Entry closed at 12:00 UK time and after a colleauge randomly drew a name out of a hat (it was actually a cup) I have pleasure in announcing that Jérôme Cukier (jcukier) is the winner and recipient of the free pass discount code – I wish him a very enjoyable conference.
Commiserations to those who missed out – time is starting to run out before the event kicks off but you can still benefit from a 25% discount off the registration fee by using code str11vsd or by clicking here.
Unfortunately, I can’t attend the event so for those who will are lucky to be going I thought I’d share my thoughts on what I see as the best plan of action regarding the schedule with a recommendation for some of the great visualisation-related talks that will be held (when you have to make a choice between concurrent sessions). Enjoy!
Tuesday 1st February
9:00am
Make People Fall in Love with Your Data: A Practical Tutorial for Data Visualization and UI Design Interfaces – Ken Hilburn (Juice Analytics), Zach Gemignani (Juice Analytics)
“Water, water everywhere, nor any drop to drink.” – Rime of the Ancient Mariner. People feel overwhelmed with data. But the problem is not with the amount of data. The problem is that data is not presented in a form that people can understand and use. Juice Analytics will present and demonstrate proven techniques to design information applications to present data in enjoyable and rewarding ways.
or… (maybe sneak out halfway?)
9:00am
Data Bootcamp – Joseph Adler (LinkedIn), Hilary Mason (bit.ly), Drew Conway (New York University), Jake Hofman (Yahoo!)
This tutorial offers a basic introduction to practicing data science. We’ll walk through several typical projects that range from conceptualization to acquiring data, to analyzing and visualizing it, to drawing conclusions.
1:30pm
Communicating Data Clearly – Naomi Robbins (NBR)
This tutorial describes how to draw clear, concise, accurate graphs that are easier to understand than many of the graphs one sees today. The tutorial emphasizes how to avoid common mistakes that produce confusing or even misleading graphs. Graphs for one, two, three, and many variables are covered as well as general principles for creating effective graphs.
Wednesday 2nd February
10:40am
Telling Great Data Stories Online – Jock Mackinlay (Tableau Software)
Interactive visualizations have become the new media for telling stories online. This session will focus on going from a good visualization to a great visualization by focusing on organization, user interface, and formatting. You should expect to leave this session confident in your ability to consistently create excellent interactive visuals.
11:30am
MAD Skills: A Magnetic, Agile and Deep Approach to Scalable Analytics – Brian Dolan (Discovix ), Joe Hellerstein (UC Berkeley)
A discussion of Big Data approaches to analysis problems in marketing, forecasting, academia and enterprise computing. We focus on practices to enhance collaboration and employ rich statistical methods: a Magnetic, Agile and Deep (MAD) approach to analytics. While the approach is language-agnostic, we show that sophisticated statistics can be easily scaled in traditional environments like SQL.
1:40pm
Small is the New Big: Lessons in Visual Economy – Kim Rees (Periscopic)
While the majority of charts were designed to handle a variety of data, there is a certain novelty of presenting data in a very succinct way. By designing a presentation method restricted to specific data points, we can realize an economy of space and interface.
2:30pm
Big Data, Lean Startup: Data Science on a Shoestring – Philip Kromer (Infochimps)
How do you build a crack team of data scientists on a shoestring budget? In this 40-minute presentation from the co-founder of Infochimps, Flip Kromer will draw from his experiences as a teacher and his vast programming and data experience to share lessons learned in building a team of smart, enthusiastic hires.
4:10pm
Visualizing Shared, Distributed Data – Roman Stanek (GoodData) Moderated by: Alistair Croll
“Many hands make light work”, as the saying goes. That’s true when thousands of people can collaborate on a data set. In this session, we’ll look at collective interfaces that allow many distributed users to examine and share data with one another, and how that’s changing traditional desktop visualization tools.
or… (another sneak out halfway through?)
4:10pm
New Developments in Large Data Techniques – Joseph Turian (MetaOptimize)
Certain recent academic developments in large data have immediate and sweeping applications in industry. They offer forward-thinking businesses the opportunity to achieve technical competitive advantages. However, these little-known techniques have not been discussed outside academia–until now. What if you knew about important new large data techniques that your competition don’t yet know about?
5:00pm
Google Cloud for Data Crunchers – Patrick Chanezon (Google), Ryan Boyd (Google), Stefano Mazzocchi (Google, Inc.)
Many of the tools Google created to store, query, analyze, visualize data are exposed to external developers. This talk will give you an overview of Google services for Data Crunchers: Google Storage for developers, BigQuery, Machine Learning API, App Engine, Visualization API.
7:45pm
Unleashing Twitter Data for Fun and Insight – Matthew Russell (Digital Reasoning Systems)
This talk demonstrates how an eclectic blend of storage, analysis, and visualization techniques can be used to gain a lot of serious insight from Twitter data, but also to answer fun quesions such as “What does Justin Bieber and the Tea Party have (and not have) in common?”.
8:35pm
Avro Data – Doug Cutting (Cloudera)
Apache Avro provides an expressive, efficient standard for representing large data sets. Avro data is programming-language neutral and MapReduce-friendly. Hopefully it can replace gzipped CSV-like formats as a dominant format for data.
Thursday 3rd February
10:40am
Data Journalism: Applied Interfaces – Marshall Kirkpatrick (ReadWriteWeb), Simon Rogers (Guardian), Jer Thorp (The New York Times) Moderated by: Marshall Kirkpatrick
After Kennedy, you couldn’t win an election without TV. After Obama, it was social media. But tomorrow’s citizen gets their information from visualizations. In this panel, three acclaimed designers show how they apply visualization to big data, making complex, controversial topics easy to understand and explore.
11:30am
Realtime Analytics at Twitter – Kevin Weil (Twitter, Inc.)
Most analytics systems rely on large offline computations, which means results come in hours or days behind. Twitter is all about realtime, but with over 160 million users producing over 90 million tweets per day, we need realtime analytics that scaled horizontally. This talk discusses the development of that infrastructure, as well as the products we are beginning to build on top of it.
1:40pm
AnySurface: Bringing Agent-based Simulation and Data Visualization to All Surfaces – Stephen Guerin (Santa Fe Complex)
Live demonstration of ambient computing using projector-camera pairs to scan the room and place interactive simulations into the space. All surfaces are rendered interactive. We will demonstrate a 3D sandtable for firefighter training and STEM education where the 3D sand becomes and interactive surface.
2:30pm
Beyond visualization: Productivity, Complexity and Information Overload – Creve Maples, Ph.D. (Event Horizon)
We will discuss the impact of the information explosion, the effectiveness of current technological directions, and explore the success that new perception-based, human-computer interfaces provide in analyzing and understanding complex data. Real examples will be used to illustrate that effective man-machine environments are essential in productively dealing with multi-dimensional information.
4:10pm
Data as Art – J.J. Toothman (NASA Ames Research Center)
Artistic visualizations and infographics tell the stories of rich data in unique, compelling ways and synthesize datasets in ways that allow them to be interpreted, absorbed, and experienced in ways beyond the spreadsheet, pie chart, and bar graph.
5:00pm
Predicting the Future: Anticipating the World with Data – Christopher Ahlberg (Recorded Future), Robert McGrew (Palantir Technologies) Moderated by: Alistair Croll
Data doesn’t just show us the past—it can help predict the future. Several new firms harvest massive amounts of open data, trying to anticipate everything the right ad placement to the next terrorist attack. In this session, we bring together the founders of these firms to discuss the technology—and ethics—of looking into the future.
