This is a guest post by Kevin Dalias.
I recently had the chance to attend Strata 2014 in Santa Clara, and since it was my first time at the conference, I tried to attend as many sessions as I could to understand what really makes data science tick these days. And of course, I heard plenty of the usual “a data scientist must be…” bullet points, but session after session, a new addition to the list began to take shape in my mind.
After hearing from about a dozen of the brightest contributors to the machine learning community, it finally hit me. Today’s best data scientists aren’t just scientists, they’re designers.
We’ve all heard the saying that a data scientist is a cross between a statistician, domain expert, and machine learning hacker, but in today’s landscape, that falls short. A good data scientist needs to be all of the above and also a great designer. A designer takes the time to think carefully about the how and why of a certain project that’s in front of her, and she contemplates which aspects and experiences will take a machine learning application from functional to extraordinary for her end users. In other words, a designer doesn’t forget why she’s designing in the first place: people.
So before you jump in the deep end on your next machine learning project, take the time to think like a designer.
1. Ask yourself the Who, What, How, and Why of your machine learning project
This one was inspired by Max Shron – the esteemed and incredibly charismatic former data scientist at okCupid. In Max’s session, he heavily emphasized the importance of context in any data science or machine learning project. It’s not enough to know the data and the methods to get from point A to an effective model.
In today’s machine learning, you need to take the time to understand the people behind the request – their motives, struggles, opinions, and values. So ask yourself a few questions before you get started:
- Who is asking me to help? What’s their role in the organization, and what do they personally need to accomplish to be successful in that role?
- What data are they typically working with, and more importantly, what meaning do they currently look for in that data? What additional insights might a machine learning application offer?
- How can I reliably access and clean the data I need to fuel my application? And once I have a steady data supply, how can I ensure my model evolves to meet changing needs?
- Why am I here? Why is this project a priority for the the person who requested it? If I’m successful, what will change about the requester’s daily life?
By taking the time to think about the people involved in your project, you’re positioning yourself to deliver a final product that doesn’t feel like just-another-hard-to-understand tool but instead feels like a natural fit amongst their already complex workflow. Remember, our objective as machine learners is to give our users super-human data-crunching capabilities, and while we’re the experts in machine learning, our stakeholders and clients are truly the domain experts.
2. Design for transparency
Machine learning is an incredibly exciting field, but it’s one that’s still highly technical and hard for an average person to grasp. Because of this, a machine learning application can often feel like a black box to an end user, and this lack of transparency and understanding will make it hard for an average user to trust and rely on your machine learning algorithm. Luckily, this challenge is easy to overcome when you approach a machine learning project like a designer.
As you design your machine learning application, take the time to document and explain the inputs you’ve chosen, the methods you use to clean the data, and the underlying ideas or concepts that drive your application.
Also, be careful to leverage easily understood methods like decision trees or rule-based methods. It’s entirely possible that your findings and insights will challenge or outright disprove some long-held beliefs, so it’s important that you make it possible for an end user to understand the process driving these learnings. This way, your users will feel empowered to explain and interpret your machine learning application’s outputs.
So, as you begin to think about your final product, do so with the understanding that it should be easy and intuitive for a user to drill down into the underlying data and information.
3. Think about how users will interact with your machine learning project.
If you took the time to work through each question in step 1, you’re in the perfect spot to start to understand how an end user will be interacting with your machine learning application and its outputs. Start by asking a few questions:
- In what circumstances will this information be used?
- Is this information going to be shared externally?
- Will users want to change or help to guide classifications or identify clusters?
Long before you settle on a model or method, it’s critical that you put significant thought into how the interface that delivers this information will ultimately look and feel to users. When you carefully consider the questions above, you position yourself to not just deliver on expectations but to instead design a machine learning application and interface that fit easily into a user’s workflow.
For example, if your machine learning application is going to provide insights that will be shared with an external client, you can take the initiative to design an export or output report that’s prepped and approved for sharing. Taking this approach to understanding user experience will help you design a tool that enhances rather than adds to a user’s day-to-day objectives.
4. Put pen to paper and sketch a wireframe of your final product
While you should definitely start thinking about interface design from the start, it’s important that you don’t primp and polish the final output or interface of your machine learning application out of the gate. While it can be incredibly tempting to jump straight to designing a sleek interface in HTML/CSS through which a user will interact with your machine learning algorithm, this is fundamentally misguided.
Once you invest time in choosing colors and layout, a couple of things happen which won’t help your cause. First, you’ll personally become a bit attached to your design, and secondly, those for whom you’re designing will be much more hesitant in providing feedback as they’ll feel like they’re criticizing rather than contributing. There’s an easy solution to this challenge: good old-fashioned pen and paper.
Start with a completely rough and tumble sketch of what your interface might look like, and begin the UI design conversation with this sketch. When you approach the end user with a design that’s simply a sketch on paper, they’re going to feel much more comfortable offering feedback and asking for changes because the product will feel much less set in stone.
You also have the added advantage of helping an end user to become a project contributor. This means they’ll feel stronger ownership over the machine learning project and will be much more likely to evangelize it amongst your co-workers and clients.
Now, with your sketch in hand, it’s time to start the first (of many) feedback rounds with your users.
5. Design easy ways to provide feedback
I’ve made peace with the fact that I never get a design right the first time. It’s not that I’m a bad designer or machine learning engineer, instead I just lack the domain expertise that my end users have. So, often times features or outputs that seem important to me are actually trivial or unnecessary in the real world. And for this reason, I’d encourage you to treat any development project as an ongoing dialogue.
Start by seeking feedback on your rough sketches and concepts, and allow users to offer input and guidance as to what’s wrong and what’s not quite there yet. Take this feedback to heart and head back to development.
Be sure to keep your customers up to date as development continues and elicit feedback as you go. This can be done as frequently as daily in brief morning standups or, for busier customers, weekly will certainly do the trick. This back-and-forth will yield additional insight that ensure your product is well-matched to users’ needs.
Even when your project is just about complete, I recommend you always build in easy ways for end users to make suggestions or request tweaks. Whether this is a small “suggestions” field built in to the interface or just an email address, many users will offer insightful guidance and ideas. From this feedback, patterns will emerge, and you’ll be able to lock on to the most significant and impactful changes that need to be made.
Once you’ve worked through all 5 steps, you’ll be perfectly positioned to think like a designer in your next machine learning project. This approach will ultimately make your work more accessible, useful, and actionable. After all, what good is a machine learning application if your users don’t use it?
My name is Kevin Dalias, and I’m presently a Senior Analyst at Conversant in San Francisco.
I spend my days helping advertisers make smarter and faster decisions through machine learning and analysis of publishers and user behavior online.
Over the last year, I’ve fallen in love with machine learning, and my work in product classification and publisher clustering has delivered excellent growth in my clients’ affiliate programs.