Smashing Spotlight: Antonio Holguin, Associate Principal Designer
Ever wonder what makes Smashing Ideas so smashing? Our people! We sat down with Antonio Holguin, Associate Principal Designer, to talk shop, the top 5 things that designers can do to get Voice UI right, the steps designers can take to make more inclusive and accessible digital experiences, and as our resident expert on space, his thoughts on whether we’re truly alone in the universe.
You’ve been at Smashing for more than 10 years. What keeps you so invested in continuing to work here?
First and foremost, the people. I work with some of the most interesting, smart, funny, and talented people who always keep me laughing and learning.
Second, my personal direction. Or, rather, directions. Over the years I’ve had the opportunity to work towards a variety of focuses. Nothing is ever stagnate if I don’t want it to be. Smashing has given me the opportunities to explore. I started as a junior designer creating Flash ads, sites, and games, then moved to our Mobile and Devices group where I designed iOS games and apps. Later I refocused back to web design. Currently, I am designing and building Alexa Skills and Google Home Actions.
You have a strong interest in Voice User Interfaces. Do you think Voice UI will become as mainstream as its touch counterpart? If so, why?
Voice User Interfaces (VUIs) are already becoming mainstream. Currently, we think of VUIs as cognitive machines and artificial intelligences to some degree, but VUIs that control even simple computers have been around for decades.
Let’s back up in history just a little bit to see how we are poised to be very receptive to VUIs. You can find audible interactive experiences of some kind in everyday life. Calling a customer service hotline with a phone tree is a variation of VUI – “press STAR to hear this menu again.” If you’ve ever gone though the self checkout lane at the grocery store, you’ve been told how to use the scanner by the computer – “place your item in the bagging area.” In these instances, you may not be using your own voice to interact with the computer, but the interface presented to you is audible and may or may not have a visual counterpart. We are accustomed to listening to a computer response to whatever input we give it.
Now, consider that voice input has been around for some time as well. Phone trees, or Interactive Voice Responses (IVR) have been able to respond to a caller’s voice since the 70’s, but have become more abundant in the past decade or so due to increased computational capabilities. Many of these still are a bit troublesome, resulting in many users yelling “Representative!” to eventually speak to a human.
We are finally seeing the rise of consumer devices with embedded Natural Language Processing (NLP), like Amazon’s Echo with Alexa, Google’s Home with Assistant, or Apple’s OSes with Siri, among others. These are backed by incredibly robust, offsite computational services which can handle language processing and return at breakneck speeds (provided the internet connection is stable and fast). These same services are beginning to be more than simple instruction takers. Backed by machine learning (algorithms, data storage, etc), these services are becoming “intelligent” enough to proactively help users.
Vocal or audible language is an interaction method the vast majority of users can understand and are ready to apply today. Vocal language is one of the first communication devices humans learn. So it’s only natural that VUIs become a popular mode of communicating with technology. It’ll only take increased access to devices with advancing NLP capabilities to make the VUIs more abundant. Simultaneously, the artificial intelligences will become “smarter” and more capable to help users, adding to their appeal, which will, in turn, increase the desire to access VUI capable devices.
What do you think are the biggest hurdles for VUI to become more ubiquitous?
A. Privacy and Security. | The communications we have with our technology is quite private in nature, on a number of levels. A device, plugged in to the internet that is always listening, is frankly scary. Users will need to trust that the device isn’t recording and sending anything it hears to anyone who can use it maliciously. Privacy is a major concern.
Using this technology in public is also problematic. Most people aren’t going to want strangers to overhear what a text message from a loved one says, or that they need to put milk on their shopping list. Public use of VUIs is a barrier to common use.
B. Speech Recognition. | Everyone speaks differently. VUIs will need to be able to recognize as many people’s voices and speech patterns as possible. A VUI that can’t understand the words we say gets really frustrating, really fast.
What does the research and design process for Voice UI look like?
Designing VUI apps start out much like phone app or site design, but eventually it will include more writing, talking, and listening, rather than pushing pixels.
Here’s a general step-by-step:
- First, you need to define the goal or idea.
- Second, you need to define the users, and their needs, goals, and pains.
- Third, create a journey map to help visualize how the app is used within a users’ daily life.
- Fourth, you can begin to dig into the app itself and define a user flow. This is where VUI apps begin to diverge from other UX centered projects. It helps if you start writing simple sentences. For instance, “I want to do X for Y and Z.” That sentence can most likely be said a number of different ways with the same expected outcome and response. Build upon the user flow and scripts. It helps to read your script out loud and have other people help as actors. Remember, building a VUI app is essentially designing a guided conversation.
- Fifth, once you begin to build the app, start interacting with it. Computer speech doesn’t sound like human speech, so there will be a considerable amount of time tweaking responses to have it sound just right.
How do you go about pleasing both a client and the end-user?
Anything net positive for the user is a positive for the client. As long as we continue to focus on making a product that works as well as possible for the people using it, then we are doing our due dillegence. The trick is making sure our product meets all the needs of any business requirement, or that those requirements are not in contradiction to what users need or expect.
What are the top 5 things developers/designers can do to get Voice UI right?
- Read. | Read everything. Then read some more; not only blog posts and books about how to design for Voice, but everything you possibly can. I like reading science communication, because the authors have to be as clear and unambiguous as possible.
- Write (including editing). | Writing helps us express our words as clearly as possible. Writing gives us the chance to change the words, to perfect them. Then write again.
- Talk. | Talk through your ideas. Talk through your design. Talk about what you wrote and what you’ve read. Read your writing out loud.
- Gather feedback. | Have other people listen to you and give you feedback on what you said and how you said it. It’s like having an editor for your diction.
- Listen. | I mean, really listen. Different people say the same things different ways. Listen to how synonyms can clarify or obscure a thought. Listen to VUI responses, but don’t expect that what you write as your script will be read aloud by the computer as eloquently as a human. You’re going to have to go back and rewrite some speech.
Dream artistic project?
Not a project, but rather where my art ends up. Ideally, I’d love an installation or permanent hanging at a NASA, SpaceX, or the Blue Origin admin building.
What improvements to this kind of technology do you want to see?
- Connected device hardware integration done well. | Nest in conjunction with Google Home is possibly the best example of this to date. Often, connected devices have a VUI component that feels either tacked on, or not very well thought out.
- Better speech recognition. | Toddlers are a good litmus test for VUIs. Google Assistant can understand my three year old with relatively decent accuracy. Alexa is only OK. Siri has serious trouble understanding him at all.
- More robust computing AI. | As I stated above, when the intelligences behind a voice interface become more proactively helpful, we’ll see a large rise in consumer usage.
- Better 3rd party developer support and access. | Right now Amazon provides the most access to their Alexa service, but it is still in its infancy, with changes, bugs, a confusing array of device-based capabilities, and a base of knowledge that is sporadic and difficult to navigate.
Accessibility within design is unfortunately not always a given. What steps can designers take to make more inclusive digital experiences?
To start, we collectively need to stop thinking of accessibility as another “problem” to find a solution for. Users with varying needs are not a defect that pose an obstacle to design, nor do they deserve their experiences minimized to an “accessibility pass” at the end of a project. Start with thinking about people who need assistance as a part of your user base, because they are.
If you could have dinner with any character in the Star Wars galaxy, who would it be and why?
- Bro-friend dinner: Finn
- Spiritual Advisor / Life Coach dinner: Leia
- Dinner like the ‘Royale with Cheese’ scene in Pulp Fiction (jabbering about silliness): Han (from Return of the Jedi era)
- Dinner like the dinner scene in The Untouchables: Poe – Gotta teach him about teamwork.