I was just re-reading Maida Napolitano’s article on Voice Picking from Logistics Management magazine mid-2010 (“Three Voices, Three Solutions”, Maida Napolitano, Logistics Management, July 2010). In it, Ms. Napolitano highlights three different voice picking solutions, from three different providers. All three solutions have different architectures. This provides the foundation for the segmentation of the voice solutions in the article.
Although the article is an excellent overview of three of the possible architectures for voice solutions, there is one small problem. There are actually FOUR different architectures available today, providing (at least), four different voice platforms.
Napolitano’s article covers the following designs:
- Proprietary Solutions – These are speaker-dependent solutions requiring custom voice hardware, and are the oldest voice solutions on the market.
- Open Hardware – These can be speaker-dependent, or -independent, and utilize off-the-shelf mobile device hardware with thick-client applications provided by the voice solution provider.
- Intelligent Networks – These are speaker-dependent, or -independent, and utilize a thin-client “approach”, with “more intelligence placed in the network” (Quotations mine).
Although this description gets very close to enumerating all the differences in voice architectures, it is missing one key design. Also, it tends to separate two solutions that share a fundamental design element, and lumps in the missing element as part of the last one.
Let me explain.
The first two options, Proprietary, and Open Hardware, do offer separate hardware options. However, they are alike in that they both rely on an API connection on the back-end to interface with the WMS application. Leaving the hardware aside for a moment, this architecture requires that the voice part of the solution be tightly integrated with the rest of the WMS solution, on the back-end / server side of the equation. This can lead to redundancy, where the voice solution must re-implement part of the WMS application, and other additional costs, as well as reduce flexibility in the overall WMS application. Every software engineer knows, the more integration points, the more complex it is to change any one part.
The last option, Intelligent Networks, is really about using VoIP phones to implement the voice part of the equation. While this is an interesting use of technology, like any good innovation it adds an entirely new set of challenges, risks, and potential costs. Plus, it mixes disparate part of the I/T infrastructure while attempting to remain ruggedized.
So, what’s missing?
The missing architecture design in this article is a totally client-side solution like Wavelink’s Speakeasy. Speakeasy puts the voice engine technology totally on open hardware, yet requires NO API integration on the back-end server side of the WMS application. Everything is done on the client side, a.k.a the mobile device. This architecture has all the benefits of Ms. Napolitano’s Open Hardware, AND Intelligent Networks, with none of the challenges induced by a server-side architecture. And, it does not require a separate device, like a phone. It is open hardware, and speaker-independent (or -dependent if you choose), and it relies on high-performance, highly secure thin-clients that are ubiquitous on mobile device platforms. Nothing proprietary, and provides maximum flexibility to the end-user who can implement as much, or as little, voice in their multi-modal solution as they require.
While I applaud Napolitano’s excellent article as an overview of three potential voice solutions (with great examples, by the way), I highly recommend anyone considering a voice solution remember there is a fourth option, which can provide much higher flexibility, and much lower costs.
For more on Wavelink’s Speakeasy, I recommend viewing the video of Goya Food’s implementation on Wavelink’s YouTube channel.