If any further proof was needed that podcasts are a big deal, the summer announcement of an exclusive deal between Joe Rogan and Spotify reportedly worth $100 million should do the trick.
Under the licensing deal, the entire library of “The Joe Rogan Experience”, one of America’s most popular podcasts, will become exclusive to Spotify by the end of 2020.
But navigating and searching a podcast library like Rogan’s can be extremely difficult. Google and its enterprise counterparts have focused their search engines on textual documents, leaving many other kinds of unstructured data untouched.
Vancouver startup Caption is tackling this problem. Caption allows customers to transcribe, index and search audio and video archives through a graphical interface or an API which customers can use to power the search functionality on their own sites.
Entrepreneur Gary Vaynerchuk’s personal search engine is an excellent example of this type of technology at work, allowing you to search for anything he’s ever said.
Due to their ballooning popularity, podcasts have become one of Caption’s most important use cases. Episodes frequently take over three hours, and navigating them can be challenging for listeners thanks to the long-form format of the conversations.
With this in mind, Caption has leveraged their technology to build a podcast search engine, aggregating episodes from fifteen and counting of the most popular podcasts, such as Lex Fridman and Dan Carlin’s Hardcore History.
The site, which launched last week and was featured on Product Hunt, allows users to enter any keyword and find the exact moments where it’s mentioned inside the episodes. It also provides transcripts of the episodes.
Caption was founded by Marin Smiljanic, a former engineer at Amazon in Vancouver. Smiljanic grew frustrated while working at the e-commerce giant and trying to access educational videos on their company portal.
Caption’s software adds proprietary algorithms to power textual search, similarity queries, and recommendations to the latest advances in speech-to-text technology, now readily available from both AWS and Google Cloud Platform.
So far the number of podcasts Caption has indexed is relatively small but Smiljanic’s goal is to secure deals with podcasters to license the software.