Hello, and thanks for joining me today.
I’m playing with an app called AIKO.
It’s an app that leverages Whisper, which is a technology made by OpenAI, the folks that brought us ChatGPT.
Now unless you’ve been living under a rock for the past couple of months, I’m sure you’ve heard quite a lot about ChatGPT and the fascinating possibilities it opens up to us.
Anyway, Whisper, and on top of that this AIKO app, allow transcription of audio.
The interesting thing about it is that you can record directly in the AIKO app, or you can import audio, say from a file that was pre-recorded.
For example, you might have a pre-recorded audio file of a lecture or a class.
You would be able to import it into this AIKO app, transcription would happen, and then you would have the output as text.
For my test today, I’m standing outside in front of my house recording on my Apple Watch with traffic going by.
And the reason I’m doing this is because I wanted to come up with a very sub-optimal recording environment, just to better understand how the technology would deal with audio recorded in such an environment.
I’m also trying to speak as naturally as I can without saying words like um and uh, things that I think often get said when speaking.
The interesting thing about AIKO and the way that it transcribes audio is that it supposedly is able to insert punctuation correctly.
I’m not sure if it does anything about paragraphs or not, but as the speaker, I don’t have any way of controlling format.
Once you run a file or recording through AIKO, the output is rendered as text.
However, there are a few things you can do with it.
First, you can of course copy the text into some other application.
The other thing that you can do is have the text be timestamped.
The reason that this can be handy is that you can use that then to create files that can be used as closed captioning for videos.
Anyway, it is kind of loud out here, and so I will go back inside.
I also didn’t want to make this too long because I’m not sure if it’ll work at all or how accurate it’ll be, but my plan is to post this to the blog without editing it.
Stop, stop, stop.
Aiko-generated transcription from my Apple Watch recording.
One final note, the dictation ends with the words “stop stop”. I didn’t actually speak those words, but because I have VoiceOver activated on my Apple Watch, they were picked up in the recording as I located and activated the stop button. This is definitely incredible technology and the price certainly can’t be beat. From an accessibility perspective, I found Aiko to be extremely accessible with VoiceOver on both Mac and IOS and since it is a native app using native controls, I feel confident that it will work with other assistive technologies as well. You can find more information about Aiko, including FAQs, links to app store pages and more here.