Playing around with Aiko, an amazing, accessible transcription app for Mac and iOS

I recently heard about this fantastic app, available for both Mac OS and iOS, called Aiko which leverages AI technology to transcribe audio. What sets Aiko apart from similar solutions though include, in part:

  • It’s free, totally free.
  • Audio can be dictated directly into the app, or a pre-recorded file can be imported. I’m particularly excited about this second piece.
  • Everything happens on the end-user’s device, nothing is sent to the cloud.
  • Multiple languages are supported, we’re talking a lot of languages: 100 languages according to Aiko’s home page.

I was excited to test out this fascinating technology and so to really put it through it’s paces, in a sub-optimal recording environment, I decided to record some audio using my Apple Watch, while standing outside with lots of traffic and other background noise. What follows is the unedited output of my little experiment. I’m also adding the actual recorded audio, so that you can get a sense of the crummy audio I gave Aiko to work with.

Hello, and thanks for joining me today.

I’m playing with an app called AIKO.

It’s an app that leverages Whisper, which is a technology made by OpenAI, the folks that brought us ChatGPT.

Now unless you’ve been living under a rock for the past couple of months, I’m sure you’ve heard quite a lot about ChatGPT and the fascinating possibilities it opens up to us.

Anyway, Whisper, and on top of that this AIKO app, allow transcription of audio.

The interesting thing about it is that you can record directly in the AIKO app, or you can import audio, say from a file that was pre-recorded.

For example, you might have a pre-recorded audio file of a lecture or a class.

You would be able to import it into this AIKO app, transcription would happen, and then you would have the output as text.

For my test today, I’m standing outside in front of my house recording on my Apple Watch with traffic going by.

And the reason I’m doing this is because I wanted to come up with a very sub-optimal recording environment, just to better understand how the technology would deal with audio recorded in such an environment.

I’m also trying to speak as naturally as I can without saying words like um and uh, things that I think often get said when speaking.

The interesting thing about AIKO and the way that it transcribes audio is that it supposedly is able to insert punctuation correctly.

I’m not sure if it does anything about paragraphs or not, but as the speaker, I don’t have any way of controlling format.

Once you run a file or recording through AIKO, the output is rendered as text.

However, there are a few things you can do with it.

First, you can of course copy the text into some other application.

The other thing that you can do is have the text be timestamped.

The reason that this can be handy is that you can use that then to create files that can be used as closed captioning for videos.

Anyway, it is kind of loud out here, and so I will go back inside.

I also didn’t want to make this too long because I’m not sure if it’ll work at all or how accurate it’ll be, but my plan is to post this to the blog without editing it.

Stop, stop, stop.

Aiko-generated transcription from my Apple Watch recording.

One final note, the dictation ends with the words “stop stop”.  I didn’t actually speak those words, but because I have VoiceOver activated on my Apple Watch, they were picked up in the recording as I located and activated the stop button.  This is definitely incredible technology and the price certainly can’t be beat. From an accessibility perspective, I found Aiko to be extremely accessible with VoiceOver on both Mac and IOS and since it is a native app using native controls, I feel confident that it will work with other assistive technologies as well. You can find more information about Aiko, including FAQs, links to app store pages and more here.


When Success Means Buying a Smaller Suit

Recently, I got to participate on the Parallel podcast talking about, of all things, accessibility and fitness. The reason I phrase it this way is that anyone who knows me probably knows that fitness and I don’t normally go together in the same setence, let alone the same podcast. From the show description:

Starting or maintaining a fitness program is a challenge for anyone. If you have accessibility needs, you might experience barriers related to touchscreen devices, coaching that doesn’t address a hearing or visual disability, or a need for accommodations related to physical limitations. With its Fitness+ service, Apple has taken on some of these issues, and opened up the program to many more people with disabilities, We’ll talk with a Fitness+ user, and someone who has worked on Apple accessibility teams.

Talking about anything fitness related has always been challenging for me and so I want to particularly thank the ever-awesome Shelly Brisbin for being brave enough to include me. I also want to especially thank Sommer Panage and the other unsung heros that dare to dream of a more accessible world, and work so hard to make that a reality.

Parallel can be found everywhere great podcasts can be found, more info about the episode and how to subscribe to Parallel, which you should totally consider doing whether you listen to this episode or not, can be found on Parallel’s home page.


100 Days of SwiftUI, my foray into understanding a bit more about how iOS works

Ever since I was able to accessibly use an iOS device, an iPhone 3GS, I’ve imagined how awesome it would be to be able to develop my own applications. That excitement was very short lived though as I soon became aware of just how complicated developing an application really is. It’s a very involved process — or so it seemed to me — and for someone who hasn’t written any code since C ++ was the talk of the town, it seemed like an impossibility. I wrongly assumed this was especially true for iOS because apps are often very visual and interactive and I just couldn’t imagine how I’d tackle that without vision. And so I quickly decided that iOS app development was just not for me.

Fast-forward quite a few years and Apple releases Swift and SwiftUI which, at the risk of over simplifying things quite a bit, is a more powerful and natural programming language for application development. Put another way, Swift and SwiftUi is intended to make application development easy enough for just about anyone to learn and do. Being a natural skeptic, I doubted that it could be quite as easy as Apple seemed to suggest, but the idea behind it seemed really interesting to me.; indeed, Swift and SwiftUI have taken the iOS development community by storm, with entire applications being developed using it. With only so many hours in the day though, my challenge was going to be finding the time to devote to learning it. And so again, I set the idea aside figuring I might look into it whenever I had more time.

I’m not proud of this, but I have a long list of the things I want to do when I have more time, the thing is, the longer I wait to do any of the stuff on that list, the less time I’ll actually have to do any of it.

I initially learned about 100 Days of SwiftUI from Darcy and Holly of the Maccessibility Roundtable podcast. The idea behind this course is simple: learn SwiftUI gradually — you guessed it — over 100 days. The course suggests devoting an hour per day to learning and practicing the material. An hour per day doesn’t seem that bad to me, I probably spend at least an hour per day thinking about all the stuff I’d love to do, if only I had an hour per day. 🙂 While looking at the contents of the course is a little scary for someone like me who is just beginning, I love that there are days set aside for review and practice. In addition, there is emphasis on not trying to go it alone, students are encouraged to share progress and help one another. That sharing progress thing is actually one of the two rules of the course, as it can help with accountability and can also help the student make connections with others who are also learning.

So, what do I hope to ultimately accomplish? Sure, I’d absolutely love to get to the point where I can start developing or working on apps that are useful to someone, but that’s not actually my goal. I want to understand more about iOS apps because so often, when I report an accessibility issue, I feel like I really don’t have a way to describe what’s not working for me other than to say that something just isn’t working. I’m hoping that by learning the basics of SwiftUI, I might be in a slightly better position to provide more constructive feedback. Whether I’m able to develop my own apps, or help other developers improve theirs, I figure it’s a win either way and so I’m excited to get to learning. For anyone else who might also be interested, let’s definitely connect and learn together.


The Ultimate Blog Challenge for October 2022 and a brief intro

Many of you might remember that last year, I took part in something called the Ultimate Blog Challenge, a challenge that encourages bloggers to post every day over a given month. I really enjoyed the experience which not only helped me to write more regularly, but which also gave me the opportunity to connect with other really interesting bloggers who are passionate about so many fascinating things.

I really enjoyed participating in the challenge last year and so am excited to be participating again for the month of October. I thought Just to mix things up a little bit, I thought that this year, in addition to writing, I might also try to produce more audio content because experiencing the world through audio is something that unfortunately, many people don’t really take the time to do.

So, who am I?

For those who don’t know me, let me give you a quick intro. I’m Steve Sawczyn and I’ve worked in the accessibility field for, well, for a very long time. I was born blind and over my life, have witnessed the incredible impact technology has had on my ability, and the ability of others, to have access to information. That access to information thing is incredibly impactful, paving the way for incredible possibilities. When people have access to information, they are empowered to make informed decisions and are better able to be an active part of society. I also view accessibility as a springboard for innovation, not something that impedes progress. Indeed, many of the things we take for granted resulted from some sort of accessibility-related innovation — maybe a topic for a future article?

So, what do I blog about? Initially, I thought my blog should have a focus, a very specific focus and accessibility should be that focus. The thing is though, while accessibility is a big part of my life, my life is far more than just accessibility. I try and blog about a myriad of topics, accessibility being a big one, but certainly not the only one. After all, the URL for my blog is, not Steves.profession. I also love it when people take the time to comment on posts because that introduces additional perspectives into the conversation and that’s something from which everyone can learn.

In closing, whether you have followed me and this blog for quite a while, or whether you have just discovered me, I want to thank you for reading, for joining me as I attempt the Ultimate Blog Challenge and most of all, for being part of the conversation.

Android Discovery Uncategorized

It’s a boat! It’s a tank! It’s the physical description of the Nokia X100 budget phone

In my last post, I mentioned that I would provide a physical description of the Nokia X100, the budget phone I’m using to re-discover Android. As of this writing, T-Mobile offers the X100 for $252, however promotions can bring this price down even further.

When I first beheld the Nokia X100, my initial impression was one of solidity. This phone only weighs 7.65 Ounces, but somehow, it feels much heavier, possibly because of its aluminum construction. When I placed the phone on my desk, my immediate thought was that while empires may rise and fall, this phone will stay exactly where I put it, defying the forces of nature and time if need be.

The Nokia X100 display measures 6.7″ diagonally from corner to corner. In practical terms, this means that the display is larger than the decks of many cruise ships. A small aircraft could land on the X100’s display and easily have enough room to take off again. For those that are into specific measurements, the X100 measures 6.74″ long, by 3.14″ wide, by 0.36″ thick. I realize that phone size is a personal preference, but I find the X100 a bit too large for my liking: I often carry a phone in my pocket and use it one-handed, both of which are tricky to do with a device of this size. That said, if you prefer a larger screen, you will not be disappointed. Speaking of the display, the Nokia boasts a Max Vision HD+ display. I have no idea what that means, but it’s a highlighted feature, so obviously it must be important. 🙂

I absolutely love the way controls and ports are laid out on the Nokia. Along the right-hand edge is a volume control and also a slightly recessed button which serves as the lock/unlock/power button and integrated fingerprint sensor. Having the fingerprint sensor integrated directly into the lock button makes total sense to me since you have to touch that button to unlock the device anyway, why not have it read and verify the fingerprint at the same time? I don’t know what company was the first to integrate the fingerprint sensor into the lock button, my first introduction to this bit of awesome was with Apple’s iPad Air 4TH generation and ever since then, I’ve been wondering why more companies aren’t doing this; that Nokia and other Android manufacturers are doing this fills me with much joy. As a quick aside, many Android devices still have fingerprint sensors. For me, this is a major advantage because while I have learned to live with Apple’s Face ID, I have not learned to like it. Back to the X100: the right-hand side has the volume control and the power/lock/fingerprint sensor and that’s it. Along the bottom edge of the device are a speaker, a microphone, a USBC port, and a headphone jack. That’s right, in an era when most devices have done away with the headphone jack, the X100 still makes one available; it’s like coming home to an old friend. Along the left-hand edge of the X100 is a single button, a dedicated button to activate the Google Assistant. At first, I found it a bit disappointing that this button couldn’t be reassigned to some other application or function, but as I realized just how much I could actually do with the Google Assistant, I’ve come to appreciate having a dedicated button to activate it. There are no controls along the top edge, just solid aluminum, probably thick enough to come in handy during those times when you need to break your way through an ice jam, or hammer stone from a quarry. The back of the device is relatively flat with the only prominent feature being a slightly raised circular glass housing which contains the 48MP Quad Camera System.

One aspect of the 100 that I absolutely cannot fault is its battery life. I have tried and tried and tried to drain its battery and yet usually I’m the one who winds up drained and needing to recharge. According to the T-Mobile spec page, the X100 has a 4470 mA battery capable of delivering “up to 2-day battery life”. More specifically, they claim 25 hours of talk time and 39 days, (yeah, days), of standby time. I haven’t experienced this much battery life in a mobile device since, well since the last time I owned a Nokia back in 2005. Having enough battery power to get through my day has been a real challenge, often requiring me to bring along an external battery pack if I’m away from home for any length of time. With my not quite two-year-old iPhone 12 Mini, the low battery conversation goes something like this.

Phone, “Hey, alert! 20% battery remaining.”

Me, “OK, hang on, let me get your charger.”

Phone, “Hurry up, I was just kidding about that 20%, it’s actually more like 15% now.”

Me, “Seriously? How? It’s only been like five minutes since you told me you were at 20%.”

Phone, “Yeah I know, I just figured you could use some false hope in your day. 10% now by the way.”

I should note that I’ve been trying to get Apple to replace my iPhone’s battery, but apparently, it hasn’t lost enough total capacity yet. Put another way, I just haven’t suffered enough.

In contrast, the low battery experience with the Nokia is very different:

Phone, “Hey, just thought I’d let you know, my battery is at 20%.”

Me, “Oh shoot, I have a bunch of Apple chargers around, where the heck did I leave the USBC charger?”

Phone, “Hey, don’t stress, you can take the next day or two to find it, I mean any time this week is probably fine.”

There’s nothing more frustrating than running low on battery power and the idea of having a device that can get me through my day, while having enough battery left over to possibly power a small village, is a definite win.

There’s a few more aspects and specifications of the X100 that I should call out. First the processor, the X100 has a Qualcomm® Snapdragon™ 480. This is hardly the newest or fastest processor available on Android devices, but given the price point of this phone, it seems more than adequate. My usage and testing has admittedly been limited thus far, but I have not encountered any significant issues attributable to this processor. Another thing worth mentioning is that the X100 is a 5G phone meaning that the device can function on the latest mobile networks. More specifically, the X100 supports the following frequencies and bands — don’t worry if you don’t know what these numbers mean, basically, the phone works on a bunch of different networks in a bunch of different countries, with a bunch of different providers: GSM: 850 MHz, 900 MHz, 1800 MHz, 1900 MHz; UMTS: Band I (2100), Band II (1900), Band IV (1700/2100), Band V (850), Band VIII (900); 5G: n25, n26, n66, n71; LTE: 2, 4, 5, 12, 25, 26, 41, 66, 71; LTE Roaming: 1, 3, 7, 8, 13, 20, 38, 39, 40

Regarding memory, it is possible to expand the 128 GB of built-in memory storage with the use of a memory card, it’s possible to expand storage to 1 TB according to T-Mobile’s specifications. I don’t anticipate needing more storage than the built-in 128 GB, but it’s nice to know I have the option to add additional storage if I’m wrong.

I’m really pleased with the Nokia X100. While I personally prefer smaller devices, the X100 is a very solid phone at an extremely attractive price point. The X100 may not have all the bells and whistles found in higher priced Android devices, but when it comes to getting stuff done, the X100 seems more than up to the task.


Diving into Android, a journey of rediscovery

As those of you who may have followed this blog for a while probably know, every few years or so, I generally switch my primary mobile operating system between iOS and Android. I’ve done this for a few reasons, first because I feel it’s important that I keep up with how each operating system is evolving and second, … OK there really isn’t a second, I’m just a geek at heart and it gives me an excuse to play with the other operating system.

While I am not planning on actually switching from iOS to Android this time, there are a few reasons, beyond the geek thing, which have caused me to want to dive into Android again and better understand how that platform has evolved from an accessibility experience perspective. First, it’s been a few years and both operating systems have evolved quite a bit in that time. Many of the issues that caused me to switch back from Android to iOS have been addressed and I’m really curious to see what the newer experience is like. The second and more important reason though is that Android devices exist at just about every possible price point and I still don’t feel that this is truly the case with iOS. Don’t get me wrong, iOS devices are fantastic, but for many, they are still very unaffordable and with the cost of everything increasing, this becomes an even bigger challenge for many people with disabilities. This point was recently emphasized during a conversation I recently had. IN short, I was talking to someone about all the amazing things we can do with mobile devices and her comment was that she felt very shut out, shut out because iOS devices, even used devices, were beyond her family’s budget. The conversation quickly turned toward Android, but when she started asking about the capabilities of lower priced devices, I found that I really didn’t have any answers for her. Obviously so-called budget devices are not going to be the fastest and aren’t going to have the latest and greatest features, but can they work well enough to help someone not feel so “shut out”? The more I looked into this, the more I started realizing that yes, yes they probably can, but without getting my hands on such a device, it would be difficult to really understand what that experience might be like.

I’m starting my Android rediscovery journey with a Nokia X100 budget phone. As of this writing, the X100 is available from T-Mobile for a cash price of $252, however as with most devices purchased from a carrier, this price can be decreased with various offers such as adding a new line of service. I’ll cover my first impressions of the device in another post, but while this device certainly doesn’t sport all the latest and greatest features, I’m really impressed with just how many capabilities it does have, especially at this price point.

As always whenever I blog about something, my hope is that this will evolve into a conversation, a conversation that fosters learning and understanding. If I get something wrong, feel free to jump in and let me know. If I do something and you think you know of a better way, jump in and let me know that too.

I’m excited to see where this Android rediscovery journey will take me, and I thank you for coming along.


Tip: Does the FaceTime control bar sometimes get in your way? There’s an accessible way to dismiss it.

One of the new features introduced in iOS15 is this call control bar which provides FaceTime audio controls across the top of the iOS screen during a FaceTime audio call.

Screen shot of Steve's very messy iOS home screen with the FaceTime control bar across the top. Visible controls, from left to right, are leave call, open messages, Audio route, Mute, camera, share content.
Screen shot of FaceTime control bar

I actually really like this new control bar because it gives me the option to mute/unmute from wherever I am and for me, this is much faster than having to switch back to the FaceTime app each and every time. That said, there are times when this control bar gets in the way. For example, sometimes I’ll be in an application and I know there’s a “back” button, but I can’t get to it with VoiceOver because it’s obscured by the FaceTime audio control bar. I mentioned my frustration about this to a sighted friend and she told me that visually, it’s possible to swipe this control bar away. At first, I thought we might have an accessibility issue of some sort as I could not find a way to do this when using VoiceOver. Eventually, I remembered the two-finger scrub gesture and like magic, away it went.

For anyone unfamiliar with it, the two-finger scrub gesture is a VoiceOver command that can be used in a few different ways depending on context. IF a keyboard is visible, the two-finger scrub gesture will dismiss it. If an application has a “back” button, the two-finger scrub gesture will perform that action. The easiest way to think about the purpose of this gesture is that it can help you get out of something by dismissing a control, navigating back, closing a pop-up or menu — in many ways, similar to what might happen when pressing the escape key when using a desktop application. To perform this gesture, place two fingers on the screen and move them quickly in a scrubbing motion such as right, left, right.

Putting it all together

If you ever have a reason to temporarily dismiss the FaceTime Audio call control bar and need to do so using VoiceOver, here’s how to do it.

  1. Touch the FaceTime Audio control bar with one finger, this will set VoiceOver’s focus to the correct place. This is important because otherwise, VoiceOver’s focus will remain on your home screen or on whatever aplicationp screen you have open and the scrub gesture will not dismiss the control bar.
  2. Perform the two-finger scrub gesture. If successful, the control bar will go away. IF not, double check that you have correctly set VoiceOver focus to the control bar as just described. If the two-finger scrub gesture isn’t performed correctly, it is possible that focus may inatvertantly move away from the FaceTime Audio control bar.

A few more things to note. First, I don’t know of a way to permanently dismiss the FaceTime Audio control bar and so you will have to repeat these steps whenever you need to dismiss it. Second, if you dismissed the control bar and then want to have it back, you can make it reappear by double tapping the call indicator located on the iOS status bar.

I really like the new FaceTime Audio control bar and find it super useful to have call controls available regardless of which app I’m in or which screen I’m on. For those times though where it might come in handy to move that bar out of the way, I’m glad there’s an accessible way to do so.


Sharing: New trend in tactile currencies

I recently came across this fascinating post which I am sharing because I think it might be of interest to my own readers.tactile markings on currency is something that has fascinated me for a while and this post is a fantastic explanation and historical account of it. I definitely encourage anyone with interest in the subject to check it out and to follow this blog.

Among the most recent tactile currency markings, a new trend is emerging: indicating the value using dot patterns. The idea is not new, it has been …

New trend in tactile currencies

On this Thanksgiving, a quick note of thanks

As we celebrate Thanksgiving here in the US today, I wanted to send out a quick note of thanks to all of you: for reading my words, for providing encouragement as I continue my blogging journey, and for engaging in some really amazing conversation along the way. I have a lot to be thankful for this year, but there is one group of folks I want to recognize in particular: those developers who work extra hard to ensure their apps are accessible.

There are many developers who work tirelessly to make their apps accessible, not because they necessarily have to, but because they simply realize it’s the right thing to do. There are many accessibility resources out there that can help developers make their apps accessible, but finding those resources, understanding them, and figuring out how to implement them can be a real challenge, especially for developers with extremely limited resources.

I’d like to encourage everyone to think about an app that makes a real difference to them, whether for accessibility or other reasons, and consider writing the developer a positive review of thanks today. I’ve spoken with many developers who have indicated to me that while it may seem like a small thing, positive reviews make a real difference. First, the more stars an app receives, the more likely it will be discovered by others. Second, a kind review is a great way to show appreciation in a public way. And finally, your review might make a difference to someone who appreciates the hard work a developer has put into making their app accessible — I know I’ve felt more comfortable purchasing apps when I see a review like, “works well with VoiceOver” or “very accessible”. Writing a quick review is a great way to say thank you, it’s something that makes a real difference, something that is appreciated, and something that only takes a few minutes to do.

Again, thank you all for reading my words, supporting me, and for continuing the conversation. To those who celebrate, have a happy Thanksgiving.


Quick tip: how to get rid of the iOS bubble sound when typing or using Braille Screen Input

I’ve been using Braille Screen Input on iOS for years, as it helps me to type more efficiently. One thing that has bothered me though, whether typing with Braille Screen Input or the on-screen keyboard, is this bubble sound that VoiceOver occasionally makes. While that sound does have a purpose and an important one at that, I find it distracting and have always lamented that I didn’t have a way to disable it. Little did I know that there actually is a way to disable it.

I received many replies on Twitter, some from people experiencing the same frustration as me, and others, offering a solution I likely never would have found on my own.

As it turns out, there are actually a lot of sound customizations that can be made in VoiceOver, many of which are off by default and so I never even knew they existed. Not only that, but it’s possible to preview each of the VoiceOver sounds which is a great way to learn what they actually mean. I recorded a brief video showcasing these settings in the hopes it might be useful to others.

Demo of the VoiceOver sounds dialog

Disabling the VoiceOver auto fill sound has made a world of difference for me. Now I can use Braille Screen Input without being distracted every couple of words. In fact, I’ve written this very entry solely using Braille Screen Input.

I would like to thank Rachel, Matthew, and Kara, for getting back to me so quickly with what proved to be the perfect solution. Twitter can be an awesome place for conversation and I’m glad these awesome people are a part of it.