The ChatGPT app uses the speech-to-text model Whisper, and it’s always spot on in English. Whisper is open source. I don’t understand why it’s not widespread, but hopefully it will be or similarly capable software will be soon.
Google Colab can run Whisper, but it can most likely also run on an old, cheap laptop that you have.
Fractional scalling works fine for me. Am I doing something wrong? How do I break it?
Oof that sucks. Thank you for answering :)
Ten favorites on Mastodon?! Woah! That’s a hit toot!
Does Lemmy not support alt text? I can’t add alt text from Infinity for Lemmy; is that because Lemmy doesn’t support it or just my client? Because they really should.
I don’t know who has to hear this but
that’s a phrase, not a quote.
I’ll likely change to Infinity for Everything. Infinity is awesome and is still working for free for now.
Edit: using Infinity for Lemmy
I ain’t using that goofy orange app.
Fun fact: you can’t illegally download all of Wikipedia, because you can only do it legally. The English Wikipedia is less than 100 GB with images and half that without images. You can also download the top 50,000 article, which with images, are less than 7 GB.
See @Kiwix@mastodon.social (I’m not sure how well-supported cross-galaxy mentions are here or if they even work)
edit: see Kiwix.org
On YouTube, I change the speed constantly, and, sometimes, I think I’m watch at x1 when I’m actually watching at x1.5 or even x2.
I can usually differentiate x1 from other speeds, but when there’s a slow talker and no music, it’s actually quite easy not to be able to tell unless I think hard about it, or realistically just check, especially since I’m very used to it, having done that for around two years.
However, watching a show for
3020 minutes without realizing it is unrealistic, especially since there are a lot of sound effects and it’s3020 minutes!