Today's article is not a very simple article...


Jul 23, 2020
But, the rewards are great...

Basically, today's article is about an application that will help you extract text from a whole host of file types. Seriously, check out the list of file types. Assuming you're listening to music with intelligible singing, it can even extract the text from those files - so I'm told. I tested a bunch of stuff but not that particular one. You should test it and leave a comment!

It's not easy. It requires reading the article. It's not terribly hard, but it's not as easy as my normal stuff. Still, it's worth the effort if you want to extract text from documents, images, music files, etc...


This is quite a powerful application, assuming one nails the dependencies. If you can't nail them all, it's still useful. It was a couple of years ago that I first found the application and added it to my system and my bookmarks. Oddly, I wasn't planning on an article about it - even though I was working on my site at the time.

I figured it was just a tad too complicated for an article. I used it recently and really figured that it was worth an article. So, I wrote said article.

There are a lot of great Python applications that can be installed with PIP. That's another tool that I think folks overlook.
article about pyenv vs virtualenv vs that new fancy thing which' name I forgot would be interesting too.
Fitting, that you consider that being the "not easy" part.

I used to 'heat map'. That is, I tracked scrolling, mouse movements, and clicks.

Statistically speaking, most people skip reading a bunch of the article. They almost always scroll to the middle. I'm guilty of this as well, but they also skip the end - not even reading all of the middle section.

It was anonymized data and didn't include things like IP addresses.

Way outside of my wheelhouse for that.

I do have a user-generated article that I need to get published. So, there's that.

