Machine Learning-based Voice ITC Translator Software now available
Since early 2019, I have been working on software to extract voices from physical noise/signals. My earliest attempts used other people's software, mainly an algorithm called "spectral subtraction." in a ReaFir noise reduction plugin. This converts the noise into the frequency spectrum, where slight imprints of voice can be discovered and emphasized.
We now enter the year 2022 - Spectral subtraction is still a very valuable tool, but it is only the beginning of a process I've developed for extracting voices. I've created machine-learning-based models to find and emphasize voices. I've also made a program that finds and generates "formants" or peaks in the harmonic buzz of the human voice.
I'm finally releasing my full software, in Python. I use a very similar version of) this code in all of my experiments (FPGAs, radio noise, etc.)
I would've liked to have shared it as an executable, like I did Spiricam, but Python executable-makers are notoriously buggy. Another reason I've hesitated is=n sharing the code sooner is that it used to require some heavy GPU resources. However, thanks to some software developments by Google, my ML models seem to run OK on the CPU pretty well in real-time.
So if you want to try out my code, you'll have to do some command-line steps and you'll have to at minimum install a free program called Miniconda, or a larger version called Anaconda with Python version 3.8, 64-bit. Maybe a few GBs of disk storage will be required.
Here's the link to the code: https://drive.google.com/drive/folders/1fu6hAuE0AbhbQjx0Ts_3Ju0QRJ0awxRM?usp=sharing
In the directory is a README.txt, which I'll update as we iron out the instructions.
When I've resolved most of the common issues, I'll make the code into a ZIP file for the Downloads sections.
For now, feel free to ask questions in the comments. As I like to say "The spirits are waiting!"
42 Comments
Recommended Comments