2013-11-19 02:51:27 +00:00
dejavu
==========
2013-12-06 16:57:24 +00:00
Audio fingerprinting and recognition algorithm implemented in Python, see the explanation here:
[How it works ](http://willdrevo.com/fingerprinting-and-audio-recognition-with-python.html )
2013-11-19 02:51:27 +00:00
Dejavu can memorize audio by listening to it once and fingerprinting it. Then by playing a song and recording microphone input, Dejavu attempts to match the audio against the fingerprints held in the database, returning the song being played.
2013-12-16 16:41:33 +00:00
## Dependencies:
2013-11-19 02:51:27 +00:00
I've only tested this on Unix systems.
* [`pyaudio` ](http://people.csail.mit.edu/hubert/pyaudio/ ) for grabbing audio from microphone
* [`ffmpeg` ](https://github.com/FFmpeg/FFmpeg ) for converting audio files to .wav format
* [`pydub` ](http://pydub.com/ ), a Python `ffmpeg` wrapper
* [`numpy` ](http://www.numpy.org/ ) for taking the FFT of audio signals
* [`scipy` ](http://www.scipy.org/ ), used in peak finding algorithms
2013-12-16 16:41:33 +00:00
* [`matplotlib` ](http://matplotlib.org/ ), used for spectrograms and plotting
2013-11-19 02:51:27 +00:00
* [`MySQLdb` ](http://mysql-python.sourceforge.net/MySQLdb.html ) for interfacing with MySQL databases
For installing `ffmpeg` on Mac OS X, I highly recommend [this post ](http://jungels.net/articles/ffmpeg-howto.html ).
## Setup
First, install the above dependencies.
Second, you'll need to create a MySQL database where Dejavu can store fingerprints. For example, on your local setup:
$ mysql -u root -p
Enter password: ** ********
mysql> CREATE DATABASE IF NOT EXISTS dejavu;
Next, create a configuration file. Let's call it `dejavu.cnf` :
[connection]
username: root
password: < password above >
database: < name of database you created above >
hostname: 127.0.0.1
Now you're ready to start fingerprinting your audio collection!
## Fingerprinting
Let's say we want to fingerprint all of July 2013's VA US Top 40 hits.
Start by loading the configuration file and creating a Dejavu object.
```python
>>> from dejavu.control import Dejavu
>>> from ConfigParser import ConfigParser
>>> config = ConfigParser()
>>> config.read("dejavu.cnf")
>>> dejavu = Dejavu(config)
```
Next, give the `fingerprint()` command four arguments:
* input directory to look for audio files
* output directory to place .wav files (for computing fingerprints)
* audio extensions to look for in the input directory
* number of processes
```python
>>> dejavu.fingerprint("va_us_top_40/mp3", "va_us_top_40/wav", [".mp3"], 3)
```
For a large amount of files, this will take a while. However, Dejavu is robust enough you can kill and restart without affecting progress: Dejavu remembers which songs it fingerprinted and converted and which it didn't, and so won't repeat itself.
You'll have a lot of fingerprints once it completes a large folder of mp3s:
```python
>>> print dejavu.fingerprinter.db.get_num_fingerprints()