mirror of
https://github.com/correl/dejavu.git
synced 2024-11-23 11:09:52 +00:00
Merge branch 'tuxdna-master'
This commit is contained in:
commit
19679f8d11
13 changed files with 288 additions and 180 deletions
|
@ -1,31 +0,0 @@
|
||||||
# Dependencies required by dejavu
|
|
||||||
|
|
||||||
* [`pyaudio`](http://people.csail.mit.edu/hubert/pyaudio/)
|
|
||||||
* [`ffmpeg`](https://github.com/FFmpeg/FFmpeg)
|
|
||||||
* [`pydub`](http://pydub.com/)
|
|
||||||
* [`numpy`](http://www.numpy.org/)
|
|
||||||
* [`scipy`](http://www.scipy.org/)
|
|
||||||
* [`matplotlib`](http://matplotlib.org/)
|
|
||||||
* [`MySQLdb`](http://mysql-python.sourceforge.net/MySQLdb.html)
|
|
||||||
|
|
||||||
## Dependency installation for Mac OS X
|
|
||||||
|
|
||||||
Tested on OS X Mavericks. An option is to install [Homebrew](http://brew.sh) and do the following:
|
|
||||||
|
|
||||||
```
|
|
||||||
brew install portaudio
|
|
||||||
brew install ffmpeg
|
|
||||||
|
|
||||||
sudo easy_install pyaudio
|
|
||||||
sudo easy_install pydub
|
|
||||||
sudo easy_install numpy
|
|
||||||
sudo easy_install scipy
|
|
||||||
sudo easy_install matplotlib
|
|
||||||
sudo easy_install pip
|
|
||||||
|
|
||||||
sudo pip install MySQL-python
|
|
||||||
|
|
||||||
sudo ln -s /usr/local/mysql/lib/libmysqlclient.18.dylib /usr/lib/libmysqlclient.18.dylib
|
|
||||||
```
|
|
||||||
|
|
||||||
However installing `portaudio` and/or `ffmpeg` from source is also doable.
|
|
63
INSTALLATION.md
Normal file
63
INSTALLATION.md
Normal file
|
@ -0,0 +1,63 @@
|
||||||
|
# Installation of dejavu
|
||||||
|
|
||||||
|
So far dejavu has only been tested on Unix systems.
|
||||||
|
|
||||||
|
* [`pyaudio`](http://people.csail.mit.edu/hubert/pyaudio/) for grabbing audio from microphone
|
||||||
|
* [`ffmpeg`](https://github.com/FFmpeg/FFmpeg) for converting audio files to .wav format
|
||||||
|
* [`pydub`](http://pydub.com/), a Python `ffmpeg` wrapper
|
||||||
|
* [`numpy`](http://www.numpy.org/) for taking the FFT of audio signals
|
||||||
|
* [`scipy`](http://www.scipy.org/), used in peak finding algorithms
|
||||||
|
* [`matplotlib`](http://matplotlib.org/), used for spectrograms and plotting
|
||||||
|
* [`MySQLdb`](http://mysql-python.sourceforge.net/MySQLdb.html) for interfacing with MySQL databases
|
||||||
|
|
||||||
|
For installing `ffmpeg` on Mac OS X, I highly recommend [this post](http://jungels.net/articles/ffmpeg-howto.html).
|
||||||
|
|
||||||
|
## Fedora 20+
|
||||||
|
|
||||||
|
### Dependency installation on Fedora 20+
|
||||||
|
|
||||||
|
Install the dependencies
|
||||||
|
|
||||||
|
sudo yum install numpy scipy python-matplotlib ffmpeg portaudio-devel
|
||||||
|
pip install PyAudio
|
||||||
|
pip install pydub
|
||||||
|
|
||||||
|
Now setup virtualenv ([howto?](http://www.pythoncentral.io/how-to-install-virtualenv-python/))
|
||||||
|
|
||||||
|
pip install virtualenv
|
||||||
|
virtualenv --system-site-packages env_with_system
|
||||||
|
|
||||||
|
Install from PyPI
|
||||||
|
|
||||||
|
source env_with_system/bin/activate
|
||||||
|
pip install PyDejavu
|
||||||
|
|
||||||
|
|
||||||
|
You can also install the latest code from GitHub:
|
||||||
|
|
||||||
|
source env_with_system/bin/activate
|
||||||
|
pip install https://github.com/tuxdna/dejavu/zipball/master
|
||||||
|
|
||||||
|
## Max OS X
|
||||||
|
|
||||||
|
### Dependency installation for Mac OS X
|
||||||
|
|
||||||
|
Tested on OS X Mavericks. An option is to install [Homebrew](http://brew.sh) and do the following:
|
||||||
|
|
||||||
|
```
|
||||||
|
brew install portaudio
|
||||||
|
brew install ffmpeg
|
||||||
|
|
||||||
|
sudo easy_install pyaudio
|
||||||
|
sudo easy_install pydub
|
||||||
|
sudo easy_install numpy
|
||||||
|
sudo easy_install scipy
|
||||||
|
sudo easy_install matplotlib
|
||||||
|
sudo easy_install pip
|
||||||
|
|
||||||
|
sudo pip install MySQL-python
|
||||||
|
|
||||||
|
sudo ln -s /usr/local/mysql/lib/libmysqlclient.18.dylib /usr/lib/libmysqlclient.18.dylib
|
||||||
|
```
|
||||||
|
|
||||||
|
However installing `portaudio` and/or `ffmpeg` from source is also doable.
|
1
MANIFEST.in
Normal file
1
MANIFEST.in
Normal file
|
@ -0,0 +1 @@
|
||||||
|
include requirements.txt
|
35
README.md
35
README.md
|
@ -6,19 +6,9 @@ Audio fingerprinting and recognition algorithm implemented in Python, see the ex
|
||||||
|
|
||||||
Dejavu can memorize audio by listening to it once and fingerprinting it. Then by playing a song and recording microphone input, Dejavu attempts to match the audio against the fingerprints held in the database, returning the song being played.
|
Dejavu can memorize audio by listening to it once and fingerprinting it. Then by playing a song and recording microphone input, Dejavu attempts to match the audio against the fingerprints held in the database, returning the song being played.
|
||||||
|
|
||||||
## Dependencies:
|
## Installation and Dependencies:
|
||||||
|
|
||||||
I've only tested this on Unix systems.
|
Read [INSTALLATION.md](INSTALLATION.md)
|
||||||
|
|
||||||
* [`pyaudio`](http://people.csail.mit.edu/hubert/pyaudio/) for grabbing audio from microphone
|
|
||||||
* [`ffmpeg`](https://github.com/FFmpeg/FFmpeg) for converting audio files to .wav format
|
|
||||||
* [`pydub`](http://pydub.com/), a Python `ffmpeg` wrapper
|
|
||||||
* [`numpy`](http://www.numpy.org/) for taking the FFT of audio signals
|
|
||||||
* [`scipy`](http://www.scipy.org/), used in peak finding algorithms
|
|
||||||
* [`matplotlib`](http://matplotlib.org/), used for spectrograms and plotting
|
|
||||||
* [`MySQLdb`](http://mysql-python.sourceforge.net/MySQLdb.html) for interfacing with MySQL databases
|
|
||||||
|
|
||||||
For installing `ffmpeg` on Mac OS X, I highly recommend [this post](http://jungels.net/articles/ffmpeg-howto.html).
|
|
||||||
|
|
||||||
## Setup
|
## Setup
|
||||||
|
|
||||||
|
@ -100,6 +90,19 @@ An example configuration is as follows:
|
||||||
>>> djv = Dejavu(config)
|
>>> djv = Dejavu(config)
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Tuning
|
||||||
|
|
||||||
|
Inside `fingerprint.py`, you may want to adjust following parameters (some values are given below).
|
||||||
|
|
||||||
|
FINGERPRINT_REDUCTION = 30
|
||||||
|
PEAK_SORT = False
|
||||||
|
DEFAULT_OVERLAP_RATIO = 0.4
|
||||||
|
DEFAULT_FAN_VALUE = 10
|
||||||
|
DEFAULT_AMP_MIN = 15
|
||||||
|
PEAK_NEIGHBORHOOD_SIZE = 30
|
||||||
|
|
||||||
|
These parameters are described in the `fingerprint.py` in detail. Read that in-order to understand the impact of changing these values.
|
||||||
|
|
||||||
## Recognizing
|
## Recognizing
|
||||||
|
|
||||||
There are two ways to recognize audio using Dejavu. You can recognize by reading and processing files on disk, or through your computer's microphone.
|
There are two ways to recognize audio using Dejavu. You can recognize by reading and processing files on disk, or through your computer's microphone.
|
||||||
|
@ -109,7 +112,7 @@ There are two ways to recognize audio using Dejavu. You can recognize by reading
|
||||||
Through the terminal:
|
Through the terminal:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ python dejavu.py recognize file sometrack.wav
|
$ python dejavu.py --recognize file sometrack.wav
|
||||||
{'song_id': 1, 'song_name': 'Taylor Swift - Shake It Off', 'confidence': 3948, 'offset_seconds': 30.00018, 'match_time': 0.7159781455993652, 'offset': 646L}
|
{'song_id': 1, 'song_name': 'Taylor Swift - Shake It Off', 'confidence': 3948, 'offset_seconds': 30.00018, 'match_time': 0.7159781455993652, 'offset': 646L}
|
||||||
```
|
```
|
||||||
|
|
||||||
|
@ -132,10 +135,10 @@ With scripting:
|
||||||
and with the command line script, you specify the number of seconds to listen:
|
and with the command line script, you specify the number of seconds to listen:
|
||||||
|
|
||||||
```bash
|
```bash
|
||||||
$ python dejavu.py recognize mic 10
|
$ python dejavu.py --recognize mic 10
|
||||||
```
|
```
|
||||||
|
|
||||||
## Testing (New!)
|
## Testing
|
||||||
|
|
||||||
Testing out different parameterizations of the fingerprinting algorithm is often useful as the corpus becomes larger and larger, and inevitable tradeoffs between speed and accuracy come into play.
|
Testing out different parameterizations of the fingerprinting algorithm is often useful as the corpus becomes larger and larger, and inevitable tradeoffs between speed and accuracy come into play.
|
||||||
|
|
||||||
|
@ -163,7 +166,7 @@ rm -rf ./results ./temp_audio
|
||||||
|
|
||||||
###########
|
###########
|
||||||
# Fingerprint files of extension mp3 in the ./mp3 folder
|
# Fingerprint files of extension mp3 in the ./mp3 folder
|
||||||
python dejavu.py fingerprint ./mp3/ mp3
|
python dejavu.py --fingerprint ./mp3/ mp3
|
||||||
|
|
||||||
##########
|
##########
|
||||||
# Run a test suite on the ./mp3 folder by extracting 1, 2, 3, 4, and 5
|
# Run a test suite on the ./mp3 folder by extracting 1, 2, 3, 4, and 5
|
||||||
|
|
143
dejavu.py
143
dejavu.py
|
@ -1,98 +1,93 @@
|
||||||
#!/usr/bin/python
|
#!/usr/bin/python
|
||||||
|
|
||||||
|
import os
|
||||||
import sys
|
import sys
|
||||||
import json
|
import json
|
||||||
import warnings
|
import warnings
|
||||||
|
import argparse
|
||||||
|
|
||||||
from dejavu import Dejavu
|
from dejavu import Dejavu
|
||||||
from dejavu.recognize import FileRecognizer
|
from dejavu.recognize import FileRecognizer
|
||||||
from dejavu.recognize import MicrophoneRecognizer
|
from dejavu.recognize import MicrophoneRecognizer
|
||||||
from dejavu.recognize import FileRecognizer
|
from argparse import RawTextHelpFormatter
|
||||||
|
|
||||||
warnings.filterwarnings("ignore")
|
warnings.filterwarnings("ignore")
|
||||||
|
|
||||||
def init():
|
DEFAULT_CONFIG_FILE = "dejavu.cnf.SAMPLE"
|
||||||
# load config from a JSON file (or anything outputting a python dictionary)
|
|
||||||
with open("dejavu.cnf") as f:
|
|
||||||
config = json.load(f)
|
def init(configpath):
|
||||||
|
"""
|
||||||
|
Load config from a JSON file
|
||||||
|
"""
|
||||||
|
try:
|
||||||
|
with open(configpath) as f:
|
||||||
|
config = json.load(f)
|
||||||
|
except IOError as err:
|
||||||
|
print("Cannot open configuration: %s. Exiting" % (str(err)))
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
# create a Dejavu instance
|
# create a Dejavu instance
|
||||||
return Dejavu(config)
|
return Dejavu(config)
|
||||||
|
|
||||||
def showHelp():
|
|
||||||
print ""
|
|
||||||
print "------------------------------------------------"
|
|
||||||
print "DejaVu audio fingerprinting and recognition tool"
|
|
||||||
print "------------------------------------------------"
|
|
||||||
print ""
|
|
||||||
print "Usage: dejavu.py [command] [arguments]"
|
|
||||||
print ""
|
|
||||||
print "Available commands:"
|
|
||||||
print ""
|
|
||||||
print " Fingerprint a file"
|
|
||||||
print " dejavu.py fingerprint /path/to/file.extension"
|
|
||||||
print ""
|
|
||||||
print " Fingerprint all files in a directory"
|
|
||||||
print " dejavu.py fingerprint /path/to/directory extension"
|
|
||||||
print ""
|
|
||||||
print " Recognize what is playing through the microphone"
|
|
||||||
print " dejavu.py recognize mic number_of_seconds"
|
|
||||||
print ""
|
|
||||||
print " Recognize a file by listening to it"
|
|
||||||
print " dejavu.py recognize file /path/to/file"
|
|
||||||
print ""
|
|
||||||
print " Display this help screen"
|
|
||||||
print " dejavu.py help"
|
|
||||||
print ""
|
|
||||||
exit
|
|
||||||
|
|
||||||
if len(sys.argv) > 1:
|
if __name__ == '__main__':
|
||||||
command = sys.argv[1]
|
parser = argparse.ArgumentParser(
|
||||||
else:
|
description="Dejavu: Audio Fingerprinting library",
|
||||||
showHelp()
|
formatter_class=RawTextHelpFormatter)
|
||||||
|
parser.add_argument('-c', '--config', nargs='?',
|
||||||
|
help='Path to configuration file\n'
|
||||||
|
'Usages: \n'
|
||||||
|
'--config /path/to/config-file\n')
|
||||||
|
parser.add_argument('-f', '--fingerprint', nargs='*',
|
||||||
|
help='Fingerprint files in a directory\n'
|
||||||
|
'Usages: \n'
|
||||||
|
'--fingerprint /path/to/directory extension\n'
|
||||||
|
'--fingerprint /path/to/directory')
|
||||||
|
parser.add_argument('-r', '--recognize', nargs=2,
|
||||||
|
help='Recognize what is '
|
||||||
|
'playing through the microphone\n'
|
||||||
|
'Usage: \n'
|
||||||
|
'--recognize mic number_of_seconds \n'
|
||||||
|
'--recognize file path/to/file \n')
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
if command == 'fingerprint': # Fingerprint all files in a directory
|
if not args.fingerprint and not args.recognize:
|
||||||
|
parser.print_help()
|
||||||
|
sys.exit(0)
|
||||||
|
|
||||||
djv = init()
|
config_file = args.config
|
||||||
|
if config_file is None:
|
||||||
|
config_file = DEFAULT_CONFIG_FILE
|
||||||
|
# print "Using default config file: %s" % (config_file)
|
||||||
|
|
||||||
if len(sys.argv) == 4:
|
djv = init(config_file)
|
||||||
|
if args.fingerprint:
|
||||||
|
# Fingerprint all files in a directory
|
||||||
|
if len(args.fingerprint) == 2:
|
||||||
|
directory = args.fingerprint[0]
|
||||||
|
extension = args.fingerprint[1]
|
||||||
|
print("Fingerprinting all .%s files in the %s directory"
|
||||||
|
% (extension, directory))
|
||||||
|
djv.fingerprint_directory(directory, ["." + extension], 4)
|
||||||
|
|
||||||
directory = sys.argv[2]
|
elif len(args.fingerprint) == 1:
|
||||||
extension = sys.argv[3]
|
filepath = args.fingerprint[0]
|
||||||
print "Fingerprinting all .%s files in the %s directory" % (extension, directory)
|
if os.path.isdir(filepath):
|
||||||
|
print("Please specify an extension if you'd like to fingerprint a directory!")
|
||||||
|
sys.exit(1)
|
||||||
|
djv.fingerprint_file(filepath)
|
||||||
|
|
||||||
djv.fingerprint_directory(directory, ["." + extension], 4)
|
elif args.recognize:
|
||||||
|
# Recognize audio source
|
||||||
|
song = None
|
||||||
|
source = args.recognize[0]
|
||||||
|
opt_arg = args.recognize[1]
|
||||||
|
|
||||||
else:
|
if source in ('mic', 'microphone'):
|
||||||
|
song = djv.recognize(MicrophoneRecognizer, seconds=opt_arg)
|
||||||
filepath = sys.argv[2]
|
elif source == 'file':
|
||||||
djv.fingerprint_file(filepath)
|
song = djv.recognize(FileRecognizer, opt_arg)
|
||||||
|
print(song)
|
||||||
elif command == 'recognize': # Recognize audio
|
|
||||||
|
|
||||||
source = sys.argv[2]
|
|
||||||
song = None
|
|
||||||
|
|
||||||
if source in ['mic', 'microphone']:
|
|
||||||
|
|
||||||
seconds = int(sys.argv[3])
|
|
||||||
djv = init()
|
|
||||||
song = djv.recognize(MicrophoneRecognizer, seconds=seconds)
|
|
||||||
|
|
||||||
elif source == 'file':
|
|
||||||
|
|
||||||
djv = init()
|
|
||||||
sourceFile = sys.argv[3]
|
|
||||||
song = djv.recognize(FileRecognizer, sourceFile)
|
|
||||||
|
|
||||||
else:
|
|
||||||
|
|
||||||
showHelp()
|
|
||||||
|
|
||||||
print song
|
|
||||||
|
|
||||||
else:
|
|
||||||
|
|
||||||
showHelp()
|
|
||||||
|
|
||||||
|
sys.exit(0)
|
||||||
|
|
|
@ -3,6 +3,9 @@ import dejavu.decoder as decoder
|
||||||
import fingerprint
|
import fingerprint
|
||||||
import multiprocessing
|
import multiprocessing
|
||||||
import os
|
import os
|
||||||
|
import traceback
|
||||||
|
import sys
|
||||||
|
|
||||||
|
|
||||||
class Dejavu(object):
|
class Dejavu(object):
|
||||||
|
|
||||||
|
@ -27,7 +30,7 @@ class Dejavu(object):
|
||||||
# if we should limit seconds fingerprinted,
|
# if we should limit seconds fingerprinted,
|
||||||
# None|-1 means use entire track
|
# None|-1 means use entire track
|
||||||
self.limit = self.config.get("fingerprint_limit", None)
|
self.limit = self.config.get("fingerprint_limit", None)
|
||||||
if self.limit == -1: # for JSON compatibility
|
if self.limit == -1: # for JSON compatibility
|
||||||
self.limit = None
|
self.limit = None
|
||||||
self.get_fingerprinted_songs()
|
self.get_fingerprinted_songs()
|
||||||
|
|
||||||
|
@ -79,9 +82,7 @@ class Dejavu(object):
|
||||||
break
|
break
|
||||||
except:
|
except:
|
||||||
print("Failed fingerprinting")
|
print("Failed fingerprinting")
|
||||||
|
|
||||||
# Print traceback because we can't reraise it here
|
# Print traceback because we can't reraise it here
|
||||||
import traceback, sys
|
|
||||||
traceback.print_exc(file=sys.stdout)
|
traceback.print_exc(file=sys.stdout)
|
||||||
else:
|
else:
|
||||||
sid = self.db.insert_song(song_name)
|
sid = self.db.insert_song(song_name)
|
||||||
|
@ -94,13 +95,12 @@ class Dejavu(object):
|
||||||
pool.join()
|
pool.join()
|
||||||
|
|
||||||
def fingerprint_file(self, filepath, song_name=None):
|
def fingerprint_file(self, filepath, song_name=None):
|
||||||
|
songname = decoder.path_to_songname(filepath)
|
||||||
songname = decoder.path_to_songname(filepath)
|
song_name = song_name or songname
|
||||||
song_name = song_name or songname
|
# don't refingerprint already fingerprinted files
|
||||||
# don't refingerprint already fingerprinted files
|
|
||||||
if song_name in self.songnames_set:
|
if song_name in self.songnames_set:
|
||||||
print "%s already fingerprinted, continuing..." % song_name
|
print "%s already fingerprinted, continuing..." % song_name
|
||||||
else:
|
else:
|
||||||
song_name, hashes = _fingerprint_worker(filepath,
|
song_name, hashes = _fingerprint_worker(filepath,
|
||||||
self.limit,
|
self.limit,
|
||||||
song_name=song_name)
|
song_name=song_name)
|
||||||
|
@ -129,9 +129,9 @@ class Dejavu(object):
|
||||||
song_id = -1
|
song_id = -1
|
||||||
for tup in matches:
|
for tup in matches:
|
||||||
sid, diff = tup
|
sid, diff = tup
|
||||||
if not diff in diff_counter:
|
if diff not in diff_counter:
|
||||||
diff_counter[diff] = {}
|
diff_counter[diff] = {}
|
||||||
if not sid in diff_counter[diff]:
|
if sid not in diff_counter[diff]:
|
||||||
diff_counter[diff][sid] = 0
|
diff_counter[diff][sid] = 0
|
||||||
diff_counter[diff][sid] += 1
|
diff_counter[diff][sid] += 1
|
||||||
|
|
||||||
|
@ -149,15 +149,16 @@ class Dejavu(object):
|
||||||
return None
|
return None
|
||||||
|
|
||||||
# return match info
|
# return match info
|
||||||
nseconds = round(float(largest) / fingerprint.DEFAULT_FS * \
|
nseconds = round(float(largest) / fingerprint.DEFAULT_FS *
|
||||||
fingerprint.DEFAULT_WINDOW_SIZE * \
|
fingerprint.DEFAULT_WINDOW_SIZE *
|
||||||
fingerprint.DEFAULT_OVERLAP_RATIO, 5)
|
fingerprint.DEFAULT_OVERLAP_RATIO, 5)
|
||||||
song = {
|
song = {
|
||||||
Dejavu.SONG_ID : song_id,
|
Dejavu.SONG_ID: song_id,
|
||||||
Dejavu.SONG_NAME : songname,
|
Dejavu.SONG_NAME: songname,
|
||||||
Dejavu.CONFIDENCE : largest_count,
|
Dejavu.CONFIDENCE: largest_count,
|
||||||
Dejavu.OFFSET : largest,
|
Dejavu.OFFSET: largest,
|
||||||
Dejavu.OFFSET_SECS : nseconds }
|
Dejavu.OFFSET_SECS: nseconds
|
||||||
|
}
|
||||||
|
|
||||||
return song
|
return song
|
||||||
|
|
||||||
|
|
|
@ -12,8 +12,8 @@ IDX_TIME_J = 1
|
||||||
|
|
||||||
######################################################################
|
######################################################################
|
||||||
# Sampling rate, related to the Nyquist conditions, which affects
|
# Sampling rate, related to the Nyquist conditions, which affects
|
||||||
# the range frequencies we can detect.
|
# the range frequencies we can detect.
|
||||||
DEFAULT_FS = 44100
|
DEFAULT_FS = 44100
|
||||||
|
|
||||||
######################################################################
|
######################################################################
|
||||||
# Size of the FFT window, affects frequency granularity
|
# Size of the FFT window, affects frequency granularity
|
||||||
|
@ -23,15 +23,15 @@ DEFAULT_WINDOW_SIZE = 4096
|
||||||
# Ratio by which each sequential window overlaps the last and the
|
# Ratio by which each sequential window overlaps the last and the
|
||||||
# next window. Higher overlap will allow a higher granularity of offset
|
# next window. Higher overlap will allow a higher granularity of offset
|
||||||
# matching, but potentially more fingerprints.
|
# matching, but potentially more fingerprints.
|
||||||
DEFAULT_OVERLAP_RATIO = 0.5
|
DEFAULT_OVERLAP_RATIO = 0.5
|
||||||
|
|
||||||
######################################################################
|
######################################################################
|
||||||
# Degree to which a fingerprint can be paired with its neighbors --
|
# Degree to which a fingerprint can be paired with its neighbors --
|
||||||
# higher will cause more fingerprints, but potentially better accuracy.
|
# higher will cause more fingerprints, but potentially better accuracy.
|
||||||
DEFAULT_FAN_VALUE = 15
|
DEFAULT_FAN_VALUE = 15
|
||||||
|
|
||||||
######################################################################
|
######################################################################
|
||||||
# Minimum amplitude in spectrogram in order to be considered a peak.
|
# Minimum amplitude in spectrogram in order to be considered a peak.
|
||||||
# This can be raised to reduce number of fingerprints, but can negatively
|
# This can be raised to reduce number of fingerprints, but can negatively
|
||||||
# affect accuracy.
|
# affect accuracy.
|
||||||
DEFAULT_AMP_MIN = 10
|
DEFAULT_AMP_MIN = 10
|
||||||
|
@ -39,13 +39,13 @@ DEFAULT_AMP_MIN = 10
|
||||||
######################################################################
|
######################################################################
|
||||||
# Number of cells around an amplitude peak in the spectrogram in order
|
# Number of cells around an amplitude peak in the spectrogram in order
|
||||||
# for Dejavu to consider it a spectral peak. Higher values mean less
|
# for Dejavu to consider it a spectral peak. Higher values mean less
|
||||||
# fingerprints and faster matching, but can potentially affect accuracy.
|
# fingerprints and faster matching, but can potentially affect accuracy.
|
||||||
PEAK_NEIGHBORHOOD_SIZE = 20
|
PEAK_NEIGHBORHOOD_SIZE = 20
|
||||||
|
|
||||||
######################################################################
|
######################################################################
|
||||||
# Thresholds on how close or far fingerprints can be in time in order
|
# Thresholds on how close or far fingerprints can be in time in order
|
||||||
# to be paired as a fingerprint. If your max is too low, higher values of
|
# to be paired as a fingerprint. If your max is too low, higher values of
|
||||||
# DEFAULT_FAN_VALUE may not perform as expected.
|
# DEFAULT_FAN_VALUE may not perform as expected.
|
||||||
MIN_HASH_TIME_DELTA = 0
|
MIN_HASH_TIME_DELTA = 0
|
||||||
MAX_HASH_TIME_DELTA = 200
|
MAX_HASH_TIME_DELTA = 200
|
||||||
|
|
||||||
|
@ -56,7 +56,7 @@ MAX_HASH_TIME_DELTA = 200
|
||||||
PEAK_SORT = True
|
PEAK_SORT = True
|
||||||
|
|
||||||
######################################################################
|
######################################################################
|
||||||
# Number of bits to throw away from the front of the SHA1 hash in the
|
# Number of bits to throw away from the front of the SHA1 hash in the
|
||||||
# fingerprint calculation. The more you throw away, the less storage, but
|
# fingerprint calculation. The more you throw away, the less storage, but
|
||||||
# potentially higher collisions and misclassifications when identifying songs.
|
# potentially higher collisions and misclassifications when identifying songs.
|
||||||
FINGERPRINT_REDUCTION = 20
|
FINGERPRINT_REDUCTION = 20
|
||||||
|
@ -136,26 +136,20 @@ def generate_hashes(peaks, fan_value=DEFAULT_FAN_VALUE):
|
||||||
sha1_hash[0:20] time_offset
|
sha1_hash[0:20] time_offset
|
||||||
[(e05b341a9b77a51fd26, 32), ... ]
|
[(e05b341a9b77a51fd26, 32), ... ]
|
||||||
"""
|
"""
|
||||||
fingerprinted = set() # to avoid rehashing same pairs
|
|
||||||
|
|
||||||
if PEAK_SORT:
|
if PEAK_SORT:
|
||||||
peaks.sort(key=itemgetter(1))
|
peaks.sort(key=itemgetter(1))
|
||||||
|
|
||||||
for i in range(len(peaks)):
|
for i in range(len(peaks)):
|
||||||
for j in range(1, fan_value):
|
for j in range(1, fan_value):
|
||||||
if (i + j) < len(peaks) and not (i, i + j) in fingerprinted:
|
if (i + j) < len(peaks):
|
||||||
|
|
||||||
freq1 = peaks[i][IDX_FREQ_I]
|
freq1 = peaks[i][IDX_FREQ_I]
|
||||||
freq2 = peaks[i + j][IDX_FREQ_I]
|
freq2 = peaks[i + j][IDX_FREQ_I]
|
||||||
|
|
||||||
t1 = peaks[i][IDX_TIME_J]
|
t1 = peaks[i][IDX_TIME_J]
|
||||||
t2 = peaks[i + j][IDX_TIME_J]
|
t2 = peaks[i + j][IDX_TIME_J]
|
||||||
|
|
||||||
t_delta = t2 - t1
|
t_delta = t2 - t1
|
||||||
|
|
||||||
if t_delta >= MIN_HASH_TIME_DELTA and t_delta <= MAX_HASH_TIME_DELTA:
|
if t_delta >= MIN_HASH_TIME_DELTA and t_delta <= MAX_HASH_TIME_DELTA:
|
||||||
h = hashlib.sha1(
|
h = hashlib.sha1(
|
||||||
"%s|%s|%s" % (str(freq1), str(freq2), str(t_delta)))
|
"%s|%s|%s" % (str(freq1), str(freq2), str(t_delta)))
|
||||||
yield (h.hexdigest()[0:FINGERPRINT_REDUCTION], t1)
|
yield (h.hexdigest()[0:FINGERPRINT_REDUCTION], t1)
|
||||||
|
|
||||||
# ensure we don't repeat hashing
|
|
||||||
fingerprinted.add((i, i + j))
|
|
||||||
|
|
|
@ -207,10 +207,16 @@ class DejavuTest(object):
|
||||||
log_msg('file: %s' % f)
|
log_msg('file: %s' % f)
|
||||||
|
|
||||||
# get column
|
# get column
|
||||||
col = self.get_column_id(re.findall("[0-9]*sec",f)[0])
|
col = self.get_column_id(re.findall("[0-9]*sec", f)[0])
|
||||||
song = path_to_songname(f).split("_")[0] # format: XXXX_offset_length.mp3
|
# format: XXXX_offset_length.mp3
|
||||||
line = self.get_line_id (song)
|
song = path_to_songname(f).split("_")[0]
|
||||||
result = subprocess.check_output(["python", "dejavu.py", 'recognize', 'file', self.test_folder + "/" + f])
|
line = self.get_line_id(song)
|
||||||
|
result = subprocess.check_output([
|
||||||
|
"python",
|
||||||
|
"dejavu.py",
|
||||||
|
'-r',
|
||||||
|
'file',
|
||||||
|
self.test_folder + "/" + f])
|
||||||
|
|
||||||
if result.strip() == "None":
|
if result.strip() == "None":
|
||||||
log_msg('No match')
|
log_msg('No match')
|
||||||
|
|
13
example.py
13
example.py
|
@ -16,12 +16,19 @@ djv.fingerprint_directory("mp3", [".mp3"])
|
||||||
# Recognize audio from a file
|
# Recognize audio from a file
|
||||||
from dejavu.recognize import FileRecognizer
|
from dejavu.recognize import FileRecognizer
|
||||||
song = djv.recognize(FileRecognizer, "mp3/Sean-Fournier--Falling-For-You.mp3")
|
song = djv.recognize(FileRecognizer, "mp3/Sean-Fournier--Falling-For-You.mp3")
|
||||||
|
print "From file we recognized: %s\n" % song
|
||||||
|
|
||||||
# Or recognize audio from your microphone for 10 seconds
|
# Or recognize audio from your microphone for `secs` seconds
|
||||||
from dejavu.recognize import MicrophoneRecognizer
|
from dejavu.recognize import MicrophoneRecognizer
|
||||||
song = djv.recognize(MicrophoneRecognizer, seconds=2)
|
secs = 5
|
||||||
|
song = djv.recognize(MicrophoneRecognizer, seconds=secs)
|
||||||
|
if song is None:
|
||||||
|
print "Nothing recognized -- did you play the song out loud so your mic could hear it? :)"
|
||||||
|
else:
|
||||||
|
print "From mic with %d seconds we recognized: %s\n" % (secs, song)
|
||||||
|
|
||||||
# Or use a recognizer without the shortcut, in anyway you would like
|
# Or use a recognizer without the shortcut, in anyway you would like
|
||||||
from dejavu.recognize import FileRecognizer
|
from dejavu.recognize import FileRecognizer
|
||||||
recognizer = FileRecognizer(djv)
|
recognizer = FileRecognizer(djv)
|
||||||
song = recognizer.recognize_file("mp3/Josh-Woodward--I-Want-To-Destroy-Something-Beautiful.mp3")
|
song = recognizer.recognize_file("mp3/Josh-Woodward--I-Want-To-Destroy-Something-Beautiful.mp3")
|
||||||
|
print "No shortcut, we recognized: %s\n" % song
|
9
requirements.txt
Normal file
9
requirements.txt
Normal file
|
@ -0,0 +1,9 @@
|
||||||
|
# requirements file
|
||||||
|
|
||||||
|
### BEGIN ###
|
||||||
|
pydub>=0.9.4
|
||||||
|
PyAudio>=0.2.7
|
||||||
|
numpy>=1.8.2
|
||||||
|
scipy>=0.12.1
|
||||||
|
matplotlib>=1.3.1
|
||||||
|
### END ###
|
28
run_tests.py
28
run_tests.py
|
@ -8,28 +8,28 @@ import shutil
|
||||||
usage = "usage: %prog [options] TESTING_AUDIOFOLDER"
|
usage = "usage: %prog [options] TESTING_AUDIOFOLDER"
|
||||||
parser = OptionParser(usage=usage, version="%prog 1.1")
|
parser = OptionParser(usage=usage, version="%prog 1.1")
|
||||||
parser.add_option("--secs",
|
parser.add_option("--secs",
|
||||||
action="store",
|
action="store",
|
||||||
dest="secs",
|
dest="secs",
|
||||||
default=5,
|
default=5,
|
||||||
type=int,
|
type=int,
|
||||||
help='Number of seconds starting from zero to test')
|
help='Number of seconds starting from zero to test')
|
||||||
parser.add_option("--results",
|
parser.add_option("--results",
|
||||||
action="store",
|
action="store",
|
||||||
dest="results_folder",
|
dest="results_folder",
|
||||||
default="./dejavu_test_results",
|
default="./dejavu_test_results",
|
||||||
help='Sets the path where the results are saved')
|
help='Sets the path where the results are saved')
|
||||||
parser.add_option("--temp",
|
parser.add_option("--temp",
|
||||||
action="store",
|
action="store",
|
||||||
dest="temp_folder",
|
dest="temp_folder",
|
||||||
default="./dejavu_temp_testing_files",
|
default="./dejavu_temp_testing_files",
|
||||||
help='Sets the path where the temp files are saved')
|
help='Sets the path where the temp files are saved')
|
||||||
parser.add_option("--log",
|
parser.add_option("--log",
|
||||||
action="store_true",
|
action="store_true",
|
||||||
dest="log",
|
dest="log",
|
||||||
default=True,
|
default=True,
|
||||||
help='Enables logging')
|
help='Enables logging')
|
||||||
parser.add_option("--silent",
|
parser.add_option("--silent",
|
||||||
action="store_false",
|
action="store_false",
|
||||||
dest="silent",
|
dest="silent",
|
||||||
default=False,
|
default=False,
|
||||||
help='Disables printing')
|
help='Disables printing')
|
||||||
|
@ -38,13 +38,13 @@ parser.add_option("--log-file",
|
||||||
default="results-compare.log",
|
default="results-compare.log",
|
||||||
help='Set the path and filename of the log file')
|
help='Set the path and filename of the log file')
|
||||||
parser.add_option("--padding",
|
parser.add_option("--padding",
|
||||||
action="store",
|
action="store",
|
||||||
dest="padding",
|
dest="padding",
|
||||||
default=10,
|
default=10,
|
||||||
type=int,
|
type=int,
|
||||||
help='Number of seconds to pad choice of place to test from')
|
help='Number of seconds to pad choice of place to test from')
|
||||||
parser.add_option("--seed",
|
parser.add_option("--seed",
|
||||||
action="store",
|
action="store",
|
||||||
dest="seed",
|
dest="seed",
|
||||||
default=None,
|
default=None,
|
||||||
type=int,
|
type=int,
|
||||||
|
@ -62,27 +62,27 @@ except:
|
||||||
os.mkdir(options.results_folder)
|
os.mkdir(options.results_folder)
|
||||||
|
|
||||||
# set logging
|
# set logging
|
||||||
if options.log == True:
|
if options.log:
|
||||||
logging.basicConfig(filename=options.log_file, level=logging.DEBUG)
|
logging.basicConfig(filename=options.log_file, level=logging.DEBUG)
|
||||||
|
|
||||||
# set test seconds
|
# set test seconds
|
||||||
test_seconds = ['%dsec' % i for i in range(1, options.secs + 1, 1)]
|
test_seconds = ['%dsec' % i for i in range(1, options.secs + 1, 1)]
|
||||||
|
|
||||||
# generate testing files
|
# generate testing files
|
||||||
for i in range(1, options.secs + 1, 1):
|
for i in range(1, options.secs + 1, 1):
|
||||||
generate_test_files(test_folder, options.temp_folder,
|
generate_test_files(test_folder, options.temp_folder,
|
||||||
i, padding=options.padding)
|
i, padding=options.padding)
|
||||||
|
|
||||||
# scan files
|
# scan files
|
||||||
log_msg("Running Dejavu fingerprinter on files in %s..." % test_folder,
|
log_msg("Running Dejavu fingerprinter on files in %s..." % test_folder,
|
||||||
log=options.log, silent=options.silent)
|
log=options.log, silent=options.silent)
|
||||||
|
|
||||||
tm = time.time()
|
tm = time.time()
|
||||||
djv = DejavuTest(options.temp_folder, test_seconds)
|
djv = DejavuTest(options.temp_folder, test_seconds)
|
||||||
log_msg("finished obtaining results from dejavu in %s" % (time.time() - tm),
|
log_msg("finished obtaining results from dejavu in %s" % (time.time() - tm),
|
||||||
log=options.log, silent=options.silent)
|
log=options.log, silent=options.silent)
|
||||||
|
|
||||||
tests = 1 # djv
|
tests = 1 # djv
|
||||||
n_secs = len(test_seconds)
|
n_secs = len(test_seconds)
|
||||||
|
|
||||||
# set result variables -> 4d variables
|
# set result variables -> 4d variables
|
||||||
|
|
60
setup.py
Normal file
60
setup.py
Normal file
|
@ -0,0 +1,60 @@
|
||||||
|
from setuptools import setup, find_packages
|
||||||
|
# import os, sys
|
||||||
|
|
||||||
|
|
||||||
|
def parse_requirements(requirements):
|
||||||
|
# load from requirements.txt
|
||||||
|
with open(requirements) as f:
|
||||||
|
lines = [l for l in f]
|
||||||
|
# remove spaces
|
||||||
|
stripped = map((lambda x: x.strip()), lines)
|
||||||
|
# remove comments
|
||||||
|
nocomments = filter((lambda x: not x.startswith('#')), stripped)
|
||||||
|
# remove empty lines
|
||||||
|
reqs = filter((lambda x: x), nocomments)
|
||||||
|
return reqs
|
||||||
|
|
||||||
|
PACKAGE_NAME = "PyDejavu"
|
||||||
|
PACKAGE_VERSION = "0.1.2"
|
||||||
|
SUMMARY = 'Dejavu: Audio Fingerprinting in Python'
|
||||||
|
DESCRIPTION = """
|
||||||
|
Audio fingerprinting and recognition algorithm implemented in Python
|
||||||
|
|
||||||
|
See the explanation here:
|
||||||
|
|
||||||
|
`http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/`__
|
||||||
|
|
||||||
|
Dejavu can memorize recorded audio by listening to it once and fingerprinting
|
||||||
|
it. Then by playing a song and recording microphone input or on disk file,
|
||||||
|
Dejavu attempts to match the audio against the fingerprints held in the
|
||||||
|
database, returning the song or recording being played.
|
||||||
|
|
||||||
|
__ http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/
|
||||||
|
"""
|
||||||
|
REQUIREMENTS = parse_requirements("requirements.txt")
|
||||||
|
|
||||||
|
setup(
|
||||||
|
name=PACKAGE_NAME,
|
||||||
|
version=PACKAGE_VERSION,
|
||||||
|
description=SUMMARY,
|
||||||
|
long_description=DESCRIPTION,
|
||||||
|
author='Will Drevo',
|
||||||
|
author_email='will.drevo@gmail.com',
|
||||||
|
maintainer="Saleem Ansari",
|
||||||
|
maintainer_email="tuxdna@gmail.com",
|
||||||
|
url='http://github.com/tuxdna/dejavu',
|
||||||
|
license='MIT License',
|
||||||
|
include_package_data=True,
|
||||||
|
packages=find_packages(),
|
||||||
|
platforms=['Unix'],
|
||||||
|
install_requires=REQUIREMENTS,
|
||||||
|
classifiers=[
|
||||||
|
'Development Status :: 4 - Beta',
|
||||||
|
'Environment :: Console',
|
||||||
|
'Intended Audience :: Developers',
|
||||||
|
'License :: OSI Approved :: MIT License',
|
||||||
|
'Operating System :: OS Independent',
|
||||||
|
'Topic :: Software Development :: Libraries :: Python Modules',
|
||||||
|
],
|
||||||
|
keywords="python, audio, fingerprinting, music, numpy, landmark",
|
||||||
|
)
|
|
@ -8,7 +8,7 @@ rm -rf ./results ./temp_audio
|
||||||
|
|
||||||
###########
|
###########
|
||||||
# Fingerprint files of extension mp3 in the ./mp3 folder
|
# Fingerprint files of extension mp3 in the ./mp3 folder
|
||||||
python dejavu.py fingerprint ./mp3/ mp3
|
python dejavu.py -f ./mp3/ mp3
|
||||||
|
|
||||||
##########
|
##########
|
||||||
# Run a test suite on the ./mp3 folder by extracting 1, 2, 3, 4, and 5
|
# Run a test suite on the ./mp3 folder by extracting 1, 2, 3, 4, and 5
|
||||||
|
@ -22,4 +22,4 @@ python run_tests.py \
|
||||||
--padding 8 \
|
--padding 8 \
|
||||||
--seed 42 \
|
--seed 42 \
|
||||||
--results ./results \
|
--results ./results \
|
||||||
./mp3
|
./mp3
|
||||||
|
|
Loading…
Reference in a new issue