Merge branch 'tuxdna-master'

This commit is contained in:
Will Drevo 2014-12-15 21:31:44 -05:00
commit 19679f8d11
13 changed files with 288 additions and 180 deletions

View file

@ -1,31 +0,0 @@
# Dependencies required by dejavu
* [`pyaudio`](http://people.csail.mit.edu/hubert/pyaudio/)
* [`ffmpeg`](https://github.com/FFmpeg/FFmpeg)
* [`pydub`](http://pydub.com/)
* [`numpy`](http://www.numpy.org/)
* [`scipy`](http://www.scipy.org/)
* [`matplotlib`](http://matplotlib.org/)
* [`MySQLdb`](http://mysql-python.sourceforge.net/MySQLdb.html)
## Dependency installation for Mac OS X
Tested on OS X Mavericks. An option is to install [Homebrew](http://brew.sh) and do the following:
```
brew install portaudio
brew install ffmpeg
sudo easy_install pyaudio
sudo easy_install pydub
sudo easy_install numpy
sudo easy_install scipy
sudo easy_install matplotlib
sudo easy_install pip
sudo pip install MySQL-python
sudo ln -s /usr/local/mysql/lib/libmysqlclient.18.dylib /usr/lib/libmysqlclient.18.dylib
```
However installing `portaudio` and/or `ffmpeg` from source is also doable.

63
INSTALLATION.md Normal file
View file

@ -0,0 +1,63 @@
# Installation of dejavu
So far dejavu has only been tested on Unix systems.
* [`pyaudio`](http://people.csail.mit.edu/hubert/pyaudio/) for grabbing audio from microphone
* [`ffmpeg`](https://github.com/FFmpeg/FFmpeg) for converting audio files to .wav format
* [`pydub`](http://pydub.com/), a Python `ffmpeg` wrapper
* [`numpy`](http://www.numpy.org/) for taking the FFT of audio signals
* [`scipy`](http://www.scipy.org/), used in peak finding algorithms
* [`matplotlib`](http://matplotlib.org/), used for spectrograms and plotting
* [`MySQLdb`](http://mysql-python.sourceforge.net/MySQLdb.html) for interfacing with MySQL databases
For installing `ffmpeg` on Mac OS X, I highly recommend [this post](http://jungels.net/articles/ffmpeg-howto.html).
## Fedora 20+
### Dependency installation on Fedora 20+
Install the dependencies
sudo yum install numpy scipy python-matplotlib ffmpeg portaudio-devel
pip install PyAudio
pip install pydub
Now setup virtualenv ([howto?](http://www.pythoncentral.io/how-to-install-virtualenv-python/))
pip install virtualenv
virtualenv --system-site-packages env_with_system
Install from PyPI
source env_with_system/bin/activate
pip install PyDejavu
You can also install the latest code from GitHub:
source env_with_system/bin/activate
pip install https://github.com/tuxdna/dejavu/zipball/master
## Max OS X
### Dependency installation for Mac OS X
Tested on OS X Mavericks. An option is to install [Homebrew](http://brew.sh) and do the following:
```
brew install portaudio
brew install ffmpeg
sudo easy_install pyaudio
sudo easy_install pydub
sudo easy_install numpy
sudo easy_install scipy
sudo easy_install matplotlib
sudo easy_install pip
sudo pip install MySQL-python
sudo ln -s /usr/local/mysql/lib/libmysqlclient.18.dylib /usr/lib/libmysqlclient.18.dylib
```
However installing `portaudio` and/or `ffmpeg` from source is also doable.

1
MANIFEST.in Normal file
View file

@ -0,0 +1 @@
include requirements.txt

View file

@ -6,19 +6,9 @@ Audio fingerprinting and recognition algorithm implemented in Python, see the ex
Dejavu can memorize audio by listening to it once and fingerprinting it. Then by playing a song and recording microphone input, Dejavu attempts to match the audio against the fingerprints held in the database, returning the song being played.
## Dependencies:
## Installation and Dependencies:
I've only tested this on Unix systems.
* [`pyaudio`](http://people.csail.mit.edu/hubert/pyaudio/) for grabbing audio from microphone
* [`ffmpeg`](https://github.com/FFmpeg/FFmpeg) for converting audio files to .wav format
* [`pydub`](http://pydub.com/), a Python `ffmpeg` wrapper
* [`numpy`](http://www.numpy.org/) for taking the FFT of audio signals
* [`scipy`](http://www.scipy.org/), used in peak finding algorithms
* [`matplotlib`](http://matplotlib.org/), used for spectrograms and plotting
* [`MySQLdb`](http://mysql-python.sourceforge.net/MySQLdb.html) for interfacing with MySQL databases
For installing `ffmpeg` on Mac OS X, I highly recommend [this post](http://jungels.net/articles/ffmpeg-howto.html).
Read [INSTALLATION.md](INSTALLATION.md)
## Setup
@ -100,6 +90,19 @@ An example configuration is as follows:
>>> djv = Dejavu(config)
```
## Tuning
Inside `fingerprint.py`, you may want to adjust following parameters (some values are given below).
FINGERPRINT_REDUCTION = 30
PEAK_SORT = False
DEFAULT_OVERLAP_RATIO = 0.4
DEFAULT_FAN_VALUE = 10
DEFAULT_AMP_MIN = 15
PEAK_NEIGHBORHOOD_SIZE = 30
These parameters are described in the `fingerprint.py` in detail. Read that in-order to understand the impact of changing these values.
## Recognizing
There are two ways to recognize audio using Dejavu. You can recognize by reading and processing files on disk, or through your computer's microphone.
@ -109,7 +112,7 @@ There are two ways to recognize audio using Dejavu. You can recognize by reading
Through the terminal:
```bash
$ python dejavu.py recognize file sometrack.wav
$ python dejavu.py --recognize file sometrack.wav
{'song_id': 1, 'song_name': 'Taylor Swift - Shake It Off', 'confidence': 3948, 'offset_seconds': 30.00018, 'match_time': 0.7159781455993652, 'offset': 646L}
```
@ -132,10 +135,10 @@ With scripting:
and with the command line script, you specify the number of seconds to listen:
```bash
$ python dejavu.py recognize mic 10
$ python dejavu.py --recognize mic 10
```
## Testing (New!)
## Testing
Testing out different parameterizations of the fingerprinting algorithm is often useful as the corpus becomes larger and larger, and inevitable tradeoffs between speed and accuracy come into play.
@ -163,7 +166,7 @@ rm -rf ./results ./temp_audio
###########
# Fingerprint files of extension mp3 in the ./mp3 folder
python dejavu.py fingerprint ./mp3/ mp3
python dejavu.py --fingerprint ./mp3/ mp3
##########
# Run a test suite on the ./mp3 folder by extracting 1, 2, 3, 4, and 5

143
dejavu.py
View file

@ -1,98 +1,93 @@
#!/usr/bin/python
import os
import sys
import json
import warnings
import argparse
from dejavu import Dejavu
from dejavu.recognize import FileRecognizer
from dejavu.recognize import MicrophoneRecognizer
from dejavu.recognize import FileRecognizer
from argparse import RawTextHelpFormatter
warnings.filterwarnings("ignore")
def init():
# load config from a JSON file (or anything outputting a python dictionary)
with open("dejavu.cnf") as f:
config = json.load(f)
DEFAULT_CONFIG_FILE = "dejavu.cnf.SAMPLE"
def init(configpath):
"""
Load config from a JSON file
"""
try:
with open(configpath) as f:
config = json.load(f)
except IOError as err:
print("Cannot open configuration: %s. Exiting" % (str(err)))
sys.exit(1)
# create a Dejavu instance
return Dejavu(config)
def showHelp():
print ""
print "------------------------------------------------"
print "DejaVu audio fingerprinting and recognition tool"
print "------------------------------------------------"
print ""
print "Usage: dejavu.py [command] [arguments]"
print ""
print "Available commands:"
print ""
print " Fingerprint a file"
print " dejavu.py fingerprint /path/to/file.extension"
print ""
print " Fingerprint all files in a directory"
print " dejavu.py fingerprint /path/to/directory extension"
print ""
print " Recognize what is playing through the microphone"
print " dejavu.py recognize mic number_of_seconds"
print ""
print " Recognize a file by listening to it"
print " dejavu.py recognize file /path/to/file"
print ""
print " Display this help screen"
print " dejavu.py help"
print ""
exit
if len(sys.argv) > 1:
command = sys.argv[1]
else:
showHelp()
if __name__ == '__main__':
parser = argparse.ArgumentParser(
description="Dejavu: Audio Fingerprinting library",
formatter_class=RawTextHelpFormatter)
parser.add_argument('-c', '--config', nargs='?',
help='Path to configuration file\n'
'Usages: \n'
'--config /path/to/config-file\n')
parser.add_argument('-f', '--fingerprint', nargs='*',
help='Fingerprint files in a directory\n'
'Usages: \n'
'--fingerprint /path/to/directory extension\n'
'--fingerprint /path/to/directory')
parser.add_argument('-r', '--recognize', nargs=2,
help='Recognize what is '
'playing through the microphone\n'
'Usage: \n'
'--recognize mic number_of_seconds \n'
'--recognize file path/to/file \n')
args = parser.parse_args()
if command == 'fingerprint': # Fingerprint all files in a directory
if not args.fingerprint and not args.recognize:
parser.print_help()
sys.exit(0)
djv = init()
config_file = args.config
if config_file is None:
config_file = DEFAULT_CONFIG_FILE
# print "Using default config file: %s" % (config_file)
djv = init(config_file)
if args.fingerprint:
# Fingerprint all files in a directory
if len(args.fingerprint) == 2:
directory = args.fingerprint[0]
extension = args.fingerprint[1]
print("Fingerprinting all .%s files in the %s directory"
% (extension, directory))
djv.fingerprint_directory(directory, ["." + extension], 4)
if len(sys.argv) == 4:
elif len(args.fingerprint) == 1:
filepath = args.fingerprint[0]
if os.path.isdir(filepath):
print("Please specify an extension if you'd like to fingerprint a directory!")
sys.exit(1)
djv.fingerprint_file(filepath)
directory = sys.argv[2]
extension = sys.argv[3]
print "Fingerprinting all .%s files in the %s directory" % (extension, directory)
elif args.recognize:
# Recognize audio source
song = None
source = args.recognize[0]
opt_arg = args.recognize[1]
djv.fingerprint_directory(directory, ["." + extension], 4)
else:
filepath = sys.argv[2]
djv.fingerprint_file(filepath)
elif command == 'recognize': # Recognize audio
source = sys.argv[2]
song = None
if source in ['mic', 'microphone']:
seconds = int(sys.argv[3])
djv = init()
song = djv.recognize(MicrophoneRecognizer, seconds=seconds)
elif source == 'file':
djv = init()
sourceFile = sys.argv[3]
song = djv.recognize(FileRecognizer, sourceFile)
else:
showHelp()
print song
else:
showHelp()
if source in ('mic', 'microphone'):
song = djv.recognize(MicrophoneRecognizer, seconds=opt_arg)
elif source == 'file':
song = djv.recognize(FileRecognizer, opt_arg)
print(song)
sys.exit(0)

View file

@ -3,6 +3,9 @@ import dejavu.decoder as decoder
import fingerprint
import multiprocessing
import os
import traceback
import sys
class Dejavu(object):
@ -27,7 +30,7 @@ class Dejavu(object):
# if we should limit seconds fingerprinted,
# None|-1 means use entire track
self.limit = self.config.get("fingerprint_limit", None)
if self.limit == -1: # for JSON compatibility
if self.limit == -1: # for JSON compatibility
self.limit = None
self.get_fingerprinted_songs()
@ -79,9 +82,7 @@ class Dejavu(object):
break
except:
print("Failed fingerprinting")
# Print traceback because we can't reraise it here
import traceback, sys
traceback.print_exc(file=sys.stdout)
else:
sid = self.db.insert_song(song_name)
@ -94,13 +95,12 @@ class Dejavu(object):
pool.join()
def fingerprint_file(self, filepath, song_name=None):
songname = decoder.path_to_songname(filepath)
song_name = song_name or songname
# don't refingerprint already fingerprinted files
songname = decoder.path_to_songname(filepath)
song_name = song_name or songname
# don't refingerprint already fingerprinted files
if song_name in self.songnames_set:
print "%s already fingerprinted, continuing..." % song_name
else:
else:
song_name, hashes = _fingerprint_worker(filepath,
self.limit,
song_name=song_name)
@ -129,9 +129,9 @@ class Dejavu(object):
song_id = -1
for tup in matches:
sid, diff = tup
if not diff in diff_counter:
if diff not in diff_counter:
diff_counter[diff] = {}
if not sid in diff_counter[diff]:
if sid not in diff_counter[diff]:
diff_counter[diff][sid] = 0
diff_counter[diff][sid] += 1
@ -149,15 +149,16 @@ class Dejavu(object):
return None
# return match info
nseconds = round(float(largest) / fingerprint.DEFAULT_FS * \
fingerprint.DEFAULT_WINDOW_SIZE * \
fingerprint.DEFAULT_OVERLAP_RATIO, 5)
nseconds = round(float(largest) / fingerprint.DEFAULT_FS *
fingerprint.DEFAULT_WINDOW_SIZE *
fingerprint.DEFAULT_OVERLAP_RATIO, 5)
song = {
Dejavu.SONG_ID : song_id,
Dejavu.SONG_NAME : songname,
Dejavu.CONFIDENCE : largest_count,
Dejavu.OFFSET : largest,
Dejavu.OFFSET_SECS : nseconds }
Dejavu.SONG_ID: song_id,
Dejavu.SONG_NAME: songname,
Dejavu.CONFIDENCE: largest_count,
Dejavu.OFFSET: largest,
Dejavu.OFFSET_SECS: nseconds
}
return song

View file

@ -136,26 +136,20 @@ def generate_hashes(peaks, fan_value=DEFAULT_FAN_VALUE):
sha1_hash[0:20] time_offset
[(e05b341a9b77a51fd26, 32), ... ]
"""
fingerprinted = set() # to avoid rehashing same pairs
if PEAK_SORT:
peaks.sort(key=itemgetter(1))
for i in range(len(peaks)):
for j in range(1, fan_value):
if (i + j) < len(peaks) and not (i, i + j) in fingerprinted:
if (i + j) < len(peaks):
freq1 = peaks[i][IDX_FREQ_I]
freq2 = peaks[i + j][IDX_FREQ_I]
t1 = peaks[i][IDX_TIME_J]
t2 = peaks[i + j][IDX_TIME_J]
t_delta = t2 - t1
if t_delta >= MIN_HASH_TIME_DELTA and t_delta <= MAX_HASH_TIME_DELTA:
h = hashlib.sha1(
"%s|%s|%s" % (str(freq1), str(freq2), str(t_delta)))
yield (h.hexdigest()[0:FINGERPRINT_REDUCTION], t1)
# ensure we don't repeat hashing
fingerprinted.add((i, i + j))

View file

@ -207,10 +207,16 @@ class DejavuTest(object):
log_msg('file: %s' % f)
# get column
col = self.get_column_id(re.findall("[0-9]*sec",f)[0])
song = path_to_songname(f).split("_")[0] # format: XXXX_offset_length.mp3
line = self.get_line_id (song)
result = subprocess.check_output(["python", "dejavu.py", 'recognize', 'file', self.test_folder + "/" + f])
col = self.get_column_id(re.findall("[0-9]*sec", f)[0])
# format: XXXX_offset_length.mp3
song = path_to_songname(f).split("_")[0]
line = self.get_line_id(song)
result = subprocess.check_output([
"python",
"dejavu.py",
'-r',
'file',
self.test_folder + "/" + f])
if result.strip() == "None":
log_msg('No match')

View file

@ -16,12 +16,19 @@ djv.fingerprint_directory("mp3", [".mp3"])
# Recognize audio from a file
from dejavu.recognize import FileRecognizer
song = djv.recognize(FileRecognizer, "mp3/Sean-Fournier--Falling-For-You.mp3")
print "From file we recognized: %s\n" % song
# Or recognize audio from your microphone for 10 seconds
# Or recognize audio from your microphone for `secs` seconds
from dejavu.recognize import MicrophoneRecognizer
song = djv.recognize(MicrophoneRecognizer, seconds=2)
secs = 5
song = djv.recognize(MicrophoneRecognizer, seconds=secs)
if song is None:
print "Nothing recognized -- did you play the song out loud so your mic could hear it? :)"
else:
print "From mic with %d seconds we recognized: %s\n" % (secs, song)
# Or use a recognizer without the shortcut, in anyway you would like
from dejavu.recognize import FileRecognizer
recognizer = FileRecognizer(djv)
song = recognizer.recognize_file("mp3/Josh-Woodward--I-Want-To-Destroy-Something-Beautiful.mp3")
print "No shortcut, we recognized: %s\n" % song

9
requirements.txt Normal file
View file

@ -0,0 +1,9 @@
# requirements file
### BEGIN ###
pydub>=0.9.4
PyAudio>=0.2.7
numpy>=1.8.2
scipy>=0.12.1
matplotlib>=1.3.1
### END ###

View file

@ -8,28 +8,28 @@ import shutil
usage = "usage: %prog [options] TESTING_AUDIOFOLDER"
parser = OptionParser(usage=usage, version="%prog 1.1")
parser.add_option("--secs",
action="store",
action="store",
dest="secs",
default=5,
type=int,
help='Number of seconds starting from zero to test')
parser.add_option("--results",
action="store",
action="store",
dest="results_folder",
default="./dejavu_test_results",
help='Sets the path where the results are saved')
parser.add_option("--temp",
action="store",
action="store",
dest="temp_folder",
default="./dejavu_temp_testing_files",
help='Sets the path where the temp files are saved')
parser.add_option("--log",
action="store_true",
action="store_true",
dest="log",
default=True,
help='Enables logging')
parser.add_option("--silent",
action="store_false",
action="store_false",
dest="silent",
default=False,
help='Disables printing')
@ -38,13 +38,13 @@ parser.add_option("--log-file",
default="results-compare.log",
help='Set the path and filename of the log file')
parser.add_option("--padding",
action="store",
action="store",
dest="padding",
default=10,
type=int,
help='Number of seconds to pad choice of place to test from')
parser.add_option("--seed",
action="store",
action="store",
dest="seed",
default=None,
type=int,
@ -62,27 +62,27 @@ except:
os.mkdir(options.results_folder)
# set logging
if options.log == True:
logging.basicConfig(filename=options.log_file, level=logging.DEBUG)
if options.log:
logging.basicConfig(filename=options.log_file, level=logging.DEBUG)
# set test seconds
test_seconds = ['%dsec' % i for i in range(1, options.secs + 1, 1)]
# generate testing files
for i in range(1, options.secs + 1, 1):
generate_test_files(test_folder, options.temp_folder,
i, padding=options.padding)
generate_test_files(test_folder, options.temp_folder,
i, padding=options.padding)
# scan files
log_msg("Running Dejavu fingerprinter on files in %s..." % test_folder,
log=options.log, silent=options.silent)
log=options.log, silent=options.silent)
tm = time.time()
djv = DejavuTest(options.temp_folder, test_seconds)
log_msg("finished obtaining results from dejavu in %s" % (time.time() - tm),
log=options.log, silent=options.silent)
log=options.log, silent=options.silent)
tests = 1 # djv
tests = 1 # djv
n_secs = len(test_seconds)
# set result variables -> 4d variables

60
setup.py Normal file
View file

@ -0,0 +1,60 @@
from setuptools import setup, find_packages
# import os, sys
def parse_requirements(requirements):
# load from requirements.txt
with open(requirements) as f:
lines = [l for l in f]
# remove spaces
stripped = map((lambda x: x.strip()), lines)
# remove comments
nocomments = filter((lambda x: not x.startswith('#')), stripped)
# remove empty lines
reqs = filter((lambda x: x), nocomments)
return reqs
PACKAGE_NAME = "PyDejavu"
PACKAGE_VERSION = "0.1.2"
SUMMARY = 'Dejavu: Audio Fingerprinting in Python'
DESCRIPTION = """
Audio fingerprinting and recognition algorithm implemented in Python
See the explanation here:
`http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/`__
Dejavu can memorize recorded audio by listening to it once and fingerprinting
it. Then by playing a song and recording microphone input or on disk file,
Dejavu attempts to match the audio against the fingerprints held in the
database, returning the song or recording being played.
__ http://willdrevo.com/fingerprinting-and-audio-recognition-with-python/
"""
REQUIREMENTS = parse_requirements("requirements.txt")
setup(
name=PACKAGE_NAME,
version=PACKAGE_VERSION,
description=SUMMARY,
long_description=DESCRIPTION,
author='Will Drevo',
author_email='will.drevo@gmail.com',
maintainer="Saleem Ansari",
maintainer_email="tuxdna@gmail.com",
url='http://github.com/tuxdna/dejavu',
license='MIT License',
include_package_data=True,
packages=find_packages(),
platforms=['Unix'],
install_requires=REQUIREMENTS,
classifiers=[
'Development Status :: 4 - Beta',
'Environment :: Console',
'Intended Audience :: Developers',
'License :: OSI Approved :: MIT License',
'Operating System :: OS Independent',
'Topic :: Software Development :: Libraries :: Python Modules',
],
keywords="python, audio, fingerprinting, music, numpy, landmark",
)

View file

@ -8,7 +8,7 @@ rm -rf ./results ./temp_audio
###########
# Fingerprint files of extension mp3 in the ./mp3 folder
python dejavu.py fingerprint ./mp3/ mp3
python dejavu.py -f ./mp3/ mp3
##########
# Run a test suite on the ./mp3 folder by extracting 1, 2, 3, 4, and 5