Releases: MTG/essentia
Essentia 2.1 beta5
Essentia 2.1 beta5 is our current preliminary version of the forthcoming 2.1 release. This pre-release includes the following changes:
-
Algorithms updates and bug-fixes
-
Fix the
slaneyMel
scale implementation in MelBands and MFCC (#849). Introduced in 2.1-beta4, it was erroneously computing the HTK Mel scale. SethtkMel
as the default scale to ensure backward compatibility with all previous versions of MelBands/MFCC. -
New option
unit_tri
for triangle area normalization in MelBands, MFCC, and TriangularBands. -
New parameter
silenceThreshold
in MFCC and GFCC. Set default threshold to1e-10
(#543). -
TriangularBands: faster unit-sum normalization and an improved check for insufficient spectrum resolution (#142).
-
ConstantQ and the related Chromagram and SpectrumCQ are reimplemented from scratch and now function correctly. The
maxFrequency
parameter is replaced bynumberBins
. -
New
negativeFrequencies
parameter in FFTC to include negative frequencies in the output. -
New
normalize
parameter for IFFT size normalization. -
FFTC now supports KissFFT and Accelerate.
-
PoolAggregator: new aggregation method
last
to get the last value. Fix possible nan/inf values in kurtosis and skewness (#689). Apply aggregation for pool values that contain only one vector too. -
New
checkRange
parameter in Trimmer and StereoTrimmer. -
PitchFilter: improve consistency between input and output stream types (#674).
-
PitchMelodia: fix missing output
pitchConfidence
in streaming mode. -
MultiPitchMelodia:
peakFrameThreshold
andpeakFrameThreshold
parameters now work correctly (they were overridden by hardcoded values). -
New
tolerance
parameter in PitchYinFFT. When the pitch confidence is lower than the tolerance value the output pitch is set to 0. Atolerance
of 1 disables this feature. -
Fix occasional negative values output by Danceability (#483).
-
LoudnessEBUR128:
- Fix memory leaks and warnings on empty input. Set a larger internal buffer size to avoid buffer resizes.
- New parameter
startFromZero
to zero-center the first window for loudness estimation.
-
Fix a memory leak in AudioLoader.
-
BeatTrackerDegara output is now deterministic (#860).
-
ChordDetectionBeats: add new parameter
chromaPick
and fix a beat segment indexing bug in the case of very close consecutive beats. -
New
minPeakDistance
parameter in PeakDetection. -
Fix invalid memory access in PCA (#727).
-
Update Key and KeyExtractor algorithms with new pitch class profiles and new parameters for detuning correction and low-energy HPCP bin thresholding. Use the new
bgate
profile by default. Add spectral whitening step to KeyExtractor. Change output key naming. Add a new functionequivalentKey
to match between equivalent names. -
Proper mutex implementation for all FFT* algorithms.
-
-
New algorithms
- Invertible Constant-Q based on Non-Stationary Gabor frames: NSGConstantQ, NSGIConstantQ, NSGConstantQStreaming.
- Chromaprinter (fingerprinting) wrapper for the Chromaprint library.
- NNLSChroma and LogSpectrum (derived from the original NNLS Chroma code).
- TriangularBarkBands (more configurable than BarkBands) and BFCC (bark-frequency cepstrum coefficients).
- New algorithms for audio problems detection: ClickDetector, DiscontinuityDetector, FalseStereoDetector, GapsDetector, HumDetector, NoiseBurstDetector, SNR, SaturationDetector, StartStopCut, TruePeakDetector.
- New algorithms for probabilistic Yin (pYIN) pitch estimation: PitchYinProbabilistic, PitchYinProbabilities, PitchYinProbabilitiesHMM.
- StereoTrimmer and StereoMuxer.
- Welch (power spectral density estimation).
- New algorithm IFFTC for inverse complex STFT.
- Histogram.
-
Updated music and sound feature extractors
streaming_extractor_music
andstreaming_extractor_freesound
. Both extractors are now also available as algorithms: MusicExtractor and FreesoundExtractor. New MusicExtractorSVM algorithm allows applying SVM models to the output of MusicExtractor.-
Fix possible memory leaks in MusicExtractor
-
Proper logging for "out of memory" errors
-
Skip aggregation for some descriptors
-
Add audio
length
to metadata and removeend_time
-
Add number of audio channels to metadata (
number_channels
) -
Better grouping of metadata related to audio analysis
-
Updated key/chords estimation parameters
-
Estimate key using three different key profiles (
temperley
,krumhansl
,edma
) -
Updated descriptors in MusicExtractor:
- New LoudnessEBU128 loudness descriptors
- Add
melbands128
high-resolution melbands - Compute
hpcp_crest
- Compute
bpm_histogram
- New
stdev
aggregate statistics in addition tovar
-
Updated descriptors in FreesoundExtractor
- Add
melbands96
high-resolution melbands - Add
stdev
statistic - Remove
frequency_bands
- Do not output
bpm_confidence
when configured to use 'degara' for beat tracking spectral_contrast
andscvalleys
are now calledspectral_contrast_coeffs
andspectral_contrast_valleys
for consistency with MusicExtractorstartFrame
andstopFrame
are now calledsound_start_frame
andsound_stop_frame
- Add
-
-
New extractors
- Add a new extractor for spectrograms and log-energy Mel-spectrograms (
streaming_spectrogram
).
- Add a new extractor for spectrograms and log-energy Mel-spectrograms (
-
Python bindings updates
- Add support for Python 3.
- Update all tutorials and code examples to Python 3.
- New
essentia.pyutils
submodule provides useful functions for a number of use-cases (spectrograms, CQ-grams, batch processing with extractors, etc.) - Fix a memory bug in Pool on a
isSingleValue
check in Python. - Faster VECTOR_VECTOR_REAL conversion from Python types.
-
Build scripts updates
- Add script for Python packaging (python.py) and wheels.
- Travis CI and build scripts for manylinux wheels.
- Update Waf to 2.0.10.
- The code is now partly C++11.
- Build flags for MSVC.
- Fixes for cross-compilation with Mingw-w64.
- Default
--prefix=$VIRTUAL_ENV
when inside a virtualenv. - Read
PKG_CONFIG_PATH
and add new flag--pkg-config-path
for custom lib paths. - New flag
--only-python
to build Python extension separately from libessentia. - Link only to libessentia when building examples.
- Generate a proper
essentia.pc
pkg-config file. - Static builds updates.
- Replace LibAv with FFmpeg, build with muxers.
- Update Taglib version to 1.11.1, build with zlib.
- Update Gaia to 2.4.5.
-
Miscellaneous
-
Updated documentation, tutorials, and examples including a significant web redesign.
- Improve build scripts for documentation.
- Every algorithm page now has links to related algorithms.
- An updated list of research works using Essentia.
- New python examples.
- New QA scripts for audio problems detection and HPCPs.
-
A usual assortment of code cleanup, updated and expanded unit tests, and better logging (more informative log and exception messages).
Essentia 2.1 beta4
This pre-release includes the following changes:
-
Improved algorithms
- AudioLoader now supports audio sources with multiple audio streams (new parameter 'audioStream')
- PoolAggregator now outputs stdev in addition to var (#342)
- SpectralContrast: Improve precision for computation of subband bin intervals
- Danceability now also outputs a DFA exponent vector
- HPCP can now optionally apply unit sum normalization (#348)
- HPCP: 'splitFrequency' parameter is now called 'bandSplitFrequency'
- LoudnessEBUR128: Warn on empty input in the streaming mode
-
Updates to Mel and ERB energy band algorithms
- Add support for extracting MelBands and MFCCs 'the htk way'
- Add support for DCT type III in DCT algorithm
- New parameter 'dctType' in DCT, MFCC and GFCC
- New 'liftering' parameter in DCT and MFCC
- New parameters 'normalize', 'type', 'scale' and 'weighting' in MelBands and MFCC
- New 'type' parameter in GFCC
- New 'logType' parameter in MFCC, GFCC
- New 'log' parameter in TriangularBands and MelBands
- ERBBands: 'type' parameter value "energy" is now called "power"
- TriangularBands is now faster
-
New algorithms
- SpectrumToCent for computing cent scale from frequency bins
- New algorithm IDCT for inverse DCT
- New algorithm SpectrumCQ
-
Bug-fixes in algorithms:
- MelBands and TriangularBands: Add checks for insufficient spectrum resolution (#142)
- Fix PitchYin out of range error (#376)
- Fix Inf values in OddToEvenHarmonicEnergyRatio
- Fix reset() in LowLevelSpectralExtractor and LowLevelSpectralEqloudExtractor
- Fix occasional exception in BeatsLoudness (#199)
- Danceability: Fix NaN danceability value occurring on very short input signals
- Fix memory leak in MelBands
- Fix memory bug in Vibrato
- SpectralContrast: Force non-zero 'lowFrequencyBound' parameter to avoid division by zero (#568)
- AudioLoader: Fix memory bug on exceptions while opening an audio file in AudioLoader
-
Updates to Python wrapper:
- FrameGenerator now inherits the default parameters from FrameCutter
- FrameGenerator now has a new method frame_times() to compute frame positions in time
- Fix array memory corruption when passing NumPy array views to Essentia algorithms (#240)
- Fix memory deallocation for streaming algorithms to avoid a memory leak
-
Extractors:
- Freesound extractor now stores all results in json
-
Logging:
- Remove colors in log messages when piped to file; do not print colors on Windows
-
Build scripts updates:
- Update waf to 1.9.5
- Update script for computing algorithm dependencies
-
Code cleanup and unit tests updates
-
Re-designed and expanded documentation:
- Updated installation instructions
- Reorganized and improved Python tutorials. Notebook tutorials are now also rendered as html
- Updated algorithm descriptions
- Added examples of industrial applications and academic studies using Essentia
Essentia 2.1 beta3
This pre-release includes the following changes:
- Build script updates:
- Cross-compilation for iOS and Android
- Support for javascript using Emscripten
- Updated dependencies in static extractors (LibAv 11.2, Taglib 1.10)
- Fixed cross-compilation for Windows
- Homebrew formula for easy installation on OSX
- Updated Debian packaging
- All dependencies are now optional. Algorithms and examples relying on missing dependencies will be ignored.
- New flags for building lightweight versions of Essentia
--lightweight=LIBS
to specify dependencies to be included--include-algos=ALGOS
and--ignore-algos=ALGOS
to specify algorithms to be included
- New algorithms:
- SuperFlux algorithm for real-time onset detection (SuperFluxExtractor, SuperFluxNovelty)
- Algorithms for sound modeling
- Overlap-add (OverlapAdd)
- Sine model analysis/synthesis (SineModelAnal, SineModelSynth)
- Sine subtraction (SineSubtraction)
- Sinusoidal plus Residual model analysis/synthesis (SprModelAnal, SprModelSynth)
- Melody Analysis (monophonic/predominant)
- HarmonicMask
- Signal resampling (ResampleFFT)
- New pitch-related algorithms
- Multi-pitch estimation in polyphonic music (MultiPitchKlapuri, MultiPitchMelodia)
- Adaptation of Melodia algorithm for monophonic signals (PitchMelodia)
- Yin pitch detection algorithm (PitchYin)
- Pitch contour segmentation into notes (PitchContourSegmentation)
- Vibrato detection (Vibrato)
- BPM estimation on loops (PercivalEnhanceHarmonics, PercivalEvaluatePulseTrains, LoopBpmConfidence, LoopBpmEstimator, PercivalBpmEstimator)
- STFT on complex inputs ( FFTC)
- ConstantQ and Chromagram (still in experimental stage)
- TriangularBands
- Lightweight spectral centroid implementation (SpectralCentroidTime)
- Chords detection on beat segments (ChordsDetectionBeats)
- VectorRealAccumulator
- Improved algorithms:
- LoudnessEBUR128 algorithms are now finalized (includes bug-fixes)
- FFT now supports KissFFT and Accelerate FFT libraries as an alternative to FFTW
- New profiles for Key estimation (including profiles for electronic music)
- New 'generalized' parameter in Autocorrelation algorithm
- New 'scale' and 'shift' parameters in UnaryOperator algorithm
- New 'normalized' parameter in Windowing algorithm
- New 'inputSize' parameter in GFCC algorithm
- Added support for 8kHz for EqualLoudness algorithm
- LogAttackTime now outputs attack times
- BpmHistogramDescriptors now outputs a complete histogram
- ChordsDescriptors now throws exception on incorrect chords
- Refactored AudioLoader and AudioWriter algorithms. Use libavresample, remove support for libswresample
- Rename PitchFilterMakam to PitchFilter. Allow filtering negative energy values. Remove optional 'octaveFilter' parameter
- Rename PredominantMelody algorithm to PredominantPitchMelodia
- Bug-fixes:
- Fix wrong behavior of HarmonicPeaks that was indirectly affecting results in HPCP, Key, Tristimulus and OddToEvenHarmonicEnergy
- Fixed filter coefficients in BandReject and BandPass
- Fixed weightings in NoveltyCurve
- Different key profiles in Key streaming algorithm now work correctly
- Bug fixes in Envelope, TonicIndianArtMusic, RhythmExtractor2013, PitchYinFFT, BpmHistogramDescriptors, ReplayGain streaming
- Updated extractors (including Freesound extractor)
- Improved documentation
- Fresh new design
- Algorithms are now organized by categories.
- Improved and rewritten algorithm descriptions
- New python examples and tutorials
- More minor fixes, improvements and code cleanup
- Updated unit tests. Audio files for tests are now hosted in a separate repository
Known issues:
- Some unit tests fail (#316)
Essentia 2.1 beta2
Changes:
- Build scripts updates:
- New scripts for static builds on Linux, OSX and (cross-compilation) Windows
- New flag
--with-example
to build only specific examples - New git commit SHA hash value accessible via Essentia library API for better versioning
- Algorithm updates:
- AudioLoader now outputs codec and bitrate, and computes md5 hash values over undecoded audio
- MetadataReader now uses new TagLib 1.9 API and is able to read any tags
- YamlInput now supports json
- New Entropy algorithm
- EffectiveDuration now accepts a threshold parameter
- Fixed incorrect computation of onset rate in OnsetRate
- New algorithm LoudnessEBUR128 for measuring loudness according to the EBU R128 standard (still in experimental stage)
- New BinaryOperator algo
- PitchYinFFT algorithm now includes peak interpolation
- Revised and updated extractors:
- Revised, refactored and expanded music extractor (streaming_extractor_music) including new functionality and descriptors
- Updated Freesound extractor, including new descriptors
- Some updates in core Essentia code
- Updated documentation and examples
- Bugfixes and unit tests updates
Dependencies: Libav 9, Taglib 1.9
Ubuntu/Debian Libav/Taglib compatibility:
- Debian Jessie - the required package versions are already in the repository
- Debian Wheezy - install libav/libtag1-dev packages from wheezy-backports repository
- libav 6:10.1
- libtag1-dev 1.9.1
- Ubuntu Trusty (14.04 LTS), Utopic (14.10) and Vivid (15.04) - the required package versions are already in the repository
Essentia 2.0.1
Essentia 2.0.1:
- Added pre-trained high-level classifier models for genres, moods, rhythm and instrumentation (to be used with
streaming_extractor_archivemusic
extractor, see accuracies here) - Fixed scheduler in streaming mode
- Fixed compilation with clang/libc++/c++11
- PitchYinFFT now supports parabolic interpolation
- Updated Vamp plugin
- Updated documentation and tutorials
- Minor bugfixes, more unittests, etc.
For post-release bugfixes (including Ubuntu 14.04 compatibility) use the 2.0.1 branch.
Ubuntu/Debian Libav compatibility:
- Debian Wheezy - libav 6:0.8.17
- Ubuntu Precise (12.04 LTS) - libav 4:0.8.17
- Ubuntu Trusty (14.04 LTS) - libav 6:9.18
Essentia 2.0
- First release to be publicly available as free software released under AGPLv3
- Refactoring of the core API
- fix small API annoyances for the standard mode
- streaming mode refactor. It is now much better defined, using sound computer science techniques (The visible network is a directed acyclic graph, the composites have better defined semantics, and the order of execution of the algorithms is the topological sort of the transitive reduction of the visible network after the composites have been expanded). In particular, the scheduler that runs the algorithms in the streaming mode is now a lot more correct, which permitted to clean all the small hacks that had accumulated in the algorithms themselves during the 1.x releases to compensate for the deficiencies of the initial scheduler.
- New algorithms for onset detection, beat tracking and melody extraction
- New and updated features extractors
- Updated Vamp plugin
- Much better documentation, more python examples
- Bugfixes, more unittests, etc.
For post-release bugfixes use the 2.0 branch.
Ubuntu/Debian Libav compatibility:
- Debian Wheezy - libav 6:0.8.17
- Ubuntu Precise (12.04 LTS) - libav 4:0.8.17
- Ubuntu Trusty (14.04 LTS) - libav 6:9.18