It seems a recent libmagic regression (detected on Gentoo and Arch) is causing webm files to be incorrectly identified. If you have them in your mono-collection, it might be a good time to ask for a patrolling read against your by-id index

Have received some complaints that the *nix binaries are built with WAY too new glibc. So they will now be built on latest release of Debian instead of bleeding edge Gentoo.

Breaking Changes

Risk: moderate. Deprecated source_* parameters has been dropped
- This affects qualifier expressions of all stages of the pipeline
- This also affects transform argument generation
Risk: moderate. Store qualifiers and path generation no longer bind file_* attributes (except for file_extension)
- Offering files to stores is a self contained process. Hoppers can be configured to auto invoke this process after certain files are ingested, but should not change said process. To convey extra information when auto invoked by hoppers is contrarian to this design
- If we need per-file attributes lets design it properly as opposed to hacking pieces of it onto two colocated features

New Features

Added inline named capture groups support for regex
- Realized through the PCRE2 library
- Yes these are still applied at a lower precedence to named constants
- Yes this means we now support match specific group attributes
Regex qualifiers now support minimum match length thresholds
- The new value for the include config directive is PROPERTY /EXPRESSION/FLAGS THRESHOLD
- eg: require the expression match at least 50% of the value include = x /\d+/ 50%
- eg: require the expression match at least 12 characters include = x /\d+/ 12

Behavior Changes

Workflows resumed through WIP files now bypass hopper evaluation
- WIP files now contain group attributes as well as workflow parameters, allowing manual touch ups
Store qualifiers and path generation now bind file_extension from the file identification process instead of copied verbatim from the imported file's path
Order assignment now sorts all files by length then character codes
- This ensures semantically correct order for variable length numbers in file names: 0, 1, 10, 11, 2, 3 (the order without length factoring)
- Another happy coincidence is this tends to cluster together similarly named files

Performance

Removed extraneous memory allocations from INI parsing
Removed unnecessary memory allocations for attribute matching at the cost of a bit of short lived heap fragmentation
Time complexity of matching files has been improved from m log(n) to m + n

Bug Fixes

Reduced FFMPEG warning spam when dealing with JPEG files
- A side effect of this change is that phash has started producing slightly different results
- So do not be alarmed if you see a lot of phash corrections while patrolling by-id

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0.4.0

Breaking Changes

New Features

Behavior Changes

Performance

Bug Fixes