GitHub - steffenfritz/FileTrove: FileTrove indexes files and creates metadata from them.

STATUS: Development

VERSION: v1.0.0-DEV-16

NOTES:

Release DEV-16 is the last DEV release. The next one will be a feature freeze BETA version. (2024-04-23)
Preparing FileTrove for BETA release by working on OpenSSF issues: https://scorecard.dev/viewer/?uri=github.com/steffenfritz/FileTrove (2024-05-12)

About

FileTrove indexes files and creates metadata from them.

The single binary application walks a directory tree and identifies all regular files by type with siegfried, giving you the

MIME type
PRONOM identifier
Format version
Identification proof and note
filename extension

os.Stat() is giving you the

File size
File creation time
File modification time
File access time
and the same for directories

Furthermore it creates and calculates

UUIDv4s as unique identifiers (not stable across sessions)
hash sums (md5, sha1, sha256, sha512 and blake2b-512)
the entropy of each file (up to 1GB)

and it extracts some EXIF metadata and you can add your own DublinCore Elements metadata to scans.

FileTrove also checks if the file is in the NSRL (https://www.nist.gov/itl/ssd/software-quality-group/national-software-reference-library-nsrl).

For this check a 4.0GB BoltDB is needed and can be downloaded with FileTrove during the installation.

You can also create your own database for the NSRL check. You just need a text file with SHA1 hashes, one per line and the tool admftrove from this repository. With this tool you can also add your own hashes to an existing database.

All results are written into a SQLite database and can be exported to TSV files.

How to install

Download a release from https://github.com/steffenfritz/FileTrove/releases or compile from source (using task build in cmd/ftrove (https://taskfile.dev)).
Copy the file where you want to install ftrove (the downloaded file has a suffix, omitted in the following documentation)
Run ./ftrove --install . (Mind the period)

a) If you don't have already a NSRL database, you have to download it. Please be patient.

b) If you have a NSRL database copy/move it do the "db" directory that ftrove just created.
You are ready to go!

How to run

./ftrove -h gives you all flags ftrove understands.

A run only with necessary flags looks like this:

./ftrove -i $DIRECTORY

where $DIRECTORY is a directory you want to use as a starting point. FileTrove will walk this directory recursively down.

How to see the results

You can export the results via ./ftrove -t $UUID where $UUID is the session id. Every indexing run gets its own session id. You get a list of all sessions using ./ftrove -l.

Example:

./ftrove -l
./ftrove -t 926be141-ab75-4106-8236-34edfcf102f2

This will create several TSV files that can be read with Excel, Numbers and your preferred text editor.

You can also work with SQL on the database, using sqlite on the console or a GUI like sqlitebrowser (https://sqlitebrowser.org/). Sqliteviz is also a neat tool to visualize the data (https://sqliteviz.com/app/#/).

Background

FileTrove is the successor of filedriller and based on my iPres 2021 paper Marrying siegfried and the National Software Reference Library

Name		Name	Last commit message	Last commit date
Latest commit History 231 Commits
.github		.github
cmd		cmd
testdata		testdata
vendor		vendor
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
Taskfile.yml		Taskfile.yml
database_schema.dbml		database_schema.dbml
db.go		db.go
db_test.go		db_test.go
dublincore.go		dublincore.go
dublincore_test.go		dublincore_test.go
entropy.go		entropy.go
entropy_test.go		entropy_test.go
exif.go		exif.go
exif_test.go		exif_test.go
filewalk.go		filewalk.go
filewalk_test.go		filewalk_test.go
go.mod		go.mod
go.sum		go.sum
hash.go		hash.go
hash_test.go		hash_test.go
install.go		install.go
nsrl.go		nsrl.go
siegfried.go		siegfried.go
siegfried_test.go		siegfried_test.go
times.go		times.go
times_test.go		times_test.go
userinfo.go		userinfo.go
uuid.go		uuid.go
uuid_test.go		uuid_test.go
version.go		version.go
yara.go		yara.go
yara_test.go		yara_test.go

License

steffenfritz/FileTrove

Folders and files

Latest commit

History

Repository files navigation

About

How to install

How to run

How to see the results

Background

About

Topics

Resources

License

Security policy

Stars

Watchers

Forks

Sponsor this project

Languages