Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: checksums for all manifest download urls #423

Open
swarnimarun opened this issue Oct 26, 2023 · 2 comments
Open

feat: checksums for all manifest download urls #423

swarnimarun opened this issue Oct 26, 2023 · 2 comments

Comments

@swarnimarun
Copy link
Contributor

Feature/Goal

Provide checksums in manifests for all downloadable urls.
Eg,

"binaries": {
    "aarch64-apple-darwin": {
         "url": "https://github.com/binary/bin/release/download/.../binary_mac",
         "checksum": "835acc0ae8636450bb69b257d56fbb4160d84bcf"
    },
    "arm-linux-gnu": {
         "url": "https://github.com/binary/bin/release/download/.../binary_linux",
         "checksum": "4452d71687b6bc2c9389c3349fdc17fbd73b833b"
    }
}

And verify against the checksum the existing or downloaded binary for ensuring correct version is present or that we haven't downloaded the wrong binary.

Motivation

Currently we don't verify the downloaded artifacts to be the correct version or same as the expected binary in general. This can cause a few issues,

  • The locally present binary doesn't match what the manifest requests even if the binary names are the same.

  • Possibly the binary downloaded from the url has been changed/updated and there is a mismatch in version downloaded and version tested when making the service manifest.

  • Or, the url now points to a malicious binary due to any reason, which we won't want to be allowed to be executed.

FAQ

Who is in-charge to make the checksum? The service developer or internal github action in registry?

  • Developers should be incharge, as our github actions will have the same issue of not being able to identify possible cases of version drift/mismatch and malicious binary.

Do we plan to provide tools for automating this process?

  • We have discussed providing a prem-cli for developing the services and building manifests with all these fields automatically filled-in.
    But we haven't finalized it internally yet.

Do we want to be relaxed and still download services without checksum, simply showing in UI as “dangerous” because not verified?

  • Downloading compromised binaries is not an issue if we aren't ever executing it, we could consider downloading with least permissions in a safe/temp directory first perhaps(?), also we can't verify checksum for files pre-download.

Does this apply to all download urls?

  • Yes, this could also apply to all models we may download, that aren't "safetensors", or one of the other safe model formats. (We would be working on providing proper guidelines around safety for local execution as well.)
@tiero
Copy link
Contributor

tiero commented Oct 27, 2023

I would start doing the following @swarnimarun

  1. editing the manifest as you suggest (adding an object with both URL and Checksum) in a fork of yours of prem-registry. Little suggestion: add already a field for signature, which may come in handy in the future, if we want to guarantee cryptographically signed builds.

  2. Bump the version to 1.1, so App can handle gracefully the new JSON spec

  3. Provide a bash implementation to serve as "specification" to generate checksum on MacOS and Linux. This can become a simple Github Action as well

BONUS

Let's take the occasion to introduce a https://json-schema.org so we can maintain easily the manifest

@casperdcl
Copy link
Contributor

casperdcl commented Oct 27, 2023

for OSS services we control, I very very strongly suggest using URLs with built-in checksums instead, e.g. https://github.com/premAI-io/prem-services/releases/download/v1/cht-llama-cpp-mistral-1.1.2-aarch64-apple-darwin instead of https://github.com/premAI-io/prem-services/releases/download/v1/cht-llama-cpp-mistral-1-aarch64-apple-darwin becuase

  • they're OSS and https: and made by us, so we know we can trust them
  • no annoying process of copy-pasting hashes whenever we release a new service

For external URLs, sure we can have an (optional) checksum field.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants