Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Infer additional secret properties in engine, from schemata #16187

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

lunaris
Copy link
Contributor

@lunaris lunaris commented May 13, 2024

This commit modifies calls to RegisterResource so that they load resource schemata, enabling the engine to modify resource properties according to the schema instead of the caller (e.g. the SDK). As part of this commit, the engine will now extend additionalSecretOutputs options with Secret properties it finds in the schema. In a later commit, this will allow us to stop generating SDK code to set these properties correctly client-side. The schema may be used in subsequent pieces of work to handle other concerns, such as defaults, though this is not tackled in this commit.

Note that for this to even work at all, we need to cache provider loads when providers are loaded for the purpose of schema lookup. Ordinarily, provider loads are not cached -- if a program instantiates the same name/version of a provider twice with different parameters, for instance, we want those instantiations to yield two separate, independent instances. However, when loading schema, not only does this not matter (two providers will have the same schema if and only if they share the same name/version), it's of critical importance that we cache provider loads by name/version, lest we instantiate a provider for every resource in the program that depends on it. This commit thus introduces such a cache to the pluginLoader backing schema loads (which thankfully is separate from the means we use to load providers during program execution).

Note that in a program context, this may mean we have a number of potentially avoidable provider loads: if there are P distinct providers in a program, we will perform 2P loads where we previously performed P. We are betting that this will not have a severe impact on performance, but could be wrong. Ideally we could avoid the double loading by sharing a "schema load" (GetSchema being akin to a "static" method call) with an "instantiation load" (Check, Diff etc. being akin to "instance" methods). Unfortunately, as it stands today, provider loads are coupled/the same as provider instances, so this would likely require a substantial rethink/rework to make it possible (there is no way to load the "class definition" without instantiating it, to continue the static/instance analogy).

Note: This work is almost entirely aped from master...zaid/additional-secret-properties-from-schema as preparation for looking at how we handle other things (names, defaulting, etc.) in engine.

@lunaris lunaris requested a review from a team as a code owner May 13, 2024 14:49
@pulumi-bot
Copy link
Contributor

pulumi-bot commented May 13, 2024

Changelog

[uncommitted] (2024-06-04)

Features

  • [engine] Extend additionalSecretOutputs in engine based on resource schemata
    #16187

Args: args,
Target: target,
}, defaultProviderVersions, dryRun), nil
schemaLoader := schema.NewPluginLoader(plugctx.Host)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this do any caching or do we reload the schema each time we get a resource for it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed -- caching is necessary for this to even work. I've fix this and added comments and updated the PR description/commit message to (hopefully) explain it all.

@lunaris lunaris force-pushed the wjones/engine-schema branch 3 times, most recently from 1bf571a to fff22a6 Compare May 14, 2024 16:25
pkg/resource/deploy/source_eval.go Outdated Show resolved Hide resolved
Comment on lines 1790 to 1805
//if parsedVersion, err := semver.Parse(req.GetVersion()); err == nil {
// version = &parsedVersion
//}

//if typeToken, err := tokens.ParseTypeToken(req.Type); err == nil {
// packageName := typeToken.Package().String()
// if pkgReference, err := rm.schemaLoader.LoadPackageReference(packageName, version); err == nil {
// if resourceSchema, found, err := pkgReference.Resources().Get(req.Type); err == nil && found {
// for _, outputProperty := range resourceSchema.Properties {
// if outputProperty.Secret {
// additionalSecretOutputsSet.Add(outputProperty.Name)
// }
// }
// }
// }
//}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are making the schema load bearing. Going between having and not having a schema will start to cause diffs on some providers, since [secret] => "foo" is a change that should show up in state.

I would be very careful to distinguish between tolerable errors (there is no schema for this resource, technically OK) and non-tolerable errors (we tried to load the schema, but failed to launch), correctly erroring for non-tolerable errors.

We should log for tolerable errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A) This shouldn't actually cause diffs because SDK-gen should have been setting the same things secret anyway
B) A provider returning an empty schema is fine and we should tolerate that and just do nothing. I think anything else is an error, I don't expect any providers to error on this method or return invalid schemas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A) is true only as long as the generated SDK agrees with the provider. This may not hold now for Go, since they provider may be a different version then the SDK used to consume it.

B) Yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A) True, but I'd consider that already buggy and the diff would be more correct. I don't think we should try to avoid that.

This commit modifies calls to `RegisterResource` so that they load
resource schemata, enabling the engine to modify resource properties
according to the schema instead of the caller (e.g. the SDK). As part of
this commit, the engine will now extend `additionalSecretOutputs`
options with `Secret` properties it finds in the schema. In a later
commit, this will allow us to stop generating SDK code to set these
properties correctly client-side. The schema may be used in subsequent
pieces of work to handle other concerns, such as defaults, though this
is not tackled in this commit.

Note that for this to even work at all, we need to cache provider loads
when providers are loaded for the purpose of schema lookup. Ordinarily,
provider loads are not cached -- if a program instantiates the same
name/version of a provider twice with different parameters, for
instance, we want those instantiations to yield two separate,
independent instances. However, when loading schema, not only does this
not matter (two providers will have the same schema if and only if they
share the same name/version), it's of critical importance that we
cache provider loads by name/version, lest we instantiate a provider for
_every resource_ in the program that depends on it. This commit thus
introduces such a cache to the `pluginLoader` backing schema loads
(which thankfully is separate from the means we use to load providers
during program execution).

Note that in a program context, this may mean we have a number of
potentially avoidable provider loads: if there are `P` distinct
providers in a program, we will perform `2P` loads where we previously
performed `P`. We are betting that this will not have a severe impact on
performance, but could be wrong. Ideally we could avoid the double
loading by sharing a "schema load" (`GetSchema` being akin to a "static"
method call) with an "instantiation load" (`Check`, `Diff` etc. being
akin to "instance" methods). Unfortunately, as it stands today, provider
loads are coupled/the same as provider instances, so this would likely
require a substantial rethink/rework to make it possible (there is no
way to load the "class definition" without instantiating it, to continue
the static/instance analogy).
return nil, err
}

l.providers[key] = provider
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't we need to shutdown these cached provider instances at some point?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants