Infer additional secret properties in engine, from schemata #16187

lunaris · 2024-05-13T14:49:32Z

This commit modifies calls to RegisterResource so that they load resource schemata, enabling the engine to modify resource properties according to the schema instead of the caller (e.g. the SDK). As part of this commit, the engine will now extend additionalSecretOutputs options with Secret properties it finds in the schema. In a later commit, this will allow us to stop generating SDK code to set these properties correctly client-side. The schema may be used in subsequent pieces of work to handle other concerns, such as defaults, though this is not tackled in this commit.

Note that for this to even work at all, we need to cache provider loads when providers are loaded for the purpose of schema lookup. Ordinarily, provider loads are not cached -- if a program instantiates the same name/version of a provider twice with different parameters, for instance, we want those instantiations to yield two separate, independent instances. However, when loading schema, not only does this not matter (two providers will have the same schema if and only if they share the same name/version), it's of critical importance that we cache provider loads by name/version, lest we instantiate a provider for every resource in the program that depends on it. This commit thus introduces such a cache to the pluginLoader backing schema loads (which thankfully is separate from the means we use to load providers during program execution).

Note that in a program context, this may mean we have a number of potentially avoidable provider loads: if there are P distinct providers in a program, we will perform 2P loads where we previously performed P. We are betting that this will not have a severe impact on performance, but could be wrong. Ideally we could avoid the double loading by sharing a "schema load" (GetSchema being akin to a "static" method call) with an "instantiation load" (Check, Diff etc. being akin to "instance" methods). Unfortunately, as it stands today, provider loads are coupled/the same as provider instances, so this would likely require a substantial rethink/rework to make it possible (there is no way to load the "class definition" without instantiating it, to continue the static/instance analogy).

Note: This work is almost entirely aped from master...zaid/additional-secret-properties-from-schema as preparation for looking at how we handle other things (names, defaulting, etc.) in engine.

pulumi-bot · 2024-05-13T15:02:16Z

Changelog

[uncommitted] (2024-06-04)

Features

[engine] Extend additionalSecretOutputs in engine based on resource schemata
#16187

Frassle · 2024-05-13T17:15:59Z

pkg/engine/update.go

-		Args:        args,
-		Target:      target,
-	}, defaultProviderVersions, dryRun), nil
+	schemaLoader := schema.NewPluginLoader(plugctx.Host)


Does this do any caching or do we reload the schema each time we get a resource for it?

Indeed -- caching is necessary for this to even work. I've fix this and added comments and updated the PR description/commit message to (hopefully) explain it all.

pkg/resource/deploy/source_eval.go

iwahbe · 2024-05-14T18:05:52Z

pkg/resource/deploy/source_eval.go

+	//if parsedVersion, err := semver.Parse(req.GetVersion()); err == nil {
+	//  version = &parsedVersion
+	//}
+
+	//if typeToken, err := tokens.ParseTypeToken(req.Type); err == nil {
+	//  packageName := typeToken.Package().String()
+	//  if pkgReference, err := rm.schemaLoader.LoadPackageReference(packageName, version); err == nil {
+	//    if resourceSchema, found, err := pkgReference.Resources().Get(req.Type); err == nil && found {
+	//      for _, outputProperty := range resourceSchema.Properties {
+	//        if outputProperty.Secret {
+	//          additionalSecretOutputsSet.Add(outputProperty.Name)
+	//        }
+	//      }
+	//    }
+	//  }
+	//}


We are making the schema load bearing. Going between having and not having a schema will start to cause diffs on some providers, since [secret] => "foo" is a change that should show up in state.

I would be very careful to distinguish between tolerable errors (there is no schema for this resource, technically OK) and non-tolerable errors (we tried to load the schema, but failed to launch), correctly erroring for non-tolerable errors.

We should log for tolerable errors.

A) This shouldn't actually cause diffs because SDK-gen should have been setting the same things secret anyway
B) A provider returning an empty schema is fine and we should tolerate that and just do nothing. I think anything else is an error, I don't expect any providers to error on this method or return invalid schemas.

A) is true only as long as the generated SDK agrees with the provider. This may not hold now for Go, since they provider may be a different version then the SDK used to consume it.

B) Yes

A) True, but I'd consider that already buggy and the diff would be more correct. I don't think we should try to avoid that.

This commit modifies calls to `RegisterResource` so that they load resource schemata, enabling the engine to modify resource properties according to the schema instead of the caller (e.g. the SDK). As part of this commit, the engine will now extend `additionalSecretOutputs` options with `Secret` properties it finds in the schema. In a later commit, this will allow us to stop generating SDK code to set these properties correctly client-side. The schema may be used in subsequent pieces of work to handle other concerns, such as defaults, though this is not tackled in this commit. Note that for this to even work at all, we need to cache provider loads when providers are loaded for the purpose of schema lookup. Ordinarily, provider loads are not cached -- if a program instantiates the same name/version of a provider twice with different parameters, for instance, we want those instantiations to yield two separate, independent instances. However, when loading schema, not only does this not matter (two providers will have the same schema if and only if they share the same name/version), it's of critical importance that we cache provider loads by name/version, lest we instantiate a provider for _every resource_ in the program that depends on it. This commit thus introduces such a cache to the `pluginLoader` backing schema loads (which thankfully is separate from the means we use to load providers during program execution). Note that in a program context, this may mean we have a number of potentially avoidable provider loads: if there are `P` distinct providers in a program, we will perform `2P` loads where we previously performed `P`. We are betting that this will not have a severe impact on performance, but could be wrong. Ideally we could avoid the double loading by sharing a "schema load" (`GetSchema` being akin to a "static" method call) with an "instantiation load" (`Check`, `Diff` etc. being akin to "instance" methods). Unfortunately, as it stands today, provider loads are coupled/the same as provider instances, so this would likely require a substantial rethink/rework to make it possible (there is no way to load the "class definition" without instantiating it, to continue the static/instance analogy).

Frassle · 2024-06-07T10:33:53Z

pkg/codegen/schema/loader.go

+		return nil, err
+	}
+
+	l.providers[key] = provider


Don't we need to shutdown these cached provider instances at some point?

lunaris requested a review from a team as a code owner May 13, 2024 14:49

lunaris force-pushed the wjones/engine-schema branch from 38aa4af to a714606 Compare May 13, 2024 17:11

Frassle reviewed May 13, 2024

View reviewed changes

lunaris force-pushed the wjones/engine-schema branch 3 times, most recently from 1bf571a to fff22a6 Compare May 14, 2024 16:25

iwahbe reviewed May 14, 2024

View reviewed changes

lunaris force-pushed the wjones/engine-schema branch from fff22a6 to 21db4f3 Compare May 15, 2024 08:02

lunaris force-pushed the wjones/engine-schema branch from 21db4f3 to 502e77d Compare June 4, 2024 16:55

lunaris force-pushed the wjones/engine-schema branch from 502e77d to 1fb4d6a Compare June 4, 2024 17:07

pulumi-bot mentioned this pull request Jun 5, 2024

[DOWNSTREAM TEST][PLATFORM]Test: Upgrade pulumi/{pkg,sdk} to 1fb4d6a4c28476236eea13ef99dd5af5371539b2 pulumi/pulumi-aws#4022

Closed

Frassle reviewed Jun 7, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Infer additional secret properties in engine, from schemata #16187

Infer additional secret properties in engine, from schemata #16187

lunaris commented May 13, 2024 •

edited

pulumi-bot commented May 13, 2024 •

edited

Frassle May 13, 2024

lunaris Jun 4, 2024

iwahbe May 14, 2024

Frassle May 14, 2024

iwahbe May 14, 2024

Frassle May 15, 2024

Frassle Jun 7, 2024

Infer additional secret properties in engine, from schemata #16187

Are you sure you want to change the base?

Infer additional secret properties in engine, from schemata #16187

Conversation

lunaris commented May 13, 2024 • edited

pulumi-bot commented May 13, 2024 • edited

Changelog

[uncommitted] (2024-06-04)

Features

Frassle May 13, 2024

Choose a reason for hiding this comment

lunaris Jun 4, 2024

Choose a reason for hiding this comment

iwahbe May 14, 2024

Choose a reason for hiding this comment

Frassle May 14, 2024

Choose a reason for hiding this comment

iwahbe May 14, 2024

Choose a reason for hiding this comment

Frassle May 15, 2024

Choose a reason for hiding this comment

Frassle Jun 7, 2024

Choose a reason for hiding this comment

lunaris commented May 13, 2024 •

edited

pulumi-bot commented May 13, 2024 •

edited