feat(storage): introduce gRPC client-side metrics#10639
Conversation
4863273 to
892e55c
Compare
tritone
left a comment
There was a problem hiding this comment.
Overall looks pretty good. We should talk about default behavior and user facing options.
| } | ||
|
|
||
| func (c *grpcStorageClient) Close() error { | ||
| if c.settings.meterProvider != nil { |
There was a problem hiding this comment.
Is it problematic if this never gets called? e.g. will some data not be flushed?
I just ask because calling client.Close() is basically optional; the program could just terminate first. But maybe that's fine.
| func newGRPCStorageClient(ctx context.Context, opts ...storageOption) (storageClient, error) { | ||
| s := initSettings(opts...) | ||
| s.clientOption = append(defaultGRPCOptions(), s.clientOption...) | ||
| // TODO: detect project id |
There was a problem hiding this comment.
I am assuming we'd have to add an option to enable/disable metrics as well?
| return provider, nil | ||
| } | ||
|
|
||
| func togRPCDialOption(provider *sdkmetric.MeterProvider) option.ClientOption { |
There was a problem hiding this comment.
nit: name seems misleading, maybe clientOptionFromMeterProvider. Or just do this inline since the function is only called once.
| {Key: "host_id", Value: attribute.StringValue(mr_host_id_label_default)}} | ||
| } | ||
|
|
||
| func getPreparedResourceUsingGCPDetector(ctx context.Context, project string) (*resource.Resource, error) { |
There was a problem hiding this comment.
nit: This func also seems superfluous and might be clearer in-line
There was a problem hiding this comment.
I wrote it this way to write tests; should I be doing something different to make it testable?
| metric_lb_locality_label = "grpc.lb.locality" | ||
| ) | ||
|
|
||
| func getMetricsToEnable() []estats.Metric { |
There was a problem hiding this comment.
This and getMetricsEnabledByDefault could probably be local vars rather than functions given that they are just static data.
|
|
||
| func TestMetrics(t *testing.T) { | ||
| ctx := context.Background() | ||
| grpcClient, err := NewGRPCClient(ctx) |
There was a problem hiding this comment.
Does this work without auth? If not we should probably make it an integration test. Or try option.WithoutAuthentication
There was a problem hiding this comment.
Requires auth; I need to update to not run with short tests.
There was a problem hiding this comment.
If it's an integration test, move to the file with the other integration tests.
ce08cc5 to
210280d
Compare
55f8e46 to
e7abe02
Compare
e7abe02 to
f91c109
Compare
|
|
||
| The gRPC client emits metrics by default and will export the | ||
| gRPC telemetry discussed in [gRFC/66] and [gRFC/78] to | ||
| [Google Cloud Monitoring]. The metrics are accessible to through Cloud Monitoring API and you incur no additional cost for publishing the metrics.Google Cloud Support can use this information to more quickly diagnose |
There was a problem hiding this comment.
| [Google Cloud Monitoring]. The metrics are accessible to through Cloud Monitoring API and you incur no additional cost for publishing the metrics.Google Cloud Support can use this information to more quickly diagnose | |
| [Google Cloud Monitoring]. The metrics are accessible through Cloud Monitoring API and you incur no additional cost for publishing the metrics. Google Cloud Support can use this information to more quickly diagnose |
| } | ||
|
|
||
| func enableClientMetrics(ctx context.Context, s *settings) (*internalMetricsContext, error) { | ||
| _, ep, err := htransport.NewClient(ctx, s.clientOption...) |
There was a problem hiding this comment.
I believe we have this endpoint captured under the client as xml host/scheme.
| // lack of credentials after initial failure. | ||
| // https://pkg.go.dev/go.opentelemetry.io/otel/sdk/metric@v1.28.0#Exporter | ||
| func (e *exporterLogSuppressor) Export(ctx context.Context, rm *metricdata.ResourceMetrics) error { | ||
| if err := e.exporter.Export(ctx, rm); err != nil && !e.emittedFailure { |
There was a problem hiding this comment.
Maybe we should do the check for emittedFailure earlier? Or is the intent that the user can fix the permission issue while the code is still running without having to restart?
There was a problem hiding this comment.
The intention is that once it's deployed it can recover once appropriate permissions are provided otherwise a redeployment would be required.
Introduce client-side metrics for gRPC client only.
Pending