Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[bug] OpenTelemetry Trace hierarchy broken #3117

Open
genofire opened this issue Jul 19, 2024 · 7 comments
Open

[bug] OpenTelemetry Trace hierarchy broken #3117

genofire opened this issue Jul 19, 2024 · 7 comments
Labels
bug Something isn't working

Comments

@genofire
Copy link

Describe the bug with a clear and concise description of what the bug is.

For Trace-ID is "X-Request-Id" used.

But OpenTelemetry and W3C use the "Traceparent"-Header for documentation see here:

the content is:
<version>-<trace-id>-<parent-span-id>-<trace-flag>

where the id has to be strapped.

What's your GoToSocial Version?

0.16.0

GoToSocial Arch

amd64 container

What happened?

Tracing not mapped with loadbalancer

What you expected to happen?

Tracing not mapped with loadbalancer

How to reproduce it?

No response

Anything else we need to know?

No response

@genofire genofire added the bug Something isn't working label Jul 19, 2024
@Tsuribori
Copy link
Contributor

There's a request-id-header config option that controls what incoming header the request ID is extracted from https://docs.gotosocial.org/en/latest/configuration/observability/#settings. Would setting request-id-header: "Traceparent" work?

@genofire
Copy link
Author

genofire commented Jul 23, 2024

you are correct it works:

image(6)
so we could close that issue.

but why is the SQL-Query span not associated by the parent span (the request)? (run they parallel or sequential?)

@Tsuribori
Copy link
Contributor

Tsuribori commented Jul 23, 2024

Checked my traces and it seems that all the SQL query spans have a parentSpanId value that is non-existent. Also there are traces that only contain a single SQL span with a non-existent parent span that should probably be associated with a request. So there is definitively something wrong.

@Tsuribori
Copy link
Contributor

I tested with the testrig DEBUG=1 GTS_PORT=8080 ./gotosocial testrig start used for development and tracing works there correctly, the SQL spans are child spans of the request.

@genofire genofire changed the title [bug] Wrong usage of Trace-Header for OpenTelemetry [bug] OpenTelemetry Trace hierarchy broken Jul 26, 2024
@Tsuribori
Copy link
Contributor

I tried running on my "production" setup with the env vars:

OTEL_TRACES_SAMPLER: "traceidratio"
OTEL_TRACES_SAMPLER_ARG: "0.01"
OTEL_BSP_MAX_QUEUE_SIZE: "10000"

to see if the problem was caused by the queue filling up but spans were still missing like before so I guess it's something else. I guess the testrig and a production server have some sort of difference in the way they function or are set up?

@genofire
Copy link
Author

genofire commented Aug 5, 2024

I just run it with (with an plain tempo):

GTS_TRACING_ENABLED: "true"                                                                                                                                                                                                                    
GTS_TRACING_ENDPOINT: tempo.monitoring.svc:4317                                                                                                                                                                                                
GTS_TRACING_INSECURE_TRANSPORT: "true"                                                                                                                                                                                                         
GTS_TRACING_TRANSPORT: grpc  

do you run the collector: OTEL_*

@Tsuribori
Copy link
Contributor

Tsuribori commented Aug 5, 2024

I just run it with (with an plain tempo):

GTS_TRACING_ENABLED: "true"                                                                                                                                                                                                                    
GTS_TRACING_ENDPOINT: tempo.monitoring.svc:4317                                                                                                                                                                                                
GTS_TRACING_INSECURE_TRANSPORT: "true"                                                                                                                                                                                                         
GTS_TRACING_TRANSPORT: grpc  

do you run the collector: OTEL_*

Gotosocial uses opentelemetry-go for traces which has some settings that can be set through OTEL_* environment variables, documented in https://pkg.go.dev/go.opentelemetry.io/otel/sdk/trace

I also use Tempo with the following configuration:

GTS_TRACING_ENABLED: "true"
GTS_TRACING_TRANSPORT: "http"
GTS_TRACING_ENDPOINT: "<redacted>:4318"
GTS_TRACING_INSECURE_TRANSPORT: "true"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants