-
Notifications
You must be signed in to change notification settings - Fork 61
Description
Hi! I noticed a significant impact on the CPU load from a Redis service instrumented with OBI. I whittled it down to a minimal reproduction: https://github.com/paweljw/redis-impact-repro I generally run run_load.sh first, then run_obi.sh.
This minimal configuration file seems to be enough to trigger it:
discovery:
instrument:
- open_ports: 6379
trace_printer: text
Here are the results I'm getting with the synthetic reproduction on a spare, otherwise empty 4-core server (not anywhere close to 100% loaded):
load-generator-1 | [STATS] avg_latency=10.73ms | redis_cpu=27.9% | ops/sec=9813
load-generator-1 | [STATS] avg_latency=9.49ms | redis_cpu=29.0% | ops/sec=10792
load-generator-1 | [STATS] avg_latency=8.49ms | redis_cpu=29.0% | ops/sec=12281
load-generator-1 | -------------- OBI STARTED -----------------
load-generator-1 | [STATS] avg_latency=9.55ms | redis_cpu=28.2% | ops/sec=10899
load-generator-1 | [STATS] avg_latency=10.75ms | redis_cpu=47.1% | ops/sec=9654
load-generator-1 | [STATS] avg_latency=10.32ms | redis_cpu=47.2% | ops/sec=10035
load-generator-1 | [STATS] avg_latency=10.78ms | redis_cpu=46.7% | ops/sec=9555
load-generator-1 | [STATS] avg_latency=11.64ms | redis_cpu=45.7% | ops/sec=8773
load-generator-1 | [STATS] avg_latency=10.92ms | redis_cpu=46.4% | ops/sec=9350
load-generator-1 | [STATS] avg_latency=10.41ms | redis_cpu=46.7% | ops/sec=9827
load-generator-1 | [STATS] avg_latency=11.45ms | redis_cpu=45.5% | ops/sec=8985
load-generator-1 | [STATS] avg_latency=11.78ms | redis_cpu=40.7% | ops/sec=8709
load-generator-1 | -------------- OBI STOPPED -----------------
load-generator-1 | [STATS] avg_latency=8.54ms | redis_cpu=28.2% | ops/sec=12126
load-generator-1 | [STATS] avg_latency=9.78ms | redis_cpu=28.2% | ops/sec=10658
load-generator-1 | [STATS] avg_latency=9.00ms | redis_cpu=28.1% | ops/sec=11524
load-generator-1 | [STATS] avg_latency=9.49ms | redis_cpu=28.0% | ops/sec=10930
load-generator-1 | [STATS] avg_latency=8.03ms | redis_cpu=29.2% | ops/sec=12774
When excluding - exe_path: "**/redis-server", the performance hit is gone (and so are the traces, of course):
load-generator-1 | [STATS] avg_latency=10.09ms | redis_cpu=28.4% | ops/sec=9888
load-generator-1 | [STATS] avg_latency=10.28ms | redis_cpu=28.6% | ops/sec=10069
load-generator-1 | [STATS] avg_latency=7.87ms | redis_cpu=29.6% | ops/sec=13077
load-generator-1 | -------------- OBI STARTED -----------------
load-generator-1 | [STATS] avg_latency=9.93ms | redis_cpu=27.6% | ops/sec=10517
load-generator-1 | [STATS] avg_latency=10.90ms | redis_cpu=28.8% | ops/sec=9479
load-generator-1 | [STATS] avg_latency=8.30ms | redis_cpu=29.5% | ops/sec=12507
load-generator-1 | [STATS] avg_latency=10.91ms | redis_cpu=27.6% | ops/sec=9767
load-generator-1 | [STATS] avg_latency=8.46ms | redis_cpu=29.3% | ops/sec=12169
load-generator-1 | -------------- OBI STOPPED -----------------
load-generator-1 | [STATS] avg_latency=8.80ms | redis_cpu=29.0% | ops/sec=11718
load-generator-1 | [STATS] avg_latency=8.64ms | redis_cpu=28.8% | ops/sec=11898
My first thought was eBPF, after some previous run-ins :) but it seems just auto-instrumenting the process is enough. Not limited to Python-based workloads, I'm seeing this with other clients too.
This unfortunately blocks me from continuing to roll out OBI - the synthetic benchmark takes an extra ~20% percentage point hit, a real server with high load is predictably worse. The ops/sec hit isn't so great either for production usecases π
I'm not sure whether this is a bug, or expected performance hit from instrumentation. Could you please advise? Thank you! π