Back to Blog

Adding Otel and Prometheus Support to MCP

Matthew Lenhard
MCPOtelPrometheusMonitoringObservability

Introduction

Recently, I set out to add OpenTelemetry (Otel) and Prometheus support to an MCP server. My goal was to capture all tool call requests and responses, providing better observability into how tools are used. I also wanted to do this without having to instrument each tool individually.

Since this was for a remote server, I needed a solution that didn't require modifying the client or adding a local gateway. If you want to see all of the code in action you can find it here.

Understanding the SDK

The MCP SDK provides a high-level interface for defining tools on the server side using the McpServer class. Tools are registered via the tool method, which takes a name, description, parameter schema, and a handler function. Every time a client calls a tool, the handler is invoked. This makes the tool method the perfect interception point for adding instrumentation.

The Monkey Patch Approach

Instead of forking the SDK or submitting a patch upstream, I opted for a monkey patch. This means replacing the tool method on the McpServer prototype at runtime, so every tool registered gets automatically wrapped with my instrumentation logic.

The core of the patch is in the Metrics.initialize method. When called, it replaces McpServer.prototype.tool with a new function that wraps the original handler. This wrapper does several things:

  • Starts an OpenTelemetry span for tracing.
  • Increments a Prometheus counter for every tool call.
  • Measures and records the latency of each call.
  • Increments an error counter and marks the span as failed if an exception is thrown.

This is all done transparently—tool authors don't need to change their code at all.

How the Monkey Patch Works

The heart of the instrumentation is in the Metrics.initialize static method. This method is responsible for setting up the metrics server and, crucially, for monkey patching the MCP SDK so that all tool registrations are automatically instrumented.

The patching is done by saving a reference to the original McpServer.prototype.tool method, then replacing it with a new function. This new function intercepts all calls to tool— the method used to register tools with the server.

static initialize(metricsPort = 9090, options = {}) {
  if (!Metrics.instance) {
    Metrics.instance = new Metrics(options);
    
    // Patch the McpServer prototype to add instrumentation
    const originalTool = McpServer.prototype.tool;
    McpServer.prototype.tool = function(...args) {

The replacement function is designed to be compatible with both supported signatures of the tool method: one with and one without an options argument. It extracts the tool's name, description, parameters, and handler from the arguments. If options are present, it includes them as well.

      // The tool method can be called with multiple signatures:
      // (name, description, parameters, handler)
      // (name, description, parameters, options, handler)
      const name = args[0];
      const description = args[1];
      const parameters = args[2];
      
      // Last arg is always the handler
      const handler = args[args.length - 1];
      
      // Check if options are present (second to last argument when length > 4)
      const hasOptions = args.length > 4;
      const options = hasOptions ? args[3] : undefined;
      
      // Create new args array with wrapped handler
      const newArgs = hasOptions 
          ? [name, description, parameters, options]
          : [name, description, parameters];

The key step is that it wraps the original handler function with a new async function that adds tracing and metrics logic before and after the handler is called:

      // Add wrapped handler that includes instrumentation
      newArgs.push(async (...handlerArgs) => {
        const span = Metrics.instance.tracer.startSpan(`tool.${name}`);
        const startTime = process.hrtime();
        try {
          Metrics.instance.metrics.toolCalls.inc({ tool_name: name });
          const result = await handler(...handlerArgs);
          span.setStatus({ code: SpanStatusCode.OK });
          return result;
        } catch (error) {
          Metrics.instance.metrics.toolErrors.inc({ tool_name: name });
          span.setStatus({
            code: SpanStatusCode.ERROR,
            message: error instanceof Error ? error.message : String(error)
          });
          throw error;
        } finally {
          const [seconds, nanoseconds] = process.hrtime(startTime);
          const duration = seconds + nanoseconds / 1e9;
          Metrics.instance.metrics.toolLatency.observe({ tool_name: name }, duration);
          span.end();
        }
      });

Finally, it calls the original tool method with the new arguments, ensuring that the tool is registered with the instrumented handler instead of the original one:

      // Call original tool method with new args
      return originalTool.apply(this, newArgs);
    };

This approach means that every tool registered after Metrics.initialize() is automatically instrumented, without requiring any changes to the tool registration code elsewhere in your project.

How the Instrumentation Works

When a tool is registered, the monkey patch intercepts the registration and wraps the handler function. Here's what happens on each tool call:

  const span = Metrics.instance.tracer.startSpan(`tool.${name}`);
  const startTime = process.hrtime();
  try {
    Metrics.instance.metrics.toolCalls.inc({ tool_name: name });
    const result = await handler(...handlerArgs);
    span.setStatus({ code: SpanStatusCode.OK });
    return result;
  } catch (error) {
    Metrics.instance.metrics.toolErrors.inc({ tool_name: name });
    span.setStatus({
      code: SpanStatusCode.ERROR,
      message: error instanceof Error ? error.message : String(error)
    });
    throw error;
  } finally {
    const [seconds, nanoseconds] = process.hrtime(startTime);
    const duration = seconds + nanoseconds / 1e9;
    Metrics.instance.metrics.toolLatency.observe({ tool_name: name }, duration);
    span.end();
  }
  • Tracing: Each tool call is traced using OpenTelemetry, with spans named after the tool. This allows you to see the flow of requests in your tracing backend (e.g., Jaeger, Zipkin, or an OTLP-compatible collector).
  • Metrics: Prometheus counters and histograms are updated for every call, error, and latency measurement. These metrics are exposed via an HTTP endpoint (/metrics) using an embedded Express server.

Setting Up the Metrics Server

The metrics server is started automatically when you call metrics.initialize(). By default, it listens on port 9090, but you can configure this and other options (like the Otel endpoint and service name):

import { metrics } from './metrics';

metrics.initialize(9090, {
  enableTracing: true,
  otelEndpoint: 'http://localhost:4318/v1/traces',
  serviceName: 'my-mcp-server'
});

Once running, you can scrape Prometheus metrics from http://localhost:9090/metrics and view traces in your preferred backend.

Downsides to Monkey Patching

While monkey patching offers a quick and powerful way to add cross-cutting features like observability, it comes with several important caveats. The most significant risk is that it relies on the internal implementation details of the SDK, which may change in future releases.

If the SDK maintainers refactor or rename the tool method, your patch could silently break or cause subtle bugs that are hard to diagnose. This can make upgrades more difficult, as you need to carefully review SDK changes for compatibility.

Conclusion

Monkey patching is sometimes frowned upon, but in this case, it allowed me to add tracing to all tool calls without individual instrumentation and it was a great way to better understand the internals of the sdk.

I'm hopeful that this type of functionality is eventually added natively to the MCP specification, so that work arounds like this are not necessary.