DocsObservabilitySDKsAdvanced Features

Advanced features

Use these patterns to harden your Langfuse instrumentation, protect sensitive data, and adapt the SDKs to complex environments. Each section shows the equivalent Python and JS/TS setup side-by-side.

Mask sensitive data

Provide a mask function when instantiating the client to scrub inputs, outputs, and metadata before they leave your infrastructure.

from langfuse import Langfuse
import re
 
def pii_masker(data: any, **kwargs) -> any:
    if isinstance(data, str):
        return re.sub(r"[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+", "[EMAIL_REDACTED]", data)
    elif isinstance(data, dict):
        return {k: pii_masker(data=v) for k, v in data.items()}
    elif isinstance(data, list):
        return [pii_masker(data=item) for item in data]
    return data
 
langfuse = Langfuse(mask=pii_masker)

Logging & debugging

The SDK uses Python’s standard logging under the "langfuse" logger.

import logging
 
langfuse_logger = logging.getLogger("langfuse")
langfuse_logger.setLevel(logging.DEBUG)

Alternatively set debug=True or LANGFUSE_DEBUG="True" when instantiating the client.

Sampling

Sample traces directly on the client.

from langfuse import Langfuse
 
langfuse = Langfuse(sample_rate=0.2)

Filter exported spans

Exclude spans from specific instrumentation scopes.

from langfuse import Langfuse
 
langfuse = Langfuse(blocked_instrumentation_scopes=["sqlalchemy", "psycopg"])
⚠️

Filtering parent spans may result in orphaned children in Langfuse.

Tracer provider isolation

Create a dedicated OTEL TracerProvider for Langfuse spans.

from opentelemetry.sdk.trace import TracerProvider
from langfuse import Langfuse
 
langfuse_tracer_provider = TracerProvider()
langfuse = Langfuse(tracer_provider=langfuse_tracer_provider)
langfuse.start_span(name="isolated").end()
⚠️

TracerProviders still share the same context, so mixing providers may create orphaned spans.

Multi-project setups

Instantiate dedicated clients per project and pass the langfuse_public_key context where needed.

from langfuse import Langfuse, observe
 
project_a = Langfuse(public_key="pk-lf-project-a-...", secret_key="sk-lf-project-a-...")
project_b = Langfuse(public_key="pk-lf-project-b-...", secret_key="sk-lf-project-b-...")
 
@observe
def process_data_for_project_a(data, langfuse_public_key="pk-lf-project-a-..."):
    return {"processed": data}
 
@observe
def process_data_for_project_b(data, langfuse_public_key="pk-lf-project-b-..."):
    return {"processed": data}

You can also route OpenAI or LangChain integrations by passing langfuse_public_key on each call.

Environment-specific considerations

Thread pools and multiprocessing

Use the OpenTelemetry threading instrumentor so context flows across worker threads.

from opentelemetry.instrumentation.threading import ThreadingInstrumentor
 
ThreadingInstrumentor().instrument()

For multiprocessing, follow the OpenTelemetry guidance. If you use Pydantic Logfire, enable distributed_tracing=True.

Distributed tracing

Prefer native OTEL propagation when linking services. The trace_context argument should be a last resort because it forces root-span semantics server-side.

Time to first token (TTFT)

from langfuse import get_client
import datetime, time
 
langfuse = get_client()
 
with langfuse.start_as_current_observation(as_type="generation", name="TTFT-Generation") as generation:
    time.sleep(3)
    generation.update(
        completion_start_time=datetime.datetime.now(),
        output="some response",
    )
 
langfuse.flush()

Self-signed TLS certificates

.env
OTEL_EXPORTER_OTLP_TRACES_CERTIFICATE="/path/to/my-selfsigned-cert.crt"
import os, httpx
from langfuse import Langfuse
 
httpx_client = httpx.Client(verify=os.environ["OTEL_EXPORTER_OTLP_TRACES_CERTIFICATE"])
langfuse = Langfuse(httpx_client=httpx_client)
⚠️

Understand the security implications before trusting self-signed certificates.

Evaluation & scoring

Use observation methods, context-aware helpers, or low-level APIs to submit scores.

from langfuse import get_client
 
langfuse = get_client()
 
with langfuse.start_as_current_observation(as_type="generation", name="summary_generation") as gen:
    gen.update(output="summary text...")
    gen.score(name="conciseness", value=0.8, data_type="NUMERIC")
    gen.score_trace(name="user_feedback_rating", value="positive", data_type="CATEGORICAL")
 
with langfuse.start_as_current_observation(as_type="span", name="complex_task"):
    langfuse.score_current_span(name="task_component_quality", value=True, data_type="BOOLEAN")
langfuse.create_score(
    name="fact_check_accuracy",
    value=0.95,
    trace_id="abcdef1234567890abcdef1234567890",
    observation_id="1234567890abcdef",
    data_type="NUMERIC",
    comment="Source verified for 95% of claims.",
)

Dataset runs

from langfuse import get_client
 
langfuse = get_client()
 
dataset = langfuse.get_dataset(name="my-eval-dataset")
for item in dataset.items:
    print(item.input, item.expected_output)
 
langfuse.create_dataset(name="new-summarization-tasks")
langfuse.create_dataset_item(
    dataset_name="new-summarization-tasks",
    input={"text": "Long article..."},
    expected_output={"summary": "Short summary."}
)

Observation types

Specify observation types via decorators or context managers.

from langfuse import observe
 
@observe(as_type="tool")
def retrieve_context(query):
    return vector_store.get(query)
from langfuse import get_client
 
langfuse = get_client()
 
with langfuse.start_as_current_observation(as_type="chain", name="retrieval-pipeline") as chain:
    with langfuse.start_as_current_observation(as_type="retriever", name="vector-search") as retriever:
        retriever.update(output={"results": perform_vector_search("user question")})
Was this page helpful?