Advanced features
Use these patterns to harden your Langfuse instrumentation, protect sensitive data, and adapt the SDKs to complex environments. Each section shows the equivalent Python and JS/TS setup side-by-side.
Mask sensitive data
Provide a mask function when instantiating the client to scrub inputs, outputs, and metadata before they leave your infrastructure.
from langfuse import Langfuse
import re
def pii_masker(data: any, **kwargs) -> any:
if isinstance(data, str):
return re.sub(r"[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\.[a-zA-Z0-9-.]+", "[EMAIL_REDACTED]", data)
elif isinstance(data, dict):
return {k: pii_masker(data=v) for k, v in data.items()}
elif isinstance(data, list):
return [pii_masker(data=item) for item in data]
return data
langfuse = Langfuse(mask=pii_masker)Logging & debugging
The SDK uses Python’s standard logging under the "langfuse" logger.
import logging
langfuse_logger = logging.getLogger("langfuse")
langfuse_logger.setLevel(logging.DEBUG)Alternatively set debug=True or LANGFUSE_DEBUG="True" when instantiating the client.
Sampling
Sample traces directly on the client.
from langfuse import Langfuse
langfuse = Langfuse(sample_rate=0.2)Filter exported spans
Exclude spans from specific instrumentation scopes.
from langfuse import Langfuse
langfuse = Langfuse(blocked_instrumentation_scopes=["sqlalchemy", "psycopg"])Filtering parent spans may result in orphaned children in Langfuse.
Tracer provider isolation
Create a dedicated OTEL TracerProvider for Langfuse spans.
from opentelemetry.sdk.trace import TracerProvider
from langfuse import Langfuse
langfuse_tracer_provider = TracerProvider()
langfuse = Langfuse(tracer_provider=langfuse_tracer_provider)
langfuse.start_span(name="isolated").end()TracerProviders still share the same context, so mixing providers may create orphaned spans.
Multi-project setups
Instantiate dedicated clients per project and pass the langfuse_public_key context where needed.
from langfuse import Langfuse, observe
project_a = Langfuse(public_key="pk-lf-project-a-...", secret_key="sk-lf-project-a-...")
project_b = Langfuse(public_key="pk-lf-project-b-...", secret_key="sk-lf-project-b-...")
@observe
def process_data_for_project_a(data, langfuse_public_key="pk-lf-project-a-..."):
return {"processed": data}
@observe
def process_data_for_project_b(data, langfuse_public_key="pk-lf-project-b-..."):
return {"processed": data}You can also route OpenAI or LangChain integrations by passing langfuse_public_key on each call.
Environment-specific considerations
Thread pools and multiprocessing
Use the OpenTelemetry threading instrumentor so context flows across worker threads.
from opentelemetry.instrumentation.threading import ThreadingInstrumentor
ThreadingInstrumentor().instrument()For multiprocessing, follow the OpenTelemetry guidance. If you use Pydantic Logfire, enable distributed_tracing=True.
Distributed tracing
Prefer native OTEL propagation when linking services. The trace_context argument should be a last resort because it forces root-span semantics server-side.
Time to first token (TTFT)
from langfuse import get_client
import datetime, time
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="generation", name="TTFT-Generation") as generation:
time.sleep(3)
generation.update(
completion_start_time=datetime.datetime.now(),
output="some response",
)
langfuse.flush()Self-signed TLS certificates
OTEL_EXPORTER_OTLP_TRACES_CERTIFICATE="/path/to/my-selfsigned-cert.crt"import os, httpx
from langfuse import Langfuse
httpx_client = httpx.Client(verify=os.environ["OTEL_EXPORTER_OTLP_TRACES_CERTIFICATE"])
langfuse = Langfuse(httpx_client=httpx_client)Understand the security implications before trusting self-signed certificates.
Evaluation & scoring
Use observation methods, context-aware helpers, or low-level APIs to submit scores.
from langfuse import get_client
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="generation", name="summary_generation") as gen:
gen.update(output="summary text...")
gen.score(name="conciseness", value=0.8, data_type="NUMERIC")
gen.score_trace(name="user_feedback_rating", value="positive", data_type="CATEGORICAL")
with langfuse.start_as_current_observation(as_type="span", name="complex_task"):
langfuse.score_current_span(name="task_component_quality", value=True, data_type="BOOLEAN")langfuse.create_score(
name="fact_check_accuracy",
value=0.95,
trace_id="abcdef1234567890abcdef1234567890",
observation_id="1234567890abcdef",
data_type="NUMERIC",
comment="Source verified for 95% of claims.",
)Dataset runs
from langfuse import get_client
langfuse = get_client()
dataset = langfuse.get_dataset(name="my-eval-dataset")
for item in dataset.items:
print(item.input, item.expected_output)
langfuse.create_dataset(name="new-summarization-tasks")
langfuse.create_dataset_item(
dataset_name="new-summarization-tasks",
input={"text": "Long article..."},
expected_output={"summary": "Short summary."}
)Observation types
Specify observation types via decorators or context managers.
from langfuse import observe
@observe(as_type="tool")
def retrieve_context(query):
return vector_store.get(query)from langfuse import get_client
langfuse = get_client()
with langfuse.start_as_current_observation(as_type="chain", name="retrieval-pipeline") as chain:
with langfuse.start_as_current_observation(as_type="retriever", name="vector-search") as retriever:
retriever.update(output={"results": perform_vector_search("user question")})