INTELLIGENT ROUTING FOR LLM SYSTEMS

It`s just a good thinking game 🎲

XUNZHUO

FOCUS INTELLIGENT ROUTING / LLM SYSTEMS
WORK OPEN SOURCE / RESEARCH / INFRASTRUCTURE

I study intelligent routing and system intelligence for LLMs.

My current work connects semantic routing with the Workload-Router-Pool architecture: using workload signals, policy loops, and coordinated serving pools to optimize quality, cost, latency, privacy, and safety together.

Latest log: The Second Half of LLM Routing
  • 01:00

    SEMANTIC ROUTING

    Signal-driven orchestration for model, tool, and policy selection.

    • Reasoning-aware routing strategies
    • Latency, cost, and quality controls
    • Decision loops that optimize system behavior
  • 02:35

    OPEN SOURCE SYSTEMS

    Pragmatic infrastructure work across gateways, inference stacks, and control planes.

    • Gateway and Envoy ecosystem leadership
    • Inference platform design and operations
    • Portable patterns for production AI systems
  • 03:20

    RESEARCH AND STANDARDS

    Writing, evaluation, and community work that reframes routing as a systems problem.

    • Research on routing, search, and reasoning
    • Community roles across CNCF and Kubernetes
    • Standards work for AI gateway architecture
Primary Track

vLLM Semantic Router

Co-Founder

Signal-driven decision routing for mixture-of-modality deployments.

Node 01

Envoy Gateway

Steering Committee and Maintainer

Manages Envoy Proxy as a standalone or Kubernetes-based application gateway.

Node 02

Envoy AI Gateway

Maintainer

Manages unified access to generative AI services built on Envoy Gateway.

Node 03

vLLM AIBrix

Maintainer

Cost-efficient and pluggable infrastructure components for GenAI inference.

Node 04

Higress

Approver

AI gateway and AI-native API gateway.

Node 05

Istio

Maintainer

Connects, secures, controls, and observes services.

Node 06

Kiali

Maintainer

Observability console for Istio with service mesh.

Paper Archive

Recent papers on routing, systems, and inference optimization.

15 papers / 2025-2026

A selection from recent work on semantic routing, agent behavior, and AI infrastructure. More papers are collected on Works.

Co-Chair

Kubernetes AI Gateway WorkGroup

Leading the community effort to define standards for AI Gateway in the Kubernetes ecosystem.

Fall 2023 Ambassador

CNCF Ambassador

Representing and promoting Cloud Native Computing Foundation projects and values globally.

2024 Program

Linux Foundation APAC Open Source Evangelist

Advocating for open source adoption and best practices across the Asia-Pacific region.

KubeCon 2024 Hong Kong

KubeCon Program Committee

Reviewing and selecting talks for one of the largest cloud-native conferences.