← browse the library
Model inference serving preview
template

Model inference serving

Editable deployment figure composing the cloud, server, gpu, and database icons: client requests flow over the internet through a load balancer to GPU-backed inference nodes that read from a feature store, with a dashed inference-cluster boundary. Parametric spacing and labels.

This template ships an edit contract (in its meta.json) that the repo-wide using-opentikz skill reads to edit it reliably — the parameters and safe operations are listed below.

idinference-serving
typetemplate
domainsystems, ml
venueOSDI, NSDI, MLSys
requirestikz, arrows.meta, backgrounds, fit, positioning, shapes.symbols
licenseCC0-1.0
authorOpenTikZ contributors

inferenceservingdeploymentload balancergpucloudfeature storesystem diagram

Download SVG
template.tex
\documentclass[border=10pt]{standalone}

% --- packages (mirror these in figure.meta.json "requires") ---
\usepackage{tikz}
\usetikzlibrary{positioning, arrows.meta, shapes.symbols, fit, backgrounds}

% --- palette (canonical source: reference/color-palettes/color-palettes.md; light variant) ---
\definecolor{otblue}{HTML}{0072B2}
\definecolor{otorange}{HTML}{E69F00}
\definecolor{otteal}{HTML}{009E73}
\definecolor{otpurple}{HTML}{CC79A7}
\definecolor{otgray}{HTML}{5A5A5A}

% ===== reusable icon sub-pictures (adapted from icons/) ==================
\newcommand{\cloudpic}{%
  \begin{tikzpicture}
    \node[cloud, cloud puffs=11, cloud puff arc=110, aspect=2.2,
      draw=otblue!75!black, fill=otblue!15, line width=0.7pt,
      minimum width=2cm, minimum height=1.2cm]{};
  \end{tikzpicture}}
\newcommand{\serverpic}{%
  \begin{tikzpicture}[line width=0.55pt]
    \foreach \i in {0,1,2}{
      \filldraw[draw=otblue!75!black, fill=otblue!12, rounded corners=1pt]
        (0,\i*0.34) rectangle (1.0,\i*0.34+0.26);
      \filldraw[otteal] (0.13,\i*0.34+0.13) circle[radius=0.04];
    }
  \end{tikzpicture}}
\newcommand{\gpupic}{%
  \begin{tikzpicture}[line width=0.5pt]
    \filldraw[draw=otteal!70!black, fill=otteal!12, rounded corners=1pt] (0,0) rectangle (1.0,0.62);
    \begin{scope}[shift={(0.31,0.31)}]
      \filldraw[draw=otteal!75!black, fill=otteal!22] (0,0) circle[radius=0.17];
      \foreach \a in {0,90,180,270}{\draw[otteal!70!black, line width=0.4pt] (0,0)--(\a:0.15);}
    \end{scope}
    \node[font=\sffamily\tiny\bfseries, text=otteal!80!black] at (0.74,0.31){GPU};
  \end{tikzpicture}}
\newcommand{\dbpic}{%
  \begin{tikzpicture}[line width=0.7pt]
    \def\rx{0.55}\def\ry{0.20}\def\bh{1.05}
    \filldraw[draw=otteal!75!black, fill=otteal!15]
      (-\rx,0)--(-\rx,-\bh) arc[start angle=180,end angle=360,x radius=\rx,y radius=\ry]
      --(\rx,0) arc[start angle=0,end angle=180,x radius=\rx,y radius=\ry]--cycle;
    \filldraw[draw=otteal!75!black, fill=otteal!25] (0,0) ellipse[x radius=\rx,y radius=\ry];
    \draw[draw=otteal!75!black, line width=0.5pt]
      (-\rx,-0.5) arc[start angle=180,end angle=360,x radius=\rx,y radius=\ry];
  \end{tikzpicture}}
% ========================================================================

\begin{document}
% ==== parameters (edit these) ============================================
\def\colsep{3.0}    % horizontal gap between stages: client -> cloud -> lb -> nodes -> store (cm)
\def\rowsep{1.5}    % vertical offset of the stacked inference nodes (cm)
% labels
\def\clientlabel{Client}
\def\internetlabel{Internet}
\def\lblabel{Load\\balancer}
\def\nodeonelabel{Inference node 1}
\def\nodetwolabel{Inference node 2}
\def\storelabel{Feature store}
\def\featurelabel{features}
\def\clusterlabel{Inference cluster}
\def\titlelabel{Model inference serving}
% =========================================================================

\begin{tikzpicture}[
    >={Stealth[length=2.4mm]},
    icon/.style={inner sep=1pt},
    compbox/.style={draw=otgray!55, rounded corners=3pt, fill=otgray!8,
      align=center, font=\sffamily\small, minimum height=0.9cm, inner sep=5pt},
    nodebox/.style={draw=otgray!45, rounded corners=4pt, fill=otgray!4, inner sep=5pt},
    slbl/.style={font=\sffamily\footnotesize, text=otgray},
    req/.style={draw=otgray!75, line width=1pt, ->},
    read/.style={draw=otteal!70!black, line width=1pt, ->, dashed},
  ]

  \node[compbox] (client) at (0,0) {\clientlabel};
  \node[icon] (cloud) at (\colsep,0) {\cloudpic};
  \node[slbl, below=0pt of cloud] {\internetlabel};
  \node[compbox] (lb) at (2*\colsep,0) {\lblabel};

  % GPU-backed inference nodes (server + gpu composed in one box)
  \node[nodebox] (node1) at (3*\colsep,\rowsep)  {\serverpic\hspace{4pt}\gpupic};
  \node[nodebox] (node2) at (3*\colsep,-\rowsep) {\serverpic\hspace{4pt}\gpupic};
  \node[slbl, below=1pt of node1] (n1lab) {\nodeonelabel};
  \node[slbl, below=1pt of node2] (n2lab) {\nodetwolabel};

  \node[icon] (db) at (4.3*\colsep,0) {\dbpic};
  \node[slbl, below=2pt of db] {\storelabel};

  % request path
  \draw[req] (client.east) -- (cloud.west);
  \draw[req] (cloud.east) -- (lb.west);
  \draw[req] (lb.east) -- (node1.west);
  \draw[req] (lb.east) -- (node2.west);
  % feature reads
  \draw[read] (node1.east) -- (db.west);
  \draw[read] (node2.east) -- (db.west);
  \node[slbl, text=otteal!70!black] at (3.85*\colsep,0.55) {\featurelabel};

  % inference cluster boundary (fits the nodes and their labels)
  \begin{scope}[on background layer]
    \node[draw=otgray!50, dashed, rounded corners=6pt, fill=otgray!4, inner sep=10pt,
          fit=(node1)(node2)(n1lab)(n2lab)] (cluster) {};
    \node[slbl, anchor=south west] at ([xshift=2pt, yshift=2pt]cluster.north west) {\clusterlabel};
  \end{scope}

  % title
  \node[font=\sffamily\bfseries] at (2.17*\colsep,{\rowsep+1.7}) {\titlelabel};

\end{tikzpicture}
\end{document}

Edit contract — how the AI edits this template

using-opentikz skill →
Parameters & safe edit operations

Parameters

\colsephorizontal gap between request-path stages (client -> cloud -> lb -> nodes -> store) default 3.0
\rowsepvertical offset of the two stacked inference nodes from the centerline default 1.5
\clientlabelclient box label
\internetlabellabel under the cloud
\lblabelload-balancer label (use \\ for two lines)
\nodeonelabeltop inference-node label
\nodetwolabelbottom inference-node label
\storelabelfeature-store label
\featurelabellabel on the feature-read edges
\clusterlabeldashed inference-cluster boundary label
\titlelabelfigure title

Node naming

fixed semantic names along the request path: (client) (cloud) (lb); inference nodes (node1)(node2) with their labels (n1lab)(n2lab); (db) feature store; (cluster) dashed boundary box. New nodes get role-based names and join the (cluster) fit list if inside the boundary

Operations

  • rename-node — edit the matching label macro (\clientlabel, \lblabel, \node*label, \storelabel, \titlelabel)
  • add-inference-node — declare \node[nodebox] (node3) at (3*\colsep, y) {\serverpic\hspace{4pt}\gpupic} plus its slbl label, route a req edge from (lb.east) and a read edge to (db.west), then add the node and its label to the (cluster) fit list
  • change-spacing — edit \colsep (stage gap) or \rowsep (node offset); node and store x-positions are multiples of \colsep, title position is derived from \colsep/\rowsep
  • swap-icon — replace a node body picture macro (\cloudpic, \serverpic, \gpupic, \dbpic) with another icon sub-picture adapted from icons/
  • recolor — change the palette name in the req/read styles or an icon macro; keep request vs feature-read flows in distinct colors; never inline hex

Use it

The file compiles on its own (\documentclass{standalone}). Drop it into your project and \input it, or copy the tikzpicture into your figure. Colours come from the shared palette defined in the preamble — edit those named colours, not raw hex.

Graphic content is CC0 1.0 (public domain) — reuse freely, no attribution required.