% Copyright 2019 by Till Tantau
%
% This file may be distributed and/or modified
%
% 1. under the LaTeX Project Public License and/or
% 2. under the GNU Free Documentation License.
%
% See the file doc/generic/pgf/licenses/LICENSE for more details.


\section{Visualizers}
\label{section-dv-visualizers}

\subsection{Overview}

In a data visualization a long stream of data points is \emph{visualized} using
\emph{visualizers}. Recall that it is the job of the axis systems as described
in Section~\ref{section-dv-axes} to determine \emph{where} data points are
visualized. It is the job of the visualizers to determine \emph{how} they are
visualized.

The most basic and common visualizer is the \emph{line visualizer}. It simply
connects subsequent data points by straight lines to indicate either that the
points on these lines interpolate between the real data points or the straight
lines are used to indicate the order in which the data points appear. A
different, more ``conservative'' visualizer is the \emph{scatter visualizer} or
\emph{mark visualizer}, which just places a small mark at each data point. Such
a visualizer does not imply any interpolation or ordering between the data
points.

Visualizers may, however, also be more complicated. For instance, a visualizer
used for a box plot could visualize a data point as a box with a median value,
standard deviation, outliers, and other information; a rectangle visualizer
might visualize data points as larger areas; a projection visualizer might
visualize the projection of data points onto different axes; and so.

Creating a new visualizer is not quite trivial since a new \pgfname\ class
needs to be implemented. Fortunately, using visualizers is much simpler: For
each kind of visualizer there is a key that allows you to create such a
visualizer. You can then use further keys to configure the visualizer and to
connect it to the data.

In a data visualization multiple visualizers may exist at the same time. This
happens in different situations:
%
\begin{itemize}
    \item A data visualization may contain several independent data sets that
        are to be visualized. There might be a line plot, for which a line
        visualizer is used, and also a scatter plot, for which a scatter
        visualizer would be used.

        In this case, for each data point only one visualizer will do anything.
        To achieve this, each data point has an attribute called |visualizer|
        which tells the visualizer objects whether they should ``react'' to the
        data point or not.
    \item A single data point might be visualized several times. For instance,
        a scatter visualizer might draw a mark at the data point's position on
        the page and a projection visualizer might draw, additionally, a mark
        at the projected position.
\end{itemize}


\subsection{Usage}

\subsubsection{Using a Single Visualizer}

The simplest scenario for using visualizers are data visualizations in which
there is only a single data set that is visualized in one style. In this case,
all that needs to be done in order to choose a visualizer is use one of the
options starting with |visualize as ...| together with the |\datavisualization|
command:
%
\begin{codeexample}[preamble={\usetikzlibrary{datavisualization}}]
% Define a data set:
\tikz \datavisualization data group {example} = {
data {
  x, y
  0, 0
  0.5, 2
  1, 2
  1.5, 1.5
  2, 0.5
}};
\tikz \datavisualization [school book axes, visualize as line]        data group {example};
\qquad
\tikz \datavisualization [school book axes, visualize as smooth line] data group {example};
\qquad
\tikz \datavisualization [school book axes, visualize as scatter]     data group {example};
\end{codeexample}

Methods for styling visualizers are discussed in
Section~\ref{section-dv-visualizer-styling}.


\subsubsection{Using Multiple Visualizers}

A data visualization may contain multiple data groups and for each data set we
might wish to use a different visualizer. In this case, we need some way of
telling the data visualization engine to which visualizer should be used with
the different data points.

To solve this problem, you can \emph{name} a visualizer. The visualizer's name
can then both be used to configure the visualizer and also to indicate that
data points ``belong'' to the visualizer.

Naming a visualizer is quite simple: The |visualize as ...| keys actually take
a single parameter, which is the name of the visualizer. For instance, the
following code creates three visualizers, named |sin|, |cos|, and |tan|:
%
\begin{codeexample}[code only]
visualize as line=sin,
visualize as line=cos,
visualize as scatter=tan
\end{codeexample}

(When you just say |visualize as line| without providing a name, the name
|line| is chosen as a default, for |visualize as scatter| the name |scatter| is
the default and so.)

In order to indicate which data points should be visualized by which of these
visualizers, the following key is important:

\begin{key}{/data point/set}
    A visualizer will only act on a data point when its name matches the value
    of this key. Initially, this key is set to the last visualizer created, so
    if there is only one, there is no need to set or worry about this key.
\end{key}

Since the |set| key has the path prefix |/data point|, it can be set like any
other attribute of a data key:
%
\begin{codeexample}[width=7cm,preamble={\usetikzlibrary{datavisualization}}]
\tikz \datavisualization
 [scientific axes=clean,
  visualize as line=sin,
  visualize as line=cos,
  visualize as scatter=tan]
data {
  x, y, set
  0, 0, sin
  1, 1, sin
  2, 0, sin
  3, -1, sin
  4, 0, sin
  0, 1, cos
  1, 0, cos
  0, 0, tan
  1, 1, tan
  2, 2, tan
  3, 4, tan
  2, -1, cos
  3, 0, cos
  4, 1, cos
};
\end{codeexample}

As can be seen, the data points with the same |set| attribute do not need to be
consecutive.

The above method of specifying the visualizer works nicely, but in most cases
it would be more natural to keep the |set| attribute out of the table. This is
easy to achieve by using multiple |data| and using the following key:

\begin{key}{/pgf/data/set=\meta{name}}
    Shorthand for |/data point/set=|\meta{name}.
    %
\begin{codeexample}[width=7cm,preamble={\usetikzlibrary{datavisualization}}]
\tikz \datavisualization
 [scientific axes=clean,
  visualize as line=sin,
  visualize as line=cos]
data [set=sin] {
  x, y
  0, 0
  1, 1
  2, 0
  3, -1
  4, 0
}
data [set=cos] {
  x, y
  0, 1
  1, 0
  2, -1
  3, 0
  4, 1
};
\end{codeexample}
    %
\end{key}

When you need to visualize several similar things in a single plot (like ten
lines that all get visualized by |visualize as line|), it is somewhat
cumbersome having to write this ten times. In this case you can shorten your
code by making use of the |.list| key handler: When you add it to a key, the
``value'' passed to the key is parsed as a list of values. The key is then
executed once for each of these values:
%
\begin{codeexample}[
    width=7cm,
    preamble={\usetikzlibrary{datavisualization.formats.functions}},
]
\tikz \datavisualization
 [scientific axes=clean,
  visualize as line/.list={sin, cos, tan}]
data [set=sin, format=function] {
  var x : interval[0:3*pi];
  func y = sin(\value x r);
}
data [set=cos, format=function] {
  var x : interval[0:3*pi];
  func y = cos(\value x r);
}
data [set=tan, format=function] {
  var x : interval[0:pi/2.2];
  func y = tan(\value x r);
};
\end{codeexample}


\subsubsection{Styling a Visualizer}
\label{section-dv-visualizer-styling}

In order to style a visualizer that has been created using for instance
|visualize as line=|\meta{visualizer name}, you can use the following key:

\begin{key}{/tikz/data visualization/\meta{visualizer name}=\meta{options}}
    For each visualizer, a key of the same name is created with the path prefix
    |/tikz/data visualization|. This key takes the \meta{options} and executes
    them with the path prefix
    %
\begin{codeexample}[code only]
/tikz/data visualization/visualizer options/
\end{codeexample}
    %
    These options are then used to configure the appearance of the current
    visualizer. (This is quite similar to the way options are passed to an axis
    in order to configure the axis.) Possible options include |style|, but also
    |label in legend| and |label in data|. The latter two options are discussed
    in Section~\ref{section-dv-labels-in}, the first option below.
    %
\begin{codeexample}[
    width=7cm,
    preamble={\usetikzlibrary{datavisualization.formats.functions}},
]
\tikz \datavisualization
 [scientific axes=clean,
  visualize as smooth line/.list={sin, cos},
  sin={style=red},
  cos={style=blue}]
data [set=sin, format=function] {
  var x : interval[0:3*pi];
  func y = sin(\value x r);
}
data [set=cos, format=function] {
  var x : interval[0:3*pi];
  func y = cos(\value x r);
};
\end{codeexample}
    %
\end{key}

\begin{key}{/tikz/data visualization/visualizer options/style=\meta{options}}
    The \meta{options} given to this key should be normal \tikzname\ options.
    They will be executed when the visualizer is used.
    %
\begin{codeexample}[
    width=7cm,
    preamble={\usetikzlibrary{datavisualization.formats.functions}},
]
\tikz \datavisualization
 [scientific axes=clean,
  visualize as smooth line=sin,
  sin={style={red, densely dotted}},
  visualize as smooth line=cos,
  cos={style={mark=x}},
]
data [set=sin, format=function] {
  var x : interval[0:3*pi];
  func y = sin(\value x r);
}
data [set=cos, format=function] {
  var x : interval[0:3*pi];
  func y = cos(\value x r);
};
\end{codeexample}

    When you have multiple visualizers in a single data visualization, you can
    use the |style| option with each visualizer to configure their different
    appearances as in the above example. However, it is usually much better
    (and easier) to use a style sheet, see
    Section~\ref{section-dv-style-sheets}.
    %
\begin{codeexample}[
    width=7cm,
    preamble={\usetikzlibrary{datavisualization.formats.functions}},
]
\tikz \datavisualization
 [scientific axes={clean, end labels},
  x axis={label=$x$}, y axis={grid={major also at=0}},
  visualize as smooth line/.list={sin,cos,sin 2,cos 2},
  legend={below, rows=2},
  sin={label in legend={text=$\sin x$}},
  cos={label in legend={text=$\cos x$}},
  sin 2={label in legend={text=$\sin 2x$}},
  cos 2={label in legend={text=$\cos 2x$}},
  style sheet=strong colors]
data [set=sin, format=function] {
  var x : interval[0:3*pi];
  func y = sin(\value x r);
}
data [set=cos, format=function] {
  var x : interval[0:3*pi];
  func y = cos(\value x r);
}
data [set=sin 2, format=function] {
  var x : interval[0:3*pi];
  func y = sin(2*\value x r);
}
data [set=cos 2, format=function] {
  var x : interval[0:3*pi];
  func y = cos(2*\value x r);
};
\end{codeexample}
    %
\end{key}

\begin{key}{/tikz/data visualization/visualizer options/ignore style sheets}
    This option, which should be passed to a visualizer after its creation
    before another visualizer is created, causes style sheets \emph{not} to
    apply to the visualizer (but the |style| option will still have an effect).
    This allows you to create visualizers that are used for special purposes
    and that do not ``take part'' in the usual styling. For instance, a
    visualizer might be used internally to depict a regression line, even
    though the regression line itself should not participate in the usual
    styling by, say, dashing or different coloring.
\end{key}

In addition to the options passed to a visualizer via |style|, the following
also gets executed when a visualizer is used:

\begin{stylekey}{/tikz/data visualization/every visualizer}
    This style is used with every visualizer. Note that it should contain
    normal \tikzname\ keys.
    %
\begin{codeexample}[
    width=7cm,
    preamble={\usetikzlibrary{datavisualization.formats.functions}},
]
\tikz \datavisualization
 [scientific axes=clean,
  every visualizer/.style={dashed},
  visualize as smooth line]
data [format=function] {
  var x : interval[0:3*pi];
  func y = sin(\value x r);
};
\end{codeexample}
    %
\end{stylekey}


\subsection{Reference: Basic Visualizers}

\subsubsection{Visualizing Data Points Using Lines}

\begin{key}{/tikz/data visualizers/visualize as line=\meta{visualizer name} (default line)}
    Creates a new visualizer named \meta{visualizer name}. Basically, this
    visualizer connects all data points for which the |/data point/set|
    attribute equals \meta{visualizer name} by a line that is styled by the
    visualizer's style.

    In more detail, the following happens:
    %
    \begin{enumerate}
        \item A new object is created (of class |plot handler visualizer|) that
            is configured to collect the canvas positions of all data points
            whose |set| attribute equals \meta{visualizer name}.
        \item During the end of the data visualization, \pgfname's plotting
            mechanism (see Section~\ref{section-plots}) is used to plot the
            stream of recorded data points.

            This means that, in principle, all of the plot handlers available
            in \tikzname\ could be used for the visualization (such as the
            |smooth| handler). However, some plot handlers such as, say, the
            |xcomb| are unsuitable as plot handlers since they do not support
            the advanced axis handling done by the data visualization engine.
            Because of this (and also for other reasons), you cannot set the
            plot handler directly, but must use one of the options like
            |straight line|, |smooth line| and others, documented in a moment.
        \item Additionally, plot marks can be drawn at the collected data
            points. Here, all of the options available to \tikzname\ for
            drawing plot marks are available. To configure them, all options
            offered by \tikzname\ for  configuring marks are available such as
            |mark repeat|:
            %
\begin{codeexample}[
    width=7cm,
    preamble={\usetikzlibrary{datavisualization.formats.functions}},
]
\tikz \datavisualization
 [scientific axes=clean,
  visualize as line=my data,
  my data={style={mark=x, mark repeat=3}}]
data [format=function] {
  var x : interval [0:pi] samples 10;
  func y = sin(\value x r);
};
\end{codeexample}
    \end{enumerate}

    The line visualizer also provides a method of dealing with gaps in a line.
    Take for instance the function $f(x) = \tan x$. When this function is
    plotted over the interval $[0,\pi]$, then the function will go to $\pm
    \infty$ at $\pi/2$. When we plot this, we might plot the function in the
    interval $[0,\frac{\pi}{2}-\epsilon]$ and then continue in the interval
    $[\frac{\pi}{2}+\epsilon,\pi]$. However, we do not want the point at
    coordinate $\bigl(\frac{\pi}{2}- \epsilon, \tan(\frac{\pi}{2}-
    \epsilon)\bigr)$ to be connected to the coordinate $\bigl(\frac{\pi}{2}+
    \epsilon, \tan(\frac{\pi}{2}+ \epsilon)\bigr)$ by a line. Rather, there
    should be a ``gap'' or a ``jump'' between these coordinates. To achieve
    this, the following key can be used:
    %
    \begin{key}{/data point/outlier=\meta{value} (default true, initially \normalfont empty)}
        When this key is set to anything non-empty value, a visualizer will
        consider this data point to be an ``outlier''. For a line visualizer
        this means that the point is not shown and that the current line ends
        at the previous data point and a new line starts at the next data
        point.
        %
\begin{codeexample}[
    width=7cm,
    preamble={\usetikzlibrary{datavisualization.formats.functions}},
]
\tikz \datavisualization
 [scientific axes=clean, x axis={grid={major at=(pi/2)}},
  visualize as smooth line]
data [format=function] {
  var x : interval[0:pi/2-0.1];
  func y = tan(\value x r);
}
data point [outlier]
data [format=function] {
  var x : interval[pi/2+0.1:pi];
  func y = tan(\value x r);
};
\end{codeexample}
    \end{key}
\end{key}

\begin{key}{/tikz/data visualizers/visualize as smooth line=\meta{visualizer name} (default line)}
    A shorthand |visualize as line=|\meta{visualizer name} followed
    \meta{visualizer name}|=smooth line|.
\end{key}

\begin{key}{/tikz/data visualization/visualizer options/straight line}
    Causes the data points to be connected by straight lines.
    %
\begin{codeexample}[preamble={\usetikzlibrary{datavisualization.formats.functions}}]
\tikz [scale=.55] \datavisualization
 [scientific axes=clean, all axes={ticks=few},
  visualize as smooth line=my data,  my data={straight line}]
data [format=function] {
  var t : interval [0:4] samples 5;
  func x = cos(\value t r);
  func y = sin(\value t r);
};
\end{codeexample}
    %
\end{key}

\begin{key}{/tikz/data visualization/visualizer options/straight cycle}
    Causes the data points to be connected by a polygon.
    %
\begin{codeexample}[preamble={\usetikzlibrary{datavisualization.formats.functions}}]
\tikz [scale=.55] \datavisualization
 [scientific axes=clean, all axes={ticks=few},
  visualize as smooth line=my data,  my data={straight cycle}]
data [format=function] {
  var t : interval [0:4] samples 5;
  func x = cos(\value t r);
  func y = sin(\value t r);
};
\end{codeexample}
    %
\end{key}

\begin{key}{/tikz/data visualization/visualizer options/polygon}
    This is an alias for |straight cycle|.
\end{key}

\begin{key}{/tikz/data visualization/visualizer options/smooth line}
    Causes the data points to be connected by a line that is smoothed at the
    joins:
    %
\begin{codeexample}[preamble={\usetikzlibrary{datavisualization.formats.functions}}]
\tikz [scale=.55] \datavisualization
 [scientific axes=clean, all axes={ticks=few},
  visualize as smooth line=my data,  my data={smooth line}]
data [format=function] {
  var t : interval [0:4] samples 5;
  func x = cos(\value t r);
  func y = sin(\value t r);
};
\end{codeexample}
    %
\end{key}

\begin{key}{/tikz/data visualization/visualizer options/smooth cycle}
    Causes the data points to be connected by a circular line that is smoothed
    at the joins:
    %
\begin{codeexample}[preamble={\usetikzlibrary{datavisualization.formats.functions}}]
\tikz [scale=.55] \datavisualization
 [scientific axes=clean, all axes={ticks=few},
  visualize as smooth line=my data,  my data={smooth cycle}]
data [format=function] {
  var t : interval [0:4] samples 5;
  func x = cos(\value t r);
  func y = sin(\value t r);
};
\end{codeexample}
    %
\end{key}

\begin{key}{/tikz/data visualization/visualizer options/gap line}
    This key causes the data points to be connected by lines that ``do not
    quite touch'' the data points. This is implemented by using the
    |\pgfplothandlergaplineto|, see Section~\ref{section-plot-gapped}.
    %
\begin{codeexample}[preamble={\usetikzlibrary{datavisualization.formats.functions}}]
\tikz [scale=.55] \datavisualization
 [scientific axes=clean, all axes={ticks=few},
  visualize as smooth line=my data,  my data={gap line}]
data [format=function] {
  var t : interval [0:4] samples 5;
  func x = cos(\value t r);
  func y = sin(\value t r);
};
\end{codeexample}
    %
\end{key}

\begin{key}{/tikz/data visualization/visualizer options/gap cycle}
    Like |gapped line|, only with a cycle:
    %
\begin{codeexample}[preamble={\usetikzlibrary{datavisualization.formats.functions}}]
\tikz [scale=.55] \datavisualization
 [scientific axes=clean, all axes={ticks=few},
  visualize as smooth line=my data,  my data={gap cycle}]
data [format=function] {
  var t : interval [0:4] samples 5;
  func x = cos(\value t r);
  func y = sin(\value t r);
};
\end{codeexample}
    %
\end{key}

\begin{key}{/tikz/data visualization/visualizer options/no lines}
    Suppresses the line. This option only makes sense when the |mark| option is
    used.
    %
\begin{codeexample}[preamble={\usetikzlibrary{datavisualization.formats.functions}}]
\tikz [scale=.55] \datavisualization
 [scientific axes=clean, all axes={ticks=few},
  visualize as smooth line=my data,  my data={no lines, style={mark=x}}]
data [format=function] {
  var t : interval [0:4] samples 5;
  func x = cos(\value t r);
  func y = sin(\value t r);
};
\end{codeexample}
    %
\end{key}


\subsubsection{Visualizing Data Points Using Marks}

\begin{key}{/tikz/data visualizers/visualize as scatter=\meta{visualizer name} (default scatter)}
    A shorthand  |visualize as line=|\meta{visualizer name} followed
    \meta{visualizer name}|=no lines| and setting the |style| of the visualizer
    so that is will use |mark=x| (plus some size adjustments) to draw marks at
    the data points.
    %
\begin{codeexample}[
    width=7cm,
    preamble={\usetikzlibrary{datavisualization.formats.functions}},
]
\tikz \datavisualization
 [scientific axes=clean,
  visualize as scatter]
data [format=function] {
  var x : interval [0:pi] samples 10;
  func y = sin(\value x r);
};
\end{codeexample}
    %
\end{key}


\subsection{Advanced: Creating New Visualizers}

Creating a new visualizer is a two-stage process that does, unfortunately,
require in-depth knowledge of the data visualization backend:
%
\begin{enumerate}
    \item First, you need to create a new class using |\pgfooclass| whose
        instances react to the signal |visualize datapoint signal|. This
        requires detailed knowledge of the data visualization engine, see
        Section~\ref{section-dv-backend}.
    \item Second, you should provide keys on the \tikzname\ level for creating
        the necessary objects. These keys invoke the key |new visualizer|
        internally.
\end{enumerate}

\begin{key}{/tikz/data visualization/new visualizer=\marg{name}\marg{options}\marg{legend entry options}}
    This key configures a new visualizer named \meta{name}. This entails the
    following actions:
    %
    \begin{itemize}
        \item The key |/tikz/data visualization/|\meta{name} is created. As
            described earlier, this key can be used to pass for instance
            |style| options to the visualizer.
        \item The style key
            |/tikz/data visualization/visualizers/|\meta{name}|/styling| is
            created and made empty. This is the key in which the |style| key
            will store the options passed to the visualizer.
        \item The style key
            |/tikz/data visualization/visualizers/|\meta{name}|/label in legend options|
            is set to \meta{legend entry options}. These options are used to
            configure how the visualizer should be rendered in a legend, see
            Section~\ref{section-dv-legend-entries} for details.
        \item The key |/data point/set/|\meta{name} is set to a number that is
            increased for each visualizer in the current data visualization.
            This number is important for style sheets, see
            Section~\ref{section-dv-style-sheets}.
        \item The key |/data point/|\meta{name}|/execute at begin| is set to
            code that creates a |{scope}| that executes the following styles as
            options:
            %
            \begin{enumerate}
                \item The \meta{options} passed to the |new visualizer| key.
                \item The |every visualizer| style.
                \item The styling from the currently active style sheets, see
                    Section~\ref{section-dv-style-sheets}.
                \item The styling stored in the |styling| key mentioned above.
            \end{enumerate}
            %
        \item The key |/data point/|\meta{name}|/execute at end| is set to code
            that will finish all paths that may have been created by the
            visualizer and closes the scope.
    \end{itemize}

    All of the above mean the following in practice:
    %
    \begin{itemize}
        \item Inside a new |visualize as ...| key, you pass the name of the
            to-be-created to |new visualizer| as the first parameter and any
            special default styling setup of the visualizer as the second
            parameter.
        \item The new |visualize as ...| key should also create a visualizer
            object using |new object|.
        \item When this object finally is about to create the actual
            visualization, it should surround the code by invoking the code
            stored in the |execute at begin| and the |execute at end| keys of
            the visualizer.
    \end{itemize}

    Everything else is usually taken care of by the |new visualizer| key
    automatically.
\end{key}

As an example, let us create a simple visualizer that creates a circle whose
radius is dictated by the |radius| attribute. To keep things simple in this
example, this attribute cannot be configured.

First, we need the visualizer class. For this example I have boiled it down to
a minimum:
%
\begin{codeexample}[code only]
\pgfooclass{circle visualizer}
{
  % Stores the name of the visualizer. This is needed for filtering and configuration
  \attribute name;

  % The constructor. Just setup the attribute.
  \method circle visualizer(#1) { \pgfooset{name}{#1} }

  % Connect to visualize signal.
  \method default connects() {
    \pgfoothis.get handle(\me)
    \pgfkeysvalueof{/pgf/data visualization/obj}.connect(\me,visualize,visualize datapoint signal)
  }

  % This method is invoked for each data point. It checks whether the data point belongs to the correct
  % visualizer and, if so, calls the macro \dovisualization to do the actual visualization.
  \method visualize() {
    \pgfdvfilterpassedtrue
    \pgfdvnamedvisualizerfilter
    \ifpgfdvfilterpassed
      \dovisualization
    \fi
  }
}
\end{codeexample}

The |\dovisualization| method must now do the correct visualization.
%
\begin{codeexample}[code only]
\def\dovisualization{
  \pgfkeysvalueof{/data point/\pgfoovalueof{name}/execute at begin}
    \pgfpathcircle{\pgfpointdvdatapoint}{\pgfkeysvalueof{/data point/radius}}
    % \pgfusepath is done by |execute at end|
  \pgfkeysvalueof{/data point/\pgfoovalueof{name}/execute at end}
}
\end{codeexample}

Finally, we create a |visualize as| key:
%
\begin{codeexample}[code only]
\tikzdatavisualizationset{
  visualize as circle/.style={
    new object={
      when=after survey,
      store=/tikz/data visualization/visualizers/#1,
      class=circle visualizer,
      arg1=#1
    },
    new visualizer={#1}{%
      color=visualizer color,        % a color setup by the style sheet
      every path/.style={fill,draw}, % fill and draw the circle by default,
    }{}, % let's ignore legends in this example
    /data point/set=#1
  },
  visualize as circle/.default=circle
}
\end{codeexample}

Now, let's see how this works:

% TODOsp: codeexamples: This stuff is all needed for the next `codeexample`
%         but cannot be stored (simply) in `setup code`, `preample` or `pre`
\pgfooclass{circle visualizer} {
  % Stores the name of the visualizer. This is needed for filtering
  % and configuration
  \attribute name;

  % The constructor. Just setup the attribute.
  \method circle visualizer(#1) { \pgfooset{name}{#1} }

  % Connect to visualize signal.
  \method default connects() {
    \pgfoothis.get handle(\me)
    \pgfkeysvalueof{/pgf/data visualization/obj}.connect(\me,visualize,visualize datapoint signal)
  }

  % This method is invoked for each data point. It checks whether the
  % data point belongs to the correct visualizer and, if so, calls the
  % macro \dovisualization to do the actual visualization.
  \method visualize() {
    \pgfdvfilterpassedtrue
    \pgfdvnamedvisualizerfilter
    \ifpgfdvfilterpassed
      \dovisualization
    \fi
  }
}

\def\dovisualization{
  \pgfkeysvalueof{/data point/\pgfoovalueof{name}/execute at begin}
    \pgfpathcircle{\pgfpointdvdatapoint}{\pgfkeysvalueof{/data point/radius}}
    % \pgfusepath is done by |execute at end|
  \pgfkeysvalueof{/data point/\pgfoovalueof{name}/execute at end}
}

\tikzdatavisualizationset{
  visualize as circle/.style={
    new object={
      when=after survey,
      store=/tikz/data visualization/visualizers/#1,
      class=circle visualizer,
      arg1=#1
    },
    new visualizer={#1}{%
      color=visualizer color,        % a color setup by the style sheet
      every path/.style={fill,draw}, % fill and draw the circle by default,
    }{},
    /data point/set=#1
  },
  visualize as circle/.default=circle
}

\begin{codeexample}[width=7cm,preamble={\usetikzlibrary{datavisualization}}]
\tikz \datavisualization [
  scientific axes=clean,
  visualize as circle/.list={a, b, c},
  style sheet=strong colors]
data [set=a] {
  x, y, radius
  0, 0, 2pt
  1, 1, 3pt
  1, 2, 3pt
  2, 0, 1pt
}
data [set=b] {
  x, y, radius
  0.5, 0.5, 5pt
  1, 1.5, 2pt
  1, 2.5, 3pt
  0, 2, 4pt
}
data [set=c] {
  x, y, radius
  3, 2, 3pt
  2.5, 0.5, 4pt
};
\end{codeexample}