Yoix / YDAT,
New YDAT Instantiation Available
YDATCLF
-
for both
CLF
formats (common log or combined log)
-
helps you make sense of
acticvity on your web server
Interested in a 3D YDAT? Let us know.
the Yoix Data Analysis Tool
YDAT is a data visualization module for Yoix.
It consists of a data manager, graph, histogram and event plot canvases, and axis, palette and other auxiliary components.
It is included in the Yoix jar and is also licensed as open source.
The Data Manager
The function of the data manager is primarily to maintain an in-core database that is used to coordinate any and all data visualization components.
The manager responds to inputs either from the visualization components or from programmatic directives and provides input to all interested YDAT components.
As data records are selected or deselected in one visualization component, the data manager is informed and the other visualization components are updated.
In some components, the data-record to visualization-element mapping might be one-to-one;
in others, it can be many-to-one, in which case the appropriate proportion of an element that is selected or deselected is indicated
by adjustments to background fill shading.
For example, consider records containing cancer rate data with patient county of residence and gender information.
Figure 1a shows YDAT histograms of patient counts by county and gender.
When the male gender histogram bar is deselected (figure 1b), the data manager causes the county
patient counts to be adjusted since only female patient records are now selected.
A secondary function of the data manager is to keep track of the
various visualization components and to coordinate requests coming
from them so that deadlock situations do not arise.
The Visualization Components: Event Plots, Histograms and Graphs
Currently, there are three types of visualization components: event plots, histograms and graphs.
As time permits or the need arises, additional linked components, such as pie charts or scatter plots, may be added.
Figure 2 illustrates some of the features discussed in this section and the next.
Event plots are used to display individual data observations having a certain distinct value in one dimension and some magnitude in another dimension.
In the case of call detail, as an example, the observations are telephone calls occurring at some point in time and with some duration so the display
consists of time along the x-axis and duration along the y-axis with the observations drawn as vertical lines.
Mouse interactions allow users to deselect or reselect the data lines.
A deselected line is made invisible, and its contribution to other visualization components is drawn in grey (greyed out).
Data lines may also be deselected by adjusting the axes range.
Histograms are used to cluster data observations within various categories.
In other words and continuing with the call detail example, a category for a call
might be the area code in which the call terminated.
In this case, a histogram bar is drawn for each area code in which calls terminate
and the length of each bar is determined by the number of calls
terminating in the particular area code.
As calls are deselected, for example via the event plot, the appropriate proportion of the
bar is greyed out.
The display table contains data observations chosen from the event plot by sweeping out a rectangle.
In current versions of YDAT, the data display table is an extended form of a JTable.
Any data observation line whose top point falls within the rectangle is shown in the display table, with a color coded column corresponding to the
shading used in the event plot.
The color bars all have the same fixed width since each bar only contains one observation.
The text columns give detailed information about the observation.
In current versions of YDAT, the column headers are buttons that can be used to sort the table.
A left-button (button 1) click causes an ascending sort (and the column header takes on a green shading),
a right-button (either button 2 or 3) click causes as descending sort (and the column header takes on a red shading).
Additional sort columns can be added by depressing the control (CTRL) key while performing the button click.
Additional sort columns are shaded in increasingly darker colorings so that the order of the sorting can be determined visually.
Requesting the same sort on an already sorted column, clears the sort in that column (the control key needs to be depressed to prevent the click from
being interpreted as an entirely new sort request).
Incidentally, this sort behavior is available on any Yoix table by the addition of a small amount of Yoix code, which can specifically be found in the
YWAIT constructors file.
YDAT graphs display nodes and edges whose meaning depends on the data.
In the Figure 2 call detail example, nodes represent telephone numbers and directed edges represent traffic flows,
originating at one number and terminating at another with some flows being bidirectional.
Node shapes can be any general path with optional labeling;
edges can be lines or splines with or without arrowheads and labeling.
Nodes, edges or both can be deselected and reselected.
Deselected elements are greyed out or, optionally, made invisible both to reduce graph clutter and to aid in information extraction.
An option to deselect attached edges when a node is deselected is also available.
In addition, nodes can be dragged to new positions.
Of course attached edges are repositioned accordingly.
A record of repositioned nodes and their original positions is maintained to facilitate undoing any repositioning.
As an option, node fill shading either can be similar to histogram shading in that deselected data observations cause the appropriate proportion of
fill shading, in one dimension, to be greyed out (see figure 2) or node shading can be mapped to a data value (see figure 3).
Edges grey out when the represented flows go to zero.
Clicking on a graph element can trigger a program action.
Moreover, since there are capabilities in Yoix for working with graphs, program actions can cause behavior based on graph structure
like deselecting all nodes within two degrees of separation from a given node.
The fact that nodes shapes are described by a general path allows the graph display component to be used for other types of visual displays such as maps.
Figure 3 shows the graph display component being used to display a county map of New Mexico.
In that example, which is available as one of the demo programs in the Yoix distribution,
node fill shading is used to indicate data volumes within the counties.
Auxiliary Components
There are several auxiliary components in the data analysis module.
The two most notable are the axis and palette components.
The axis component not only provides labeling for either the X or Y axis of a plot, but a built in slider mechanism also allows the
data range end-points to be adjusted, which is useful for interactively restricting a data set to a subset of particular interest.
Moreover, this subset range represented by the slider can be dragged across the length of the axis (using button 2) so that one can observe, for example,
how the characteristics of a 24-hour period of data changes continuously over the course of several months.
The palette component provides a way to specify how to color a set of data values over the range of those values.
Color values can be generated from a fixed set of array values, or by picking values over a range of the color spectrum,
or by whatever programmatic means one cares to devise.
Using YDAT
Currently YDAT is made available to Yoix scripts by means of the following import statement:
import yoix.ydat.*;
As the previous section indicated, YDAT has a great many features and trying to write a Yoix program from scratch to use them can be somewhat
daunting, consequently a configuration file mechanism has been constructed to make it easier to create instantiations of YDAT.
All the examples you see on this page were constructed using that mechanism.
The configuration file is itself a Yoix script, but it mainly consists of Arrays that are organized in a manner suggestive of tables, which is how the
downstream scripts effectively interpret them.
In practice, most users will initially access YDAT as a stand-alone application by running the start-up script in the download package.
The script kicks off Yoix, loads a set of six Yoix scripts that configure an instantiation of YDAT based
on a configuration file that users construct based on their needs.
Included in the download package are two specific demo configuration files and a general configuration file for display of
DOT format graphs (see the GraphViz website
for information about the DOT graph description language).
An alternative way of running YDAT is as a plugin to a Yoix application.
The YWAIT package shows how to run YDAT as a plugin.
Demo1, included in the download package, displays some publicly available brain cancer incidence data for most of the counties in the
state of New Mexico.
Screen shot of the demo are shown in Figures 3, 4 and 5.
To run this demo, execute:
ydat_demo1
Figure 5 again shows the New Mexico data.
In this case, only the years 1982 through 1986, inclusive, have been selected, the event plot is displayed as lines connecting data points,
all displays are colored based on the gender information, the histograms are stacked based on gender, the county histogram is sorted by the
selected counts rather than the total counts and the sort order of both columns is decreasing, which shows how the header column coloring is
darker for the second sort index.
Demo2 displays some online auction sales information for various vehicles.
A screen shot of the demo is below.
To run this demo, execute:
ydat_demo2
Demo3 displays a fabricated network example.
A screen shot of the demo is above in Figure 2.
To run this demo, execute:
ydat_demo3
Graphs expressed using the DOT graph description language of GraphViz can
generate input suitable for YDAT by using the following options to the layout engine (in this example, the layout engine is
dot, but the other layout engines can also be used and would take the same options):
dot -Txdot -y < input_dot_file > output_xdot_file
YDAT can then be used to display the graph using:
ydot output_xdot_file
Note that the command above is spelled y-d-o-t, not y-d-a-t.
An example of graph display using the xdot configuration file is shown in Figure 7.
To run this demo, execute:
ydat_demo_xdot1
Another example of graph display using the same xdot configuration file, but with different data is shown in Figure 8.
To run this demo, execute:
ydat_demo_xdot2
Some Things to Try
In this brief list of usage tips for YDAT end-users, B1 refers to mouse button 1, the left-most mouse button of a three button mouse, B2 is the center button and B3 is the right-most mouse button. SHFT and CTRL indicate the shift and control keys, respectively. A sequence such as SHFT-B1 means the the shift key should be depressed and held as mouse button 1 is being used.
-
Try the following in the Event Plot window using either demo1 or demo2,
though the text below refers specifically to demo1:
-
Depress B1 and sweep out a box that includes some of the data bars (for demo2, you will need to include the tops of the bars; for demo1, just be sure to include the tops of any of the stack bar striations).
When you release B1, the "Data Detail" table should become populated.
See the section below for tips on sorting the data detail.
-
Depress and hold B1 on the grey line beneath the X axis or to the left of the Y axis.
You will notice that starting from the end of the axis closest to your cursor, the grey line disappeared up to where you cursor is and the data bars disappeared as well.
In addition, the colored portions of the histogram bars in the "Age Group" window shortened and were replaced by a grey shading.
The map shading also changes as counties with less data assume shades that move along purple, to red, through the spectrum to green before greying out.
The light blue shading indicates that no data is available of those counties in the data set.
The shading of the bars in the table will also grey out if the associated data
disappears.
Congratulations, you just did your first bit of data filtering with YDAT, you deselected from the data set the years of data corresponding to the bars in the event plot that disappeared.
With B1 still depressed, drag it along the axis.
Year bars will disappear or reappear depending on your position and the other
displays will be concurrently updated.
Before we leave this exercise, shorten the grey line until only four year
bars are visible, now slowly shorten it a little more until the fourth bar just disappears leaving three bars.
Now, release B1 and try the next experiment.
-
Place your mouse cursor on what remains of the grey line and depress and hold B2.
Nothing should happen, but now, with B2 still depressed, try dragging your mouse
cursor along the axis.
Your should see a years appear and disappear, but with three years always visible.
Note how the other displays behave.
-
Reset to the original data set by going to the File->Load->All menubar option
in the event plot window.
While you are up in the menubar, try Filter->Race and then
Filter->Gender and, finally, View->Style->Stacked Polygons.
Come back later and try other menu bar options to see what they do.
-
Without depressing any buttons, just try moving your mouse cursor around
on the data in the event plot.
Note the shading shifts and tooltips information.
You can lock-down the position of the tooltip text box by pressing CTRL-d, second CTRL-d unlocks it.
The escape key (ESC) toggles tooltips off or on.
-
Click in the event plot window to make sure it has the focus.
Now, place the cursor somewhere in the plot and move the mouse wheel forwards
and backwards.
Depress the SHIFT and move the mouse wheel.
In the first case, the x-axis should have zoomed in or out centered on the
cursor position.
In the second case, the zooming should have happened on the y-axis (though
locked down at the zero value).
While this feature works with any set of data and display configuration, it
is probably more use with the sort of event plot in demo2.
-
In any of the filter windows (at this point you should have "Age Group", "Gender" and "Race" visible):
-
Depress B1 on the little grey diamond at the bottom of the window and drag it
back and forth. It adjust how much space the bars and text labels occupy in the
window.
-
Click in the window to make sure it has the keyboard focus and then press and
release the F2 key on your keyboard.
The cursor should change to a hand symbol.
Now move the cursor to one of the histogram bars, depress B1 while on the
bar and drag over to the Data Detail table and release B1.
The currently selected elements that comprised the histogram bar you selected
should now be displayed in the table.
Go back to the filter window.
Press and release the F2 key to return to the normal cursor and turn off this
feature.
Incidentally, the same sort of operation will also work in the event plot
and map windows.
-
Click B3 on a bar. It should grey out (deselect). Then click B1 on the same bar
and it should become selected again.
Depress B3 and drag the mouse cursor over several bars to deselect them.
Do the same with B1 to select them again.
-
In the windows menu bar, select Show->Transient. Now, depress B3 and drag the
mouse cursor over the bars.
Select Show->All Off and now depress B1 and drag the mouse cursor over the
bars.
Set then back to normal by togging Show->Transient and selecting
Show->All On.
-
Go to the "Gender" filter window and select Show->Color Data.
The event plot coloring changed to just red and blue, corresponding to the two
colors of the genders in the filter window.
Try clicking B3 then B1 on the Female bar and then the Male bar.
-
Go back to the "Age Group" filter window and selected View->Stacks.
Now the shading matches the gender shading and shows female/male contributions
for each age group.
-
Now go to the "Race" filter window and select Show->Color Data.
This time, both the event plot and the "Age Group" shading changed to indicate
contributions in the data based on gender.
-
Once again, experiment with other choices in the filter window menu bar.
-
In the Data Detail table (you will probably need to swwep out some data in
the event plot to populate the table as described earlier):
-
In the column headers, click B3 to reverse sort on that column (header
gets a reddish tint) and click B1 to forward sort (header gets a greenish tint).
Making the same sort request twice in a row on the same column, regardless of
the elapsed time, will reset the sort to the original data order (and return
the header color to its original yellowish tint).
-
To add additional sort columns, use CTRL-B1 or CTRL-B3.
Note how the header tinting darkens for each additional column in an attempt
to give a visual indication that the column is further along in the sort
order.
Any sorting column can be dropped from the sort ordering by making the
same sort request on it while CTRL is depressed.
In other words, for a reddish tinted column header, use CTRL-B3 to drop
that column from contributing to the table's sort, while on a greenish
tinted column use CTRL-B1 to drop that column.
Again note the lightening of column header tinting for columns later in
the sort order from the column that was dropped.
-
Each time you swep out data from the event plot, the table will be cleared
before the new data is added to the table.
If you want to retain the data from a previous sweep and simple add more to
it, select Show->Accumulate from the menu bar in the Data Detail window.
-
In the Graph window that is displaying a county map of New Mexico
(note: there is no Graph window in demo2, but what is described here will
work on the xdot1 and xdot2 graph demos):
-
The drop-down menu in the lower-right corner should be on "Select", if it is
not, then select "Select" (you can just roll your mousewheel over the widget
to easily change the selection when the Graph window is the active window).
-
Move the Zoom slider to zoom in or out.
The zoom factor is shown in the blue shaded label at the lower-left.
Leave the zoom at some value other than 1.0.
Click B1 on the blue label and the map should return to the 1.0 zoom level.
Click B1 again on that label and the map should return to its previous zoom
value.
-
Move the scroll bars to position the map so it is off center.
Click B3 on the blue zoom factor label in the lower-left.
The map should re-center.
-
Depress B3 and drag it over the counties to deselect them and use B1 to re-select them.
As usual, note the co-ordinate effect on the data in the other windows.
-
Depress B2 somewhere on the map and drag the cursor around, the map should drag
with it (a "Grab" operation).
B2 can always be used to grab the map in this manner.
A B3 click on the zoom label will again re-center.
-
Choose "Grab" from the pull-down menu in the lower-right.
Now press B1 on the map and drag it around.
This is another way to achieve the grab action.
In this mode, B3 will center the map where it is clicked.
-
Use the drop-down to change to "Move" mode.
Depress B1 while on one of the counties in the map and drag to someplace outside the map.
The just the county should have dragged along with you.
Release B1 to leave the county somewhere.
Repeat as often as desired.
Click B3 on a displaced county.
It shading will change and the location where it belongs will be filled in with
similar shading.
If you depress and hold B3 to drag the county back to its home, you can release
B3 to properly return it where it belongs.
You can use Show->All Moved->Reset to put all the displaced counties back.
-
"Press" mode allows you to use B3 to highlight/mark a county.
-
"Scroll" mode lets B1 control the scroll bars while in the graph area.
Depress B1 and drag it around in the map
to affect both scroll bars simultaneously.
-
"Zoom" mode lets you use B1 to sweep a area of the map. Releasing B1 will
cause a zoom to that area to be performed with the map centered on the
center of the zoom box (as indicated by the crossed lines).
If you perform multiple zooms, then each sweep with B3 will allow you to step
back through the zooms just performed.
-
In any mode, use the mouse wheel to zoom in or out centered on the current
mouse cursor position.
We encourage you to play around with the various menu bar options to
learn about other YDAT features.
Getting Help
Unfortunately, documentation for YDAT is largely absent.
There is some information in the Yoix README, which has been excerpted
here.
Also, some comments in the Yoix source code give some information on the syntax used
for graph or map drawing, which has been excerpted
here.
Also, the examples should provide some guidance in the construction of configuration files, but probably not enough.
We thought it was more important to get this tool out and build up the documentation later.
To remedy this situation,
we would like to try the following approach to help us build up that documentation:
join the
Yoix mailing list
and ask questions about usage in that forum.
We will respond there and when a critical mass of useful instructions are generated we will gather them together to
construct a usage document for YDAT.
Yoix is a registered trademark of AT&T Intellectual Property.
|