Programming with Timespans in Interactive Visualizations
Yifan Wu, Remco Chang, Eugene Wu, Joe Hellerstein

TL;DR
This paper introduces DIEL, a declarative programming model that simplifies the development of interactive visualizations by managing concurrency and interface state through declarative queries over event history.
Contribution
The paper presents DIEL, a novel declarative framework that helps developers handle concurrency issues in interactive visualizations more easily.
Findings
DIEL reduces code complexity for managing concurrency.
Developers can specify interface states declaratively instead of procedural updates.
Conflict resolution in visualizations can be achieved with minimal code.
Abstract
Modern interactive visualizations are akin to distributed systems, where user interactions, background data processing, remote requests, and streaming data read and modify the interface at the same time. This concurrency is crucial to provide an interactive user experience---forbidding it can cripple responsiveness. However, it is notoriously challenging to program distributed systems, and concurrency can easily lead to ambiguous or confusing interface behaviors. In this paper, we present DIEL, a declarative programming model to help developers reason about and reconcile concurrency-related issues. Using DIEL, developers no longer need to procedurally describe how the interface should update based on different input events, but rather declaratively specify what the state of the interface should be as queries over event history. We show that resolving conflicts from concurrent processes…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Multimedia Communication and Technology · Advanced Database Systems and Queries
\onlineid
0 \vgtccategoryResearch \vgtcinsertpkg
Programming with Timespans in Interactive Visualizations
Yifan Wu
UC Berkeley e-mail: [email protected]
Remco Chang
Tufts University e-mail: [email protected]
Eugene Wu
Columbia University e-mail: [email protected]
Joe Hellerstein
UC Berkeley e-mail: [email protected]
Abstract
Modern interactive visualizations are akin to distributed systems, where user interactions, background data processing, remote requests, and streaming data read and modify the interface at the same time. This concurrency is crucial to provide an interactive user experience—forbidding it can cripple responsiveness. However, it is notoriously challenging to program distributed systems, and concurrency can easily lead to ambiguous or confusing interface behaviors. In this paper, we present diel, a declarative programming model to help developers reason about and reconcile concurrency-related issues. Using diel, developers no longer need to procedurally describe how the interface should update based on different input events, but rather declaratively specify what the state of the interface should be as queries over event history. We show that resolving conflicts from concurrent processes in real-world interactive visualizations can be done in a few lines of diel code.
1 Introduction
Today’s interactive visualization applications are complex and difficult to develop. Applications need to deal with numerous concurrent processes all happening at once: user interactions in the interface; data and query processing requests to potentially large or remote databases [37, 20, 15, 24]; real-time data that streams into the interface[46, 41, 19, 47]. Each of these processes has the capability to change contents of the user interface, and their concurrent nature can lead to modifications that confuse the user.
As a concrete example, consider the real-time Twitter analytics example in Figure 1. The left view renders a map that is overlaid with the locations of tweets as orange dots. New tweets stream into the visualization and dynamically add new points in the view. The user can draw a brush on the map (the yellow box) to highlight tweets of interest. The brush interaction interactively updates the bar chart on the right, which shows a distribution by hour of when the selected tweets were composed. The user can click the left/right/up/down panning controls in the lower right of the map to move the map view. Finally, the user can also click on a tweet to show the tweet’s content in a pop up window.
In this example, the two processes—the user’s brushing interaction, and the tweet stream—introduce ambiguity about what happens when both attempt to update the visualization concurrently. For instance, the user intends to select the tweets on the left side of the map and is in the middle of dragging their mouse. But what if new tweets are added to the map visualization that are within the user’s brush? The user did not intend to select the new tweets, but they are now within the current brush.
Example 1**.**
Figure 2 isolates this specific issue. The user brushes to select point A; two new points, B and C, arrive. We show two timelines at the bottom of the figure: the blue timeline shows mouse events related to the user’s brushing while the gray timeline shows the new tweet arrival times. Although it is clear that A should be selected, there are two ambiguous choices for the developer to make. B arrived during the brushing interaction (at step 2); should B be part of the selection? Further, C arrived after the interaction ended (at step 4); should C be part of the selection? An application developer needs to choose a behavior that fits their use case, and communicate the behavior to the user.
The user interface updates based on events, both from the user inputs and data updates. Events change the interface in relation to events that have happened in the past. Without concurrent processes, such as the brush and the tweet update, the relationships between events are simple to maintain. For instance, for a static visualization of tweets, the events can be handled synchronously and in isolation—the brush is evaluated and the results are rendered in lockstep. However with the added real-time tweets, the relationships between the events are made more complex. Consider the additional event-handling logic needed to implement the policy of selecting tweet C: when handling new tweets, the developer now needs to check if the new tweet falls in the the brushed region, and if so, update the bar chart to the right to include the new tweet. Application developers need to be aware of the concurrent processes and be able to create a policy about how the interface should update with interleaved events from these concurrent processes. The policies should make sense to users for their specific use case, and be easy to program.
1.1 Hello, DIEL
In response to these concerns, we developed Data Interaction Event Log (diel). In diel, the current state of the visualization is defined as queries over the sequence of all events (including user interaction events and data update events, etc.) that have occurred in the visualization. The queries help the developer directly access relationships between past events. The queries are expressed using familiar SQL-like syntax and semantics. After an event, diel stores the event to history, and updates the new state of the visualization based on the queries specified by the developer. diel manages the state of the application in the face of concurrent processes and does not replace the entire application—it integrates within existing applications through a JavaScript interface.
For example, Listing 3 shows the example of how to express a policy for the brushing example in Figure 2. Lines 1-4 of Listing 3 define the history of events in diel, specifying what the attributes are for each type of event—the brush event contains the bounding box as well as what the current mouse state is (down, move, or up), and the tweet event contains the id of the tweet, the geo location associated with the tweet, and the time it was tweeted. The application can use simple APIs to inject the events into diel.
Upon the arrival of any event (either a new brush event or new tweet event), the regionSelection view in lines 6-10 of Listing 3 provide SQL-like diel queries over those tables. It defines the tweets currently selected by the brush by computing a join between the tweet data events and the latest brush interaction (LATEST brushEvent). The join identifies tweets whose lat, lon coordinates are within the bounding box of the current brush (is_within_box()). Because of the use of the LATEST brush coordinates, all points A, B, and C will be selected since all points fall within the bounding box of the brush (and the timing of the events is not considered). Other policies, such as to only select A, or to cancel the brushing interaction if a new tweet arrives that changes the set of selected tweets, are also easy to implement in diel, as we show in Section 6.
The last two lines (12-13) of Listing 3 provide the query to retrieve data for visualizing the barchart in Figure 1 where information about the points A, B, and C (which are selected by the brush) are shown.
Given this specification, diel will ensure that the interface will automatically include the new tweet when it arrives and falls into the current brush, along with any other state of the visualization that may depend on it; in this case the bar chart to the right. This is illustrated in Figure 4, a simplified diagram of the diel execution process.
To summarize, we contribute by (1) identifying a class of challenges caused by concurrent timespans in interactive visualizations, and (2) developing a framework, diel, for state management in interactive visualizations that simplifies handling of concurrent timespans. We demonstrate that diel facilitates simple, and we believe elegant programming of the underlying state of interactive visualizations that deal with concurrent events, while interoperating with a wide range of toolkits for visualizing data. We also show that a wide range of other functionalities for visualizations, such as undo/replay [36], logging, and interaction summaries [11], can be naturally supported by diel.
2 Timespans in Interactive Visualizations
We begin by providing a conceptual model to think about concurrent processes during interactive visualization, which helps clarify the problem space. We use the language of events and timespans to discuss the challenge of concurrent processes:
Event**.**
An event is any change from outside the application that is captured by the application, such as a user click, or a data response from server.
Timespan**.**
Events form a linear history that can be numbered sequentially to define a logical notion of time, and a timespan is a tuple composed of a time of a starting event, and time of an ending event.
We now provide example categories of timespans and the corresponding events using a model of visualization by van Wijk [40], reproduced in Figure 5. The application uses data and a visualization specification to define a visualization. This is perceived by the user and adds to the user’s knowledge. Based on this new knowledge, the user then explores new specifications, which starts the process again. By examining different arrows in the model, we can characterize common instances of timespans. Broadly, events and timespans can be classified as either from user interactions or from data updates, and timespans may contain a mixture of different event types.
Timespans are necessarily developer-defined, as their definition are inherently determined by the application’s semantics. The taxonomy provides developers with a structure to identify relevant timespans for their application.
2.0.1 Data Timespans
- •
Data & SpecificationVis: data processing timespan. Applications may need time to process the specification when the data is large or lives on a remote server. To get from data to the final image, the application may incur network transmission, server processing, and perhaps rendering delays. Different types of events may be part of this timespan, including the interaction that initiated the request, the processing request, the response, and rendering events.
- •
DataVis: data streaming timespan: real-time data are generated over time (e.g., tweet streams), which creates a timespan that starts from the start of the visualization user’s session to the end of the session.
2.0.2 User Timespans
- •
PerceptionExploration: reaction and think timespans: users may take time from observing a change on the screen to taking action [3, 18, 2].
- •
ExplorationSpecification: user action timespan: some interactions, such as drawing a brush, or making multiple selections by clicking on each item, take time to perform the physical action.
- •
SpecificationVis: user selection timespan: often selections persist on the screen after the event is completed, such as a brush. This selection is perceived by the viewer and may still impact other events (e.g., new tweets, or a map resize). The start of the selection timespan is the creation of the selection until the removal of the selection.
The above is not intended to be an exhaustive list, but serves to showcase the rich varieties of timespans that may be present in modern, highly interactive visualizations. These different timespans are often due to concurrent processes (e.g., server processing, users, network, etc), which we show is a major source of concurrency ambiguities below.
2.1 Overlapping Timespans in the Wild
The ubiquity of timespans in modern visualizations makes overlaps between timespans highly likely. We now present examples from each of the three classes of timespan overlaps: user-user, user-data, and data-data.
2.1.1 User-User Timespan Overlap
The following is an example of the overlap due to two user interactions.
Example 2**.**
Some user interactions can have persistent effects to the visualization. For example, suppose that the user can draw a persistent brush (i.e. a brush that stays on the screen to indicate a user’s interest in monitoring a specific region in the map). Then the user pans the map. These two user selection timespans overlap each other because the brush’s timespan continues “infinitely” into the future. This results in ambiguities in how the system should respond. Should the brush continue to persist? If so, and user’s intent was to monitor a geographic region, then the brush could essentially move off-screen after the pan. But how would they know that it is still active and affecting the visualizations linked? Otherwise, should the brush stay in the same pixel region, and select different geographic regions during the pan?
2.1.2 User-Data Timespan Overlap
User timespans may overlap with data timespans, and are common because the system runs concurrently to the user. In addition to the example in the introduction, we now describe two additional examples.
Example 3**.**
The user perceives an interesting tweet, A, and proceeds to click on it. However, 10 milliseconds before they were able to click, the map data updates and the mouse position the user clicked on corresponds now to tweet B. The application now updates the charts with the selection of tweet B. However, we know that no human user can observe and act on a visual change in 10 milliseconds [3], so the selection of B must have been unintentional. This type of overlap is common across selection widgets where the contents may change due to fine-grained user interaction, such as drop-down menus for auto-completion or search suggestions, or a user accidentally clicking on a banner ad in a webpage due to asynchronous loading using AJAX.
Often in interactive visualizations, there are implicit dependencies between events. For instance, if the user is drawing a brush on a map, the brush depends on the position of the map to remain stable. Below is an example of such dependency potentially being violated due to overlapping timespans.
Example 4**.**
Suppose that instead of streaming tweets, the tweets are being fetched from the server for each new map position. The map view does not change position until the data has been loaded (data processing timespan). A user first clicks the right arrow to navigate to a new map region, but for a short while, this map does not change. During this time, the user notices an interesting tweet, and clicks on it to see the content. However immediately after, the data for the new map region (from the first interaction) arrives. Should the application block the new map region? Or should it have blocked the click on the tweet?
2.1.3 Data-Data Timespan Overlap
In traditional interfaces that execute synchronously, data processing follows lock step after each interaction. However, with a client-server architecture that introduces data-processing timespans, this will no longer be the case; instead, limits should be posed on what sequence of events are acceptable, depending on the application.
Example 5**.**
Consider an interactive cross-filter interface over bar charts (e.g., [26]). As the user brushes a line chart, fine-grained requests are being sent to fetch the filtered data for each new range selected. Ideally, the sequence of request and response should be in lock step, (q1 r1 q2 r2 q3 r3) where q is the request and r is the response. However, these requests experience varying response delay and instead the following event sequence happens (q1 q2 r1 q3 r3 r2). For each of the events, what response should the application render? Should it render r1 r3, then r2? Or should it ignore r1 and r2 and just render r3?
3 Reasoning About Concurrency
While each of the examples above can be resolved using any popular programming paradigm for user interface design, we will argue, in this section, that we have a simpler approach for managing timespans in interactive visualizations. The crux of our argument is that by keeping a log or history of all events, timespans can easily be specified as queries over that history, and these queries can be extended to specify how overlapping timespans should be treated. We begin by explaining how it is not possible to simplify interface programming by preventing concurrency.
3.1 The High Cost of Preventing Concurrency
Overlaps can clearly cause problems, and it is not simple to reason about how to handle overlaps effectively—they resemble concurrency issues in distributed systems, which are notoriously complex. As a result, a common design approach is to preclude such overlaps entirely. Prevention can be achieved by forcing each new event to wait, and starting only after the previous event has completed (blocking). Alternatively, it can be achieved by forcing the previous timespan to complete early when the later timespan begins (preemption). We illustrate the two choices in Figure 6. Unfortunately, both options can have high costs, as we now show.
The first option, blocking, is benign when the backend and network are fast enough that each timespan completes before another or is perceived to start and when there are no real-time events. This is often not the case and improving performance may present a significant challenge. One way to avoid backend delays is to load all the data needed in advance, so that interactions can proceed synchronously with negligible latency. This is common practice in commercial charting programs like Tableau [39]. However, pre-loading all data on the client complicates the application logic and may not be possible when the data set is large.
The second option, programmatic preemption, is common among frameworks for front-end application development (e.g., [8]). While this may seem like a simple solution, it may not always apply. For instance, in Example 4, the event handlers do not know that there is a dependency between the brush and the map; to program this dependency, the developer would have to coordinate the handling of the map and brush interactions. Preemption may also not be the best design. For instance in Example 5, the user may want to see intermediate results that are only a few milliseconds late.
Given that overlap prevention is not a panacea, how should a developer reason about and program concurrent events? To account for the overlap between timespans, we need to handle each event based on past events. To coordinate with past events, a developer can either (1) derive and maintain sufficient information needed from past events, or (2) store all past events and examine past events for the information needed. The former programs without history, and the latter with history. We discuss these two alternatives and highlight pain points to motivate the diel model.
3.2 Programming Without History
To maintain relevant information from past events, a developer can create and manage variables in their event handlers, and use these variables to control how events are handled.
We walk through how to deal with the example in Figure 2. Under traditional event handling, A and B are selected, since a call back re-evaluates the selection criteria each time the brush moves. In order to select all of A, B, and C, the developer first needs to save the bounding box of the brush, and filter every new Tweet (e.g., C) to check if it falls within the bounding box. If so, then the new Tweet is added to a separate variable that stores the selected Tweets. This logic then needs to be disabled once the brush is cancelled. Now if the developer want to change it such that only A is selected, since it is the only tweet that was in the region before the brush has happened, the developer need to mark the new tweets as having started after the brush has started, and when evaluating the brush, skip those marked.
Without history, the programmer needs to perform imperative bookkeeping, and as a result when the design changes, they need to modify different parts of the event handling and bookkeeping logic. In most UI packages, the state captured is scattered across variables in multiple handlers, which further exacerbates the challenge of derived state management.
3.3 Programming With History
When programming with history, the developer defines declaratively, what the next state of the UI is, without needing to procedurally describe how to get there from the current state. The relevant state of the UI can be captured as data; the graphical elements can be fully derived from that data, via a graphical grammar [42]. Interactions can also be seen as additional data to be visualized, e.g., the state of a brush on a map at any point in time is defined by two coordinates (latMin, lonMin, latMax,lonMax), and map as a semi-transparent rectangle to the screen.
As an example, Listing 3 simply defines the state of the UI as queries over history—the developer no longer needs to track the current state of the variables and UI, nor do they need to define the different ways that one state could change to the next.
There are many ways to represent event history, and even more ways to program with it. We chose to program history with SQL because it is (a) a “declarative” or “functional” approach where the output is computed at any time from a spec over history, (b) a simple language with modest temporal power—just enough to support simple overlap checks among timestamps and timespans, and the ability to query history with simple predicates like “most recent event matching a condition”, (c) a language that is familiar to our target audience, namely developers of data visualizations, and (d) widely implemented on browsers, mobile devices, and backend servers; this accelerated our development of diel.
We cannot claim in any empirical sense that the language in diel is the perfect choice, but we believe it is a good fit to our design considerations, and as a result a good fit to taming the complexity of overlapping timespans.
4 The DIEL Model
We now introduce the diel model: how different data in the interactive visualization is modeled (data model), and how application state evolves (execution model).
4.1 Data Model
Data in diel are represented as relational tables with rows and columns. There are four types of data in any application: input, history, static, and output; we discuss each type of table in turn.
4.1.1 Input Tables
contain events from the world that may change the application state. The events may be from the user (e.g., clicks), or asynchronous data inputs (e.g., new tweets). Inputs are unpredictable, and the application (and diel) has no control over when they might arrive. Within a diel program, input tables are read-only. They are populated by the application through the diel API.
4.1.2 History Tables
are used by a diel developer to persist data derived from input events, or storing data that does not to be updated with each event. They are INSERT-only tables that the developer can define and populate in the diel program. Deletions and updates are not allowed—we keep history immutable to make reasoning about program behavior easier in the face of concurrency [16, 5, 10]. The allBrushes table in Listing 17 is an example for supporting UNDO functionality.
4.1.3 Static Tables
store pre-populated data that is READ-only within the diel program. For instance, the map tiles in the Tweet visualization is fixed and does not change. Since diel is evaluated using a database, static tables are simply existing tables in the database.
4.1.4 Output Views
represent the state that the application ultimately reads and renders in the visualization. The purpose of the diel program is to automatically maintain these views by executing queries over the input, history, and static tables. For instance, hourDistOutput in Listing 7 defines data used to render the bar chart in Figure 1.
4.2 Execution Model
The developer defines the above tables, and writes state programs that specify how the history tables are populated when different types of events arrive. Developers can also define database views to organize the overall diel program to help define the Output Views. A database view is a table whose rows are not explicitly stored in the database but are computed as needed from a query [31]. Once these are defined, diel automatically performs the following sequence of actions to update the Output Views on every new event that arrives in the Input Tables:
Increment a global logical timestep, and annotate event with logical timestep and physical wall-clock time. 2. 2.
Store input event in corresponding Input Table. 3. 3.
Execute the corresponding state programs to populate history tables. 4. 4.
Execute/maintain the Output Views 5. 5.
Invoke upcalls to the visualization for each Output View that has changed.
5 The DIEL Prototype
To evaluate the feasibility and expressivity of diel, we implemented a prototype that runs in a standard web browser. The diel language is SQL with some syntactic sugar and and diel program compiles to a JavaScript library that a developer can reference. The library runs on a browser-based SQLite engine [21] This section introduces the syntax through a naive implementation of the tweet example. We walk through syntax for setting up the inputs, outputs and how to use the compiled JavaScript object. In the next section, we will show how to use diel to resolve overlapping timespans. For the ease of reading, we will follow the convention of naming inputs with suffix Event, and outputs with Output.
5.1 Input/Output Syntax
Listing 3 lines 1-4 shows how to create input tables. The diel keyword CREATE INPUT takes after CREATE TABLE syntax in SQL, with the specification for the column names, their data types, and additional constraints. Listing 7 lines 1-3 show the other two events: map panning (mapEvent) and click to select an individual Tweet (clickEvent).
Listing 7 lines 5-7 show the syntax for creating output views. CREATE OUTPUT specifies a query over diel tables. mapOutput queries for the most recent map panning event. The LATEST keyword is syntactic sugar to find the records with the highest logical timestep. The query for mapOutput is the same as SELECT * FROM mapEvent WHERE timestep =(SELECT MAX(timestep) FROM mapEvent.
5.2 API Syntax
Given a diel program, diel generates a Javascript object diel the application can import and use. The input specifications are translated into API function calls of the form diel.input.<inputName>. We show examples of how the application can invoke it within its user interaction event handlers (Listing 8) and for streaming data (Listing 9).
The output views are complied into API calls of the form diel.bindOutput.<outputName>. The application passes in a callback function responsible for taking the full set of records in the updated Output View and rendering it on in the UI (Listing 10). We expect that the function is state-less and idempotent, meaning that calling it multiple times with the same records results in the same output rendering. Supporting deltas and incremental updates is an optimization left for future work.
5.3 State Program and History Tables Syntax
The syntax of a state program is CREATE PROGRAM BEGIN <program Content> END;. For instance, in Listing 11, the history table is selections History and the program content is the INSERT clause, which saves the selection of tweets that was executed at each timestep. If the state program should run after specific events, the developer can use the following syntax: CREATE PROGRAM AFTER <inputName>.
6 Resolving Timespans With DIEL
This section will use diel to address the examples provided in Section 1 and Section 2. We show how we can change the overlap policies easily, often with just a few lines of code. The goal of our discussion is not to prescribe specific design policies, but rather to facilitate implementation.
6.1 User-User Timespan Overlap
Recall Example 2, where the brush bounding box persists and the user pans the map. We show two possible ways to resolve the ambiguity. Recall that the brushOutput table defines the state of the user’s brush. We can specify new policies by redefining it based on the brush’s relationship with the panning input table.
Alternative 1 in Listing 12 implements a policy that “removes” the brush after a map interaction by redefining the brush state to be brushPreemptedOutput, which will be empty if a map interaction exists after the last brush interaction event. Alternative 2 implements brushStillInOutput, which is empty if the latest brush’s bounding box is no longer within the latest view box as defined by the panning event.
6.2 User-Data Timespan Overlaps
6.2.1 User Action and Data Streaming Overlap
Listing 3 demonstrated one possible policy to resolve the conflict in Example 1 by selecting all points (A, B, C). An alternative policy is to only select the tweets that were on the screen at the time the brush began (i.e. point A). Thus, neither B nor C should be part of the selection. In Listing 13, initialSelection (lines 2-5) expresses this by finding the most recent mouse down event from brushEvent and ensuring that only the tweets in the bounding box (regionSelection) that also have a lower timestamp are kept.
Alternatively, if the designer chooses to invalidate a brush when new tweets arrive, they can do so by selecting the most recent brush only there does not exist a panning interaction that is more recent (Listing 13, lines 9-14.)
For completeness, event deletions are also possible, albeit slightly more cumbersome, when using this immutable history paradigm. This can be achieved by keeping track of events as either additions or deletions (tweetEvent in Listing 14), and writing a query that ignores events that have a later deletion event (tweetOutput).
6.2.2 User Reaction and Data Streaming Overlap
We now discuss Example 3, where the user inadvertently clicks on the wrong tweet because it was inserted immediately before the user’s click action. Listing 15 shows two possible policies to address this issue.
skipUnintendedClick checks that the time of the user’s click event did not occur within 200ms [18] of the most recent tweet event in the diel system. If so, the click event is simply discarded. The 200ms can and should be adjusted based on the use case and users.
intendedSelect illustrates a second policy that simply ignores tweets that were introduced within 200ms of the user’s mouse click. This is done by identifying the tweet in the clicked location with a timestamp that is least 200ms earlier than the click event’s timestamp.
6.3 Data-Data Timespan Overlap
Example 5 discusses what happens when the brush sends asynchronous requests and the responses arrive out of order. One possible overlap policy is to display the response as soon as it arrives if it is more recent that that shown previously. For example, assume an intended sequence of requests [1, 2, 3, 4, 5]. If the the received order of the sequence is [1, 4, 3, 5, 2], then following this policy, the rendered result will be [1, 4, 4, 5, 5] for each timestep. This shows partial results that are slightly out of sync, but also prevents the potential issue of showing an incorrect final result, as well as out of order updates.
Listing 16 shows an implementation of this policy. The brush response input event keeps track of what the timestep of the initial request as the attribute requestTimestep. carrier and flightCount are the actual response payloads. barChartOutput picks the response of the most recent request by sorting the responses in descending order of requestTimestep.
To summarize, all the design policies in the examples can be expressed by straightforward selection and join queries that are easier to write and understand than state that is manipulated and stored in variables throughout the application logic. This allows the developer to quickly iterate on different concurrency policies to find those most effective for their application.
7 From Fixing Conflicts to Creating Features
The previous sections demonstrated how diel can be used to help developers resolve different types of timespan conflicts. However, the power of diel is not limited to this capability. In fact, the key concept behind diel– the use of event history to reason about visualization state – can be extended to implement other capabilities.
Shneiderman has argued for recording a history of actions to support undo, replay, and progressive refinement [36]. Since diel records history, these important features are easy to support. Below we present how diel can be used to support undo, provide history-based scents, perform logging and debugging, and facilitate information sharing in a client-server architecture.
7.0.1 Undo
History is immutable, thus diel models an undo as another interaction event. We will show linear undo, used in applications such as Emacs [29], in the Tweet example. For simplicity, we assume that the user click on Tweets to select them. She clicks Tweets A, B, and C, and then presses undo twice. The sequence of actions is (A, B, C, Undo, Undo), but what the user will see is the sequence of selections: (A, B, C, B, C). where B mean that B is selected due to an undo action.
Listing 17 provides an implementation. We first define an undo input table; allSelections will track the sequence of selections as described above, where the rank is simply the sequential id. The state program populates allSelections with the actual selected tweet for each logical timestep (i.e., each click and undo event). To compute the selected tweet, it first checks whether the most recent interaction were undo events or normal clickEvents (lines 23-25). If there are undo events, then it subtracts their count from the maximum rank in allSelections to identify the selection that the undos represent (e.g., B after the first undo above). Otherwise, currentUndoSel is empty.
currentSelection uses the SQL coalesce function to either pick the selection from the currentUndoSel, or if it is empty, to pick the most recent brush event. Finally, the state program adds the current selection.
7.0.2 History-Based Scents
The HindSight design technique represents interaction history directly in the data visualizations to enhance the user experience [11]. Figure 18 illustrates an example for the tweet visualization, where the regions that the user has brushed are visualized in a world map view widget. The visualization is that of the final selection (filter by mouse up), and only selects distinct values (a convenient SQL functionality).
7.0.3 Logging and Debugging
Since all the events are logged, developers can save the diel tables from user sessions to the server, which can be queried at a later time, to answer questions such as, how many interactions were made, and what the average latency was per requests. The authors have used this functionality to replay past user sessions as part of debugging the visualization [27, 32, 7].
7.0.4 Client-Server Architectures
Accessing remote servers directly is another functionality that is supported by the diel model. Listing 19 shows a join between a diel table mapEvent and a table remote.tweets that is stored on a remote database server. Here, the new input tweetEvents is defined as a query over a remote data source based on the user’s panning interactions. This allows the request to be evaluated asynchronously, and be handled simply as another diel input table. This functionality is not yet implemented, because it requires a federated SQL layer [35] that executes across multiple databases. However, the techniques are well understood, and we plan to borrow techniques from existing federated systems such as BigDAWG [13].
Another common functionality for applications with client-server architectures is to upload the current state of a client application to the server, useful for migrating a session from one device to another [14] or for restoring a user session after a network disconnection. To implement the feature, the developer shares diel tables on the client to servers, and the complete session can be directly reconstructed base on diel tables.
8 Related Work
Our work presents the challenge of timespans in interactive visualization, which broadly falls in the category of “concurrency” issues, which has been explored in much depth in other fields such as databases and collaborative groupware. While the root cause of the problems are similar in that events do not occur in lockstep, the challenges they cause are very different. In this section, we show how timespans differ from prior models of concurrency/asynchrony, and how diel uniquely addresses the issues of timespans.
8.1 Concurrency in Databases
Databases developed the formalism of transactions as the foundation of concurrent execution of reads and writes to the tables. The properties that a transaction must maintain is that the transactions are atomic—all the reads/writes are executed in full, or none are and that the transactions are isolated from other concurrent transactions. While diel makes use of the relational representation, it does not use the transaction formalism to reason about timespans. In diel there is only one thread running, appending one event at a time. When diel discusses concurrency, it’s the concurrency between timespans, not between insertions. For future work, we could investigate how to map timespans into transactions. However that will be work building on top of diel, not instead of it.
There is no notion of locking in diel’s context since there are no concurrent record access. Note that this should not be confused with blocking in the UI; blocking is handled in diel not by rejecting input record but by not outputing the most recent event.
8.2 Operational Transforms
Collaborative editors also face the issue of “concurrent” operations across users. However, interactive visualizations express queries over data for a single user, and groupware express modifications of shared objects for applications like shared documents and multi-player games.
The focus in collaborative editing is on distributed consistency: the issues that arise when multiple views (replicas of system state) are subject to viewing and modification, and each replica may learn about updates to system state in different orders resulting in potential inconsistency [9, 38]. diel instead addresses the case where multiple processes (e.g., user interactions, responses to previous asynchronous requests, streaming data) modify a single visualization state. Thus, all events are strictly ordered and recorded in the log, and the diel system is able to see the timespans formed by pairs of ordered events. This is arguably a simpler problem than the above (which has a reputation for being complex [38] and difficult to prove).
diel has a single event log and does not need to keep replicas of the application state in sync, which is much simpler. diel provides a programming abstract to enable programmers to design the correct behavior easier, and not an algorithm that imposes a design choice. That being said, integrating the two approaches for collaborative visualizations is interesting future work.
8.3 Programming Abstractions for Concurrency and Asynchrony
Outside of application verticals, there has been several domain specific languages designed to deal with concurrency, ranging from functional reactive programming (FRP), temporal logic, to constraint programming. We discuss each in turn.
In Reactive Extensions, events are modeled as streams that can be composed with functional operators such as switch and reduce [25]. Reactive Vega is another popular functional reactive programming framework [34], with a focus on interactive visualizations. Reactive Vega generates a dataflow graph based on declarative specifications and can efficiently process events.
Both systems are fundamentally streaming (“push”) driven systems, which encourage a reactive coding style similar to state machines, where state transitions and emits output as input flows through. However it is feasible to use these FRP languages as a substrate to incrementally maintain both a history and the result of standing queries over that history: this is how stream queries and relational materialized view maintenance work. In short, one could probably implement diel-like logic as an idiosyncratic design pattern in an FRP language like Rx or Vega, but doing so would not be a task for a basic visualization developer.
Bloom provides a simple temporal model, consisting only of timestamped data, event-driven timesteps, and relational queries run at each timestep [1]. diel applies principles from the Bloom language to interactive applications, and adapts a SQL interface as opposed to Datalog, which is lesser known and presents a deeper learning curve.
diel is also related to the Forward [12] system and the DeVIL language [45, 44]. Both are database abstractions for application programming. Forward is a “full stack” framework where developers can directly access remote tables. Forward differs from diel in that it synchronizes the client and server state and does not provide abstractions for asynchronous events. DeVIL is similar to diel in that it defines declarative SQL-like abstractions to express interactions and how they change the visualization state. However, it does not explicitly store event history, nor let developers query the history.
Constraint programming [17, 28] automatically maintain developer specified constraints over variables, and allow the constraints to be context-dependent to the input sequence via FSM abstractions. diel does not allow developers to reason about state transitions for individual events. Although our language is more restrictive, it is sufficiently expressive for programming concurrent events in interactive visualizations—as opposed to generic interfaces.
diel expose the log as a simple query-able table, and make it easy to express and directly see the effects of a wide variety of reconciliation policies within the visualization, in order to choose the policy that provides good user experience.
9 Conclusion and Future Work
In this paper, we present DIEL, a model that allows a developer to reason about timespans and write programs for resolving conflicts from overlapping timespans. We implemented a prototype DSL and demonstrate the ease with which DIEL supports different designs for a variety of real-world examples. We also showed that the history diel keeps facilitates functionalities such as undo, replay, and logging.
Future work directions include further system improvements, as well as exploration of the design space and policies for overlapping timespans. In the near term we plan to open source diel and release a gallery of examples to get feedback from the interactive visualization community. On the system front, we plan to enhance the performance of the diel engine by exploring (a) storage reclamation for unreachable history [6], (b) avoidance of recomputation via materialized view maintenance techniques [4] and lineage [30], and (c) optimized federated query execution across remote and local data [35]. On the policy level, we are exploring the design space for timespans involving asynchronous remote calls, performing user studies to better understand how design affordances can enable users to exploit and keep track of the multitasking that is made possible by well-managed overlapping timespans. Over time, we hope to generalize to higher-level principles for design recommendations for timespans, much as the work on grammars of graphics have led to design recommendations for static visualizations [22, 23, 43].
10 Acknowledgments
This work was supported by the National Science Foundation under Grant No. 1564351, 1527765, and 1564049.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] P. Alvaro, N. Conway, J. M. Hellerstein, and W. R. Marczak. Consistency analysis in bloom: a calm and collected approach. In CIDR , pp. 249–260, 2011.
- 2[2] H. Benchmark. Reaction time, 2018.
- 3[3] S. Card, T. Moran, and A. Newell. The model human processor- an engineering model of human performance. Handbook of perception and human performance. , 2:45–1, 1986.
- 4[4] R. Chirkova, J. Yang, et al. Materialized views. Foundations and Trends® in Databases , 4(4):295–405, 2012.
- 5[5] M. Coblenz, J. Sunshine, J. Aldrich, B. Myers, S. Weber, and F. Shull. Exploring language support for immutability. In Proceedings of the 38th International Conference on Software Engineering , pp. 736–747. ACM, 2016.
- 6[6] N. Conway, P. Alvaro, E. Andrews, and J. M. Hellerstein. Edelweiss: Automatic storage reclamation for distributed programming. Proceedings of the VLDB Endowment , 7(6):481–492, 2014.
- 7[7] E. Czaplicki. The perfect bug report, 2016.
- 8[8] E. Czaplicki. Effects examples, 2017.
