Ensemble Data Tool
The Ensemble Data Tool (EDT) is used to visualize and analyze the RDF results of an MRM run within RiverWare. A screenshot is shown in
Figure 3.12 for a sample Ensemble Data Set with over-plotted lines for each trace. This section describes the Ensemble Data Tool and how to use it.
Figure 3.12 Screenshot of the Ensemble Data Tool
Ensemble Data Tool Sequence
To use the ensemble data tool you must have RDF output from an MRM run. An example Ensemble Data Tool use-case is described by the following sequence. Remember that many of these can be automated using scripts as described in
Script Actions for Working with Ensemble Data Sets.
3. Use the Import RDF button and select the RDF file. An Ensemble Data Set object is created on the workspace and shown in the tool.
4. If the Ensemble Data Set contains series data, use the
Create Data Views button
to show over-plotted lines of all the traces for one or more slots.
5. Use the legend/filter button
to see a legend of the traces and to filter the visibility of curves shown.
6. To perform analysis, select the data set and select Analyze.
7. Select the main category of analysis and then select any filtering options for slots or traces. Select the specific analysis type (mean, min, max, etc) along with any other relevant options. Select Perform Analysis when ready.
8. Based on the selections, the resulting data will be added to slots on new or existing Ensemble Data Set objects. Open the objects from the EDT or the workspace and view or plot the data. In the future, additional plotting and visualization options may be available from the EDT.
9. If desired, use the “Visualize” button to create custom data views of the RDF data or the analysis results. Data Views with the following types can be added:
a. RDF Slot line curves on a plot
b. Analysis results curves and/or regions on a plot
10. If specified, the Ensemble Data Set objects will be saved in the model file and can be accessed for additional analysis.
Accessing the Ensemble Data Tool
Access the Ensemble Data Tool from any of the following locations:
• From the workspace, use the Utilities and then Ensemble Data Tool menu.
• From the Multiple Run Control dialog, use the View and then Ensemble Data Tool menu.
• From an Ensemble Data Set object’s open object dialog, use the Open Ensemble Data Tool button.
Then the Ensemble Data Tool will open. A blank, empty Ensemble Data Tool is shown in
Figure 3.13.
Figure 3.13 Ensemble Data Tool with no Ensemble Data Sets
How to Use the Ensemble Data Tool
The Ensemble Data Tool allows you to create, view and analyze Ensemble Data Sets that are created from RDF files. This section describes these operations
Creating, Organizing and Deleting Data Sets
Use the Operations and then Import from RDF menu or use the Import RDF button. Then use the file chooser to select the RDF file and select Open. The Ensemble Data Tool reads the RDF file and creates an Ensemble Data Set object on the workspace and shows it as an item in the Data Sets panel.
The default name and metadata is shown in the Data Sets panel and described below. Right click and choose Rename Selected Data Set to change the data set name. Right-click options also include Expand All and Collapse All items in the tree-view.
Use the up and down arrow buttons to move the selected data set up or down in the list.
Use the Delete Set button to delete the data set and the ensemble data set object from the tool and model.
Viewing Data Set properties
The Data Sets panel presents the list of data sets as an expandable tree view. Selecting a data set expands its row in the tree view.
For each ensemble data set, the following metadata and properties are shown in the list:
• Save With Model: Whether the ensemble data set should be saved in the model as an object or is temporary and will not be saved
• RDF File Path: the path of the RDF file from which the data set was created
• RDF File Creation Time: The date and time at which the RDF file was created
• RDF Number of Timesteps: Count of the number of timesteps for series slots in the RDF file
• RDF Start Date: earliest date/time of series data in the RDF file
• RDF End Date: latest date/time of series data in the RDF file
• RDF Timestep size: the size of the timestep for series data.
• RDF Number of Runs: count of the runs in the RDF file
• RDF Trace Range: the range of trace numbers for traces in the RDF file
• Number of Data Set Slots: the number of slots in the data set found by multiplying the number of slots by the number of traces.
• RDF Slots: An expandable list of all of the slots in the set. With this expanded, select one slot and create a Data View or Plot Page using the buttons or right-click context menus.
For each Analysis data set, the following are shown:
• Save With Model: Whether the ensemble data set should be saved in the model as an object or is temporary and will not be saved
• Analyzed Slots: For each slot analyzed, the following is shown:
– Analysis Action: The script action that was used to perform the analysis.
– Analysis Type: For certain analysis actions, the type performed, often the statistical approach used.
– Timestamp: The time and date at which the analysis was performed.
• Results Slots: An expandable list of slots that were the result of the analysis. Select one slot and create a Data View or Plot Page using the buttons or right-click context menus.
Data Views - Plots of Data Sets Series Slots
The EDT displays a graphical visualization as Data Views in the panel on the right. Each Data View is a tab in the panel. A sample screen shot with the Data Views panel is shown in
Figure 3.14.
Figure 3.14 Screenshot of Ensemble Data Tool Selected Data Set panel
To create a view, there are two approaches:
1. In the EDT, select either a Data Set or a slot and use the
Create Data Views button
to quickly create line plots. With an entire Data Set selected, there will be one Data View / tab for each slot. If you have an individual slot selected, only that slot will be shown as a new tab in the Data Views.
2. Use the Visualize button to create a custom data view. Select the desired slots, traces, time range and destination for the data views. See
Visualizing Data Sets for more information.
The following Data Views are currently supported:
• Series data from RDF data sets can be plotted as over-plotted lines, also called spaghetti plots.
• The results of a Multiple Statistics analysis can be plotted as lines and shaded regions
• Results of an Analyze Ensemble Data Set action can be plotted as a single line.
Use the following buttons to perform operations on the Data Views.
• Close View: Close the current view only.
• Create Plot Page: Create a Plot Page in the Output Manager for the current view. Once in the Output Manager, you can further customize all aspects of the plot.
Note: Creating a plot page of a Multiple Statistics that includes shading between curves is possible but there is little support for the shading in the Plot Page. There is currently no capability to change colors or opacity.
• Close All Views: Close all the Data View tabs.
View the legend and filter the visibility of the curves using the Legend/Filter button
:
Figure 3.15 Legend and Filter panel shown in the EDT
Select individual curves to hide or show or use the All/None button as needed. Once you have it configured as desired, you can copy and apply that filter using one of the following approaches:
• Select the
Copy Visibility Filter button
switch to a different Data View and then select the
Apply Visibility Filter button
.
• In the Operations menu, select the options to Copy/Apply Visibility Filter from Data View.
• Use the right-click context menu on the plot itself to Copy Curve Visibility Filter and similar option to apply it to a different Data View.
Within the Data View, the operations are similar to other plots in RiverWare. Hover over a curve to see the run/trace number, date and value:
Draw a rectangle to zoom in. Click and drag the middle mouse button to pan the plot. Right click on a plot area for additional plotting functionality as shown in
Figure 3.16.
Figure 3.16 Ensemble Data Tool Plotting controls.
Tip: To see further customize the plot, use the Create Plot Page button. Then you can edit and save the Plot Page.
Analyzing Data Sets
Use the Analyze button to perform analysis on an Ensemble Data Set. Currently, only RDF data sets can be analyzed; in the future there may be ways to perform additional analysis on Analysis data sets.
The following types of analysis actions are supported by the Ensemble Data Tool. These correspond directly to the types of analysis supported by the ensemble data analysis script action types. See the script documentation for the specific details of each analysis type.
Analysis Action (data type) | Link to Script Information |
---|
Analyze Ensemble Data Set (per timestep, across traces) | |
Compute Regression (on scalar data, across Traces) | |
Compute Duration Curve (on series data, across Traces) | |
When you select Analyze, the tool presents a modal dialog that allows you to configure and perform the analysis as shown in
Figure 3.17Figure 3.17 Screenshot of Analyze Data Set dialog
Select any item in the right column to initiate editing. For example, select the Mean value and then select from the menu to change the analysis type, as shown in
Figure 3.18. Documentation of the various options are in the script documentation, with links in the table above,
Analyzing Data Sets.
Figure 3.18 Screenshot of editing while using the Analyze Ensemble Data Set dialog
Warning: Ensemble data analysis is not supported for aggregate series slots. Including aggregate series slots in ensemble data analysis will result in an error.
Visualizing Data Sets
Use the
Visualize button to create line plots as Data Views in the EDT and/or Plot Pages. The tool presents a modal dialog that allows you to configure the visualization, as shown in
Figure 3.19Figure 3.19 Screenshot of Visualize Ensemble Data Set dialog
Select any item in the right column to initiate editing. For example, select the “All Traces” value and then select from the menu to change which traces to show, as shown in
Figure 3.20. Documentation of the various options are in the script documentation,
Visualize Ensemble Data Set in Automation Tools.
Figure 3.20 Screenshot of editing while using the Visualize Ensemble Data Set dialog
Once satisfied, select Create Visualization and the specified Data Views and/or Plot Pages will be created.
About Ensemble Data Sets
An Ensemble Data Set is an object that contains either RDF trace data or the results of analysis of that data.
In addition to being listed in the Ensemble Data Tool, all ensemble data sets are displayed on the workspace. When a data set is created, its workspace object is automatically placed on the workspace in the lower right corner of the region containing existing objects.
Figure 3.21 shows a workspace containing several ensemble data sets. Variations in the icon distinguish between the set's data source (RDF or Analysis).
Figure 3.21 Workspace with ensemble data set objects
Ensemble Data Sets are simulation objects. For example, the data are represented as slots, sometimes referred to as “data set slots”, which can be either series, scalar, or table. An Ensemble Data Set can be opened in the Object Viewer and allows access the wide range of RiverWare functionality that applies to slot data. For example, you can visualize ensemble data in plots or canvas charts and write their data using DMIs or model reports.
Ensemble Data Sets in the Object Viewer have an additional row of controls, allowing you to control whether the data set will be included in the model file and providing a button with access to the Ensemble Data Tool as shown in
Figure 3.22.
Note: The object is read-only; many operations that change the object, slots, and data are disabled.
Figure 3.22 Object Viewer displaying an Ensemble Data Set
Script Actions for Working with Ensemble Data Sets
The following script actions operate on the Ensemble Data Tool:
• Analyze Ensemble Data Set - Compute statistic on an Ensemble Data Set, per timestep, across traces. The built in statistics are: exceedance, max, mean, median, min, percentile, or sum. In addition, you can specify a global RPL function to perform the per timestep, statistics across trace.
• Compute Duration Curve - Compute a duration curve for a set of series slots across multiple traces and timesteps from an open ensemble data set.
• Compute Regression - Perform a multiple linear regression analysis on scalar data, across traces in the ensemble data set.
• Delete Ensemble Data Set - Close and delete the specified ensemble data set from the Ensemble Data Tool and the model, freeing the memory associated with the data set.
A sample use case would be the following sequence of script actions:
1. Create Ensemble Data Set action to read an RDF file and create the data set.
2. Open Ensemble Data Tool action to open the dialog box.
3. Visualize Ensemble Data Set action to create plots of the imported RDF data.
4. Memo action to pause while you look at the plots.
5. Analyze Ensemble Data Set to perform statistical analysis on the data set.
6. Open Slot to show the slot populated by the Analyze Ensemble Data Set.
7. Visualize Ensemble Data Set action to create plots of the resulting statistics.
8. Delete Ensemble Data Set to remove the data set from memory.