Society is increasingly data driven. Charts abound in scientific papers, the media and government reports, to make an argument or justify a decision. Yet, most of those charts are static even when on web sites, i.e. the users cannot further interact with the data, e.g. to look for different correlations than the ones presented.
iScatter is my contribution to improve what can be done with a scatterplot, a very common and general chart type, applicable to any domain. iScatter is more sophisticated than the scatterplot widgets I've seen – where the 'interaction' is often little more than hovering over a data point to see its precise x and y coordinates – whilst being far easier to use (and necessarily much simpler and less flexible) than professional statistical packages, which require a steep learning curve.
In the rest of this page I show an example of what iScatter can do and explain how you can create your own data for presentation with iScatter in your own web pages.
If you spot an error in iScatter, if you have any query or suggestion, or need help including iScatter in your site, please add a comment below or e-mail me directly.
The example is taken from education. The fictitious data is about students. Their attributes are: id, gender, age, ethnicity, exam and assignment scores, tutor (who also marked the assignments), exam marker and exam script id.
Here is the chart. Click on the "?" icon to start a guided help tour. You need to use Chrome or Safari on a desktop (not a tablet) to interact with iScatter.
Here are some exercises to try out iScatter, with some hints if you're stuck:
- Are the mean scores higher for women or men? Do assignments seem to prepare students for the exam? See this chart.
- Are tutor groups diverse in terms of gender and ethnic background? See this chart and then hover over each tutor in the legend.
- Is some marker lenient or severe? Were tutor groups shuffled among markers? Did some tutor's students achieve better results? See this chart and then hover on the tutors.
As the last example shows, subsets in the legend serve two purposes: to colour circles and to have associated reference lines. Tutors are listed first to colour the circles and to answer the second question of tutor vs marker allocation. Markers are listed next in the legend, so that reference lines for their median scores can be displayed.
iScatter has some keyboard shortcuts: + to zoom in, - to zoom out, = to auto-zoom, u, d, l, r to pan up, down, left or right, and ? to start the guided help. If there are several charts on the page, the keyboard controls only one of them.
For another example of an interactive scatterplot, with real data and more data points, see this post.
My colleague Tony Hirst wrote a free online data visualisation tutorial that uses iScatter in places. The tutorial accompanies some videos Hans Rosling recorded for the Open University, about how to compare countries according to a range of indicators.
Creating the data
Use your favourite application to create two spreadsheets, one defining the attributes (the data schema) and the other the data points. The schema has 6 columns and one row per attribute. Here's an excerpt of the exam schema:
|marker||Marker||Exam script marker||string||nominal|
|oes||Exam||Overall exam score||number||ratio|
The header row must be as shown, but in lowercase. The id, name, type and level columns must be filled; the description and the unit, both arbitrary text, are optional (see last row). The type must be string or number. The level must be one of nominal, ordinal, interval, ratio. iScatter turns the attribute name and unit into the axis title, and the description into a tooltip.
As for the data spreadsheet, it must have one row per data point and one column per attribute. The columns are headed by the attribute ids you defined in the schema. Here's an excerpt of the exam data:
Finally, use your spreadsheet application to export each spreadsheet to a CSV (comma separated values) file. For the student example, I created an Excel workbook with two spreadsheets, and used 'Save As...' to get the CSV schema and data files. The workbook and CSV files are in the iScatter resource bundle (see below).
Creating the web page
iScatter has a programmatic interface that allows you to change what is displayed (as seen in the above exercise hints) but also to restrict what is displayed. For example, you may have noticed that: script numbers and student identifiers can’t be shown in either axis, although they're part of a student's attributes as seen when hovering over a circle; the assignments (resp. exam) can only appear in the x (resp. y) axis; there are three fixed lines (distinction, merit, pass) for exam and assignments, but only two (benchmark 1 and 2) for the age attribute.
The best way to create your own page and see iScatter's programmatic interface at work is to download the resource bundle below and modify the bare bone template as needed.
The following files are available:
- a two page summary
- my poster for the 2014 Higher Education Academy STEM conference, which won the best poster award (academic/support staff category)
- a zip file with iScatter's code, some data sets, and the bare bone template.
iScatter works on Chrome 31 (Mac or Windows), Safari 6 (Mac), and Internet Explorer 11 (Windows), but the latter garbles the tooltips. iScatter doesn't work with Firefox. I haven't tried other browsers or operating systems.
iScatter requires mouse hovering to present information on demand about the plotted data points and their statistics. As such, half the functionality is not available on tablets.