Lecture Notes

Lecture Pod 6 – The Beauty of Data Visualisation

break-up-times

*Image taken directly from the lecture.

The main point of the lecture is that Information is Beautiful. Data Visualisation helps us to hone in and focus only on the information that matters. Visualising data helps us to see patterns and connections that we would not have otherwise seen. Data can be massaged, shaped and compared to create new insights. Through being shaped it can be used to tell a story that no-one has noticed before. Data visualisation is really the combination of the languages of both the eye and the brain, the combination of images and words to create new meaning. It is also a form of data compression, making a ridiculous amount of data fit into a small space while also being understandable.

It can also just look cool.

 

billion-dollar-o-gram

*Image taken directly from the lecture.

The main thing that I learn’t from this lecture was that for information to be understood, it needs a good visualisation. It needs to be able to be quickly understood and also able to be further interrogated. If people are scrolling on social media, they are not going to stop for a huge paragraph of text but they may stop for a visualisation.

References

McCandless, D. (2016). The beauty of data visualization. Ted.com. Retrieved 18 October 2016, from http://www.ted.com/talks/david_mccandless_the_beauty_of_data_visualization

Advertisements

Lecture Pod 5 – Data Journalism

Olympic Medal Tables.png

*This image was taken from the lecture about data journalism in action at the London      Olympics (referenced at the end)

This lecture was all about data journalism, what it is, its history and its use in something like the Olympics. The key definition here is that data journalism is telling a story using data to bolster your argument or point of view. The point of data journalism is to tell a story in a way that will keep the reader entertained.  The good thing about data journalism is that it can (to a degree) take opinions out of the news and leave you with cold hard facts. There is almost always something to do with data in most stories. Now data journalism is just journalism because people don’t trust reporters anymore, they need to bolster their arguments through data.

There is a myth that data journalism did not exist before 2009, that is not true. Data journalism has been used ever since the first issue of the guardian back in the 1800’s. It has always been a part of the news, it has just been recently popularised. Even just using text and simple lines could be considered an effective data visualisation. These data visualisations have often been used to explain things about war to the public.

The Olympics are a good example of data visualisation. Everyone wants to know who is the best, using medal tables. But this data may not tell the whole story. If this data was corrected against things like GDP or population what effect would that have on the medal tally? A visualisation like this was made by the Guardian for the 2012 Olympics. It allowed people to interrogate the data themselves and it could be updated in real time.

The most important point in here though is the speed a data visualisation can now be produced at. This speed was not available 20 or 30 years ago while now we can do them fast and they can be updated in real time, they are not fixed which is a great asset to have when covering something ongoing, like the Olympic medal tables.

References

Data journalism in action: the London Olympics. (2016). YouTube. Retrieved 15 October 2016, from https://www.youtube.com/watch?v=WyjBJzigm0w

History of Data Journalism at The Guardian. (2016). YouTube. Retrieved 15 October 2016, from https://www.youtube.com/watch?v=iIa5EoxyvZI

What is data journalism at The Guardian?. (2016). YouTube. Retrieved 15 October 2016, from https://www.youtube.com/watch?v=IBOhZn28TsE

 

Lecture Pod 4 – Data Presentation Styles: Why do we use graphs?

Bar-Chart.jpg

*This image was taken from the lecture.

To put it simply, this lecture was about why we use graphs. I know, it’s an exciting topic, but an important one. We use graphs to make comparisons easier. That’s why bar charts are so good. They allow simple comparisons. Bubble charts not so great at this. They only give us a general idea. Readers of a graph often simply compare the height instead of the area, making a bubble chart less powerful, despite its aesthetic appeal. We always tend to underestimate the size difference.

We also looked at three common types of graphs. First was the bar chart which is easy to use and familiar to the audience. It is often used to compare data across categories. Next was a line chart which is just as common but is better used to display trends over time.  Last were pie charts which are commonly used and misused. They are best used to show relative percentages of information.

But the most important point made here was that designers often choose how to display their data incorrectly. They don’t think about what type of data they are presenting and what would be best for it. They tend to pick their charts based on other things like aesthetics which, while important, need to be balance with actually communicating your information clearly. Designers can also be influenced by data visualisation trends at the time, which can work out terribly.

References

Cmielewski, L. (2016). leonGraphsPod720p. Retrieved from https://vimeo.com/177306425

Lecture Pod 3 – Visualisation: Historical and Contemporary Visualisation Methods

Nightingale.png

*This image was taken from the lecture.

This lecture pod was about why we use data visualisations at all and its various used throughout our history. This lecture showed various historical uses of data visualisation like the visualisation of Napoleon’s invasion of Moscow, Florence Nightingales charts of causes for death among British soldiers, Otto Neurath’s , and serialised charts all the way to the recent work by Alberto Cairo, The Functional Art.

Numerous key points made in this lecture which all related to the historical examples given. The first was the visualisation is used to an audience grasp complex ideas and difficult concepts quickly. Other points similar to this are made like extracting meaning from raw data is difficult, but a graphic makes it simple, saving us time and effort. Another point made is that a visualisations aim is for your eyes and you brain to perceive what lies beyond their natural reach. Another good point made in the lecture is that data visualisations are more complex today because we have access to much more data than ever before.

But I believe the most important point made in this lecture was really very simple, that what you show in a visualisation can be just as important as what you don’t. This point struck home for me because it made me realise that sometimes showing only a small amount of data can be more effective than showing all of it. But it also made me realise something else, just how easy it is to convince people of a statistic through the simple fact or omitting it from a visualisation.

References

Cmielewski, L. (2016). Visualisation: Historical and contemporary visualisation methods- Part 1. Retrieved from https://vimeo.com/176255824

Cmielewski, L. (2016). Visualisation: Historical and contemporary visualisation methods- Part 2. Retrieved from https://vimeo.com/176255825

Lecture Pod 2 – Data Types

lecture 2 image

*This image was taken directly from the lecture.

This lecture pod was about one thing and one thing only – Data Types. There are four different types of data listed in this lecture. First was Nominal, then Ordinal, Interval and lastly Ratio.

Nominal data (pertaining to names) is named categories that can be counted and used to calculate percentages but you cannot take averages from them. The best example of this would be a supermarkets section (e.g. dairy, produce, canned & frozen), each category would be classes as nominal data.

Ordinal data is all about order. This type of data can be counted and be used to calculate percentages. There is also a debate about whether this data can be used to take averages, but I won’t go into it (my opinion is yes, it can). This data has no true mathematical value with numbers being assigned to make data analysis easy. The numbers just need to be in order. The best example of this data would be calculating which line will get you out of the supermarket quickest (long line, medium line, short line etc.).

Interval data refers to data in which the interval between each point is fixed. This type of data is numeric. The best example of this would be time, there is always a set amount of seconds in each minute making it interval data. In this type of data, 0 does not mean that nothing is there (0˚C does not mean that there is not temperature).

Lastly there is ratio data. This type of data is numeric, like interval data but differs from it in one key aspect; it has a meaningful 0 point. To put it clearly, in this type of data, 0 is the absence of anything. Examples of this data would be height, age, weight and money.

I believe however that the most important point made in this lecture was made early on, about whether data types matter. The answer is obviously yes, otherwise we wouldn’t use them but the reason for this is exceptionally simple. We use data types to prevent mistakes, like a postcode being interpreted as a pin code or something like that. This is the most important point because it drives home one simple point to me, it is easy to make mistakes, but it is easier to avoid them if you label your data correctly.

References

Waterson, S. (2016). DataVis POD02- Data Types. Retrieved from https://vimeo.com/176274669

Lecture Pod 1 – Introduction to Data Visualisation

There were a few important points made throughout this first lecture. First was that there is more data now that at any point throughout history. 23 exabytes (1 exabyte = 1 billion gigabytes) of data was recorded and replicated in 2002 (UC Berkeley’s School of Information Management and Systems, 2003). We now do that in seven days . Richard Saul Wurman (1997) said “There is a tsunami of data that is crashing onto the beaches of the civilised world. This is a tidal wave or unrelated, growing data formed in bits and bytes, coming in an unorganised, uncontrolled, incoherent cacophony of foam. None of it is easily related, none of it comes with any organisation methodology…”. Another important point made is that data itself has no meaning. It does not become information until someone interprets it.

But out of all these I think the most important point was the first one made. Data Visualisation is a mass medium. It has millions of viewers, award shows and even celebrities. It is an essential part of the communication medium, a data driven story without some form of visualisation is like a fashion story without a photo. The reason I think this is the most important point is because it made me realise just how big data visualisation is. There is so much data around today that it now blends into the background and this point made me realise just how much data we consume every day without even registering it.

Where-People-Run

The featured image here is a data visualisation of the most popular running routes in major cities, this one is New York (Yu, 2014). The data was pulled from the workout app Runkeeper. Darker means there is more traffic, lighter is less travelled.

References

UC Berkeley’s School of Information Management and Systems,. (2003). How much Information?. University of California, USA. Retrieved from http://www2.sims.berkeley.edu/research/projects/how-much-info-2003/execsum.htm

Waterson, S. (2016). DataVis POD01- What is Data Vis?. Retrieved from https://vimeo.com/175177926

Wurman, R.S (1997) Information Architects, Graphis Inc; USA

Yu, N. (2014). 10 Cool Big Data Visualizations | MastersinDataScience.org. Master’s in Data Science. Retrieved 25 July 2016, from http://www.mastersindatascience.org/blog/10-cool-big-data-visualizations/