Jump to main content - accesskey 2

Data Art project sponsors

Data Art with BBC Backstage

News Cloud

  • Climate tagsSearch for 'Climate' and related tags

NewsCloud is a news archive visualisation that shows how the themes relating to a search term fluctuate over time - a temporal tag cloud.
For example a search for ‘energy’ under science news returns terms relating to different energy sources, the science of energy (dark matter and the hadron collider), environmental and political themes.

Each term is represented as a circle that changes scale in proportion to its occurrence in news headlines over a particular timeframe. As the time slider changes the circles grow and shrink giving a picture of which terms are coming to prominence at any one time. To see the overall tag cloud, you can extend the time slider to the full search period. For more detail you can click on a term circle to see the list of stories where it appears. The headlines link through to the online article. Try the term ‘future’ under the ‘energy’ search.

A good example of the temporal aspect of the visualisation is the search oil price under world news. You can the track the relative prominence of ‘saudi’, ‘iraq’, ’russia’, ’china’, ’iran’, and ’libya’ over the last decade.

There are two versions of the project: BBC Science News, and Guardian Open Platform. We’re using the Open Platform because it provides an API that can be queried by date, and an archive going back over 10 years.

Launch NEWS CLOUD //

Launch BBC News Cloud     Launch Guardian News Cloud

Instructions //

  • In the Guardian version you can select the news section and the date range.
  • Enter a search term
  • When the terms appear, click on the a circle to see the stores appear on the right. Roll over the stories to see the description and click to link to the online article.
  • Use the time slider at the bottom to change the date range, to play and to step forwards and backwards. To get an overview, drag the date handles to cover the full date range.
  • Change the ‘Word Count’ slider to see a greater or fewer number of terms
  • For searches with a large number of results on the Guardian version you’ll get a warning message suggesting you use a more specific search. For more than ten thousand results the search is blocked.

Search BBC version //

Here are some of the searches we’ve tried:

climate, energy, genome, space, species, telescope, mars, carbon

Search Guardian version //

Here are some of the searches we’ve tried:

World section:
aids, oil prices, credit crunch, globalisation, anti-globalisation, flu, diamonds, gold, wmd, recession, data, New Orleans

Politics section:
AV, PR, expenses

Tehnology section:
twitter, facebook

 

  • Oil pricesWorld News - Oil prices
  • HackingMedia - Hacking
  • NuclearEnvironment - Nuclear
  • PlaceboThe placebo effect

How it was Built //

Data for the BBC version was harvested using Yahoo! Search Web Services and Jungle, it isn’t the complete RSS archive. The Guardian version uses the Open Platform API.

The related terms are based on the individual words appearing in the story headlines and descriptions, ignoring a list of stop words.

The list of terms are sorted by frequency, and the top 80 are used for the visualisation. As a further step we could try extracting terms using Alchemy or Open Calais – the obvious caveat is that a query would need to be made for each of the possible thousands of story results.

The physics on the term circles is a modification of the Flare library. Particles have a radius property, and two custom forces have been added; soft collision between circles and a gravitational force towards the central circle.