September 2013

Thematic issue: Content technologies

A word from the guest editor

Caj Södergård
Guest Editor
VTT – Technical Research Centre of Finland, Espoo

E-mail: Caj.Sodergard@vtt.fi

Content technologies provide tools for processing content to be delivered via any media to the target audience. These tools are applied in numerous ways in media production. Research into content technologies is very active and opens new possibilities to improve production efficiency as well as to enhance the user experience and thereby the business value of media products and services.
This thematic issue focuses on several applications of content technologies. All papers address the user, and the ability to objectively measure and predict the responses various content causes in users is a much needed tool for the media professional. An emerging application proposed in this issue helps journalists find interesting topics for articles from the excessive information available on the internet. Another class of applications dealt with here is recommending content to the users. Relevant recommendations motivate the user to visit and spend time on a web service. Recommenders are therefore important in designing attractive – and monetizable – digital services. As a consequence, this technology is found in many services recommending media items such as music, books, television programmes and news articles. The papers on recommenders in this issue cover the three main methods in the field – content-based, knowledge-based and collaborative – and they bring new perspectives to all three. One such novel perspective which has been evaluated in user studies is that of a portable personal profile.

Most of the included papers are outcomes of the Finnish Next Media research program (www.nextmedia.fi) of Digile Oy. Next Media has run from 2010 through 2013 with the participation of 57 companies and eight research organisations. The volume of the program has been substantial; annually around 80 person years with half of the work done by companies and half by research partners. The program has three foci: e-reading, personal media day, and hyperlocal. The papers in this issue represent only a small part of the results of Next Media. As an example, during 2012 the program produced 101 reports, most of which are available on the web.

Even if this thematic issue is centred on work done within the Finnish Next Media program, content technologies are of course studied in many other places around the world. The paper by NTNU in Norway presented here is just one example. Computer and information technology departments at universities and research institutes often pursue content related topics ranging from multimedia “big data” analysis to multimodal user interfaces and user experience. In the upcoming EU Horizon 2020 program, “Content technologies and information management” is a major topic covering eight challenges. This will keep the theme for this thematic issue in the forefront of European research during the years to come.

Download

3-13

Media experience as a predictor of future news reading

Simo Järvelä¹, Matias Kivikangas¹, Timo Saari³, Niklas Ravaja^{1, 2}E-mails: simo.jarvela2@aalto.fi; matias.kivikangas@aalto.fi; niklas.ravaja@aalto.fi and timo.s.saari@tut.fi
¹ School of Business, Aalto University, Finland
² Department of Social Research, Helsinki Institute for Information Technology, University of Helsinki, Finland
³ Department of Pervasive Computing Tampere University of Technology, Finland

Abstract

The newspaper medium is forced to evolve in the digital age. In order to transfer the core media experience of newspaper reading to new digital formats, its very nature must be examined. In an experiment with 24 readers of a digital newspaper, responses to seven different news sections (people, city, culture, opinion, business, foreign, sports) were measured with psychophysiological methods and self-reports and the differences in responses to them were examined. These data were then compared to actual reading behavior during a six week follow-up period to investigate how immediate media experience predicts future news reading. It was found that the news sections were differentiated by self-reported emotional responses and other message ratings (e.g., relevance to the self, interestingness, reliability), but not by physiological responses. In addition, both self-reports and physiological responses (facial electromyography and heart rate) predicted news reading during the follow-up period, but the strength or direction of the association varied by news section.

Different kinds of emotions predict future reading for different news sections, suggesting that people expect differential emotional experiences from different sections.

Keywords: newspaper, readership, media experience, psychophysiology

JPMTR 021 ⎮ 1310 Original scientific paper
UDC 655.326.1:676:22

Received: 2013-06-28
Accepted: 2013-11-18

Software Newsroom – an approach to automation of news search and editing

Juhani Huovelin¹, Oskar Gross², Otto Solin¹, Krister Lindén³, Sami Maisala¹, Tero Oittinen¹, Hannu Toivonen², Jyrki Niemi³, Miikka Silfverberg³E-mails: juhani.huovelin@helsinki.fi; otto.solin@helsinki.fi; sami.maisala@helsinki.fi; tero.oittinen@helsinki.fi; oskar.gross@cs.helsinki.fi; hannu.toivonen@cs.helsinki.fi; krister.linden@helsinki.fi; jyrki.niemi@helsinki.fi; miikka.silfverberg@helsinki.fi
¹ Division of Geophysics and Astronomy, Department of Physics, University of Helsinki, Finland
² Department of Computer Science and HIIT, University of Helsinki, Finland
³ Department of Modern Languages, University of Helsinki, Finland

Abstract

We have developed tools and applied methods for automated identification of potential news from textual data for an automated news search system called Software Newsroom. The purpose of the tools is to analyze data collected from the internet and to identify information that has a high probability of containing new information. The identified information is summarized in order to help understanding the semantic contents of the data, and to assist the news editing process.

It has been demonstrated that words with a certain set of syntactic and semantic properties are effective when building topic models for English. We demonstrate that words with the same properties in Finnish are useful as well. Extracting such words requires knowledge about the special characteristics of the Finnish language, which are taken into account in our analysis.

Two different methodological approaches have been applied for the news search. One of the methods is based on topic analysis and it applies Multinomial Principal Component Analysis (MPCA) for topic model creation and data profiling. The second method is based on word association analysis and applies the log-likelihood ratio (LLR). For the topic mining, we have created English and Finnish language corpora from Wikipedia and Finnish corpora from several Finnish news archives and we have used bag-of-words presentations of these corpora as training data for the topic model. We have performed topic analysis experiments with both the training data itself and with arbitrary text parsed from internet sources. The results suggest that the effectiveness of news search strongly depends on the quality of the training data and its linguistic analysis.

In the association analysis, we use a combined methodology for detecting novel word associations in the text. For detecting novel associations we use the background corpus from which we extract common word associations. In parallel, we collect the statistics of word co-occurrences from the documents of interest and search for associations with larger likelyhood in these documents than in the background. We have demonstrated the applicability of these methods for Software Newsroom. The results indicate that the background-foreground model has significant potential in news search. The experiments also indicate great promise in employing background-foreground word associations for other applications.

A combined application of the two methods is planned as well as the application of the methods on social media using a pre-translator of social media language.

Keywords: internet, social media, data mining, topic analysis, machine learning, word associations, linguistic analysis

JPMTR 022 ⎮ 1311 Original scientific paper
UDC 054:004.4

Received: 2013-06-28
Accepted: 2013-11-07

Portable profiles and recommendation based media services: will users embrace them?

Asta Bäck and Sari Vainikainen
E-mails: asta.back@vtt.fi; sari.vainikainen@vtt.fi
VTT Technical Research Centre of Finland, Vuorimiehentie 3, P. O. Box 1000, FI-02044 VTT, Finland

Abstract

User data and user profiles are very important in current web based business models and applications. Advertising revenue is to a large extent generated based on implicit and explicit user data. We propose the use of semantic, portable and user-controllable profiles to capture and model user data that can be used for personalising media services and particularly for making recommendations.

In this paper, we present results from four user tests where users created semantic profiles and received personalised recommendations based on these profiles. We have studied users’ expectations and requirements for profile portability as well as how users have experienced the creation of a semantic profile using different data sources. We have also studied how users experienced recommendation based media services in connection with user-controlled interest profiles. In all four test cases, users have created the profiles and received the applications using the prototypes we have developed. User feedback was gathered either through interviews or using web surveys.

Users welcomed the idea of being able to control their profile data but they also had questions and concerns about privacy if the profile is shared between services. Users had dual concerns about the recommendations: some users were
afraid of a too limited view if the service only relies on the user’s profile; some users were afraid of being overwhelmed by too many recommendations.

Keywords: user profiles, portable profiles, media services, recommendations, semantic web technologies, linked data

JPMTR 023 ⎮ 1316 Research paper
UDC 659.3:81’37

Received: 2013-08-21
Accepted: 2013-10-30

Knowledge-based recommendations of media content – case magazine articles

Sari Vainikainen, Magnus Melin, Caj Södergård
E-mails: sari.vainikainen@vtt.fi; magnus.melin@vtt.fi; caj.sodergard@vtt.fi
VTT Technical Research Centre of Finland, Vuorimiehentie 3, P. O. Box 1000, FI-02044 VTT, Finland

Abstract

A successful media service must ensure that its content grabs the attention of the audience. Recommendations are a central way to gain attention. The drawback of current collaborative and content-based recommendation systems is their shallow understanding of the user and the content.

In this work, we propose recommenders with a deep semantic knowledge of both user and content. We express this knowledge with the tools of semantic web and linked data, making it possible to capture multilingual knowledge and to infer additional user interests and content meanings. In addition, linked data allows knowledge to be automatically derived from various sources with minimal user input. We apply our methods on magazine articles and show, in a user test with 119 participants, that semantic methods generate relevant recommendations. Semantic methods are especially strong when there is little initial information about the user and the content. We also show how user modelling can help avoiding the recommendation of unsuitable items.

Keywords: recommendation systems, personalization, semantics, semantic web, linked data, media services, metadata, user profiles, ontology

JPMTR 024 ⎮ 1315 Research paper
UDC 004.42:81’37

Received: 2013-08-19
Accepted: 2013-11-01

Learning user profiles in mobile news recommendation

Jon Atle Gulla¹, Jon Espen Ingvaldsen¹, Arne Dag Fidjestøl², John Eirik B. Nilsen¹, Ken Robin Haugen¹, Xiaomen Su²E-mail: jag@idi.ntnu.no
¹ Department of Computer and Information Science, Norwegian University of Science and Technology, Sem Sælands vei 7, Gløshaugen NO-7499 Trondheim, Norway
² Research and Future Studies Telenor Group, Norway

Abstract

Mobile news recommender systems help users retrieve relevant news stories from numerous news sources with minimal user interaction. The overall objective is to find ways of representing news stories, users and their relationships that allow the system to predict which news would be interesting to read for which users. Even though research shows that the quality of these recommendations depends on good user profiles, most systems have no or very simple profiles, because users are reluctant to giving explicit feedback on articles’ desirability. In this paper we present a user profiling approach adopted in the SmartMedia news recommendation project. We are building a mobile news recommender app that sources news from all major Norwegian newspapers and uses a hybrid recommendation strategy to rank the news according to the users’ context and interests. The user profiles in SmartMedia are built in real-time on the basis of implicit feedback from the users and contain information about the users’ general interests in news categories and particular interests in events or entities. Experiments with content-based filtering show that the profiles lead to more targeted recommendations and provide an efficient way of monitoring and representing users’ interests over time.

Keywords: recommender systems, personalization, Big Data, user click analysis, news apps, content-based filtering

JPMTR 025 ⎮ 1312 Research paper
UDC 004.78:004.58

Received: 2013-07-08
Accepted: 2013-11-11

UPCV – Distributed recommendation system based on token exchange

Ville Ollikainen¹, Aino Mensonen¹, Mozhgan Tavakolifard²E-mails: ville.ollikainen@vtt.fi; aino.mensonen@vtt.fi; mozhgan@idi.ntnu.no
¹ VTT Technical Research Centre of Finland, Espoo, Finland
² Department of Computer and Information Science, Norwegian University of Science and Technology, Sem Sælands vei 7-9, N-7491 Trondheim, Norway

Abstract

Most conventional recommendation systems are based on service-specific data repositories containing both user and item data. In this paper, we introduce an alternative approach called UPCV (Ubiquitous Personal Context Vectors) that inherently supports distributed computing and distributed data repositories. The principal idea is that each user-item interaction can update the data associated with both the user and the item. When updating, item data is made to slightly resemble user data and vice versa, leading to increasing similarity between them. Through interactions, similarity will spread from users to items, from items to users, making it possible to inherently provide user-item, item-item, item-user and user-user recommendations. The principle introduced in this paper can be used as a baseline for the design of different types of collaborative recommender systems. The main advantages of this method are that it requires no content analysis, preserves users’ privacy and supports scalability. The method was evaluated using data from 1575 book club members: the members were asked which books they had read and liked. The quantitative analysis indicates that the most promising results are obtained for active readers. However, even for less active readers and without content analysis, the recommendation list tends to be populated by the same authors and/or authors of the same genre that the readers have liked, leading to meaningful recommendations.

Keywords: recommendation, collaborative filtering, distributed computing, cloud computing, scalability, privacy, deniability

JPMTR 026 ⎮ 1314 Original scientific paper
UDC 004.42:004.58

Received: 2013-08-01
Accepted: 2013-10-25