Information overload problems have made a human to reconsider information retrieval process and IR tools that seemed to be effective to a certain point. It has become clear that the success of retrieval does not only consist in improving search algorithms, IR models and computational power of IR frameworks - new approaches to make information seeking closer to the end-user are needed. Such approaches include research in user interfaces better adapted to the user's operational environments, systems understanding the user's needs and whose intelligence spreads beyond an algorithmic query-document match seen in conventional "Laboratory Model" of IR discussed in [26]. This resulted, for instance, in the emergence of interactive TREC track and raise of great interest in user-centered and cognitive IR research. IR systems are seeking to incorporate the human factor in order to improve the quality of their results. Information seeking today is getting considered in dynamic context and situation rather than static settings, and a human is its essential and central part actively processing (receiving and interpreting) and even contributing information. Contextual information of the user is obtained from his/her behaviors collected by the system the user interacts with, organized and stored in user profiles or other user modeling structures, and applied to provide personalized information seeking experience.
In this section we introduce endeavors to improving Web IR by means of user interface improvements and support of exploration activities, and focus on personalization as the most wide-spread approach to user-centric IR. We discuss user profile (UP) as the core element of most personalization techniques, show its structural variety and construction methods.
2.1. Improving Web Information Retrieval
It is well known that alongside with search engine performance improvements and functionality enhancements one of the determinant factors of user acceptance of any search service is the interface. To build a true user-centric information seeking system, this factor must not be underestimated. Here we will show its importance considering mobile Web search, as the need for improvements are particularly tangible due to small screen limitations of handheld devices most of us possess today.
Landay and Kaufmann [28] in 1993 noted that "researchers continue to focus on transferring their workstation environments to these machines (portable computers) rather than studying what tasks more typical users wish to perform." In spite of all the advances of mobile devices, probably the same can be said about mobile Web search judging from its state today. Search today is poorly adapted to mobile context - often, it is a simplistic modification of search results from PC-oriented search services. For instance, many commercial mobile Web services, like those of Yahoo!, provide search results that consist of titles, summaries and URLs only. However, although all redundant information like advertisements is removed to facilitate search on handheld devices, users may still experience enormous scrolling due to long summaries. To improve the experience some services, like Google, reduce the size of summary snippets. However, this can hardly lead to the improvements and, quite the contrary, can thwart the search. As shown in Figure 2, a mobile user searching for "fireplace" cannot know that the result page is about plasma and does not match his/her needs, and has to load the page to find it out. According to Sweeney and Crestani [29]'s investigation on the effects of screen size upon presentation of retrieval results, it is best to show the summary of the same length, regardless whether it is displayed on laptops, PDAs or smartphones.
Improvements to mobile Web search done in academia go further. For example, De Luca and Nürnberger [31] implement search result categorization to improve the retrieval performance and present the information in three separate screens: screen for search and presentation of the results in a tree, screen to show search results and bookmarks' screen. Church et al. [32] substitute summary snippets, which are coming with each result item, with the related queries of like-minded individuals - queries leading to the selection of a particular Web page in the search result list. The researchers argue that such queries can be as informative as summary snippets and using this approach they provide more search results per one screen.
In contrast to the existing approaches, Shtykh et al. [33] (see also [30]) do not make any modifications to the search results, but propose an interface to handle the results provided by any conventional search service. The approach abolishes fatigue-inducing scrolling while preserving "quality" summaries of PC-oriented Web search. The proposed interface, called slide-film interface (SFI), is a kindred of "paging" technique. Unlike most mobile Web search services that truncate summary snippets of the search result items to reduce the amount of scroll and in this way facilitate easier navigation through search results that often can lead to difficulties in understanding of the content of a particular result, (owing to the availability of one slide of a screen size for one search result) our approach has an advantage to provide the greater part of one slide screen to place the full summary without any fear to make the search tiresome. SFI was compared with the conventional method of mobile Web search and the experimental results showed that, though there was no statistically significant difference in search speed when the two interfaces are used, SFI was highly evaluated for its viewability of search results and ease to remember the interface from the first interaction.
Although such approaches to improve the search with focus on the user, his/her usability are very important and user-oriented, they treat the user regardless of his/her contextual and situational information. As we already mentioned and will discuss more in Section 3, information need and human behavior are very contextual. Therefore peculiarities of information behavior, proclivities, preferences and everything that can give a better conception of the user, his/her behavioral patterns and needs must be considered in order to be able to provide a truly personalized information seeking experience. Although in the paper we focus on information seeking specifically, the application area of personalization spreads far beyond it. It is applied to Web recommendations and information filtering, user adaptation of Smart Home and wireless devices, etc.
Through our research we were particularly interested in personalizing and facilitating a human's interactions with various Web services. And search is not the only activity in Web information space users are engaged in. As empirical studies show [34], most of time users rediscover things they used to find in the past, and often they browse without any specific purpose discovering information space around them or with a particular purpose, such as learning miscellaneous information. To support such a discovery, we designed an exploratory information space [35] that makes use of human-centered power of bookmarking for information selection. The information space is built as a result of a search for something a user intends to discover, and serves as a place for rediscoveries of personal findings, socialization and exploration inside discovery chains of other participants of the system.
2.2. Personalization
Today personalization is the term we often relate to Web search personalization, such as in Google's iGoogle, recommendation system of Amazon.com, or contextual advertisements on Web sites. It is also about Decentralised-Me [36] of emerging Web 3.0 or is an essential part of Mitra [37]'s formula of Web 3.0 - Web 3.0 = (4C + P + VS), where 4C is Content, Commerce, Community, and Context, P is personalization, and VS is vertical search. However, the notion of personalization is much more diverse than that. It differs with regard to its application area and is being transformed over time and advances in its research. It is sometimes synonymous to customization and often to adaptation. It concurs with information filtering and recommendation.
In 1999 Hansen et al. [38] outlined two knowledge management strategies for business - codification, i.e., impersonalized storing knowledge in databases and its reuse, and personalization, which focuses on dialogue helping people to communicate knowledge. The authors claim that emphasizing the wrong strategy or pursing the both at the same time can undermine a business. However, today, in the situation of information overload, the both strategies often complement each other. Greer and Murtaza [39] define personalization as "a technique used to generate individualized content for each customer" and investigate the factors that influence the acceptance of personalization on an organization's Web sites. The research finds that ease of use, compatibility with an individual's value and his/her intents and expectations, and trialability ("the degree to which personalization can be used on a trial basis") are the key factors for personalization adoption. Monk and Blom [40] in their earlier works define personalization as "a process that changes the functionality, interface, information content, or distinctiveness of a system to increase its personal relevance to an individual," and Fan and Poole [41] extends this definition to "a process that changes the functionality, interface, information access and content, or distinctiveness of a system to increase its personal relevance to an individual or a category of individuals" which serves as the working definition for the paper.
Such a great diversity in understanding of what personalization is results in difficulties to produce a holistic view on personalization, hurdles for sharing findings for researches of different fields and difficulties to compare approaches. And this is one of the conceivable reasons why the current approaches focus on "how to do personalization" rather than "how personalization can be done well," as Fan and Poole [41] has noted. Most personalization approaches on the Web are system-initiated, i.e., considering adaptivity which is the ability to adapt to a user automatically based on some knowledge or assumptions about the user. But another concept - of adaptability, which is a user-initiated (or explicit by Fan and Pool [41]) approach to modify the system's parameters in order to adapt its functionalities to his/her particular contexts, - is also important when considering personalization. Monk and Blom [40] emphasized that people always personalize their surroundings, and their Web environment is not an exception, and presented their theory of user-initiated personalization of appearance.
Personalization has a lot of advantages over impersonalized approaches, some of which are obvious and some of which are hidden and have to be empirically proven. For instance, Guida and Tardieu [42] prove that personalization, similarly to long-term working memory, helps to overcome working memory limitations, expanding storage and processing capabilities of human-beings. Although the discussed personalization is considered as a creation of the situation of individual expertise that is generally not exactly what modern personalization systems can provide, such approach indicates the need in better considering context and situation in order to fully employ its merits.
2.3. Modeling User Interests
In order to be user-centric, a service has to know each user it interacts with. This is the task personalization attempts to fulfill with a variety of methods in various work task and environmental settings. Personalization systems extract the user's interests, infer his/her preferences, update and rely on knowledge about the user accumulated and structured in user profiles that differ by the data used for their definition, their structure and complexity, and construction approaches.
At this point we have to note that in modeling user interests we do not make a distinction between Web search personalization, recommendation or information filtering because the differences in their methods and goals are very subtle. All such approaches utilize a certain scheme to know the user's preferences to adapt to his/her future interactions with the system and information it provides, and constructing user profiles (or user modeling) is the most popular method. It has been extensively used from days of first information filtering systems, for instance as a user-specified profile or a bag-of-words extracted from the documents accessed by the user, and today it takes many richer and diverse forms to meet the requirements of the variety of information systems.
2.3.1. Relevance Feedback as a Modeling Material
As the reader can see from the above discussions, use of relevance feedback for personalization is very important and widely utilized. Let us see what types of feedback exists and what kinds of data are used for feedback.
Feedback Types
Relevance feedback is extensively used in Web IR for efficient collection of user behavioral data for further user behavior analysis and modeling. Relevance feedback can be explicit (provided explicitly by the user) or implicit (observed during user-system interaction). The first form of relevance feedback is high-cost in terms of user efforts and the latter one is low-cost but requires a thorough analysis to reduce the noise it normally contains. Implicit relevance feedback in IR systems consists of a number of elements, such as a query history, a clickthrough history, time spent on a certain page or a domain, and others, that can be considered in general as a collection of implicit behaviors of users interacting with the information retrieval system. It is conducted without interruption of user activities, unlike explicit one that requires direct user interferences, that is why many are showing keen interest in it. Interested readers are referred to [43] for survey on the use of classic relevance feedback methods and [44] for extensive bibliography of papers on implicit feedback, or any modern information retrieval (IR) textbook for the detailed introduction of relevance feedback.
With emergence of social network, new types of feedback become available. Thus, social bookmarking and tagging, as described in [45], are sui generis mixture of both implicit and explicit relevance feedback. On one hand, bookmarking is an explicit action done by a user and not monitored for by the system, on the other hand, in contrast to explicit feedbacks, it is normally not a burden for the user. We would classify such a feedback as motivated explicit feedback, since it is motivation that removes burdens from the explicit nature of the feedback.
Another emerging type of relevance feedback that is worth mentioning is contextual relevance feedback which shows again an increasing attention to context for personalization. As a matter of fact, it is often of no difference from many other approaches based on user profiles. Thus, in [46]'s approach contextual relevance feedback is a feedback to a search result list to filter it based on user-collected document piles. Another example is contextual relevance feedback architecture by Limbu et al. [47] which, in addition to profiles, utilizes ontologies and lexical databases.
Types of Data for Relevance Feedback
As to the types of data used for profile construction, their choice depends on the application domain of the system to be personalized. For IR systems, relevance feedback is normally documents, queries, network session duration and everything related to information search process on the Web and beyond. For instance, Teevan et al. [48] extend the conventional relevance feedback model to include the information "outside of the Web corpus" - implicit feedback data is derived from not only search histories but also from documents, emails and other information resources found in the user's PC. With the change of the application domain the type of data differs. For instance, mobile device features and location can be considered for profile construction in nomadic systems [49], and user interests can be learnt from TV watching habits, as in [50]. Naturally, any user behavior can be considered as a source for inference of his/her interests and further user profiling, and there are as many selection decisions in regard to use of a particular feedback type as there are systems that utilize them. Fu [51] proposes to examine a variety of behavioral evidences in Web searches to find those that can be captured in a natural search settings and reliably indicate users' interests.
2.3.2. Modeling Methods
With the afore-mentioned data, user interests can be inferred and user profiles (models) can be created in a number of ways and various methods. Most of them use vector-space and probabilistic modeling approaches, some of them are based on neural networks or graphs. It is hard to clearly classify all of them, since many of them are very domain-data-dependent and thus their methods are very specific. Often user interest modeling is done specifically for the system it is applied to with regard to its application domain and based on the specific data that can be obtained from user-system interactions of this particular system. Consequently, modeling methods for user interests will be constrained to that type of systems, in contrast to other generic modeling approaches.
For instance, the personalized peer-to-peer television system by Wang et al. [49] is interested in user interests inferred from TV watching habits. For user u
k
the interest in program i
m
is calculated as
(2.1)
where WatchedLength(m, k) is the duration of program i
m
in seconds watched by user u
k
, OnAirLength(m) is the full duration of program i
m
, and freq(m) denotes the number of times its has been broadcast. Models in e-learning, in addition to interests, often consider learning styles and performance, cognitive aspects of a learner, etc. They are complex and require explicit directives and assessments of an instructor. For instance, student profile in [52] consists of four components: 1) cognitive style, 2) cognitive controls, 3) learning style and 4) performance. It is created by a student registering to the course and complemented by the instructor's and psychological experts' surveys on the user's cognitive and learning styles. It is updated with the student's feedback, monitored performance and the instructor's decisions based on the user's learning history.
2.3.3. Structural Components
There is a great variety of profile structure types. The simplest and most widespread one is to represent user interests learnt from relevance feedback with document term vectors for each interest's category. Shapira et al. [53] enhance such vectors with sociological data (profession, position, status). Profiles in Sobecki [54] are attribute-value tuples, where the attributes characterize usage such as visited pages or past purchases, or demographic data such as name, sex, occupation, etc. In Ligon et al. [55]'s agent-based approach user profiles are a combination of information categories and a preference database containing search histories related to the categories.
User profiles become more elaborate and complex trying to reflect the dynamics of constantly changing user context and interests. For instance, Bahrami et al. [56] distinguish static and dynamic user interests for profile construction in their information retrieval framework. Barbu and Simina [57] distinguish Recent and Long-Term continuously learnt user profiles and apply them to information filtering tasks. Further, information systems utilized by mobile devices often extend the notion of user profile in conventional IR systems bringing specific contextual information into it. For instance, Carrillo-Ramos et al. [48], in attempt to adapt information to a nomadic user by taking context of use into consideration, introduce Contextual User Profile which consists of user preferences and current context (location, mobile device features, access rights, user activities) of use. Ferscha et al. [58] propose context-aware profile description language (PPDL) expressing mobile peers' preferences with respect to a particular situation. Finally, some attempts to provide more holistic approaches to profile structuring, such as Gargi [59]'s Information Navigation Profile (INP) defining attributes for characterizing IR interfaces, interaction and presentation modes, are made resulting in complex profiles that consist of multiple search criteria.
2.3.4. On User Contexts
As we already noted, personalization with better focus on user contexts and situations is the topic to be better investigated in the near future. As personalization depends much of the intents of and results expected by a user, it is essential to accurately assess his/her contextual characteristics.
In spite the fact that a number of personalization approaches today use the notion of context, such 'context' is usually derived from queries and retrieved documents and/or inferred from user actions. They are not likely to accurately capture the situation and the context which includes far more factors than taken in such approaches. Furthermore, the definition differs from one solution to another. And, naturally, the diversity grows in mobile and ubiquitous personalization approaches because of context peculiarities. For instance, while context of a user is being learnt, for instance, from documents and ontologies [60], multiple context attributes like environmental and other properties (time, location, temperature, space, speed, etc.) are considered in [61] to define context-aware profiles. And probably because of such differences related to application domains, there is very little exchange of verified practices among researchers working on personalization in different areas and, despite available similarities in various domains, the one-sided views on context are not rare. There are endeavors to utilize context and situation in a holistic fashion (e.g., [26]), however they are mostly on the level of theory. We believe that accurately and timely estimated contextual information will greatly contribute the field of personalization, therefore further endeavors to characterize, methods to capture and systematize knowledge about it should be continued, deepened and corroborated with empirical studies.