Personal data
Personal data in this paper include emails, photographs, telephone call histories, GPS histories, and health data such as body weight and the number of steps people walk. Also data include Tweets on Twitter, blogs, and schedules. Home energy use and costs are also included.
It is necessary to study four main items to manage and organize personal data.
-
Common metadata to manage heterogeneous data from a variety of data sources
-
Management of data permission and user authorization
-
Unified user interfaces to explore data
-
User assistance to recall memories from a mixture of heterogeneous data
This paper especially focuses on the latter two. Several viewpoints and corresponding views are studied taking into account the design of unified user interfaces. Summaries and landmarks are proposed to assist users to recall noteworthy experiences.
Viewpoints and scale
Heterogeneous personal data need to be visualized by organizing them along with some their attributes before they are explored. For example, data with location attributes can be displayed on a map and data with timestamps can be displayed on a calendar or a timeline list. Usually the 5W1H questions, - Who, What, Where, When, Why, and How -, involve the most popular concept used to organize information. LATCH is another concept [6] that includes 'Location', 'Alphabetic', 'Time', 'Category', and 'Hierarchy'.
These kinds of axes in this paper are called viewpoints and we studied three viewpoints of time, location, and people. Time is a major viewpoint because all personal data have timestamps.
Scales were also considered for all viewpoints as seen in Figure 1. Data should be displayed differently to enable proper visualization depending on the scale of the viewpoint. For example, not all GPS histories are necessary to display a location viewpoint on the scale of a country on a map. It is better to display representative trajectories. Also, displaying all WWW browsing histories throughout the year is almost always not essential from the temporal viewpoint. As home energy costs are usually calculated per month, we obviously cannot obtain accurate charges per day.
Time
All personal logs have timestamps. However, there are various points of view even in time. For example, some activities extend for a certain period of time. Moreover, personal logs include time series, such as GPS histories and monitored pulses. Moreover, home energy costs including electric bills and gas bill are totaled every month.
The change in scale for time corresponds to the change in the period, such as the year, month, and day.
Location
Most personal logs have location attributes. Parts of them have the latitudes and longitudes of locations. Other logs have attributes of places in a schedule and on a calendar. They are assigned by the name of the places, and the addresses or names of shops. Occasionally, places indicate homes, offices, stations, or schools, which is information that depends on individual users.
The change in scale at locations corresponds to the change in the geographical region.
Humans
All personal data are related to people. In other words, all data have owner attributes. Personal data are usually related to people other than the owner, such as senders of emails, colleagues at meetings, and families in photographs.
The changes in scale for humans correspond to changes in groups of people.
Category
Category is a supplementary axis that enables personal data to be selected. A text tag is one item of information in a category. It is also useful for filtering large amounts of data selected with the above viewpoint.
Views
Views that correspond to viewpoints are explained. A variety of visualizations is available such as calendars and timelines even in a temporal viewpoint.
Views that feature temporal information
The most popular view that features temporal information is a calendar. It usually provides daily, weekly, monthly, and yearly forms on a calendar view. The amount of data to be displayed generally substantially increases as the time interval expands. Therefore, some representative data are displayed on the screen. Another view that features time is timeline visualization such as AllofMe [7].
A kind of zooming user interface is proposed in this paper to enable interaction from the temporal viewpoint. A zooming user interface (ZUI) is a graphical user interface that provides a visual scaling function [8–10]. Users can continuously change the size of the view to see more or less detail with the interface.
Figure 2 shows an overview of temporal zooming. A later section explains it in more detail.
There are various methods of display that feature temporal information. In home energy costs, monthly usage and cost are displayed in figures on a monthly view. A bar chart in which 12 bars represents monthly use are displayed on a yearly scale.
As previously described, visualization changes depending on the temporal scale and characteristics of the data. For instance, location data are usually measured every few minutes or seconds and it would be worthless to display all data on a yearly scale.
Three user interfaces are considered to feature temporal information.
Also, it is possible to use three views: a text label to display characters, a chart (e.g., bar and line charts), and an animation of time series data.
Views that feature locations
The most natural view that features locations is a map. Although location data are easy to monitor using GPS, detailed names of places cannot be understood solely from the latitude and longitude monitored by GPS. However, users occasionally write the names of places where they have been on Twitter. Also, location information such as 'homes', 'offices', and 'stations' are used on calendars. This means we use various levels of locational information in daily life.
Data are usually located on a map by the latitude and longitude to enable location data to be visualized. Therefore, personal data originally without location data were assigned to latitude and longitude by matching their timestamps to the timestamps of GPS histories in this research.
Others
Personal data can be classified by related individuals from the viewpoint of people. The classified data are displayed on a list, or a graph structure that can represent the relationship among people.
A category viewpoint is usually used for filtering information. A tag cloud user interface for this kind of view has recently become very popular on the WWW.
Summaries and Landmarks
An effective navigation system is essential to enable interaction with large amounts of personal data. Furthermore, summaries of information and special landmarks are useful for recalling experiences by navigating personal data [11]. Summaries are almost digests of daily life. Landmarks represent important events, such as parties, ceremonies, travel, and important meetings. They provide information as cues for recalling memories and exploring related information and events. A summary contains several landmarks. Of course, summaries and landmarks change depending on viewpoints and their scale.
This paper proposes six main landmarks.
-
Landmark user-generated data (e.g., photographs, videos, blogs, mail messages)
-
Landmark locations
-
Landmark people
-
Landmark tags
-
Landmark values (e.g., outliers)
-
Public landmarks
A variety of methods for clustering photographs have been proposed [12, 13]. A simple method of clustering using only the creation time was applied to photographs in our prototype. Photographs, each of which is the closest to the center of a cluster, are considered to be representative photos and displayed as temporal landmarks.
GPS histories are divided with a clustering algorithm using only latitude and longitude. Each center of the clusters is considered to be a location landmark. Also, daily living areas and others can be distinguished by the frequency of appearance of each cluster. Other landmarks are places where people have rarely gone in daily life. Here, we used a simple expectation maximization (EM) algorithm implemented in WEKA [14] to cluster photographs and GPS histories.
Other candidates for landmarks are human landmarks. These include family members who frequently appear in photographs, colleagues who frequently communicate, old friends who meet after a long time, and pop stars whose songs are very often listened to.
Landmarks of tags are defined by the frequency of tags that are assigned to each item of personal data. A tag that has been in heavy use during a period of time is a candidate for a landmark. A tag that has rarely been used during a long period of time is also a candidate for a landmark.
Outliers are candidates for landmarks in time-series data, such as home energy use, the number of steps walked, and histories of body weight. Data that exceed pre-defined or user-defined thresholds are also candidates. Consequently, we often go out on days when we walk more steps than on other days and such landmarks help us find special events.
Other landmarks are public landmarks, which include shocking public news, bestsellers, blockbuster films, and annual rankings of top Web-search words. We can recall our own experiences on those days from these landmarks.
Exploration
Figure 3 outlines exploration using the zooming user interface we propose, which is a kind of zooming user interface [8–10]. Users control the scale of the view to change the time intervals. The time intervals are shortened by zooming in and extended by zooming out. We can also scroll right and left or onto the next and previous time intervals. Summaries, landmarks, and visual forms are changed appropriately with changes in temporal scales or intervals, where visual forms include text labels and charts.
Landmarks contain representative data within a period of time. When users click on landmarks, related personal data appear. In Figure 4, since landmark 'M2' is representative of data 'M21 ~ M27', these data appear when landmark 'M2' is clicked.