TalkMine and the Adaptive Recommendation Project

LUIS MATEUS ROCHA
Complex Systems Modeling Team
Modeling, Algorithms, and Informatics Group (CCS-3)
Los Alamos National Laboratory, MS B265
Los Alamos, New Mexico 87545, USA
e-mail: rocha@lanl.gov or rocha@santafe.edu

Rocha, Luis M. [1999d]. In: the Proceedings of the Association for Computing Machinery (ACM) - Digital Libraries 99. U.C. Berkely, August 1999. pp. 242-243.

This paper is available in Adobe Acrobat (.pdf) format or in postscript (zipped). Note that this paper contains many equations and figures. The HTML version may not display properly in all browsers; to get a version more true to the original I recommend the adobe pdf version.

ABSTRACT: TalkMine is an adaptive recommendation system which is both content-based and collaborative, and further allows the crossover of information among multiple databases searched by users. In this way, different databases learn new and adapt existing keywords to the categories recognized by its communities of users. TalkMine is based on several theories of uncertainty, as well as on biologically inspired adaptionist ideas. This system is currently being implemented for the research library of the Los Alamos National Laboratory under the Adaptive Recommendation Project. In the present work we discuss the shortcomings of current recommendation systems for distributed information systems and propose how TalkMine can greatly improve these shortcomings.

INDEXTERMS: Recommendation systems, information retrieval, knowledge management, fuzzy sets, evidence sets, distributed information systems, evolutionary systems, human-machine interaction, artificial intelligence.

1. Distributed Information Systems and Information Retrieval

Distributed Information Systems (DIS) refer to collections of networked information resources in interaction with communities of users; examples of such systems are: the Internet, the World Wide Web, library information retrieval systems, etc. Traditional information retrieval systems are based solely on keywords that index (semantically characterize) documents and a query language to retrieve documents from centralized databases in terms of these keywords. This setup leads to four major flaws:

Passive Environments. There is no genuine interaction between user and system, the former pulls information from a passive database and therefore needs to know how to query relevant information with appropriate keywords. Furthermore, such impersonal interfaces cannot respond to queries in a user-specific fashion because they do not keep user-specific information, or user profiles. The net result is that users must know in advance how to characterize the information they need before pulling it from the environment.

Idle Structure. Structural relationships between documents, keywords, and information retrieval patterns are not utilized. Different kinds of structural relationships are available, but not typically used, for different DIS, e.g. citation structure in scientific library databases, the link structure in the WWW, the clustering of keyword relationships into different meanings of keywords, temporal patterns of retrieval, etc.

Fixed Semantics. Keywords are initially provided by document authors (or publishers, librarians, and indexers), and do not necessarily reflect the evolving semantic expectations of users.

Isolated Information Resources. No relationships are created or information is exchanged among documents and/or keywords in different information resources such as databases, web sites, etc. Each resource is accessed with a private set of keywords and query language.

These flaws prevent current information retrieval processes in DIS to achieve any kind of interesting coupling with users. No system-user co-adaptation and learning can be achieved because of the following fundamental limitations:

2. TalkMine

TalkMine is currently being developed as a testbed environment for the Research Library at the Los Alamos National Laboratory, more specifically, for its Library Without Walls project (1) under the Adaptive Recommendation Project (ARP).

The architecture of TalkMine has both user-side and system-side components. Each user owns a browser (or plug-in to an existing Internet browser), which functions as a consolidated interface to all information resources searched. This individual browser stores user preferences and tracks information retrieval patterns and relationships which it utilizes to adapt to the user. User preferences are stored as a set of local knowledge contexts which the user has constructed while using the system under a set of different interests. These local knowledge contexts store both semantic semi-metric and structural proximity information. This way, user preferences are much more than a list of keywords used or documents retrieved (e.g. a list of "Bookmarks"), because they also keep associative information between keywords and between documents, which permanently adapts according to the user's information retrieval history. This training can be done for distinct set of user interests, that is, the user can choose to train its browser when it retrieves information as, say, a scientist or as a sports aficionado. Each of the associated local knowledge contexts can be seen as a sort of surrogate "personality" which can be used to automate the question-answering process of the TalkMine algorithm [Rocha, 1999a, 1999b].

Where existing information retrieval is strictly unidirectionally query-based, in TalkMine an interactive, conversational, multi-directional approach between user and system side components is fundamental. Each user's browser engages in the interactive algorithm with the information resources it queries. This first results in a list of document and related topic recommendations issued according to the user's profile and present interests, as well as the integration of knowledge from the several information resources queried, as discussed above. The second result of this interaction is that all sides exchange information, therefore all of the parties can potentially learn new information in an adaptive fashion. Indeed, information resources can learn new keywords from users and other information resources, and will adapt the associations between keywords and documents according to the expectations of its users.

TalkMine tackles the flaws of information retrieval in DIS as depicted in section 1 in the following manner:

Therefore, TalkMine overcomes the limitations of information retrieval outlined in 1:

For all of these characteristics, TalkMine establishes an open-ended human-machine symbiosis, which can be used in the automatic, adaptive, organization of knowledge in DIS such as library databases or the Internet, facilitating the rapid dissemination of relevant information and the discovery of new knowledge.

REFERENCES

[1] Rocha, Luis M. [1999a]."Evidence sets: modeling subjective categories." Int. Journ. of General Systems. In Press.

[2] Rocha, Luis [1999b]."Adaptive recommendation and open-ended semiosis." International Journal of Human -Computer Studies. In Press.

1. More details of this project at http://informatics.indiana.edu/rocha/lww.


For more information contact Luis Rocha at rocha@indiana.edu.
Last Modified: September 02, 2004