JOURNAL OF DIGITAL INFORMATION MANAGEMENT V2.I2 JUNE 2004 ABSTRACTS

JOURNAL OF DIGITAL INFORMATION MANAGEMENT

(ISSN 0972-7272) The peer reviewed journal

Volume 3 Issue 2 June 2005

Abstracts

Managing Trust in Peer-to-Peer Networks

Wen Tang, Yun-Xiao Ma, Zhong Chen
School of Electronics Engineering and Computer Science, Peking University
100871, Beijing, P.R. China
Email: {tangwen, mayx, chen}@infosec.pku.edu.cn

Abstract

The notion of trust is fundamental in open networks for enabling peers to share resources and services. Since trust is related to subjective observers (peers), it is necessary to consider the fuzzy nature of trust in peers for representing, estimating and updating trustworthiness in peer-to-peer networks. In this paper, we present a fuzzy theory-based trust model for trust evaluation, recommendation and reasoning in peer-to-peer networks. Fuzzy theory is an appropriate formal method for handling fuzziness that happening all the time in trust between peers for practical purposes, and fuzzy logic provides an useful and flexible tool by utilizing fuzzy IF-THEN rules to model knowledge, experiences and criteria of trust reasoning that people uses in everyday.

A Peer-to-Peer Workflow Model for Distributing Large-Scale Workflow Data onto Grid/P2P

Kwang-Hoon Kim¹ and Hak-Sung Kim²

1 Collaboration Technology Research Lab
Department of Computer Science
Kyonggi University
San 94-6 Yiuidong Youngtongku Suwonsi, Kyonggido, 442-760, South Korea
http://ctrl.kyonggi.ac.kr
kwang@kyonggi.ac.kr

2 Division of Engineering Science
Dongnam health college
937 Jungjadong Janganku Suwonsi, Kyonggido, 440-714, South Korea
amang@dongnam.ac.kr

Abstract

In the workflow technology literature, so far, very-large scale workflow architectures, systems and applications have been looking for distributed computing infrastructures maximizing their performance and efficiency, but using minimal resources. Almost all conventional workflow systems are based upon the client-server and clustering computing environments. However, according to that Grid/P2P is hot-issued as a very feasible infrastructure for very-large scale information systems we need to explore some reasonable approaches for applying the Grid/P2P as an infrastructure of very-large scale workflow systems. This paper ought to be one of those trials for seeking the-how being fitted very well into the nature of the Grid/P2P. This paper proposes a scheme to generate a Grid/P2P configuration for implementing resource management and scheduling functionality for workflow procedures, which gives best distribution of workflow data in enacting a workflow process over Grid/P2P. The scheme’s essential idea is on the peer-to-peer workflow model that is automatically generated from the workflow process (represented by ICN) by an algorithm conceived in this paper. Eventually, the peer-to-peer workflow model becomes a workflow data distribution model for enacting the workflow process on runtime over Grid/P2P. This paper describes the peer-to-peer workflow model that provides a theoretical basis for peer-to-peer workflow distribution and enactment over Grid/P2P.

Semantic Data Integration Framework in Peer-to-Peer
based Digital Libraries

Hao Ding, Ingeborg T. Sølvberg
Information Management Group, Norwegian Univ. of Sci. &Tech.
Sem Sælands vei 7-9, NO-7491, Trondheim,Norway
Email: {hao.ding, ingeborg.solvberg}@idi.ntnu.no

Abstract

This paper presents our approaches in integrating heterogeneous metadata records in Peer-to-Peer (P2P) based digital libraries (DL). In this paper, the advantages of adapting P2P network over other approaches are to be presented in searching information among moderate-sized digital libraries. Before we present the semantic integration solution, we describe the P2P architecture built in JXTA protocol. By adopting JXTA protocol, peers can automatically discover the other candidates which can provide most appropriate answers. Such feature is realized by the advertising functionality which is introduced in the query process in the paper. As to the metadata integration, since resources may adopt distinct metadata, standardized or non-standardized, we employ the most widely adopted Dublin Core [17] as the globally shared metadata to sponsor the interoperation. This paper also describes the mechanism of applying inference rules to convert heterogeneous metadata to local repository.

Using J2EE/.NET Clusters for Parallel Computations of
Join Queries in Distributed Databases

Yosi Ben-Asher, Shlomo Berkovsky , Eduard Gelzin, Ariel Tammam, Miri Vilkhov
Computer Science Department, Haifa University, Haifa, Israel

Edi Shmueli
I.B.M. Research Center, Haifa, Israel

Abstract

In here we consider the problem of parallel execution of the Join operation by J2EE/.NET clusters. These clusters are basically intended for coarse-grain distributed processing of multiple queries/business transactions over the Web. Thus, the possibility of using J2EE/.NET clusters for fine-grain parallel computations (parallel Joins in our case) is intriguing and of practical interest. We have developed a new variant of the SFR algorithm for parallel Join operations and proved its optimality in terms of communication/execution-time tradeoffs via a simple lower bound. Two variants of
SFR algorithm were implemented over J2EE and .NET platforms. The experimental results show that despite the fact that J2EE/.NET are considered to be platforms that use complex interfaces and software entities, J2EE/.NET clusters can be efficiently used to execute the Join operation in parallel.

Data Quality Management in a Database Cluster with Lazy Replication

Cecile Le Pape - Stephane Gancarski - Patrick Valduriez^{^}
Firstname.Lastname@lip6.fr, LIP6, Paris, France
^Patrick. Valduriez@inria.fr, INRIA and LINA, Nantes, France

Abstract

We consider the use of a database cluster with lazy replication. In this context, controlling the quality of replicated data based on users' requirements is important to improve performance. However, existing approaches are limited to a particular aspect of data quality. In this paper, we propose a general model of data quality which makes the difference between "freshness" and "validity" of data. Data quality is expressed through divergence measures from the data with perfect quality. Users can thus specify the minimum level of quality for their queries. This information can be exploited to optimize query load balancing. We implemented our approach in our Refresco prototype. The results show that freshness control can help increase query throughput significantly. They also show significant improvement when freshness requirements are specified at the relation level rather than at the database level.

Content-Aware Segment-Based Video Adaptation

Mulugeta Libsie
Addis Ababa University
P. O. Box 30312, Addis Ababa, Ethiopia
mlibsie@yahoo.com
Harald Kosch*
University Klagenfurt
Universitätstrasse 65-67, Klagenfurt, Austria
Email: harald.kosch@itec.uni-klu.ac.at

Abstract

Video adaptation is an active research area aiming at delivering heterogeneous content to yet heterogeneous devices under different network conditions. It is an important component of multimedia data management to address the problem of delivering multimedia data in distributed heterogeneous environments. This paper presents a novel method of video adaptation called segment-based adaptation. It aims at applying different reduction methods on different segments based on physical content. The video is first partitioned into homogeneous segments based on physical characteristics. Then optimal reduction methods are selected and applied on each segment with the objective of minimizing quality loss and/or maximizing data size reduction during adaptation. In addition to this new method of variation creation, the commonly used reduction methods are also implemented. To realize variation creation, a unifying framework called the Variation Factory is developed. It is extended to the Multi-Step Variation Factory, which allows intermediary videos to serve as variations and also as sources to further variations. Our proposals are implemented as part of a server component, called the Variation Processing Unit (VaPU) that generates different versions of the source and an MPEG-7 metadata document.

Content adaptation in distributed multimedia systems

Girma Berhe, Lionel Brunie, Jean-Marc Pierson
Lyon Research Center for Images and Intelligent Information Systems (LIRIS)
Institut National des Sciences Appliquees de Lyon, France
Batiment Blaise Pascal (502)
20 Avenue Albert Einstein
69621 Villeurbanne, France
Email: {girma.berhe, lionel.brunie, jean-Marc.pierson}@liris.cnrs.fr

Abstract

New developments in computing and communication technology facilitate mobile data access to multimedia application systems such as healthcare, tourism and emergency. In these applications, users can access information with variety of devices having heterogeneous capabilities. One of the requirements in these applications is to adapt the content to the user's preferences, device capabilities and network conditions. In this paper we present a distributed content adaptation approach for distributed multimedia systems. In this approach content adaptation is performed in several steps and the adaptation tools are implemented by external services, we call them adaptation services. In order to determine the type and sequence of the adaptation services, we develop an adaptation graph based on client profile, network conditions, content profile (meta-data) and available adaptation services. Different quality criteria are used to optimize the adaptation graph.

Reducing Communication Overhead over Distributed Data Streams By filtering Frequent Items

Dongdong Zhang1+, Jianzhong Li 1,2, Weiping Wang 1, Longjiang Guo1,2, Chunyu Ai 2
1School of Computer Science and Technology, Harbin Institute of Technology, China
2School of Computer Science and Technology, Heilongjiang University, China
Email: {zddhit, lijzh, wpwang}@hit.edu.cn, ljguo_1234@sina.com, chunyu_ai@263.net

Abstract

In the environment of distributed data stream systems, the available communication bandwidth is a bottleneck resource. To improve the availability of communication bandwidth, communication overhead should be reduced as much as possible under the constraint of the precision of queries. In this paper, a new approach is proposed to transfer data streams in distributed data stream systems. By transferring the estimated occurrence times of frequent items, instead of raw frequent items, it can save a lot of communication overhead. Meanwhile, in order to guarantee the precision of queries, the difference between the estimated and true occurrence times of each frequent item is also sent to the central stream processor. We present the algorithm of processing frequent items over distributed data streams and give the method of supporting aggregate queries over the preprocessed frequent items. Finally, the experimental results prove the efficiency of our method.

An Original Solution to Evaluate Location-Dependent Queries in Wireless Environments

Marie Thilliez, Thierry Delot, Sylvain Lecomte
LAMIH Laboratory - CNRS UMR 8530
University of Valenciennes
Le Mont Houy
59313 Valenciennes Cedex 9 France
Email: {firstname.lastname}@univ-valenciennes.fr

Abstract

The recent emergence of handheld devices and wireless networks has provoked an exponential increase in the number of mobile users. These users are potential consumers of new applications, such as the Location-Dependent Applications (LDA) examined in this article. As their name implies, these applications depend on location information, which is used to adapt and customize the application for each user. In this article, we focus on the problem of information localization, particularly the evaluation of Location-Dependent Queries (LDQ). Such queries allow, for example, a mobile user who is in an airport to locate the closest bus stop to go to the university. To evaluate these queries, the client position must be retrieved. Often, positioning systems such as GPS are used for this purpose; however, not all mobile clients are equipped with such systems and these systems are not well suited in every environments. To remedy this lack, we propose a positioning solution based on environment metadata, that can provide an approximate client position, sufficient for evaluating LDQs. This paper presents both the positioning system, and its optimization with regard to minimizing response time and economizing mobile device resources.

Indexing Location Dependent Data in Broadcast Environment

Debopam Acharya and Vijay Kumar
SCE, Computer Networking
University of Missouri-Kansas City
5100 Rockhill Road
Kansas City, MO 64110, USA
Email: {dargc, kumarv} @umkc.edu

Abstract

Wireless Data dissemination is an effective way to disseminate large amount of data to a large mobile user population. Our project, called “DAYS (DAta in Your Space)”, investigates information management issues in wireless space. One of the main objectives of DAYS is to build a system to provide transactional and web services facilities globally. This paper discusses the creation of a location domain and proposes an algorithm for creating location dependent data. A new indexing scheme for accessing Location Dependent Data (LDD) in DAYS has also been presented which not only provides a bound on the access time but also allows significant energy conservation in mobile devices.

Energy Efficient Cache Invalidation in a Mobile Environment

Narottam Chand, Ramesh Chandra Joshi and Manoj Misra
Electronics & Computer Engineering Department
Indian Institute of Technology, Roorkee - 247 667 INDIA
Email: {narotdec, joshifcc, manojfec}@iitr.ernet.in

Abstract

Caching in mobile computing environment has emerged as a potential technique to improve data access performance and availability by reducing the interaction between the client and server. A cache invalidation strategy ensures that cached item at a mobile client has same value as on the origin server. To maintain the cache consistency, the server periodically broadcasts an invalidation report (IR) so that each client can invalidate obsolete data items from its cache. The IR strategy suffers from long query latency, larger tuning time and poor utilization of wireless bandwidth. Using updated invalidation report (UIR), the long query latency can be reduced. This paper presents a caching strategy that preserves the advantages of existing IR and UIR based strategies and improves on their disadvantages. Simulation results prove that our strategy yields better performance than IR and UIR based strategies.

Composing Optimal Invalidation Reports for Mobile Databases

Wen-Chi Hou Chih-Fang Wang
Department of Computer Science, Southern Illinois University at Carbondale
Carbondale IL, 62901, U.S.A
Email: {hou, cfw}@cs.siu.edu

Meng Su
Department of Computer Science, Penn State Erie
Erie, PA 16509, USA
Email: mengsu@psu.edu

Abstract

Caching can reduce expensive data transfers and improve the performance of mobile computing. In order to reuse caches after short disconnections, invalidation reports are broadcast to clients to identify outdated items. Detailed reports may not be desirable because they can consume too much bandwidth. On the other hand, false invalidations may set in if accurate timing of updates is not provided. In this research, we aim to reduce the false invalidation rates of cached items. From our analysis, we found that false invalidation rates are closely related to clients’ reconnection patterns (i.e., the distribution of the time spans between disconnections and reconnections). We show that in theory for any given reconnection pattern, a report with a minimal false invalidation rate can be derived. Experimental results have confirmed that our method is indeed more effective in reducing the false invalidation rate than others.

Ontology-based heterogeneous XML data integration

Christophe Cruz
Active3D-Lab2 rue Renée Char – BP 6660621066 Dijon Cedex France
Email: christophe.cruz@khali.u-bourgogne.fr

Christophe Nicolle
Le2i – UMR CNRS 5158Université de BourgogneBP 47870 – 21078 Dijon, France
Email: cnicolle@u-bourgogne.fr

Abstract

In this paper we present an ontology-based method for formalizing the implicit semantic and we suggest mechanisms to semantically integrate XML schemas and documents as well. After a survey of database interoperability, we present our semantic integration approach by explaining the nature of ontology. The article then presents our integration method for XML data and schemas using a generic ontology.

Beyond Keywords and Hierarchies

Ian Hopkins and Julita Vassileva
Department of Computer Science,
University of Saskatchewan,
57 Campus Drive, Saskatoon S7N 5A9, Canada
Email: [ikh328@mail.usask.ca; jiv@cs.usask.ca]

Abstract

As our ability to store information increases, the mechanisms we employ to access that information become ever more important. In this paper, we present Archosum, a prototype of an organizational system that attempts to encapsulate the benefits of both hierarchical and keyword systems. By introducing abstract entities, Archosum provides a simple interface with which users can build and maintain powerful relationship-based organizations. We compared Archosum to two alternative systems in a user study. Through this study we begin to expose some of the advantages and disadvantages to each of these three approaches to designing an organizational system. Furthermore, we begin to consider how organizational systems will work when distinct users create organizations for collections and how sharing might be facilitated using Archosum.

CYCLADES: An Environment for the Cooperative Management of Digital Information

Tom Gross1
Faculty of Media
Bauhaus-University Weimar
Bauhausstr. 11
99423 Weimar, Germany
Email: tom.gross(at)medien.uni-weimar.de

Dian Tan1
DaimlerChrysler AG, Research and Technology
P.O. Box 2360
89013 Ulm, Germany
Email: dian.tan(at)daimlerchrysler.com

Wido Wirsam
Fraunhofer Institute for Applied Information Technology FIT
Schloss Birlinghoven
53754 St. Augustin Germany
Email: wido.wirsam(at)fit.fraunhofer.de

Abstract

Knowledge management is often viewed as a structured process of eliciting knowledge, storing knowledge, and later retrieval by individuals. In this paper we argue that knowledge management should be seen as a dynamic process—an interaction between experts. Therefore, environments should support the cooperative management of information in workgroups or online communities. We start with a motivation for this cooperative perspective of supporting knowledge management through support for the creation and exchange of knowledge in communities. We introduce the CYCLADES environment—an open cooperative virtual archive environment based on open archives. We report in detail on the requirements analysis and functional design, on the specification, on the implementation, as well as on the evaluation of the environment. Finally, we draw conclusions for the design and implementation of cooperative knowledge management.