Please use this identifier to cite or link to this item: http://archive.nnl.gov.np:8080/handle/123456789/20
Title: Supporting consistencies in multi-language knowledge sharing
Authors: Pariyar, Amit
Keywords: Knowledge sharing
Cross-site content
Websites
Content categories
Issue Date: 16-Jan-2018
Abstract: The goal from this thesis is to support the design of multi-language knowledge sharing system with a focus on consistency in content shared among communities. Though unprecedented growth in online collaboration has attracted diverse communities to participate in knowledge sharing, example among resource rich and resource poor communities, the possibility for inconsistency in content shared is increased. This is problematic for multi-language knowledge sharing system since it is not practical to state consistency rules in advance for content shared among communities. Consequently the design of multi-language knowledge sharing has to shift focus from consistency rules and pay attention to cases that cause inconsistency in the shared content. The cases such as content omitted or content updates not shared and the presence of conflicting content are expected to occur in collaboration and are the potential cause for inconsistency. Though the occurrence of such cases may seem trivial at first nonetheless the complexity is raised as each community participates in its own language and so inconsistent content is shared in several languages. Further such cases also have the potential to cause inconsistency at global and local scales leading to globally and locally shared inconsistent content. Regional discrepancies from inconsistent content shared with communities in several geographic regions are also equally anticipated in knowledge sharing. Another problem is the constraint in content consistency due to divergent knowledge sharing goals of communities. This means where the goal is to leverage knowledge equally exact correspondences in shared content is preferred with a rigid consistency policy and where the goal is to customize knowledge sharing there is a need to restrict sharing to specific languages and specific communities with a non-rigid consistency policy. Grounding on the consequences from sharing inconsistent content and the constraint in content consistency that arises from disparate knowledge sharing goals of communities this thesis makes following contributions towards the design of multi-language knowledge sharing system. 1. Synchronization of User Editing Activities to Detect Inconsistency in Multilingual Content. The challenge in leveraging knowledge equally among communities is elevated from the participation in several languages. Inconsistency due to omitted content, updated content not shared and content conflict occur among languages which is undesirable to communities. Towards dealing with inconsistency in multilingual content, a processbased technique is proposed to detect missing content, updated facts or information and content conflict between languages. The proposed technique is based on the concept of synchronizing user editing activities which provides an alternative to content-based techniques. To realize this concept a state transition model is proposed to define states in multilingual content, set of actions and transition functions. Inconsistency detection rules are then designed using the combination of states in multilingual content. Experimental results from applying the proposed process-based technique to multilingual Wikipedia articles in English and Nepali languages showed satisfactory results with an average precision of 88% and a recall of 86% in detecting inconsistency. Since the proposed technique is not language specific it has an advantage over the content-based techniques by supporting variety of languages. 2. Guidelines on Consistency from Preferences in Sharing specific Content Categories. Given that several content categories are published in websites and shared among communities analysis based on propagation is proposed to examine the influence of specific content categories on preferences in sharing. The approach is to qualitatively compare content in webpages and examine their propagation among country-specific websites first in website graph (inter-connecting the available websites) and then in website pairs. For this study 480 webpages from 80 websites representing 10 global brands (Nivea, 3M, Starbucks, Acer, Samsung, KPMG, HP, Nestle, Avon, John Deree) are analyzed. A total of 480 comparisons of webpages in website graph and 1680 comparisons in website pair are performed to determine the preferences in sharing specific content categories. From examining propagation in website graph we revealed that “Corporate Information” has tendency to be shared globally and “Customer Support Information” has tendency to be shared locally while “Product Information” tends to be locally and regionally suitable for sharing. Implication is the guidelines on content consistency needed for specific content, example global consistency required for ‘corporate related information’ while local consistency required for ‘customer support related information’. From examining propagation in website pair coupling in websites is revealed with high coupling for ‘corporate related information’ which decreases as the content becomes local. Implication is the guidelines on setting priority where high coupling means higher priority for content consistency for example ‘corporate related information’ is of high priority in content consistency. Such guidelines are useful in dealing with global and local inconsistency in cross-site content. 3. Guidelines on Consistency from Preferences in Sharing within and beyond Geographic Regions. Country-specific websites that offer various content categories also represent geographic regions such as Europe, Asia Pacific, North America and so on which is important to consider as regional discrepancies in cross-site content are found to present in such websites. Analysis based on propagation is proposed to determine preferences among communities in sharing within or beyond specific geographic regions. The proposed approach is to qualitatively compare content in webpages and examine their propagations in several geographic regions. For this study 80 websites from geographic regions North America, Asia Pacific, Europe and Middle East-Africa are analyzed. A total of 240 comparisons of webpages within region and 1440 comparisons among regions are performed to determine preferences in sharing for specific region. From examining propagation within geographic regions high coupling in websites among countries in Europe and low coupling in websites inside North America is revealed. Websites in Europe tend to be more dependent and prefer to share most content in comparison to websites in North America while websites inside North America tend to be autonomous and prefer to participate less in sharing. Implication is the guidelines that among all regions European region is more vulnerable to intra-regional discrepancies and have higher priorities for content consistency. From examining propagation among geographic regions the autonomous nature of websites in North America is further suggested. Guidelines on higher priories for content consistency are suggested among Asia Pacific, Europe and Middle-East Africa to avoid inter-regional discrepancies in cross-site content. 4. Deploying Pattern of Sharing to Propagate Content Updates. To support content consistency allowing community preferences in customizing knowledge sharing, a technique based solely on the concept of propagating content updates restricted to specific languages or specific community is proposed. Pattern of sharing (a) Internationalization (b) Regionalization and (c) Localization with rules for restricting the publication and description of content to specific languages or community is deployed in knowledge sharing. Community preferences specified with pattern of sharing is able to deal cross-site content inconsistency from scaling content specificity for global, regional or local communities and propagating content updates confined to specific communities. The advantage is its simplicity in applying either automatically or executed manually as policies. 5. Support for Consistency without reliance on content processing. The problem surfacing limited support to resource poor languages is the dependence on content processing and necessity for massive linguistic corpuses in training systems which is unfortunately not available for resource deprived communities. To support content consistency in variety of languages including the resource poor languages techniques proposed in this thesis do not require content processing. The techniques are based on novel concept of synchronizing user editing action and restricting content updates with propagation which is not language specific and hence support community participation including the resource deprived ones. From the techniques that are simple and applicable to variety of languages along with the guidelines for content consistency to deal with (a) inconsistency in multilingual content (b) global and local inconsistency as well as (c) regional discrepancies in cross-site content; this thesis contributed in the design of multi-language knowledge sharing system catered to knowledge sharing goals of communities both for leveraging knowledge equally and customization in knowledge sharing.
URI: http://103.69.125.248:8080/xmlui/handle/123456789/20
Appears in Collections:000 Computer science, information & general works

Files in This Item:
File Description SizeFormat 
phd-thesis-amit-final.pdf3.51 MBAdobe PDFThumbnail
View/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.