Abdelsalam Almarimi, Pokorný Jaroslav
Schema Management for Data Integration: A Short Survey
In: Acta Polytechnica, Volume: 45, No: 1, Czech Technical University in Prague, Prague, 2005, pp. 24-27.
Schema management is a basic problem in many database application domains such as data integration systems. Users need to access and manipulate data from several databases. In this context, in order to integrate data from distributed heterogeneous database sources, data integration systems demand the resolution of several issues that arise in managing schemas. In this paper, we present a brief survey of the problem of schema matching which is used for solving problems of schema integration processing. Moreover, we propose a technique for integrating and querying distributed heterogeneous XML schemas.
Abdelsalam Almarimi, Pokorný Jaroslav
A Mediation Layer for Heterogenous XML Schemas
In: International Journal of Web Information Systems, Volume: 1, No: 1, Troubador Publishing LTD, 2005, pp. 25-32.
Presented at: iiWAS2004 Information Integration and Web Based Applications & Services, 27-29.09.2004, Jakarta,
Indonesia.
This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed as a tool for XML data integration system. A global XML schema is specified by the designer to provide a homogeneous view over heterogeneous XML data. An XML mediation layer is introduced to manage: (1) establishing appropriate mappings between the global schema and the schemas of the sources; (2) querying XML data sources in terms of the global schema. The XML data sources are described by XML Schema language. The former task is performed through a semi-automatic process that generates local and global paths. A tree structure for each XML schema is constructed and represented by a simple form. This is in turn used for assigning indices manually to match local paths to corresponding global paths. By gathering all paths with the same indices, the equivalent local and global paths are grouped automatically, and an XML Metadata Document is constructed. An XML Query Translator for the latter task is described to translate a global user query into local queries by using the mappings that are defined in the XML Metadata Document.
Abdelsalam Almarimi, Pokorný Jaroslav
A Mediation Layer for Heterogenous XML Schemas
Presented at: iiWAS2004 Information Integration and Web Based Applications & Services, 27-29.09.2004, Jakarta,
Indonesia.
This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed as a tool for XML data integration system. A global XML schema is specified by the designer to provide a homogeneous view over heterogeneous XML data. An XML mediation layer is introduced to manage: (1) establishing appropriate mappings between the global schema and the schemas of the sources; (2) querying XML data sources in terms of the global schema. The XML data sources are described by XML Schema language. The former task is performed through a semi-automatic process that generates local and global paths. A tree structure for each XML schema is constructed and represented by a simple form. This is in turn used for assigning indices manually to match local paths to corresponding global paths. By gathering all paths with the same indices, the equivalent local and global paths are grouped automatically, and an XML Metadata Document is constructed. An XML Query Translator for the latter task is described to translate a global user query into local queries by using the mappings that are defined in the XML Metadata Document.
Bednárek David
Statická typová kontrola XSLT programů
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 393-401.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Bednárek David, Obdržálek David, Yaghob Jakub, Zavoral Filip
Data Integration Using DataPile Structure
In: Proceedings of the 9th East-European Conference on Advances in Databases and Information Systems, Tallin, 2005, pp. 178-188.
Presented at: 9th East-European Conference on Advances in Databases and Information Systems (ADBIS 2005), 12.9.-15.9.2005, Tallin,
Estonia.
One of the areas of data integration covers systems that maintain co-herence among a heterogeneous set of databases. Such a system repeatedly col-lects data from the local databases, synchronizes them, and pushes the updates back. One of the key problems in this architecture is the conflict resolution. When data in a less relevant data source changes, it should not cause any data change in a store with higher relevancy. To meet such requirements, we propose a DataPile structure with following main advantages: effective storage of historical versions of data, straightfor-ward adaptation to global schema changes, separation of data conversion and replication logic, simple implementation of data relevance. Key usage of such mechanisms is in projects with following traits or require-ments: integration of heterogeneous data from sources with different reliability, data coherence of databases whose schema differs, data changes are performed on local databases and minimal load on the central database.
Dokulil Jiří, Yaghob Jakub, Zavoral Filip
Evoluce replikačních algoritmů v stohově orientovaných systémech
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 393-401.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Obdržálek David, Kulhánek Jiří
Statická typová kontrola XSLT programů
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 393-401.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Pokorný Jaroslav
Digitální knihovny v prostředí Sémantického webu
In: Sborník z 10. ročníku semináře AKP 2005 (automatizace knihovnických procesů - 10.), (Ed. D. Tkačíková, B. Ramajzlová), VIC ČVUT, 2005, pp. 64-73.
Presented at: AKP 2005 (Automatizace knihovnických procesů) 10. ročník semináře, 3.5.-4.5.2005, Liberec,
Czech Republic.
Digitální knihovny (DK) přispívají k rozvoji Sémantického webu a současně mohou využívat jeho technologické prvky. Lze tak docílit kvalitnějšího řízení dat v DK a snazší integrace více DK, jakož i a zvýšení možnosti interakce s dalšími informačními zdroji. Ideou, stojící za Sémantickým webem, je rozšířit webové stránky značkováním, které podchytí alespoň část významu obsahu stránky. Toto sémantické značkování znamená přidání jistých metadat, která poskytují formální sémantiku obsahu webu. Projekty Sémantického webu vycházejí z technologií, které jsou vyvíjeny jako standardy. Patří sem jazyky XML, XML Schema, RDF a RDF Schema. Tyto jazyky slouží pro zápis metadat, z nichž část se organizuje v ontologiích. Další úroveň Sémantického webu využívá jazyky logiky. Základ zpracování v takto pojatém webu poskytují programy - softwaroví agenti. Cílem článku je uvést do technologií Sémantického webu a ukázat jejich uplatnění při vytváření DK
Pokorný Jaroslav
Směrem k Sémantickému Webu
In: Sborník příspěvků 20. ročníku konference Moderní databáze, KOMIX, Roudnice nad Labem, 2005, pp. 15-24.
Presented at: 20. ročník konference Moderní databáze, 26.5.-27.5.2005, Hotel Amber, Roudnice nad Labem,
Czech Republic.
Současné webové vyhledavače založené na technikách vyhledávání informací v textech nejsou schopny využít sémantické znalosti uvnitř webové stránky a tedy nemohou dát uspokojující odpovědi na uživatelské dotazy. Možným řešením se zdá být tzv. Sémantický web, který koncem 90. let popsal ve své vizi Tim Berners-Lee. Ideou, stojící za Sémantickým webem, je rozšířit webové stránky značkováním, které podchytí alespoň část významu obsahu stránky. Toto sémantické značkování znamená přidání jistých metadat, která poskytují formální sémantiku obsahu webu. Projekty Sémantického webu vycházejí z několika technologií, z nichž ty základní jsou již standardizovány nebo alespoň doporučovány. Patří sem jazyky XML, XML Schema, RDF a RDF Schema. Tyto jazyky slouží pro zápis metadat, z nichž některá se organizují v tzv. ontologiích. Další úroveň Sémantického webu využívá jazyky logiky. Základ zpracování v takto pojatém webu poskytují softwaroví agenti, tj. programy, které pracují autonomně a proaktivně. Cílem článku je uvést do technologií podporujících vytváření Sémantického webu, ukázat jeho architekturu a zmínit některé již rozpracované projekty směřující k vytváření inteligentních webových informačních služeb, personalizovaných webových míst a sémanticky zesílených vyhledávacích strojů.
Pokorný Jaroslav, Smižanský J.
Page Content Rank: an Approach to the Web Content Mining
In: Proceedings of IADIS International Conference Applied Computing, Volume: 1, IADIS Press, 2005, pp. 289-296.
ISBN: 3-540-31198-X
Presented at: IADIS International Conference Applied Computing, 22.2.-25.2. 2005, Algavre,
Portugal.
Methods of web data mining can be divided into several categories according to a kind of mined information and goals that particular categories set: Web structure mining (WSM), Web usage mining (WUM), and Web Content Mining (WCM). The objective of this paper is to propose a new WCM method of a page relevance ranking based on the page content exploration. The method, we call it Page Content Rank (PCR) in the paper, combines a number of heuristics that seem to be important for analysing the content of Web pages. The page importance is determined on the base of the importance of terms which the page contains. The importance of a term is specified with respect to a given query q and it is based on its statistical and linguistic features. As a source set of pages for mining we use a set of pages responded by a search engine to the query q. PCR uses a neural network as its inner classification structure. We describe an implementation of the proposed method and a comparison of its results with the other existing classification system – PageRank algorithm.
Pokorný Jaroslav
Database architectures: current trends and their relationships to environmental data management
In: Proceedings of the 19th Conference EnviroInfo, Masaryk University, Brno, 2005, pp. 24-28.
Presented at: 19th Conference EnviroInfo (Informatics for Environmental Protection, Networking Environmental Information), 7.9.-9.9.2005, Brno,
Czech Republic.
Ever increasing environmental demands from customers, authorities and governmental organizations as well as new business control functions are integrated to environmental management systems (EMSs). With a production of huge data sets and their processing in real-time applications, the needs for environmental data management have grown significantly. Current trends in database development and an associated research meet these challenges. The paper discusses recent advances in database technologies and attempts to highlight them with respect to requirements of EMSs.
Pokorný Jaroslav, Reschke J.
Exporting relational data into a native XML store
Skopal Tomáš, Pokorný Jaroslav, Snášel Václav
Nearest Neighbours Search using the PM-tree
In: Procedings of The 10th International Conference on Database Systems for Advanced Applications, LNCS 3453, Springer-Verlag, 2005, pp. 803-815.
Presented at: DASFAA 2005, 17.4.-20.4.2005, Beijing,
China.
Snášel Václav, Moravec Pavel, Pokorný Jaroslav
WordNet Ontology Based Model for Web Retrieval
In: Proceedings of International Workshop on Challenges in Web Information Retrieval and Integration (WIRI) 2005, IEEE Computer Society Press, 2005, pp. 231-236.
Presented at: International Workshop on Challenges in Web Information Retrieval and Integration, 8.4.-9.4. 2005, Tokyo,
Japan.
It is well known that ontologies will become a key piece, as they allow making the semantics of Semantic Web content explicit. In spite of the big advantages that the Semantic Web promises, there are still several problems to solve. Those concerning ontologies include their availability, development and evolution. In the area of information retrieval, the dimension of document vectors plays an important role. Firstly, with higher index dimensions the indexing structures suffer from the "curse of dimensionality" and their efficiency rapidly decreases. Secondly, we may not use exact words when looking for a document, thus we miss some relevant documents. LSI is a numerical method, which discovers latent semantics in documents by creating concepts from existing terms. In this paper we present a basic method of mapping LSI concepts on given ontology (WordNet), used both for retrieval recall improvement and dimension reduction. We offer experimental results for this method on a subset of TREC collection, consisting of Los Angeles Times articles.
Snášel Václav, Moravec Pavel, Pokorný Jaroslav
Using BFA with wordnet ontology based model for web retrieval
In: Proceedings of the First IEEE International Conference on Signal-Image Technology & Internet-Based Systems (SITIS`05), 2005, pp. 254-259.
Presented at: First IEEE International Conference on Signal-Image Technology & Internet-Based Systems (SITIS`05), 27.11.-1.12.2005, Yaoundé,
Cameroon.
In the area of information retrieval, the dimension of document vectors plays an important role. We may need to find a few words or concepts, which characterize the document based on its contents, to overcome the problem of the "curse of dimensionality", which makes indexing of highdimensional data problematic. To do so, we earlier proposed a Wordnet and Wordnet+LSI (Latent Semantic Indexing) based model for dimension reduction. While LSI works on the whole collection, another procedure of feature extraction (and thus dimension reduction) exists, using binary factorization. The procedure is based on the search of attractors in Hopfield-like associative memory. Separation of true attractors (factors) and spurious ones is based on calculation of their Lyapunov function. Being applied to textual data the procedure conducted well and even more it showed sensitivity to the context in which the words were used. In this paper, we suggest that the binary factorization may benefit from the Wordnet filtration.
Vojtáš Peter
Proceedings of ITAT 2005, Information Technologies - Applications and Theory
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005.
ISBN: 80-7097-609-8
Vojtáš Peter
Fuzzy Logic as an Optimization Task
In: Fuzzy Logic and Technology, (Ed. Sobrevilla P., Montseny E.), Barcelona, 2005, pp. 781-786.
ISBN: 84-7683-872-3
Presented at: EUSFLAT - LFA 2005. Conference of the European Society for Fuzzy Logic and Technology /13./, Recontres Francophones sur la Logique Floue et ses Applications /11./, 7.9.-9.9.2005, Barcelona, Spain.