Abdelsalam Almarimi, Pokorný Jaroslav
Schema Management for Data Integration: A Short Survey
In: Acta Polytechnica, Volume: 45, No: 1, Czech Technical University in Prague, Prague, 2005, pp. 24-27.
Schema management is a basic problem in many database application domains such as data integration systems. Users need to access and manipulate data from several databases. In this context, in order to integrate data from distributed heterogeneous database sources, data integration systems demand the resolution of several issues that arise in managing schemas. In this paper, we present a brief survey of the problem of schema matching which is used for solving problems of schema integration processing. Moreover, we propose a technique for integrating and querying distributed heterogeneous XML schemas.
Abdelsalam Almarimi, Pokorný Jaroslav
A Mediation Layer for Heterogenous XML Schemas
In: International Journal of Web Information Systems, Volume: 1, No: 1, Troubador Publishing LTD, 2005, pp. 25-32.
Presented at: iiWAS2004 Information Integration and Web Based Applications & Services, 27-29.09.2004, Jakarta,
Indonesia.
This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed as a tool for XML data integration system. A global XML schema is specified by the designer to provide a homogeneous view over heterogeneous XML data. An XML mediation layer is introduced to manage: (1) establishing appropriate mappings between the global schema and the schemas of the sources; (2) querying XML data sources in terms of the global schema. The XML data sources are described by XML Schema language. The former task is performed through a semi-automatic process that generates local and global paths. A tree structure for each XML schema is constructed and represented by a simple form. This is in turn used for assigning indices manually to match local paths to corresponding global paths. By gathering all paths with the same indices, the equivalent local and global paths are grouped automatically, and an XML Metadata Document is constructed. An XML Query Translator for the latter task is described to translate a global user query into local queries by using the mappings that are defined in the XML Metadata Document.
Abdelsalam Almarimi, Pokorný Jaroslav
A Mediation Layer for Heterogenous XML Schemas
Presented at: iiWAS2004 Information Integration and Web Based Applications & Services, 27-29.09.2004, Jakarta,
Indonesia.
This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed as a tool for XML data integration system. A global XML schema is specified by the designer to provide a homogeneous view over heterogeneous XML data. An XML mediation layer is introduced to manage: (1) establishing appropriate mappings between the global schema and the schemas of the sources; (2) querying XML data sources in terms of the global schema. The XML data sources are described by XML Schema language. The former task is performed through a semi-automatic process that generates local and global paths. A tree structure for each XML schema is constructed and represented by a simple form. This is in turn used for assigning indices manually to match local paths to corresponding global paths. By gathering all paths with the same indices, the equivalent local and global paths are grouped automatically, and an XML Metadata Document is constructed. An XML Query Translator for the latter task is described to translate a global user query into local queries by using the mappings that are defined in the XML Metadata Document.
Bartoň Stanislav, Zezula Pavel
RhoIndex - An Index for Graph Structured Data
Presented at: 8th International DELOS Workshop on Future Digital Library Management Systems, 29.3.-1.4.2005, Schloss Dagstuhl,
Germany.
The effort described in this paper introduces an indexing structure for path search in the graph structured data called rho-index. It is based on a graph segmentation S(G) that is meant to represent the indexed graph G in a simpler manor yet having similar properties as the graph G had. This is achieved using graph transformations and a special type of a matrix used to represent the transformed graph.
Bednárek David
Statická typová kontrola XSLT programů
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 393-401.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Bednárek David, Obdržálek David, Yaghob Jakub, Zavoral Filip
Data Integration Using DataPile Structure
In: Proceedings of the 9th East-European Conference on Advances in Databases and Information Systems, Tallin, 2005, pp. 178-188.
Presented at: 9th East-European Conference on Advances in Databases and Information Systems (ADBIS 2005), 12.9.-15.9.2005, Tallin,
Estonia.
One of the areas of data integration covers systems that maintain co-herence among a heterogeneous set of databases. Such a system repeatedly col-lects data from the local databases, synchronizes them, and pushes the updates back. One of the key problems in this architecture is the conflict resolution. When data in a less relevant data source changes, it should not cause any data change in a store with higher relevancy. To meet such requirements, we propose a DataPile structure with following main advantages: effective storage of historical versions of data, straightfor-ward adaptation to global schema changes, separation of data conversion and replication logic, simple implementation of data relevance. Key usage of such mechanisms is in projects with following traits or require-ments: integration of heterogeneous data from sources with different reliability, data coherence of databases whose schema differs, data changes are performed on local databases and minimal load on the central database.
Dokulil Jiří, Yaghob Jakub, Zavoral Filip
Evoluce replikačních algoritmů v stohově orientovaných systémech
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 393-401.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Farský Miroslav, Neruda Martin, Neruda Roman
Mass and energy flows in consequences of company environmental acconting
In: Proceeding of the Environmental Accounting - Sustainable Development Indicators, International Conference EA-SDI 2005, (Ed. Ritschelová I.), Jan Evangelista Purkyně University and Charles University, Ústí nad Labem and Prague, 2005, pp. 356-362.
ISBN: 80-7044-676-5
Presented at: International Conference EA-SDI 2005, 26.9.-27.9.2005,
Czech Republic.
During the implementation of an environmental accounting system in a company, one of the most important pieces of information to obtain is a detailed understanding about material flows (raw materials, semi-finished products, final products and wastes) and flows of different types of energy inputs (buying, selling and wastage) when thinking about the consequences on the company. The authors, in the article: 1) study the question of the quantification of the flows, and the accuracy of their measurement, 2) provide an environmental accounting statement, with help of standards and indices, statistical trends analysis.
Hájek Petr
Making fuzzy description logic more general
In: Fuzzy Sets and Systems, Volume: 154, 2005, pp. 1-15.
A version of fuzzy description logic based on the basic (continuous t-norm based) fuzzy predicate logic BL is presented. Problems of satisfiability, validity and subsumption of concepts are discussed and reduced to problems of fuzzy propositional logic known to be decidable for any continuous t-norm. For Lukasiewicz t-norm some stronger results are obtained.
Holeňa Martin
Získávání logických tvrzení z dat jako významný směr dobývání znalostí z dat
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 311-322.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Příspěvek se zabývá problematikou získávání logických tvrzení z dat, tedy těmi metodami dobývání znalostí z dat (data mining), jejichž výsledky lze vyjádřit v jazyce nějaké formální logiky. Je podán velmi stručný přehled širokého spektra rozmanitých metod tohoto typu, jak metod vycházejících ze statistických přístupů, tak i metod spočívajících na principech strojového učení, a je poukázáno na specifický charakter metod založených na umělých neuronových sítích. Pro ilustraci jsou podrobněji popsány dvě konkrétní metody získávání logických tvrzení z dat. Jednou z nich je metoda Guha, která vychází z observační logiky a je pravděpodobně nejstarší metodou získávání pravidel z dat vůbec. Druhou je metoda založená na po částech lineárních vícevrstvých perceptronech.
Húsek Dušan, Snášel Václav, Owais Suhail S. J., Krömer Pavel
Using Genetic Algorithms for Boolean Queries Optimization
In: Proceedings of the Ninth IASTED International Conference INTERNET AND MULTIMETIA SYSTEMS AND APPLICATIONS, ACTA Press, 2005, pp. 178-184.
Presented at: Ninth IASTED International Conference INTERNET AND MULTIMETIA SYSTEMS AND APPLICATIONS, 15.8.-17.8.2005, Honolulu,
Hawaii, USA.
Most of information retrieval systems depend on Boolean queries. The performance of an information retrieval system is usually measured in terms of two different criteria, precision and recall. This way, the optimization of any of its components is a clear example of a multiobjective problem. However, although evolutionary algorithms have been widely applied in the information retrieval area, in all of these applications both criteria have been combined in a single scalar fitness function by means of a weighting scheme. In this paper, we deal with using of Genetic algorithms in Information retrieval specially in optimizing of a Boolean query.
Kudová Petra, Neruda Roman
Kernel Based Learning Methods: Regularization Networks and RBF Networks
In: Proceedings of the Sheffield Machine Learning Workshop, Springer Verlag, 2005, pp. 124-136.
ISBN: 3-540-29073-7
Presented at: Sheffield Machine Learning Workshop, 7.9.-10.9.2004, Sheffield,
Great Britain.
Kernel based learning methods are subject of great interest at present. We discuss two kernel based learning methods, namely the Regularization Networks (RN) and the Radial Basis Function Network (RBF networks).
The RNs are derived from the regularization theory, had been studied thoroughly from a function approximation point of view, and therefore have very good theoretical background.
The RBF networks represent a model of artificial neural networks with both neuro-physiological and mathematical motivation. In addition they may be treated as a generalised form of Regularization Networks, i.e. RN with increased number of kernel functions.
We demonstrated the performance of both approaches on experiments, including both benchmark and real-life learning tasks. We claim that the performance of RN and RBF network is comparable in terms of generalisation error. The RN approach usually leads to solutions with higher model complexity (high number of base units). In this situations, the RBF networks can be used as a ’cheaper’ alternative.
Kůrková Věra, Sanguineti Marcello
Learning with generalization capability by kernel methods of bounded complexity
In: Journal of Complexity, Volume: 21, Elsevier, 2005, pp. 350-367.
Learning from data with generalization capability is studied in the framework of minimization of regularized empirical error functionals over nested families of hypothesis sets with increasing model complexity. ForTikhonov`s regularization with kernel stabilizers, minimization over restricted hypothesis sets containing for a fixed integer n only linear combinations of all n-tuples of kernel functions is investigated. Upper bounds are derived on the rate of convergence of suboptimal solutions from such sets to the optimal solution achievable without restrictions on model complexity.The bounds are of the form 1/sqrt(n) multiplied by a term that depends on the size of the sample of empirical data, the vector of output data, the Gram matrix of the kernel with respect to the input data, and the regularization parameter.
Linková Zdeňka
Data Integration in VirGIS and in the Semantic Web
Technical Report: V-922, ICS AS CR, Prague, 2005, 11 p.
Integration has been an acknowledged data processing problem for a long time. However, there is no universal tool for general data integration. Because various data descriptions, data heterogeneity, and machine unreadability, it is not easy way. Improvement in this situation could bring the Semantic Web. Its idea is based on machine understandable web data, which bring us an opportunity of better automated processing. The SemanticWeb is still a future vision, but there are already some features we can use. The paper describes how is integration solved in mediation integration system VirGIS and discusses use of nowadays Semantic Web features to improve it. According to the proposed changes, a new ontology that covers data used in VirGIS is presented.
Linková Zdeňka
The Logic Summer School 2004
Technical Report: V-925, ICS AS CR, Prague, 2005, 10 p.
Abstract Logic is the foundational discipline of many sciences. Part mathematics, part philosophy and part computing science, logic remains a core intellectual study and is increasingly relevant to practical concerns. It spreads into planning, into program synthesis, into circuit design and into discourse analysis. It underpins the entire science of artiŻcial intelligence. In order to increase knowledge from the field of logic, I participated in the Logic Summer School. This report covers some information.
Linková Zdeňka, Nedbal Radim, Řimnáč Martin
Building Ontologies for GIS
Technical Report: V-932, ICS AS CR, 2005, 9 p.
Knowledge representation in geographic information systems (GIS) and associated data processing presents many challenges for researchers. To use ontologies as knowledge representation belongs to the most topical problems to solve. This involves ontology development as well as ontology re-usage. The goal of the research described in this paper is to develop a specific ontology for a given GIS area.
Linková Zdeňka, Nedbal Radim
Building Ontologies for GIS - Part 2
Technical Report: V-938, ICS AS CR, 2005, 12 p.
Ontologies play an important role in knowledge representation. Among various fields, where ontologies can be useful, is the GIS data area. We consider data in a specific GIS domain and develop a new ontology. The result is described in this paper.
Linková Zdeňka
Data Integration in VirGIS and in the Semantic Web
In: Doktorandský den 05, (Ed. Hakl F.), MATFYZPRESS, Prague, 2005, pp. 87-93.
ISBN: 80-86732-56-8
Presented at: Institute of Computer Science Ph.D. Student`s Days 05, 5.10.-7.10.2005, Nový Dvůr,
Czech Republic.
Integration has been an acknowledged data processing problem for a long time. However, there is no universal tool for general data integration. Because various data descriptions, data heterogeneity, and machine unreadability, it is not easy way. Improvement in this situation could bring the Semantic Web. Its idea is based on machine understandable web data, which bring us an opportunity of better automated processing. The SemanticWeb is still a future vision, but there are already some features we can use. The paper describes how is integration solved in mediation integration system VirGIS and discusses use of nowadays Semantic Web features to improve it. According to the proposed changes, a new ontology that covers data used in VirGIS is presented.
Linková Zdeňka, Nedbal Radim
Building Ontology for VirGIS System
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 233-242.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Ontologies play an important role in a knowledge representation. It involves ontology development as well as ontology re-use. Among various fields, where ontologies can be useful, is the GIS (Geographical Information System) data area. The goal of the research described in this paper is to develop a specific ontology for a given GIS domain. At first, we describe a general methodology and main tools for ontology development. Then a new ontology that covers data used in a VirGIS integration system is presented. The paper describes the VirGIS specified ontology as well as a list of spatio-temporal data ontologies that are available and possible to use for a general data features description.
Nedbal Radim
Relational Databases with Ordered Relations
In: Logic Journal of the IGPL, Volume: 13, 2005, pp. 587-597.
Presented at: ERCIM 2004, 12.-17.07.2004, Vienna,
Austria.
The paper deals with expressing preferences in the framework of the relational data model. Preferences have usually a form of a partial ordering. Therefore the question arises how to provide the relational data model with such an ordering.
Neruda Roman, Vaculín Roman
Concept nodes architecture within the Bang3 system
Technical Report: V-947, ICS AS CR, 2005
In this paper we present an architecture for decision making of software agents that allows the agent to behave autonomously. Our target area is computational agents — encapsulating various neural networks, genetic algorithms, and similar methods — that are expected to solve problems of different nature within an environment of a hybrid computational multi-agent system. The architecture is based on the vertically-layered and belief-desire-intention architectures. Several experiments with computational agents were conducted to demonstrate the benefits of the architecture.
Neruda Roman, Farský Miroslav, Neruda Martin
Mass and energy flows in consequences of company environmental acconting (abstract)
In: Environmental Accounting - Sustainable Development Indicators, International Conference EA-SDI 2005, Collection of Abstracts, (Ed. Ritschelová I.), Jan Evangelista Purkyně University and Charles University, Ústí nad Labem and Prague, 2005, pp. 51.
ISBN: 80-7044-674-9
Presented at: International Conference EA-SDI 2005, 26.9.-27.9.2005,
Czech Republic.
During the implementation of an environmental accounting system in a company, one of the most important pieces of information to obtain is a detailed understanding about material flows (raw materials, semi-finished products, final products and wastes) and flows of different types of energy inputs (buying, selling and wastage) when thinking about the consequences on the company. The authors, in the article: 1) study the question of the quantification of the flows, and the accuracy of their measurement, 2) provide an environmental accounting statement, with help of standards and indices, statistical trends analysis.
Neruda Roman, Krušina Pavel
Estimating and Measuring Performance of Computational Agents
In: Proceedings of the 2005 IEEE/WIC/ACM International Conference on Intelligent Agent technology IAT 2005, IEEE Computer Society Press, 2005, pp. 615-618.
ISBN: 0-7695-2416-8
Presented at: 2005 IEEE/WIC/ACM International Conference on Intelligent Agent technology IAT 2005, 19.9.-22.9.2005,
France.
We study and design multi-agent systems for computational intelligence modeling. Agents typically reside in a high-performance parallel environment, such as a cluster of computational nodes, and utilize a non-blocking asynchronous communication. The need of accurate predictions of run-time and other characterizations of complex parallel asynchronous processes bring us to design a new parallel model creation methodology. In this article our approach is briefly described and a test case is shown and discussed.
Nováček Vít, Smrž Pavel
BOLE - A New Bio-Ontology Learning Platform
In: Proceedings of ECCB`05 Workshop, Workshop on Biomedical Ontologies and Text Processing, 2005.
Presented at: ECCB`05 Workshop, Workshop on Biomedical Ontologies and Text Processing, 28.9.2005, Madrid,
Spain.
This paper presents BOLE — a new platform for bottomup generation and merging of bio-ontologies. In contrast to other ontology-learning systems that are currently available, BOLE can be characterized by the modular architecture enabling integrating and comparing various methods of the automatic acquisition of semantic relations. We introduce the architecture of the tool and discuss the methodology of the employed synthetic bottom-up approach. OLITE — the central component responsible for the automatic acquisition of semantic relations from texts is described in detail. The presented preliminary results prove the efficiency of the implemented framework. We also provide a brief comparative overview of other relevant approaches and outline the future work on representation of uncertain knowledge for bio-ontology merging.
Nováček Vít, Smrž Pavel
OLE - A New Ontology Learning Platform
In: Proceedings of International Workshop on Text Mining Research, Practice and Opportunities, Incoma Ltd., 2005, pp. 12-16.
ISBN: 954-91743-1-X
Presented at: International Workshop on Text Mining Research, Practice and Opportunities, 24.9.2005, Borovets,
Bulgaria.
This paper presents OLE — a new platform for bottom-up generation and merging of ontologies. In contrast to other ontology-learning systems that are currently available, OLE can be characterized by the modular architecture enabling integrating and comparing various methods of the automatic acquisition of semantic relations. We introduce the architecture of the tool and discuss the methodology of the employed synthetic bottom-up approach. OLITE — the central component responsible for the automatic acquisition of semantic relations from texts is described in detail. The presented preliminary results prove the efficiency of the implemented framework. We also provide a brief comparative overview of other relevant approaches and outline the future work on representation of uncertain knowledge for ontology merging.
Novák David, Zezula Pavel
Indexing the Distance Using Chord: A Distributed Similarity Search Structure
Presented at: 8th International DELOS Workshop on Future Digital Library Management Systems, 29.3.-1.4.2005, Schloss Dagstuhl,
Germany.
The need of search mechanisms based on data content rather then attributes values has recently lead to formation of the metric-based similarity retrieval. The computational complexity of such retrieval and the large volume of processed data call for distributed processing. In this paper, we propose chiDistance, a distributed data structure for similarity search in metric spaces. The structure is based on the idea of a vectorbased index method iDistance which enables to transform the issue of similarity search into the one-dimensional range search problem. A Peerto-Peer system based on the Chord protocol is created to distribute the storage space and to parallelize the execution of similarity queries. In the experiments conducted on our prototype implementation we study the system performance concentrating on several aspects of parallelism of the range search algorithm.
Obdržálek David, Kulhánek Jiří
Statická typová kontrola XSLT programů
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 393-401.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Pala Karel
The Balkanet Experience
In: Proceedings of the GLDV (German Linguistische Daten Vorarbeitung) Conference, Bonn, 2005.
Presented at: GLDV (German Linguistische Daten Vorarbeitung) Conference, 30.3.-1.4.2005, Bonn,
Germany.
This paper describes the exhaustive results obtained within IST 290388 Project Balkanet, which went on 2001-2004. The attention is paid to the restructuring and final shaping the individual Balkan WordNets. In comparison with the EuroWordNet Project some new results have been obtained: The sets of Base Concepts have been extended and a set of the Balkanet
1. Common Synsets has been introduced (8,000 synsets). These were relinked to Princeton WordNet 2.0 (PWN) and converted to XML standard format,
2. The language specific synsets that do not have translation equivalents in PWN 2.0 have been established for Balkanet languages,
3. Valency frames have been developed for Czech, Bulgarian and Romanian,
4. Domains have been added to Balkanet WordNets and implemented in the VisDic browser,
5. Integrating derivational relations into Czech WordNet and adding semantic relations into Turkish WordNet exploiting Turkish derivational morphology,
6. Links to the SUMO/MILO Ontology were added to and implemented in VisDic.
Pokorný Jaroslav
Digitální knihovny v prostředí Sémantického webu
In: Sborník z 10. ročníku semináře AKP 2005 (automatizace knihovnických procesů - 10.), (Ed. D. Tkačíková, B. Ramajzlová), VIC ČVUT, 2005, pp. 64-73.
Presented at: AKP 2005 (Automatizace knihovnických procesů) 10. ročník semináře, 3.5.-4.5.2005, Liberec,
Czech Republic.
Digitální knihovny (DK) přispívají k rozvoji Sémantického webu a současně mohou využívat jeho technologické prvky. Lze tak docílit kvalitnějšího řízení dat v DK a snazší integrace více DK, jakož i a zvýšení možnosti interakce s dalšími informačními zdroji. Ideou, stojící za Sémantickým webem, je rozšířit webové stránky značkováním, které podchytí alespoň část významu obsahu stránky. Toto sémantické značkování znamená přidání jistých metadat, která poskytují formální sémantiku obsahu webu. Projekty Sémantického webu vycházejí z technologií, které jsou vyvíjeny jako standardy. Patří sem jazyky XML, XML Schema, RDF a RDF Schema. Tyto jazyky slouží pro zápis metadat, z nichž část se organizuje v ontologiích. Další úroveň Sémantického webu využívá jazyky logiky. Základ zpracování v takto pojatém webu poskytují programy - softwaroví agenti. Cílem článku je uvést do technologií Sémantického webu a ukázat jejich uplatnění při vytváření DK
Pokorný Jaroslav
Směrem k Sémantickému Webu
In: Sborník příspěvků 20. ročníku konference Moderní databáze, KOMIX, Roudnice nad Labem, 2005, pp. 15-24.
Presented at: 20. ročník konference Moderní databáze, 26.5.-27.5.2005, Hotel Amber, Roudnice nad Labem,
Czech Republic.
Současné webové vyhledavače založené na technikách vyhledávání informací v textech nejsou schopny využít sémantické znalosti uvnitř webové stránky a tedy nemohou dát uspokojující odpovědi na uživatelské dotazy. Možným řešením se zdá být tzv. Sémantický web, který koncem 90. let popsal ve své vizi Tim Berners-Lee. Ideou, stojící za Sémantickým webem, je rozšířit webové stránky značkováním, které podchytí alespoň část významu obsahu stránky. Toto sémantické značkování znamená přidání jistých metadat, která poskytují formální sémantiku obsahu webu. Projekty Sémantického webu vycházejí z několika technologií, z nichž ty základní jsou již standardizovány nebo alespoň doporučovány. Patří sem jazyky XML, XML Schema, RDF a RDF Schema. Tyto jazyky slouží pro zápis metadat, z nichž některá se organizují v tzv. ontologiích. Další úroveň Sémantického webu využívá jazyky logiky. Základ zpracování v takto pojatém webu poskytují softwaroví agenti, tj. programy, které pracují autonomně a proaktivně. Cílem článku je uvést do technologií podporujících vytváření Sémantického webu, ukázat jeho architekturu a zmínit některé již rozpracované projekty směřující k vytváření inteligentních webových informačních služeb, personalizovaných webových míst a sémanticky zesílených vyhledávacích strojů.
Pokorný Jaroslav, Smižanský J.
Page Content Rank: an Approach to the Web Content Mining
In: Proceedings of IADIS International Conference Applied Computing, Volume: 1, IADIS Press, 2005, pp. 289-296.
ISBN: 3-540-31198-X
Presented at: IADIS International Conference Applied Computing, 22.2.-25.2. 2005, Algavre,
Portugal.
Methods of web data mining can be divided into several categories according to a kind of mined information and goals that particular categories set: Web structure mining (WSM), Web usage mining (WUM), and Web Content Mining (WCM). The objective of this paper is to propose a new WCM method of a page relevance ranking based on the page content exploration. The method, we call it Page Content Rank (PCR) in the paper, combines a number of heuristics that seem to be important for analysing the content of Web pages. The page importance is determined on the base of the importance of terms which the page contains. The importance of a term is specified with respect to a given query q and it is based on its statistical and linguistic features. As a source set of pages for mining we use a set of pages responded by a search engine to the query q. PCR uses a neural network as its inner classification structure. We describe an implementation of the proposed method and a comparison of its results with the other existing classification system – PageRank algorithm.
Pokorný Jaroslav
Database architectures: current trends and their relationships to environmental data management
In: Proceedings of the 19th Conference EnviroInfo, Masaryk University, Brno, 2005, pp. 24-28.
Presented at: 19th Conference EnviroInfo (Informatics for Environmental Protection, Networking Environmental Information), 7.9.-9.9.2005, Brno,
Czech Republic.
Ever increasing environmental demands from customers, authorities and governmental organizations as well as new business control functions are integrated to environmental management systems (EMSs). With a production of huge data sets and their processing in real-time applications, the needs for environmental data management have grown significantly. Current trends in database development and an associated research meet these challenges. The paper discusses recent advances in database technologies and attempts to highlight them with respect to requirements of EMSs.
Pokorný Jaroslav, Reschke J.
Exporting relational data into a native XML store
Řimnáč Martin
Web Integration Tool: Data Structure Modelling
In: Proceedings of the 2005 International Conference on Data Mining, CSREA Press, 2005.
ISBN: 1-932415-79-3
Presented at: DMIN`05 -International Conference on Data Mining, 20.-23.06.2005, Las Vegas,
USA.
The paper describes a method for relational data model estimation from input web data and usage of this method. It includes also its principal limitations and shows the model usage for a more effective storage into a repository. The repository is implemented as the universal relation. The properties of the model are described as well.
Řimnáč Martin
Odhadování struktury dat pomocí pravidlových systémů
In: Doktorandský den 05, (Ed. Hakl F.), MATFYZPRESS, Prague, 2005, pp. 124-133.
ISBN: 80-86732-56-8
Presented at: Institute of Computer Science Ph.D. Student`s Days 05, 5.10.-7.10.2005, Nový Dvůr,
Czech Republic.
Metoda odhadování struktury dat spojuje vizi sémantického webu a dnešní webové datové zdroje, které převážně neobsahují žádnou doprovodnou sémantiku prezentovaných informací. Aby bylo možné tyto zdroje použít pokročilými nástroji sémantického webu, je potřeba sémantiku prezentovaných dat alespoň odhadnout. Příspěvek popisuje takovou metodu, ukazuje její použití pro úlohy induktivního logického programování a jmenuje výhody použití pravidlových systémů pro její implementaci.
Řimnáč Martin
Odhad struktury dat a induktivní logické programování
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 124-133.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Odhadování struktury dat je jednou z možností, jak automatizovaným způsobem interpretovat data. Ta mohou být popsána pomocí modelu funkčních závislostí, vytváření takového modelu lze srovnat s některými technikami strojového učení. Tento příspěvek shrnuje vybrané základní techniky induktivního logického programování a analyzuje je z pohledu metody odhadování struktury dat. Ukazuje se, že techniky induktivního logického programování lze v některých případech převést právě odhadování struktury dat.
Skopal Tomáš, Pokorný Jaroslav, Snášel Václav
Nearest Neighbours Search using the PM-tree
In: Procedings of The 10th International Conference on Database Systems for Advanced Applications, LNCS 3453, Springer-Verlag, 2005, pp. 803-815.
Presented at: DASFAA 2005, 17.4.-20.4.2005, Beijing,
China.
Snášel Václav, Moravec Pavel, Pokorný Jaroslav
WordNet Ontology Based Model for Web Retrieval
In: Proceedings of International Workshop on Challenges in Web Information Retrieval and Integration (WIRI) 2005, IEEE Computer Society Press, 2005, pp. 231-236.
Presented at: International Workshop on Challenges in Web Information Retrieval and Integration, 8.4.-9.4. 2005, Tokyo,
Japan.
It is well known that ontologies will become a key piece, as they allow making the semantics of Semantic Web content explicit. In spite of the big advantages that the Semantic Web promises, there are still several problems to solve. Those concerning ontologies include their availability, development and evolution. In the area of information retrieval, the dimension of document vectors plays an important role. Firstly, with higher index dimensions the indexing structures suffer from the "curse of dimensionality" and their efficiency rapidly decreases. Secondly, we may not use exact words when looking for a document, thus we miss some relevant documents. LSI is a numerical method, which discovers latent semantics in documents by creating concepts from existing terms. In this paper we present a basic method of mapping LSI concepts on given ontology (WordNet), used both for retrieval recall improvement and dimension reduction. We offer experimental results for this method on a subset of TREC collection, consisting of Los Angeles Times articles.
Snášel Václav, Moravec Pavel, Pokorný Jaroslav
Using BFA with wordnet ontology based model for web retrieval
In: Proceedings of the First IEEE International Conference on Signal-Image Technology & Internet-Based Systems (SITIS`05), 2005, pp. 254-259.
Presented at: First IEEE International Conference on Signal-Image Technology & Internet-Based Systems (SITIS`05), 27.11.-1.12.2005, Yaoundé,
Cameroon.
In the area of information retrieval, the dimension of document vectors plays an important role. We may need to find a few words or concepts, which characterize the document based on its contents, to overcome the problem of the "curse of dimensionality", which makes indexing of highdimensional data problematic. To do so, we earlier proposed a Wordnet and Wordnet+LSI (Latent Semantic Indexing) based model for dimension reduction. While LSI works on the whole collection, another procedure of feature extraction (and thus dimension reduction) exists, using binary factorization. The procedure is based on the search of attractors in Hopfield-like associative memory. Separation of true attractors (factors) and spurious ones is based on calculation of their Lyapunov function. Being applied to textual data the procedure conducted well and even more it showed sensitivity to the context in which the words were used. In this paper, we suggest that the binary factorization may benefit from the Wordnet filtration.
Špánek Roman
Sharing information in a Large Network of Users
In: Doktorandský den 05, (Ed. Hakl F.), MATFYZPRESS, Prague, 2005, pp. 134-140.
ISBN: 80-86732-56-8
Presented at: Institute of Computer Science Ph.D. Student`s Days 05, 5.10.-7.10.2005, Nový Dvůr,
Czech Republic.
The paper describes a possible treatment of sharing data in a large network of users. The mathematical model is based on weighted hypergraphs whose nodes and edges denote the users and their relations, respectively. Its flexibility guarantees to have basic relations between users robust under frequent changes in the network connections. Approach copes with the communication/computing issues from different point of view based on a structure evolution and its further optimization in sense of keeping the parallel space and time complexities low. Although the idea is aimed to the field of mobile computing, it can be generalized in straightforward way to other similar environment. An experimental application is also proposed and discussed in the paper.
Špánek Roman
Data pozičně závislá a jejich dopad v mobilních databázích
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 273-278.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
The paper describes selected problems and possible solutions for the position management in mobile computing. A proposed scheme extends existing approaches. The main idea is to reduce amount of possible solutions given by a movement prediction algorithm by constrains ubiquitously found in the real-life. Existing solutions and possibilities for a future research are also described.
Vaculín Roman, Neruda Roman
Autonomous behavior of computational agents
In: Adaptive and Natural Computing Algorithms, Springer, Wien, 2005, pp. 514-517.
Presented at: ICANNGA 2005, 21.-23.03.2005, Coimbra,
Portugal.
In this paper we present an architecture for decision making of software agents that allows the agent to behave autonomously. Our target area is computational agents—encapsulating various neural networks, genetic algorithms, and similar methods — that are expected to solve problems of different nature within an environment of a hybrid computational multi-agent system. The architecture is based on the vertically-layered and beliefdesire-intention architectures. Several experiments with computational agents were conducted to demonstrate the benefits of the architecture
Vojtáš Peter
Proceedings of ITAT 2005, Information Technologies - Applications and Theory
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005.
ISBN: 80-7097-609-8
Vojtáš Peter
Fuzzy Logic as an Optimization Task
In: Fuzzy Logic and Technology, (Ed. Sobrevilla P., Montseny E.), Barcelona, 2005, pp. 781-786.
ISBN: 84-7683-872-3
Presented at: EUSFLAT - LFA 2005. Conference of the European Society for Fuzzy Logic and Technology /13./, Recontres Francophones sur la Logique Floue et ses Applications /11./, 7.9.-9.9.2005, Barcelona, Spain.
Wiedermann Jiří
Globural Universe and Autopoietic Automata: A Framework for Artificial Life
In: Advances in Artificial Life, (Ed. Bentley P. J., Capcarrere M., Freitas A. A., Johnson C. G.), Springer Verlag, Berlin, 2005, pp. 21-30.
Presented at: ECAL 2005, European Conference on Artificial Life, 5.9.-9.9.2005, Cantebury,
UK.
We present two original computational models - globular universe and autopoietic automata - capturing the basic aspects of an evolution: a construction of self-reproducing automata by self-assembly and a transfer of algorithmically modified genetic information over generations. Within this framework we show implementation of autopoietic automata in a globular universe. Further, we characterize the computational power of lineages of autopoietic automata via interactive Turing machines and show an unbounded complexity growth of a computational power of automata during the evolution. Finally, we define the problem of sustainable evolution and show its undecidability.
Wiedermann Jiří
Can Cognitive and Intelligent Systems Outperform Turing Machines?
In: Proceedings of Czech-Argentinian Workshop `e-Golems` (Interdisciplinary Aspects of Human-Machine Co-existence and Co-operation), (Ed. Marik et al.), CTU, Prague, 2005, pp. 82-86.
Presented at: Czech-Argentinian Workshop `e-Golems` (Interdisciplinary Aspects of Human-Machine Co-existence and Co-operation), 2.7.-5.7.2005, Prague,
Czech Republic.
We look for computational limits of artificial, natural and hybrid cognitive and intelligent systems. The common basis for such studies is offered by computationalism, i.e., the belief that cognitive or intelligent processes, respectively, are in essence computational processes. We show that in principle cognitive systems might exist whose computational power outperforms that of Turing machines and that even in practice we observe the rudiments of such systems. These results point to the fact that the so - called Church - Turing Thesis, dealing with the central position of Turing machines in the world of computations and algorithms, must be seen in the context of physical principles exploited by the cognitive systems, and in that of the communication scenario between the system and its environment.
Wiedermann Jiří
Neomezený evoluční růst výpočetní síly sebereprodukčních automatů v globulárním vesmíru a jiné výsledky
In: Kognice a umělý život, (Ed. Kelemen J., Kvasnička V., Pospíchal J.), Sleská univerzita, Ostrava, 2005, pp. 613-623.
ISBN: 80-7248-310-2
Presented at: Kognícia a umelý život V, 30.5.-2.6.2005, Smolenice,
Slovakia.
Popíšeme původní výpočetní modely – globulární vesmír a autopoietické automaty – které zachycují podstatné výpočetní aspekty evoluce: konstrukci sebereprodukčních evolučních automatů pomocí sebesestavování a přenos algoritmicky modifikovatelné genetické informace na potomka. V tomto rámci ukážeme neomezený růst výpočetní síly automatů během evoluce a pomocí interaktivního Turingova stroje chrakterizujeme výpočetní sílu rodových linií automatů.