Abdelsalam Almarimi, Pokorný Jaroslav
Schema Management for Data Integration: A Short Survey
In: Acta Polytechnica, Volume: 45, No: 1, Czech Technical University in Prague, Prague, 2005, pp. 24-27.
Schema management is a basic problem in many database application domains such as data integration systems. Users need to access and manipulate data from several databases. In this context, in order to integrate data from distributed heterogeneous database sources, data integration systems demand the resolution of several issues that arise in managing schemas. In this paper, we present a brief survey of the problem of schema matching which is used for solving problems of schema integration processing. Moreover, we propose a technique for integrating and querying distributed heterogeneous XML schemas.
Abdelsalam Almarimi, Pokorný Jaroslav
A Mediation Layer for Heterogenous XML Schemas
In: International Journal of Web Information Systems, Volume: 1, No: 1, Troubador Publishing LTD, 2005, pp. 25-32.
Presented at: iiWAS2004 Information Integration and Web Based Applications & Services, 27-29.09.2004, Jakarta,
Indonesia.
This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed as a tool for XML data integration system. A global XML schema is specified by the designer to provide a homogeneous view over heterogeneous XML data. An XML mediation layer is introduced to manage: (1) establishing appropriate mappings between the global schema and the schemas of the sources; (2) querying XML data sources in terms of the global schema. The XML data sources are described by XML Schema language. The former task is performed through a semi-automatic process that generates local and global paths. A tree structure for each XML schema is constructed and represented by a simple form. This is in turn used for assigning indices manually to match local paths to corresponding global paths. By gathering all paths with the same indices, the equivalent local and global paths are grouped automatically, and an XML Metadata Document is constructed. An XML Query Translator for the latter task is described to translate a global user query into local queries by using the mappings that are defined in the XML Metadata Document.
Abdelsalam Almarimi, Pokorný Jaroslav
A Mediation Layer for Heterogenous XML Schemas
Presented at: iiWAS2004 Information Integration and Web Based Applications & Services, 27-29.09.2004, Jakarta,
Indonesia.
This paper describes an approach for mediation of heterogeneous XML schemas. Such an approach is proposed as a tool for XML data integration system. A global XML schema is specified by the designer to provide a homogeneous view over heterogeneous XML data. An XML mediation layer is introduced to manage: (1) establishing appropriate mappings between the global schema and the schemas of the sources; (2) querying XML data sources in terms of the global schema. The XML data sources are described by XML Schema language. The former task is performed through a semi-automatic process that generates local and global paths. A tree structure for each XML schema is constructed and represented by a simple form. This is in turn used for assigning indices manually to match local paths to corresponding global paths. By gathering all paths with the same indices, the equivalent local and global paths are grouped automatically, and an XML Metadata Document is constructed. An XML Query Translator for the latter task is described to translate a global user query into local queries by using the mappings that are defined in the XML Metadata Document.
Ali K., Pokorný Jaroslav
XML-based Temporal Models
Technical Report: DC-2006-02, Dep. of Comp. Sc. and Engineering, FEE TU, Prague, 2006, 39 p.
Much research work has recently focused on the problem of representing historical information in XML. This report describes a number of temporal XML data models and provides their comparison according to the following properties: time dimension (valid time, transaction time), support of temporal elements and attributes, querying possibilities, association to XML Schema/DTD, and influence on XML syntax.
Ali K., Pokorný Jaroslav
A Three-Dimensional XML-Based Model
In: SOFSEM 2008: Theory and Practice of Computer Science, LNCS 4910, Springer, 2008, pp. 659-671.
Presented at: 34th International Conference on Current Trends in Theory and Practice of Computer Science, 19.-25.1.2008, Nový Smokovec, High Tatras,
Slovakia.
Much research work has recently focused on the problem of
representing historical information in XML. In this paper, we describe an
ongoing work to represent XML changes. Our model is a three-dimensional
XML-based model (3D_XML in short) for representing and querying histories
of XML documents. The proposed model incorporates three time dimensions,
valid time, transaction time, and efficacy time without extending the syntax of
XML. We use XQuery to express complex temporal queries on the evolution of
the document contents. We believe that native XML databases (NXDs) present
a viable alternative to relational temporal databases when complex time
dependent data has to be manipulated and stored. So NXDs will be our choice.
Basovník Stanislav, Dekár Martin, Jusko Pavol, Mikulík Andrej, Obdržálek David, Pechal Radim, Petrůšek Tomáš, Piták Roman
Logion - A Robot Which Collects Rocks
In: Proc. of International Conference on Research and Education in Robotics, 2008, pp. 276-287.
ISBN: 978-80-7378-042-5
Presented at: EUROBOT 2008: International Conference on Research and Education in Robotics, 21.-24.5.2008, Heidelberg, Germany.
Batko Michal, Skopal Tomáš, Lokoč Jakub
New Dynamic Construction Techniques for M-tree
In: Journal of Discrete Algorithms, Elsevier, Amsterdam, The Netherlands, 2008.
(in_print)
Since its introduction in 1997, the M-tree became a respected metric access method (MAM), while remaining, together with its descendants, still the only database-friendly MAM, that is, a dynamic structure persistent in paged index. Although there have been many other MAMs developed over the last decade, most of them require either static or expensive indexing. By contrast, the dynamic M-tree construction allows us to index very large databases in subquadratic time, and simultaneously the index can be maintained up-to-date (i.e., supports arbitrary insertions/deletions). In this article we propose two new techniques improving dynamic insertions in M-tree—the forced reinsertion strategies and so-called hybrid-way leaf selection. Both of the techniques preserve logarithmic asymptotic complexity of a single insertion, while they aim to produce more compact M-tree hierarchies (which leads to faster query processing). In particular, the former technique reuses the well-known principle of forced reinsertions, where the new insertion algorithm tries to re-insert the content of an M-tree leaf that is about to split in order to avoid that split. The latter technique constitutes an efficiency-scalable selection of suitable leaf node wherein a new object has to be inserted. In the experiments we show that the proposed techniques bring a clear improvement (speeding up both indexing and query processing) and also provide a tuning tool for indexing vs. querying efficiency trade-off. Moreover, a combination of the new techniques exhibits a synergic effect resulting in the best strategy for dynamic M-tree construction proposed so far.
Batko Michal, Novák David, Falchi Fabrizio, Zezula Pavel
Scalability Comparison of Peer-to-Peer Similarity Search Structures
In: Future Generation Computer Systems, Volume: 24, No: 8, Elsevier, Amsterdam, The Netherlands, 2008, pp. 834-848.
Bednárek David
Output-Driven XQuery Evaluation
In: Proc. of 2nd International Symposium on Intelligent Distributed Computing, Studies in Computational Intelligence, (Ed. C. Badica et al.), Volume: 162, Springer-Verlag, Heidelberg, 2008, pp. 55-64.
ISBN: 978-3-540-85256-8
Presented at: IDC 2008: 2nd International Symposium on Intelligent Distributed Computing, 18.-19.9.2008, Catania, Italy.
Bednárek David
Reducing Temporary Trees in XQuery
In: Proc. of 12th Advances in Databases and Information Systems, LNCS 5207, Springer-Verlag, Berlin, 2008, pp. 30-45.
ISBN: 978-3-540-85712-9
Presented at: ADBIS 2008: 12th Advances in Databases and Information Systems, 5.-9.9.2008, Pori, Finland.
Bednárek David, Yaghob Jakub, Zavoral Filip
Fine Grained Access Rights Definition in a Three Tiered Information System
In: Proc. of 5th International Conference on Innovations in Information Technology, IEEE Computer Society Press, 2008.
(in_print)
Presented at: Innovations 2008: 5th International Conference on Innovations in Information Technology, 16.-18.12.2008, Al Ain, United Arab Emirates.
Bednárek David
Extending Datalog to Cover XQuery
In: Proc. of Information Technologies - Application and Theory, (Ed. P. Vojtáš), PONT, Seňa, 2008, pp. 1-6.
ISBN: 978-80-969184-9-2
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2008, 22.-26.9.2008, High Tatras,
Slovakia.
Bednárek David
Statická typová kontrola XSLT programů
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 393-401.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Bednárek David, Obdržálek David, Yaghob Jakub, Zavoral Filip
Data Integration Using DataPile Structure
In: Proceedings of the 9th East-European Conference on Advances in Databases and Information Systems, Tallin, 2005, pp. 178-188.
Presented at: 9th East-European Conference on Advances in Databases and Information Systems (ADBIS 2005), 12.9.-15.9.2005, Tallin,
Estonia.
One of the areas of data integration covers systems that maintain co-herence among a heterogeneous set of databases. Such a system repeatedly col-lects data from the local databases, synchronizes them, and pushes the updates back. One of the key problems in this architecture is the conflict resolution. When data in a less relevant data source changes, it should not cause any data change in a store with higher relevancy. To meet such requirements, we propose a DataPile structure with following main advantages: effective storage of historical versions of data, straightfor-ward adaptation to global schema changes, separation of data conversion and replication logic, simple implementation of data relevance. Key usage of such mechanisms is in projects with following traits or require-ments: integration of heterogeneous data from sources with different reliability, data coherence of databases whose schema differs, data changes are performed on local databases and minimal load on the central database.
Bednárek David
Turingovské vzory v XSLT programech
In: Proceedings of ITAT 2006, Information Technologies - Applications and Theory, 2006.
ISBN: 80-969184-4-3
Presented at: ITAT 2006, 26.9.-1.10.2006, Chata Kosodrevina, Bystrá dolina, Nízke Tatry,
Slovakia.
Bednárek David
Optimizing XQuery/XSLT programs using backward analysis
In: Proceedings of ITAT 2007, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), PONT s.r.o., Seňa, 2007, pp. 17-22.
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2007, 21.-27.9.2007, Polana,
Slovakia.
Benda J., Obdržálek David
GFE - Graphical Finite State Machine Editor for Parallel Execution
In: Workshop on Educational Robotics, DIEES, 2006, pp. 41-47.
Presented at: Workshop on Educational Robotics 2006, 1.6.2006, Acireale, Italy.
Bustos B., Skopal Tomáš
Dynamic Similarity Search in Multi-Metric Spaces
In: Proceedings of ACM MIR 2006 (a workshop at ACM Multimedia 2006), ACM Press, Santa Barbara, CA, USA, 2006.
Presented at: ACM MIR 2006, 26.10.-27.10.2006, Santa Barbara,
CA, USA.
Dědek Jan, Eckhardt Alan, Vojtáš Peter
Experiments with Czech Linguistic Data and ILS
In: Inductive Logic Programming (Late Breaking Papers), Action M, Prague, 2008, pp. 20-25.
ISBN: 978-80-86742-26-7
Presented at: ILP 2008: Inductive Logic Programming, 10.-12.9.2008, Prague, Czech Republic.
Dědek Jan, Vojtáš Peter
Computing aggregations from linguistic web resources: a case study in Czech Republic sector/traffic accidents
In: Proc. of International Conference on Advanced Engineering Computing and Applications in Science, IEEE Computer Society Press, 2008, pp. 7-12.
ISBN: 978-0-7695-3369
Presented at: ADVCOMP 2008: International Conference on Advanced Engineering Computing and Applications in Science, 29.9.-4.10.2008, Valencia, Spain.
Dědek Jan, Vojtáš Peter
Extrakce informací z textově orientovaných zdrojů webu
In: Znalosti 2008, (Ed. V. Snášel), Vydavatelstvo STU, Bratislava, 2008.
Presented at: Znalosti 2008, 13.-15.2.2008, Bratislava,
Slovakia.
V tomto příspěvku se zbýváme extrakcí informací z webových
zdrojů převážně textového charakteru. K tomuto účelu jsme se pokusili
využít několik lingvistických nástrojů pro zpracování přirozeného
textu v češtině. Jmenovitě se jedná o nástroje pražského projektu PDT
a český WordNet. Cílem příspěvku je přiblížit možnosti, které tyto nástroje
pro extrakci informací z textu poskytují. Extrakcí informací se
zde zabýváme především v kontextu sémantického webu a zkoumáme
možnosti, jak tyto nástroje využít pro automatizaci sémantické anotace
stránek současného webu.
Dědek Jan, Eckhardt Alan, Galamboš Leo, Vojtáš Peter
Sémantický web
In: DATAKON 2008, (Ed. Řepa V., Svatoš O.), Masaryk university, 2008, pp. 12-30.
Presented at: DATAKON 2008, 18.-21.10.2008, Brno,
Czech Republic.
Dokulil Jiří, Katreniaková J.
Visual Exploration of RDF Data
In: SOFSEM 2008: Theory and Practice of Computer Science, LNCS 4910, Springer, 2008, pp. 572-583.
ISBN: 978-3-540-77565-2
Presented at: 34th International Conference on Current Trends in Theory and Practice of Computer Science, 19.-25.1.2008, Nový Smokovec, High Tatras,
Slovakia.
Dokulil Jiří, Yaghob Jakub, Zavoral Filip
Evoluce replikačních algoritmů v stohově orientovaných systémech
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 393-401.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Dokulil Jiří, Yaghob Jakub, Zavoral Filip
Infrastruktura pro dotazování nad semantickými daty
In: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu, Ústav informatiky AV ČR, Prague, 2006, pp. 10-26.
ISBN: 80-903298-7-X
Presented at: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu - Seminář projektu programu Informační společnost, 5.10.-7.10.2006, Zadov,
Czech Republic.
Idea sémantického webu je široce diskutována mezi odbornou
veřejností již mnoho let. Přestože je vyvinuta řada technologií, jazyků,
prostředků a dokonce i softwarových nástrojú, málokdo někdy nějaký
reálný sémantický web viděl. Za jeden z hlavních dùvodù tohoto stavu
považujeme neexistenci potřebné infrastruktury pro provoz sémantického
webu. V našem článku popisujeme návrh takové infrastruktury, která je
založena na využití a rozšíření technologie datového stohu a nástrojích
pro něj vyvinutých a jejich kombinaci s webovými vyhledávači a dalšími
nástroji a prostředky.
Dokulil Jiří
Použití relačních databází pro vyhodnocování SPARQL dotazů
In: Proceedings of ITAT 2006, Information Technologies - Applications and Theory, 2006.
ISBN: 80-969184-4-3
Presented at: ITAT 2006, 26.9.-1.10.2006, Chata Kosodrevina, Bystrá dolina, Nízke Tatry,
Slovakia.
Dokulil Jiří
Evaluation of SPARQL queries using relational databases
In: Proceedings of 5th International Semantic Web Conference, ISWC, 2006, (Ed. Cruz I.), LNCS 4273, Springer Verlag, Athens, FA, USA, 2006, pp. 972-973.
Basic storage and querying of RDF data using a relational
database can be done in a very simple manner. Such approach can run
into trouble when used on large and complex data. This paper presents
such data and several sample queries together with analysis of their performance.
It also describes two possible ways of improving the performance
based on this analysis.
Dokulil Jiří, Tykal J., Yaghob Jakub, Zavoral Filip
Semantic Web Infrastructure
In: Proc. of the First IEEE International Conference on Semantic Computing, IEEE, 2007, pp. 209-215.
Presented at: ICSC 2007, 17.-19.9.2007, Irvine,
California.
The Semantic Web is not widespread as it has been expected by its founders. This is partially caused by lack of standard and working infrastructure for the Semantic Web. We have built a working, portable, stable, highperformance infrastructure for the Semantic Web. This paper is focused on tasks performed by the infrastructure.
Dokulil Jiří, Tykal J., Yaghob Jakub, Zavoral Filip
Semantic Web Repository and Interfaces
In: Proc. of SEMAPRO (Int. Conf. on Advances in Semantic Processing), IEEE, 2007.
Presented at: SEMAPRO (Int. Conf. on Advances in Semantic Processing), 4.-9.11.2007, Papeete,
French Polynesia (Tahiti) .
The Semantic Web is not widespread as it has been
expected by its founders. This is partially caused by
lack of standard and working infrastructure for the Semantic
Web. We have built a working, portable, stable,
high-performance infrastructure for the Semantic
Web. This enables various experiments with the Semantic
Web in the real world.
Dokulil Jiří, Katreniaková J.
Visualization of large schemaless RDF data
In: Proc. of SEMAPRO (Int. Conf. on Advances in Semantic Processing), IEEE, 2007, pp. 243-248.
Presented at: SEMAPRO (Int. Conf. on Advances in Semantic Processing), 4.-9.11.2007, Papeete,
French Polynesia (Tahiti) .
Since many XML documents do not contain any schema definition, we expected that there will be also RDF documents without RDF schema or ontology.Then the data can only be viewed as a general labeled directed graph and the idea to present the data to the user by drawing the graph seems natural. Because the data can be extremely large, it is impossible to display the whole graph at one time. Only a suitable start node is displayed and the rest of the graph can be explored by incremental navigation.To conserve space and show possible directions of further navigation to the user we have come up with a technique called node merging. By combining suitable graph drawing and navigation techniques we get a tool that can give the user good idea about structure and content of the data.
Dokulil Jiří, Tykal J., Yaghob Jakub, Zavoral Filip
Experimental Platform for the Semantic Web
In: Proceedings of ITAT 2007, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), PONT s.r.o., Seňa, 2007, pp. 67-72.
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2007, 21.-27.9.2007, Polana,
Slovakia.
Dokulil Jiří, Katreniaková J.
Vizualizácia RDF dát pomocou techniky zlučovania vrcholov
In: Proceedings of ITAT 2007, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), PONT s.r.o., Seňa, 2007, pp. 23-28.
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2007, 21.-27.9.2007, Polana,
Slovakia.
Dokulil Jiří, Katreniaková J.
Visual Exploration of RDF Data
In: SOFSEM 2008: Theory and Practice of Computer Science, LNCS 4910, Springer, 2008, pp. 672-683.
Presented at: 34th International Conference on Current Trends in Theory and Practice of Computer Science, 19.-25.1.2008, Nový Smokovec, High Tatras,
Slovakia.
We have developed and implemented [1,2] infrastructure and
RDF storage for the Semantic Web. When we filled it with data the need
for some tool that could explore the data became evident. Unfortunately,
none of existing solutions fulfills requirements imposed by the data and
users expectations. This paper presents our RDF visualizer that was
designed specifically to handle large RDF data by means of incremental
navigation. A detailed description of the algorithm is given as well as
actual results produced by the visualizer.
Dvořáková Jana, Zavoral Filip
Xord: An Implementation Framework for Efficient XSLT Processing
In: Proc. of 2nd International Symposium on Intelligent Distributed Computing, Studies in Computational Intelligence, (Ed. C. Badica et al.), Volume: 162, Springer-Verlag, Heidelberg, 2008, pp. 95-104.
ISBN: 978-3-540-85256-8
Presented at: IDC 2008: 2nd International Symposium on Intelligent Distributed Computing, 18.-19.9.2008, Catania, Italy.
Dvořáková Jana, Zavoral Filip
BUXT Engine in Xord: Fragment Buffers for Streaming XSLT Transformations
In: Proc. of 5th International Conference on Innovations in Information Technology, IEEE Computer Society Press, 2008.
(in_print)
Presented at: Innovations 2008: 5th International Conference on Innovations in Information Technology, 16.-18.12.2008, Al Ain, United Arab Emirates.
Dvořáková Jana, Zavoral Filip
Schema-Based Analysis of XSLT Streamability
In: Proc. of International Conference on Advanced Engineering Computing and Applications in Science, IEEE Computer Society Press, 2008, pp. 187-192.
ISBN: 978-0-7695-3369
Presented at: ADVCOMP 2008: International Conference on Advanced Engineering Computing and Applications in Science, 29.9.-4.10.2008, Valencia, Spain.
Dvořáková Jana, Zavoral Filip
A Low-Memory Streaming Algorithm for XSLT Processing Implemented in Xord Framework
In: Proc. of 1st International Conference on the Applications of Digital Information and Web Technologies, IEEE Computer Society Press, 2008, pp. 239-247.
ISBN: 978-1-4244-2624-9
Presented at: ICADIWT 2008: 1st International Conference on the Applications of Digital Information and Web Technologies, 4.-6.8-2008, Ostrava, Czech Republic.
Dvořáková Jana, Zavoral Filip
Determining XSLT Streamability Using New Hierarchical XSD Model
In: Proc. of Information Technologies - Application and Theory, (Ed. P. Vojtáš), PONT, Seňa, 2008, pp. 7-12.
ISBN: 978-80-969184-9-2
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2008, 22.-26.9.2008, High Tatras,
Slovakia.
Eckhardt Alan, Horváth T., Maruščák D., Novotný R., Vojtáš Peter
Uncertainty Issues in Automating Process Connecting Web and User
In: Proc. of Uncertainty Reasoning for the Semantic Web Workshop 2007, (Ed. F. Bobillo), CEUR Workshop Proc., 2007, pp. 1-12.
Presented at: Dateso 2008: Annual International Workshop on DAtabases, TExts, Specifications and Objects, 16.4.-18.4.2008, Desná - Černá Říčka,
Czech Republic.
Eckhardt Alan, Horváth T., Vojtáš Peter
Learning different user profile annotated rules for fuzzy preference top-k quering
In: Scalable Uncertainty Management, Springer, LNAI 4772, Berlin, 2007, pp. 116-130.
Presented at: SUM 2007 International Conference, 10.10.-12.10.2007, Washington,
US.
Uncertainty querying of large data can be solved by providing top-k answers according to a user fuzzy ranking/scoring function. Usually different users have different fuzzy scoring function a user preference model. Main goal of this paper is to assign a user a preference model automatically. To achieve this we decompose user’s fuzzy ranking function to ordering of particular attributes and to a combination function. To solve the problem of automatic assignment of user model we design two algorithms, one for learning user preference on particular attribute and second for learning the combination function. Methods were integrated into a Fagin-like top-k querying system with some new heuristics and tested.
Eckhardt Alan, Vojtáš Peter
Uživatelské preference při hledání ve webovských zdrojích
In: Znalosti 2007, Fakulta elektrotechniky a informatiky, VŠB - Technická univerzita Ostrava, 2007, pp. 179-190.
Presented at: Znalosti 2007, 21.2.-23.2.2007, Ostrava,
Czech Republic.
Eckhardt Alan, Vojtáš Peter
Towards ontology language handling imperfection
In: Proceeding of the 1st Workshop on Intelligent and Knowledge oriented Technologies, 2006, pp. 124-125.
Presented at: 1st Workshop on Intelligent and Knowledge oriented Technologies, 28.11.-29.11.2006, Bratislava,
Slovakia.
Eckhardt Alan
Inductive Models of User Preferences for Semantic Web
In: Proceedings of the Dateso 2007, CEUR Workshop Proc., 2007, pp. 103-114.
Presented at: Dateso 2007 Annual International Workshop on DAtabases, TExts, Specifications and Objects, 18.4.-20.4.2007, Desná - Černá Říčka,
Czech Republic.
User preferences became recently a hot topic. The massive
use of internet shops and social webs require the presence of a user modelling,
which helps users to orient them selfs on a page. There are many
different approaches to model user preferences. In this paper, we will
overview the current state-of-the-art in the area of acquisition of user
preferences and their induction. Main focus will be on the models of user
preferences and on the induction of these models, but also the process of
extracting preferences from the user behaviour will be studied. We will
also present our contribution to the probabilistic user models.
Eckhardt Alan, Pokorný Jaroslav, Vojtáš Peter
Integrating user and group preferences for top-k search from distributed web resources
In: Proc. of DEXA Workshop Decision Support for Structural Health Monitoring and Flexible Query Processing, (Ed. Tjoa A.M., Wagner R.R..), IEEE, 2007, pp. 317-322.
Presented at: DEXA Workshop, 3.-7.9.2007, Regensburg,
Germany.
We discuss models of user and group preferences in social networks and the Semantic web. We construct a model for user and group preference querying over RDF data as well as for ordering of answers by aggregation of particular attribute ranking. We have implemented our methods and heuristics into the Tokaf middleware framework prototype. We describe also experiments with Tokaf.
Eckhardt Alan, Pokorný Jaroslav, Vojtáš Peter
A system recommending top-k objects for multiple users preference
In: Proc. of FUZZ-IEEE 2007 International Conference on Fuzzy Systems, IEEE, 2007, pp. 1101-1106.
Presented at: FUZZ-IEEE 2007, 23.-26.7.2007, London,
UK.
We discuss models of user preferences in Web environment. We construct a model for user preference querying over a number of data sources and ordering of answers by a combination of particular attribute rankings. We generalize Fagin's algorithm in two directions - we develop some new heuristics for top-k search in the model without random access and propose a method of ordering lists of objects by user fuzzy function. To enable different user preferences our system does not require objects to be sorted - instead we use a B+- tree on each of the attribute domains. This leads to a more realistic model of Web services. We implement our methods and heuristics for search of top-k answers into Tokaf middleware framework prototype. We describe experiments with Tokaf and compare different performance measures with some other methods.
Feuerlicht George, Pokorný Jaroslav, Richta Karel
Object-Relational Database Design: Can your application benefit from SQL:2003?
In: The Inter-Networked World: ISD Theory, Practice, and Education, (Ed. Barry C., Lang M., Wojtkowski W., Wojtkowski G., Wrycza S., Zupancic J.), Springer-Verlag, New York, 2008.
ISBN: 978-0387304038
Galamboš Leo
Dynamic Inverted Index Maintenance
In: International Journal of Computer Science, Volume: 1, No: 2, 2006, pp. 157-162.
Galamboš Leo
Inverted Index Maintenance
In: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu, Ústav informatiky AV ČR, Prague, 2006, pp. 27-38.
ISBN: 80-903298-7-X
Presented at: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu - Seminář projektu programu Informační společnost, 5.10.-7.10.2006, Zadov,
Czech Republic.
This paper presents a method for dynamization which may
be used for fast and effective inverted index maintenance. Experimental
results show that the dynamization process is possible and that it guarantees the response time for the query operation and index actualization.
Galamboš Leo, Lánský Jan, Chernik K.
Compression of Semistructured Documents
In: International Enformatika Conference IEC 2006, Enformatika, Transactions on Engieering, Computing and Technology, 2006, pp. 222-227.
Galamboš Leo
Vyhledávání na Webu
In: DATAKON 2007, (Ed. Popelínský L., Výborný O.), Masaryk university, 2007, pp. 17-24.
Presented at: DATAKON 2007, 20.10.-23.10.2007, Brno,
Czech Republic.
Galamboš Leo, Lánský Jan, Žemlička M., Chernik K.
Compression of Semistructured Documents
In: International Journal of Information Technology, Volume: 4, No: 1, Elsevier, 2007, pp. 11-17.
EGOTHOR is a search engine that indexes the Web
and allows us to search the Web documents. Its hit list contains URL
and title of the hits, and also some snippet which tries to shortly
show a match. The snippet can be almost always assembled by an
algorithm that has a full knowledge of the original document (mostly
HTML page). It implies that the search engine is required to store
the full text of the documents as a part of the index.
Such a requirement leads us to pick up an appropriate compression
algorithm which would reduce the space demand. One of the solutions
could be some use of common compression methods, for instance
gzip or bzip2, but it might be preferable to develop a new method
which would take advantage of the document structure, or rather, the
textual character of the documents.
There already exist special compression text algorithms and methods
for a compression of XML documents. The aim of this paper is
an integration of the two approaches to achieve an optimal level of
the compression ratio.
Gurský Peter, Horváth T., Novotný R., Vaneková Veronika, Vojtáš Peter
UPRE: User preference based search system
In: Proceeding of the IEEE/WIC/ACM International Conference on Web Intelligence, ACM IEEE WIC, 2006, pp. 4.
Presented at: IEEE/WIC/ACM International Conference on Web Intelligence WI-06, 18.12.-22.12.2006, Hong-Kong
.
Gurský Peter, Vojtáš Peter
Multikriteriálne vyhľadávanie najlepších objektov s podporou viacerých užívateľov
In: Znalosti 2007, Fakulta elektrotechniky a informatiky, VŠB - Technická univerzita Ostrava, 2007, pp. 52-62.
Presented at: Znalosti 2007, 21.2.-23.2.2007, Ostrava,
Czech Republic.
Gurský Peter, Horváth T., Jirásek J., Krajči S., Novotný R., Vaneková Veronika, Vojtáš Peter
Web Search with Variable User Model
In: DATAKON 2007, (Ed. Popelínský L., Výborný O.), Masaryk university, 2007, pp. 111-121.
Presented at: DATAKON 2007, 20.10.-23.10.2007, Brno,
Czech Republic.
Húsek Dušan, Pokorný Jaroslav, Řezanková Hana, Snášel Václav
Data clustering: From documents to the Web
In: Web Data Management Practices: Emerging Techniques and Technologies, (Ed. Vakali A., Pallis G.), Idea Group Inc., 2007, pp. 1-33.
The chapter provides a survey of some clustering methods relevant to the clustering document collections and, in consequence, Web data. We start with classical methods of cluster analysis which seem to be relevant in approaching to cluster Web data. The graph clustering is also described since its methods contribute significantly to clustering Web data. A use of artificial neural networks for clustering has the same motivation. Based on previously presented material, the core of the chapter provides an overview of approaches to clustering in the Web environment. Particularly, we focus on clustering web search results, in which clustering search engines arrange the search results into groups around a common theme. We conclude with some general considerations concerning the justification of so many clustering algorithms and their application in the Web environment.
Jusko Pavol, Obdržálek David, Petrůšek Tomáš
Software-Hardware Mapping in a Robot Design
In: Proc. of International Conference on Research and Education in Robotics, LNCS, Springer-Verlag, Heidelberg, 2008, pp. 42-51.
ISBN: 978-80-7378-042-5
Presented at: EUROBOT 2008: International Conference on Research and Education in Robotics, 21.-24.5.2008, Heidelberg, Germany.
Kochánek Jiří, Lánský Jan, Uzel Petr, Žemlička M.
Multistream Compression
In: Proc. of Data Compression Conference, IEEE Computer Society Press, 2008, pp. 557.
ISBN: 978-0-7695-3121-2
Presented at: DCC 2008: Data Compression Conference, 25.-27.3.2008, Snowbird, Utah, USA.
Kudělka Miloš, Snášel Václav, Lehečka Ondřej, El-Qawasmeh Eyas, Pokorný Jaroslav
Web pages reordering and clustering based on web patterns
In: SOFSEM 2008: Theory and Practice of Computer Science, LNCS 4910, Springer, 2008, pp. 731-742.
Presented at: 34th International Conference on Current Trends in Theory and Practice of Computer Science, 19.-25.1.2008, Nový Smokovec, High Tatras,
Slovakia.
Kuthan T., Lánský Jan
Genetic Algorithms in Syllable-Based text Compression
In: Proceedings of the Dateso 2007, CEUR Workshop Proc., 2007, pp. 21-34.
Presented at: Dateso 2007 Annual International Workshop on DAtabases, TExts, Specifications and Objects, 18.4.-20.4.2007, Desná - Černá Říčka,
Czech Republic.
Syllable based text compression is a new approach to compression
by symbols. In this concept syllables are used as the compression
symbols instead of the more common characters or words. This new
technique has proven itself worthy especially on short to middle-length
text files. The effectiveness of the compression is greatly affected by the
quality of dictionaries of syllables characteristic for the certain language.
These dictionaries are usually created with a straight-forward analysis
of text corpora. In this paper we would like to introduce an other way of
obtaining these dictionaries using genetic algorithm. We believe, that
dictionaries built this way, may help us lower the compress ratio. We will
measure this effect on a set of Czech and English texts.
Lánský Jan, Galamboš Leo, Chernik K.
Komprese webového uložiště
In: Proceedings of ITAT 2006, Information Technologies - Applications and Theory, 2006.
ISBN: 80-969184-4-3
Presented at: ITAT 2006, 26.9.-1.10.2006, Chata Kosodrevina, Bystrá dolina, Nízke Tatry,
Slovakia.
Lánský Jan, Chernik K., Vlčková Z.
Syllable-Based Burrows-Wheeler Transform
In: Proceedings of the Dateso 2007, CEUR Workshop Proc., 2007, pp. 1-10.
Presented at: Dateso 2007 Annual International Workshop on DAtabases, TExts, Specifications and Objects, 18.4.-20.4.2007, Desná - Černá Říčka,
Czech Republic.
The Burrows-Wheeler Transform (BWT) is a compression
method which reorders an input string into the form, which is preferable
to another compression. Usually Move-To-Front transform and then
Huffman coding is used to the permutated string. The original method [3]
from 1994 was designed for an alphabet compression. In 2001, versions
working with word and n-grams alphabet were presented. The newest
version copes with the syllable alphabet [7]. The goal of this article is to
compare the BWT compression working with alphabet of letters, syllables,
words, 3-grams and 5-grams.
Lánský Jan, Žemlička M.
Compression of a Set of Strings
In: Proc. of 2007 Data Compression Conference (DCC 2007), IEEE Computer Society Press, 2007, pp. 390-390.
Presented at: DCC 2007 Data Compression Conference, 27.-29.3.2007, Snowbird, Utah,
USA.
Lánský Jan, Chernik K., Vlčková Z.
Comparison of Text Models for BWT
In: Proc. of 2007 Data Compression Conference (DCC 2007), IEEE Computer Society Press, 2007, pp. 389-389.
Presented at: DCC 2007 Data Compression Conference, 27.-29.3.2007, Snowbird, Utah,
USA.
Lokoč Jakub, Skopal Tomáš
On Reinsertions in M-tree
In: Proc. of 1st international workshop on Similarity Search and Applications, IEEE Computer Society Press, 2008, pp. 121-128.
ISBN: 0-7695-3101-6
Presented at: SISAP 2008: 1st international workshop on Similarity Search and Applications, 11.-12.4.2008, Cancun, Mexico.
Lokoč Jakub, Skopal Tomáš
NM-tree: Flexible Approximate Similarity Search in Metric and Non-metric Spaces
In: Proc. of 19th International Conference on Database and Expert Systems Applications, LNCS 5181, Springer-Verlag, Berlin, 2008, pp. 312-325.
ISBN: 978-3-540-85653-5
Presented at: DEXA 2008: 19th International Conference on Database and Expert Systems Applications, 1.-5.9.2008, Turin, Italy.
Lokoč Jakub, Skopal Tomáš
On Reinsertions in M-tree
In: 1st International Workshop on Similarity Search and Applications (SISAP 2008), IEEE, 2008.
(in_print)
Presented at: SISAP 2008 - Workshop at ICDE 2008, 11.-12.04.2008, Cancun,
Mexico.
In this paper we introduce a new M-tree building method, utilizing the classic idea of forced reinsertions. In case a leaf is about to split, some distant objects are removed from the leaf (reducing the covering radius), and then again inserted into the M-tree in a usual way. A regular leaf split is performed only after a series of unsuccessful reinsertion attempts. We expect the forced reinsertions will result in more compact Mtree hierarchies (i.e., more efficient query processing), while the index construction costs should be kept as low as possible. Considering both low construction costs and low querying costs, we examine several combinations of construction policies with reinsertions. The experiments show that forced reinsertions could significantly decrease the number of distance computations, thus speeding up indexing as well as querying.
Matousek T., Zavoral Filip
Extracting Zing Models from C Source Code
In: SOFSEM 2007, LNCS 4362, Springer, Berlin, 2007, pp. 900-910.
Presented at: SOFSEM 2007, 20.2.-26.2.2007, Harrachov,
Czech Republic.
In the paper, we propose an approach to an automatic extraction of verification models for the C language source code. We primarily focus on the representation of pointers and arrays, which make the extraction from the C language specific. We provide an implementation of the model extractor as a part of our broader effort to develop a verifier of Windows kernel drivers based on the Zing model checker. To demonstrate the feasibility of our approach, we give examples of the extraction results on a practical synchronization problem.
Mlýnková Irena
An Analysis of Approaches to XML Schema Inference
In: Proc. of 4th International Conference on Signal-Image Technology and Internet-Based Systems, IEEE Computer Society Press, 2008.
ISBN: 0-7695-3101-6
(in_print)
Presented at: SITIS 2008: 4th International Conference on Signal-Image Technology and Internet-Based Systems, 30.11.-3.12.2008, Bali, Indonesia.
Mlýnková Irena
Equivalence of XSD Constructs and its Exploitation in Similarity Evaluation
In: Proc. of 7th International Conference on Ontologies, DataBases, and Applications of Semantics, LNCS 5332, Springer-Verlag, Berlin, 2008, pp. 1252-1269.
ISBN: 978-3-540-85712-9
(in_print)
Presented at: ODBASE 2008: 7th International Conference on Ontologies, DataBases, and Applications of Semantics, 11.-13.11.2008, Monterrey, Mexico.
Mlýnková Irena
Similarity of XML Schema Definitions
In: Proc. of 8th ACM Symposium on Document Engineering, ACM Press, Berlin, 2008, pp. 187-190.
ISBN: 978-1-60558-081-4
Presented at: DocEng 2008: 8th ACM Symposium on Document Engineering, 16.-19.9.2008, Sao Paulo, Brazil.
Mlýnková Irena
Current Trends in Testing XMLMSs
In: Proc. of 17th International Conference on Information Systems Development, Springer Science + Business Media, Inc., Berlin, 2008.
ISBN: 978-3-540-85712-9
(in_print)
Presented at: ISD 2008: 17th International Conference on Information Systems Development, 25.-27.8.2008, Paphos, Cyprus.
Mlýnková Irena
XML Schema Inference: A Study
Technical Report: 2008/6, Dep. of Software Engineering, MFF, Charles University, Prague, 2008, 18 p.
Mlýnková Irena
UserMap - an Enhancing of User-Driven XML-to-Relational Mapping Strategies
Technical Report: 2007/3, Charles University, Prague, 2007, 38 p.
As XML has undoubtedly become a standard for data representation, it is inevitable to propose and implement techniques for
efficient managing of XML data. A natural alternative is to exploit features and functions of (object-)relational database systems, i.e. to rely
on their long theoretical and practical history. The main concern of such
techniques is the choice of an appropriate XML-to-relational mapping
strategy.
In this paper we focus on enhancing of user-driven techniques which
leave the mapping decisions in hands of users. We propose an algorithm
which exploits the user-given annotations more deeply searching the
user-specified "hints" in the rest of the schema and applies an adaptive
method on the remaining schema fragments. We describe the proposed
algorithm, the similarity measure designed for this purpose, sample implementation of key features of the proposal called UserMap, and results
of experimental testing on real XML data.
Mlýnková Irena, Pokorný Jaroslav
From XML Schema to Object-Relational Database – An XML Schema-Driven Mapping Algorithm
In: Proceedings of the IADIS International Conference WWW/Internet, (Ed. Isaias P., Karmakar N.), IADIS, 2004, pp. 115-122.
Presented at: IADIS International Conference WWW/Internet 2004, 06.-09. 10. 2004, Madrid,
Spain.
Since XML becomes a crucial format for representing information, it is necessary to establish techniques for managing XML documents. A possible solution can be found in storing XML data in (object-)relational databases. For this purpose most of the existing techniques often exploit an XML schema of the stored XML data, usually expressed in DTD. But the more complex today’s applications are, the more insufficient the DTD becomes and the necessity to use XML Schema language becomes more essential. The paper proposes an algorithm for mapping XML Schema structures to an object-relational database schema (defined by the SQL:1999 standard) using a (modified) DOM interface and an algorithm for storing the valid XML data into relations of the resulting schema. The main aim is to exploit object-oriented features XML Schema has and the advantages of object-relational databases and to preserve the structure as well as semantic constraints of the source schema in the target schema.
Mlýnková Irena, Pokorný Jaroslav
XML in the World of (Object-) Relational Database Systems
In: Information Systems Development Advances in Theory, Practice and Education, (Ed. Vasilecas O. et al.), Kluwer, 2004.
ISBN: 0-387-25026-3
Presented at: 13th International Conference on Information Systems Development, ISD`2004, 9.9.-11.9. 2004, Vilnius,
Lithuania.
Mlýnková Irena, Toman Kamil, Pokorný Jaroslav
Statistical Analysis of Real XML Data Collections
Technical Report: 2006/5, MFF UK, Prague, 2006, 39 p.
Recently XML has achieved the leading role among languages for data representation and thus we can witness a massive boom of corresponding techniques for managing XML data. Most of the processing techniques however suffer from various bottlenecks worsening their time and/or space efficiency.We assume that the main reason is they consider XML collections too globally, involving all their possible features, although real data are often much simpler. Even though some techniques do restrict the input data, the restrictions are often unnatural. In this paper we analyze existing XML data, their structure and real complexity in particular.We have gathered more than 20GB of real XML collections and implemented a robust automatic analyzer. The analysis considers existing papers on similar topics, trying to confirm or confute their observations as well as to bring new findings. It focuses on frequent but often ignored XML items (such as mixed content or recursion) and relationship between schemes and their instances.
Mlýnková Irena
XML Data in (Object-)Relational Databases
In: Diploma Thesis, Charles University, Prague, 2007, pp. 142.
Mlýnková Irena, Toman Kamil, Pokorný Jaroslav
Statistical Analysis of Real XML Data Collections
In: Proceeding of the 13th International Conference on Management of Data - COMAD 2006, (Ed. Lakshmanan, L.L., Roy, P., Tung, A.), Tata McGraw Hill Publ. Comp., Delhi, 2006, pp. 20-31.
Presented at: 13th International Conference on Management of Data - COMAD 2006, 14.12.-16.12.2006, Delhi,
India.
Mlýnková Irena
An XML-to-Relational User-driven Mapping Strategy Based on Similarity and Adaptivity
In: Proc. of SYRCoDIS `07 4th Spring Young Researchers Colloquium on Databases and Information Systems, Volume: 256, CEUR Woskhop Proc., 2007, pp. 9-20.
Presented at: SYRCoDIS`07, 31.5.-1.6.2007, Moscow,
Russia.
As XML has become a standard for data representation,
it is inevitable to propose and implement
techniques for efficient managing of XML
data. A natural alternative is to exploit features
and functions of (object-)relational database
systems, i.e. to rely on their long theoretical
and practical history. The main concern of
such techniques is the choice of an appropriate
XML-to-relational mapping strategy.
In this paper we focus on enhancing of userdriven
techniques which leave the mapping decisions
in hands of users. We propose an algorithm
which exploits the user-given annotations
more deeply searching the user-specified
“hints” in the rest of the schema and applies an
adaptive method on the remaining schema fragments.
We describe the algorithm theoretically,
discussing the key ideas of the approach, chosen
solutions, their reasons, and consequences.
Finally, we overview the open issues related to
implementation of the proposed algorithm and
its experimental testing on real XML data.
Mlýnková Irena, Pokorný Jaroslav
Similarity and XML Technologies
In: Proc. of IADIS International Conference WWW/Internet 2007, (Ed. Isaias P., Nunes M.B., Barroso J.), IADIS, 2007, pp. 277-287.
Presented at: WWW/Internet 2007, 5.-8.10.2007, Vila Real,
Portugal.
As XML technologies have undoubtedly become a standard for data representation, it is inevitable to provide efficient implementations of W3C recommendations. A possible optimization of particular types of techniques can be found in exploitation of similarity of XML data and/or matching of XML patterns. In this paper we provide an overview and classification of such techniques from various points of view. We briefly describe the best known representatives of particular ideas and we discuss their key advantages and disadvantages. The text should serve as a good starting point for proposing an appropriate similarity-based optimization.
Mlýnková Irena, Pokorný Jaroslav
Similarity of XML Schema Fragments Based on XML Data Statistics
In: Proc. of Innovations '07: Proceedings of the 4th International Conference on Innovations in Information Technology, IEEE Computer Society Press, 2007, pp. 243-247.
Presented at: 4th International Conference on Innovations in Information Technology, 18.-20.11.2007, Dubai,
United Arab Emirates.
As XML has become a standard for data representation, it can be found in plenty of information technologies. A possible optimization of XML-based approaches can be exploitation of similarity of XML data. In this paper we propose a technique for evaluating similarity of XML schema fragments focusing on two often omitted aspects - structural level of similarity and tuning of parameters of the similarity measure. In the former case we exploit the results of statistical analysis of real-world XML data. In the latter case we show that the tuning problem is a kind of constraints optimization problem and can be solved using corresponding approaches. We have analyzed (dis) advantages of two of them, genetic algorithms and simulated annealing, and in further experiments we show that appropriate tuning produces a more precise similarity measure.
Mlýnková Irena, Pokorný Jaroslav
UserMap - an Adaptive Enhancing of User-Driven XML-to-Relational Mapping Strategies
In: ADC '08: Proceedings of the 19th Australasian Database Conference, Volume: 75, Australia Computer Society, Wollongong, New South Wales, 2008, pp. 165-174.
Presented at: ADC '08: 19th Australasian Database Conference, 22.-25.01.2008, Wollongong, New South Wales,
Australia.
As the XML has become a standard for data representation, it is inevitable to propose and implement techniques for efficient managing of XML data. A natural alternative is to exploit features of (object-)relational database systems, i.e. to rely on their long theoretical and practical history. The main concern of such techniques is the choice of an appropriate XML-to-relational mapping strategy. In this paper we focus on enhancing of user-driven techniques which leave the mapping decisions in hands of users who specify their requirements using schema annotations. We describe our prototype implementation called UserMap which is able to exploit the annotations more deeply searching the user-specified “hints” in the rest of the schema and applies an adaptive method on the remaining schema fragments. Using a sample set of supported fixed mapping methods we discuss problems related to query evaluation for storage strategies generated by the system, in particular correction of the candidate set of annotations and related query translation. And finally, we describe the architecture of the whole system.
Mlýnková Irena
UserMap - an Exploitation of User-Specified XML-to-Relational Mapping Requirements and Related Problems
Technical Report: 2007/8, Charles University, Prague, 2007, 26 p.
As the XML has become a standard for data representation, it is inevitable
to propose and implement techniques for efficient managing of XML
data. A natural alternative is to exploit features of (object-)relational database systems,
i.e. to rely on their long theoretical and practical history. The main concern
of such techniques is the choice of an appropriate XML-to-relational mapping
strategy.
In this paper we focus on enhancing of user-driven techniques which leave the
mapping decisions in hands of users who specify their requirements using schema
annotations.We describe our prototype implementation called UserMap which is
able to exploit the annotations more deeply searching the user-specified “hints” in
the rest of the schema and applies an adaptive method on the remaining schema
fragments. Using a sample set of supported fixed mapping methods we discuss
problems related to query evaluation for storage strategies generated by the system,
in particular correction of the candidate set of annotations and related query
translation. And finally, we describe the architecture of the whole system.
Nečaský Martin, Pokorný Jaroslav
Conceptual Modeling of IS-A Hierarchies for XML
In: Proc. of 18th European Japanese Conference on Information Modelling and Knowledge Bases, EJC2008 Program Comitee and EJC2008 Program Coordination team, 2008, pp. 65-84.
ISBN: 978-3-540-85712-9
(in_print)
Presented at: EJC 2008: 18th European Japanese Conference on Information Modelling and Knowledge Bases, 2.-6.6.2008, Tsukuba, Japan.
Nečaský Martin
Conceptual Model Based Normalization of XML Views
In: Proc. of DATESO 2008, (Ed. J. Pokorný, V. Snášel, K. Richta), CEUR Workshop Proc., 2008, pp. 13-24.
Presented at: Dateso 2008: Annual International Workshop on DAtabases, TExts, Specifications and Objects, 16.4.-18.4.2008, Desná - Černá Říčka,
Czech Republic.
As the popularity of XML as a format for data representation grows the need for storing XML data in an effective way grows as well. Recent research has provide us with effeective solutions based on storing XML data into relational databases and with new technologies based on storing XML data in the native form. However, design of XML databases has not been studied su±ciently yet. In this paper, we suppose a set of XML schemes that describe XML representation of our data in several types of XML documents. We show that we can not usually store the data directly in this representation because it can contain redundancies. To design an optimal database schema we therefore need to locate these redundancies and eliminate them.We describe two types of redundancies in XML data in this paper and show how to utilize a conceptual schema of the XML schemes to locate such redundancies. We also show how to normalize the XML schemes to eliminate these redundancies.
Nečaský Martin
Conceptual modeling for XML
In: Diploma Thesis, Charles University, Prague, 2007, pp. 153 p..
Nečaský Martin
Conceptual Modeling for XML: A Survey
Technical Report: 2006-3, Dep. of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague, 2006, 54 p.
Recently XML is the standard format used for the exchange of data between information systems and is also frequently applied as a logical database model. If we use XML as a logical database model we need a conceptual model for the description of its semantics. However, XML as a logical database model has some special characteristics which makes existing conceptual models as E-R or UML unsuitable. In this paper, the current approaches to the conceptual modeling of XML data are described in an uniform style. A list of requirements for XML conceptual models is presented and described approaches are compared on the base of the requirements.
Nečaský Martin
Conceptual Modeling for XML: A Survey
In: Proceedings of the Dateso 2006, CEUR-WS, 2006, pp. 40-53.
Presented at: Dateso 2006 Annual International Workshop on DAtabases, TExts, Specifications and Objects, 26.4.-28.4.2006, Desná - Černá Říčka,
Czech Republic.
Recently XML is the standard format used for the exchange of data between information systems and is also frequently applied as a logical database model. If we use XML as a logical database model we need a conceptual model for the description of its semantics. However, XML as a logical database model has some special characteristics which makes existing conceptual models as E-R or UML unsuitable. In this paper, the current approaches to the conceptual modeling of XML data are described in an uniform style. A list of requirements for XML conceptual models is presented and described approaches are compared on the base of the requirements.
Nečaský Martin
XSEM – A Conceptual model for XML Data
In: Proceedings of Communications and Doctoral Consortium, 7th International Baltic Conference on Databases and Information Systems, Vilnius, 2006, pp. 328-331.
Recently XML is the standard format used for the exchange of data between information systems and is also frequently applied as a logical database model. If we use XML as a logical database model we need a conceptual model for the description of its semantics. In this paper, we describe our work on a new conceptual model for XML called XSEM created as a combination of several approaches applied in the area of conceptual modeling for XML.
Nečaský Martin
XSEM - A Conceptual Model for XML Data
In: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu, Ústav informatiky AV ČR, Prague, 2006, pp. 60-69.
ISBN: 80-903298-7-X
Presented at: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu - Seminář projektu programu Informační společnost, 5.10.-7.10.2006, Zadov,
Czech Republic.
In this paper we briefly describe a new conceptual model
for XML data called XSEM. The model is a combination of several approaches in the area of conceptual modeling of XML data. The model
divides the process of conceptual modeling of XML data to two levels.
On the first level, a designer designs an overall non-hierarchical conceptual schema of a domain. On the second level, he or she derives different
hierarchical representations of parts of the overall conceptual schema using transformation operators. These hierarchical representations describe
how the data is organized in an XML form.
Nečaský Martin
XSEM - A Conceptual Model for XML
In: Proceedings of the Fourth Asia-Pacific Conference on Conceptual Modelling (APCCM 2007) , (Ed. Roddick J. F., Annika H.), 2007, pp. 37-48.
Presented at: The Fourth Asia-Pacific Conference on Conceptual Modelling (APCCM 2007), 30.1.-2.2.2007, Ballarat, Victoria,
Australia.
We propose a new conceptual model for XML data
called XSEM as a combination of several approaches
in the area of the conceptual modeling for XML.
The model divides the conceptual modeling process of
XML data to two levels. On the first level, a designer
designs an overall non-hierarchical conceptual schema
of a domain. On the second level, he or she derives
different hierarchical representations of parts of the
overall conceptual schema using transformation op-
erators. These hierarchical representations describe
how the data is organized in an XML form.
Nečaský Martin
Using XSEM for Modeling XML Interfaces of Services in SOA
In: Proceedings of the Dateso 2007, CEUR Workshop Proc., 2007, pp. 35-46.
Presented at: Dateso 2007 Annual International Workshop on DAtabases, TExts, Specifications and Objects, 18.4.-20.4.2007, Desná - Černá Říčka,
Czech Republic.
In this paper we briefly describe a new conceptual model for
XML data called XSEM and how to use it for modeling XML interfaces
of services in service oriented architecture (SOA). The model is a
combination of several approaches in the area of conceptual modeling of
XML data. It divides the process of conceptual modeling of XML data to
two levels. The first level consists of designing an overall non-hierarchical
conceptual schema of the domain. The second level consists of deriving
different hierarchical representations of parts of the overall conceptual
schema using transformation operators. Each hierarchical representation
models an XML schema describing the structure of the data exchanged
between a service interface and external services.
Nečaský Martin, Pokorný Jaroslav
Extending E-R for Modelling XML Keys
In: Proc. of IEEE ICDIM 2007: Proc. of The Second International Conference on Digital Information Management, IEEE Computer Society, 2007, pp. 236-241.
Presented at: ICDIM 2007: The Second International Conference on Digital Information Management, 28.-31.10.2007, Lyon,
France.
With the growing popularity of XML there is a need not only to describe the structure of XML data but also its semantics. For the conceptual modelling of XML we can use existing conceptual models. However, special features of XML require extensions of these models. In this paper, we study conceptual modelling of XML keys. We extend the notion of E-R keys to be suitable for modelling the semantics of XML keys and we show how to express them on the XML logical level.
Nečaský Martin, Pokorný Jaroslav
Design and Management of Semantic Web Services using Conceptual Model
In: Proceedings of The 23rd Annual ACM Symposium on Applied Computing (SAC 2008), Volume: 3, Fortaleza, Ceará, 2008, pp. 2243-2247.
Presented at: 23rd Annual ACM Symposium on Applied Computing, 16.-20.3.2008, Fortaleza,
Brazil.
Obdržálek David
Usage of real-world robotics in Semantic Web
In: Proc. of 5th International Conference on Innovations in Information Technology, IEEE Computer Society Press, 2008.
(in_print)
Presented at: Innovations 2008: 5th International Conference on Innovations in Information Technology, 16.-18.12.2008, Al Ain, United Arab Emirates.
Obdržálek David
Daly by se použít robotické metody i v sémantickém webu?
In: Proc. of Information Technologies - Application and Theory, (Ed. P. Vojtáš), PONT, Seňa, 2008, pp. 87-90.
ISBN: 978-80-969184-9-5
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2008, 22.-26.9.2008, High Tatras,
Slovakia.
Obdržálek David, Kulhánek Jiří
Generating and handling of differential data in DataPile-oriented systems
In: Proceedings of the IASTED International Conference on Databases and Applications (DBA 2006), (Ed. Hamza M. H.), 2006.
ISBN: 0-88986-560-4
Presented at: IASTED International Conference on Databases and Applications (DBA 2006) as part of the 24th IASTED International Multi-Conference on Applied Informatics, 13.2.-15.2.2006, Innsbruck,
Austria.
Basics of the DataPile structure for data handling systems have been theoretically designed and published. During implementation of such system, numerous problems which were not addressed during the theoretical design phase arose. In a real production environment, the applications connected to the DataPile core need special treatment and set important requirements on the data synchronization process. This article concerns with generating of differential data being distributed from the central DataPile storage to individual applications. It is shown that the synchronization part of DataPile-structured system can be implemented and run efficiently despite of the restrictions or limitations these individual applications impose.
Obdržálek David, Kulhánek Jiří
Statická typová kontrola XSLT programů
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005, pp. 393-401.
ISBN: 80-7097-609-8
Presented at: ITAT 2005, 20.9. - 25.9.2005, Račkova dolina,
Slovakia.
Obdržálek David, Benda J.
GFE - Graphical Finite State Machine Editor for Parallel Execution
In: ICEC 2007, (Ed. Ma L., Nakatsu R., Rauterberg M.), LNCS 4740, Springer, IFIP, 2007, pp. 401-406.
Presented at: ICEC 2007 - International Conference on Entertainment Computing, 20.-23.06.2005, Shanghai,
China.
Ondrejička Matúš, Pokorný Jaroslav
Extending Fagin's algorithm for more users based on multidimensional B-tree
In: Proc. of 12th Advances in Databases and Information Systems, LNCS 5207, Springer-Verlag, Berlin, 2008, pp. 199-214.
ISBN: 978-3-540-85712-9
Presented at: ADBIS 2008: 12th Advances in Databases and Information Systems, 5.-9.9.2008, Pori, Finland.
Petricek V., Escher T., Cox I. J., Margetts H.
The Web Structure of E-Government - Developing a Methodology for Quantitative Evaluation
In: Proceedings of the 15th International Conference on World Wide Web WWW 2006, ACM Press, New York, 2006, pp. 669-678.
Presented at: International Conference on World Wide Web WWW 2006, 23.12.-26.12.2006, Edinburgh,
UK.
Podzimek Michal, Dokulil Jiří, Yaghob Jakub, Zavoral Filip
Mám hlad: pomůže mi Sémantický web?
In: Proc. of Information Technologies - Application and Theory, (Ed. P. Vojtáš), PONT, Seňa, 2008, pp. 91-94.
ISBN: 978-80-969184-9-5
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2008, 22.-26.9.2008, High Tatras,
Slovakia.
Pokorný Jaroslav, Vávra Jan, Snášel Václav
A Renewed Matrix Model for XML Data
In: Proc. of 8th International Conference on Intelligent Systems Design and Applications, IEEE Computer Society, 2008.
(in_print)
Presented at: ISDA 2008: 8th International Conference on Intelligent Systems Design and Applications, 25.-28.11.2008, Kaohsiung, Taiwan.
Pokorný Jaroslav, Richta Karel, Valenta Michal
Cellstore: Educational and Experimental XML-Native DBMS
In: The Inter-Networked World: ISD Theory, Practice, and Education, (Ed. Barry C., Lang M., Wojtkowski W., Wojtkowski G., Wrycza S., Zupancic J.), Springer-Verlag, New York, 2008.
ISBN: 978-0387304038
Pokorný Jaroslav
Digitální knihovny v prostředí Sémantického webu
In: Sborník z 10. ročníku semináře AKP 2005 (automatizace knihovnických procesů - 10.), (Ed. D. Tkačíková, B. Ramajzlová), VIC ČVUT, 2005, pp. 64-73.
Presented at: AKP 2005 (Automatizace knihovnických procesů) 10. ročník semináře, 3.5.-4.5.2005, Liberec,
Czech Republic.
Digitální knihovny (DK) přispívají k rozvoji Sémantického webu a současně mohou využívat jeho technologické prvky. Lze tak docílit kvalitnějšího řízení dat v DK a snazší integrace více DK, jakož i a zvýšení možnosti interakce s dalšími informačními zdroji. Ideou, stojící za Sémantickým webem, je rozšířit webové stránky značkováním, které podchytí alespoň část významu obsahu stránky. Toto sémantické značkování znamená přidání jistých metadat, která poskytují formální sémantiku obsahu webu. Projekty Sémantického webu vycházejí z technologií, které jsou vyvíjeny jako standardy. Patří sem jazyky XML, XML Schema, RDF a RDF Schema. Tyto jazyky slouží pro zápis metadat, z nichž část se organizuje v ontologiích. Další úroveň Sémantického webu využívá jazyky logiky. Základ zpracování v takto pojatém webu poskytují programy - softwaroví agenti. Cílem článku je uvést do technologií Sémantického webu a ukázat jejich uplatnění při vytváření DK
Pokorný Jaroslav
Směrem k Sémantickému Webu
In: Sborník příspěvků 20. ročníku konference Moderní databáze, KOMIX, Roudnice nad Labem, 2005, pp. 15-24.
Presented at: 20. ročník konference Moderní databáze, 26.5.-27.5.2005, Hotel Amber, Roudnice nad Labem,
Czech Republic.
Současné webové vyhledavače založené na technikách vyhledávání informací v textech nejsou schopny využít sémantické znalosti uvnitř webové stránky a tedy nemohou dát uspokojující odpovědi na uživatelské dotazy. Možným řešením se zdá být tzv. Sémantický web, který koncem 90. let popsal ve své vizi Tim Berners-Lee. Ideou, stojící za Sémantickým webem, je rozšířit webové stránky značkováním, které podchytí alespoň část významu obsahu stránky. Toto sémantické značkování znamená přidání jistých metadat, která poskytují formální sémantiku obsahu webu. Projekty Sémantického webu vycházejí z několika technologií, z nichž ty základní jsou již standardizovány nebo alespoň doporučovány. Patří sem jazyky XML, XML Schema, RDF a RDF Schema. Tyto jazyky slouží pro zápis metadat, z nichž některá se organizují v tzv. ontologiích. Další úroveň Sémantického webu využívá jazyky logiky. Základ zpracování v takto pojatém webu poskytují softwaroví agenti, tj. programy, které pracují autonomně a proaktivně. Cílem článku je uvést do technologií podporujících vytváření Sémantického webu, ukázat jeho architekturu a zmínit některé již rozpracované projekty směřující k vytváření inteligentních webových informačních služeb, personalizovaných webových míst a sémanticky zesílených vyhledávacích strojů.
Pokorný Jaroslav, Smižanský J.
Page Content Rank: an Approach to the Web Content Mining
In: Proceedings of IADIS International Conference Applied Computing, Volume: 1, IADIS Press, 2005, pp. 289-296.
ISBN: 3-540-31198-X
Presented at: IADIS International Conference Applied Computing, 22.2.-25.2. 2005, Algavre,
Portugal.
Methods of web data mining can be divided into several categories according to a kind of mined information and goals that particular categories set: Web structure mining (WSM), Web usage mining (WUM), and Web Content Mining (WCM). The objective of this paper is to propose a new WCM method of a page relevance ranking based on the page content exploration. The method, we call it Page Content Rank (PCR) in the paper, combines a number of heuristics that seem to be important for analysing the content of Web pages. The page importance is determined on the base of the importance of terms which the page contains. The importance of a term is specified with respect to a given query q and it is based on its statistical and linguistic features. As a source set of pages for mining we use a set of pages responded by a search engine to the query q. PCR uses a neural network as its inner classification structure. We describe an implementation of the proposed method and a comparison of its results with the other existing classification system – PageRank algorithm.
Pokorný Jaroslav
Database architectures: current trends and their relationships to environmental data management
In: Proceedings of the 19th Conference EnviroInfo, Masaryk University, Brno, 2005, pp. 24-28.
Presented at: 19th Conference EnviroInfo (Informatics for Environmental Protection, Networking Environmental Information), 7.9.-9.9.2005, Brno,
Czech Republic.
Ever increasing environmental demands from customers, authorities and governmental organizations as well as new business control functions are integrated to environmental management systems (EMSs). With a production of huge data sets and their processing in real-time applications, the needs for environmental data management have grown significantly. Current trends in database development and an associated research meet these challenges. The paper discusses recent advances in database technologies and attempts to highlight them with respect to requirements of EMSs.
Pokorný Jaroslav, Reschke J.
Exporting relational data into a native XML store
Pokorný Jaroslav, Reschke J.
Exporting relational data into a native XML store
In: Advances in Information Systems Development - Bridging the Gap between Academia and Industry, (Ed. A.G. Nilsson et al), Volume: 2, Springer Verlag, 2006, pp. 807-818.
ISBN: 0-387-30834-2
Pokorný Jaroslav
Databázové architektury: současné trendy a jejich vztah k novým požadavkům praxe
In: Sborník příspěvků 20. ročníku konference Moderní databáze, KOMIX, 2006, pp. 5-14.
ISBN: 80-239-7109-3
Presented at: Moderní databáze, 30.5.-31.5.2006, Zvánovice, Czech Republic.
Pokorný Jaroslav
Database architectures: current trends and their relationships to environmental data management
In: Environmental Modelling & Software, Volume: 21, No: 11, Elsevier Science, 2006, pp. 1579-1586.
Pokorný Jaroslav
Database Architectures: Current Trends and Their Relationships to Requirements of Practice
In: Proceedings of Information Systems Development ’06 Conference, Budapest, 2006.
Presented at: ISD’ 06 Conference, 31.8.-2.9.2006, Budapest,
Hungary.
Pokorný Jaroslav
Zpracování proudů dat
In: Proceedings of the Annual Database Conference DATAKON 2006, Masaryk University, Brno, 2006, pp. 61-76.
Presented at: DATAKON 2006, 20.10.-23.10.2006, Brno,
Czech Republic.
Šesták Radovan, Lánský Jan, Žemlička M.
Suffix Array for Large Alphabet
In: Proc. of 2008 Data Compression Conference (DCC 2008), IEEE Computer Society Press, 2008, pp. 543-543.
Presented at: DCC 2008 Data Compression Conference, 25.-27.3.2008, Snowbird, Utah,
USA.
Šesták Radovan, Lánský Jan
Compression of Concatenated Web Pages Using XBW
In: SOFSEM 2008: Theory and Practice of Computer Science, LNCS 4910, Springer, 2008, pp. 743-754.
Presented at: 34th International Conference on Current Trends in Theory and Practice of Computer Science, 19.-25.1.2008, Nový Smokovec, High Tatras,
Slovakia.
XBW [10] is modular program for lossless compression that
enables testing various combinations of algorithms. We obtained best
results with XML parser creating dictionary of syllables or words combined
with Burrows-Wheeler transform - hence the name XBW. The
motivation for creating parser that handles non-valid XML and HTML
files, has been system EGOTHOR [5] for full-text searching. On files of
size approximately 20MB, formed by hundreds of web pages, we achieved
twice the compression ratio of bzip2 while running only twice as long. For
smaller files, XBWhas very good results, compared with other programs,
especially for languages with rich morphology such as Slovak or German.
For any big textual files, our program has good balance of compression
and run time.
Program XBW enables use of parser and coder with any implemented
algorithm for compression.We have implemented Burrows-Wheeler transform
which together with MTF and RLE forms block compression, dictionary
methods LZC and LZSS, and finally statistical method PPM. Coder
offers choice of Huffman and arithmetic coding.
Skopal Tomáš, Pokorný Jaroslav, Snášel Václav
Nearest Neighbours Search using the PM-tree
In: Procedings of The 10th International Conference on Database Systems for Advanced Applications, LNCS 3453, Springer-Verlag, 2005, pp. 803-815.
Presented at: DASFAA 2005, 17.4.-20.4.2005, Beijing,
China.
Skopal Tomáš
On Fast Non-Metric Similarity Search by Metric Access Methods
In: Proceedings of 10th International Conference on Extending Database Technology EDBT 2006, (Ed. Y. Ioannidis et al.), 2006, pp. 718-736.
ISBN: 3-540-32960-9
Presented at: EDBT 2006, 26.3.-31.3.2006, Munich,
Germany.
The retrieval of objects from a multimedia database employs a measure which defines a similarity score for every pair of objects. The measure should effectively follow the nature of similarity, hence, it should not be limited by the triangular inequality, regarded as a restriction in similarity modeling. On the other hand, the retrieval should be as efficient (or fast) as possible. The measure is thus often restricted to a metric, because then the search can be handled by metric access methods (MAMs). In this paper we propose a general method of non-metric search by MAMs. We show the triangular inequality can be enforced for any semimetric (reflexive, non-negative and symmetric measure), resulting in a metric that preserves the original similarity orderings (retrieval effectiveness). We propose the TriGen algorithm for turning any blackbox semimetric into (approximated) metric, just by use of distance distribution in a fraction of the database. The algorithm finds such a metric for which the retrieval efficiency is maximized, considering any MAM.
Skopal Tomáš, Snášel Václav
An Application of LSI and M-tree in Image Retrieval
In: GESTS International Transactions on Computer Science and Engineering, Volume: 34, No: 1, GEST Society, 2006, pp. 212-225.
When dealing with image databases, we often need to solve the problem of how to retrieve a desired set of images effectively and efficiently. As a representation of images, there are commonly used some high-dimensional vectors of extracted features, since in such a way the content-based image retrieval is turned into a geometric-search problem. In this article we present a study of feature extraction from raw image data by means of the LSI method (singular-value decomposition, respectively). Simultaneously, we show how such a kind of feature extraction can be used for efficient and effective similarity retrieval using the M-tree index. Because of the application to image retrieval, we also show some interesting effects of LSI, which are not directly obvious in the area of text retrieval (where LSI came from).
Skopal Tomáš, Hoksza D.
Improving the Performance of M-tree Family by Nearest-Neighbor Graphs
In: Advances in Databases and Information Systems, LNCS 4690, Springer, Berlin, 2007, pp. 172-188.
Presented at: ADBIS 2007, 29.9.-3.10.2007, Varna,
Bulgaria.
The M-tree and its variants have been proved to provide an efficient similarity search in database environments. In order to further improve their performance, in this paper we propose an extension of the M-tree family, which makes use of nearest-neighbor (NN) graphs. Each tree node maintains its own NN-graph, a structure that stores for each node entry a reference (and distance) to its nearest neighbor, considering just entries of the node. The NN-graph can be used to improve filtering of non-relevant subtrees when searching (or inserting new data). The filtering is based on using ”sacrifices” selected entries in the node serving as pivots to all entries being their reverse nearest neighbors (RNNs). We propose several heuristics for sacrifice selection; modified insertion; range and kNN query algorithms. The experiments have shown the M-tree (and variants) enhanced by NN-graphs can perform significantly faster, while keeping the construction cheap.
Skopal Tomáš
Unified Framework for Exact and Approximate Search in Dissimilarity Spaces
In: Transactions on Database Systems (TODS), Volume: 32, No: 4, ACM, 2007, pp. 1-47.
In multimedia systems we usually need to retrieve database (DB) objects based on their similarity
to a query object, while the similarity assessment is provided by a measure which defines a
(dis)similarity score for every pair of DB objects. In most existing applications, the similarity measure
is required to be a metric, where the triangle inequality is utilized to speed up the search
for relevant objects by use of metric access methods (MAMs), for example, the M-tree. A recent
research has shown, however, that nonmetric measures are more appropriate for similarity modeling
due to their robustness and ease to model a made-to-measure similarity. Unfortunately, due to
the lack of triangle inequality, the nonmetric measures cannot be directly utilized by MAMs. From
another point of view, some sophisticated similarity measures could be available in a black-box
nonanalytic form (e.g., as an algorithm or even a hardware device), where no information about
their topological properties is provided, so we have to consider them as nonmetric measures as well.
From yet another point of view, the concept of similarity measuring itself is inherently imprecise
and we often prefer fast but approximate retrieval over an exact but slower one.
To date, the mentioned aspects of similarity retrieval have been solved separately, that is, exact
versus approximate search or metric versus nonmetric search. In this article we introduce a similarity
retrieval framework which incorporates both of the aspects into a single unified model. Based
on the framework, we show that for any dissimilarity measure (either a metric or nonmetric) we
are able to change the “amount” of triangle inequality, and so obtain an approximate or full metric
which can be used for MAM-based retrieval. Due to the varying “amount” of triangle inequality,
the measure is modified in a way suitable for either an exact but slower or an approximate but
faster retrieval. Additionally, we introduce the TriGen algorithm aimed at constructing the desired
modification of any black-box distance automatically, using just a small fraction of the database.
Snášel Václav, Moravec Pavel, Pokorný Jaroslav
WordNet Ontology Based Model for Web Retrieval
In: Proceedings of International Workshop on Challenges in Web Information Retrieval and Integration (WIRI) 2005, IEEE Computer Society Press, 2005, pp. 231-236.
Presented at: International Workshop on Challenges in Web Information Retrieval and Integration, 8.4.-9.4. 2005, Tokyo,
Japan.
It is well known that ontologies will become a key piece, as they allow making the semantics of Semantic Web content explicit. In spite of the big advantages that the Semantic Web promises, there are still several problems to solve. Those concerning ontologies include their availability, development and evolution. In the area of information retrieval, the dimension of document vectors plays an important role. Firstly, with higher index dimensions the indexing structures suffer from the "curse of dimensionality" and their efficiency rapidly decreases. Secondly, we may not use exact words when looking for a document, thus we miss some relevant documents. LSI is a numerical method, which discovers latent semantics in documents by creating concepts from existing terms. In this paper we present a basic method of mapping LSI concepts on given ontology (WordNet), used both for retrieval recall improvement and dimension reduction. We offer experimental results for this method on a subset of TREC collection, consisting of Los Angeles Times articles.
Snášel Václav, Moravec Pavel, Pokorný Jaroslav
Using SDD for Topic Identification
In: Proc. of 8th International Conference on Intelligent Systems Design and Applications, IEEE Computer Society, 2008.
Presented at: ISDA 2008: 8th International Conference on Intelligent Systems Design and Applications, 25.-28.11.2008, Kaohsiung, Taiwan.
Snášel Václav, Dvorský Jiří, Timofieiev Anton, Pokorný Jaroslav
H-Index Analysis of Enron Corpus
In: Proc. of 8th International Conference on Intelligent Systems Design and Applications, IEEE Computer Society, 2008.
(in_print)
Presented at: ISDA 2008: 8th International Conference on Intelligent Systems Design and Applications, 25.-28.11.2008, Kaohsiung, Taiwan.
Snášel Václav, Moravec Pavel, Pokorný Jaroslav
Using BFA with wordnet ontology based model for web retrieval
In: Proceedings of the First IEEE International Conference on Signal-Image Technology & Internet-Based Systems (SITIS`05), 2005, pp. 254-259.
Presented at: First IEEE International Conference on Signal-Image Technology & Internet-Based Systems (SITIS`05), 27.11.-1.12.2005, Yaoundé,
Cameroon.
In the area of information retrieval, the dimension of document vectors plays an important role. We may need to find a few words or concepts, which characterize the document based on its contents, to overcome the problem of the "curse of dimensionality", which makes indexing of highdimensional data problematic. To do so, we earlier proposed a Wordnet and Wordnet+LSI (Latent Semantic Indexing) based model for dimension reduction. While LSI works on the whole collection, another procedure of feature extraction (and thus dimension reduction) exists, using binary factorization. The procedure is based on the search of attractors in Hopfield-like associative memory. Separation of true attractors (factors) and spurious ones is based on calculation of their Lyapunov function. Being applied to textual data the procedure conducted well and even more it showed sensitivity to the context in which the words were used. In this paper, we suggest that the binary factorization may benefit from the Wordnet filtration.
Snášel Václav, Moravec Pavel, Pokorný Jaroslav
Using BFA with WordNet Based Model for Web Retrieval
In: Journal of Digital Information Management, Volume: 4, No: 2, 2006, pp. 107-111.
Toman Kamil, Mlýnková Irena
XML Data - The Current State of Affairs
In: Proceedings of XML Prague 2006 conference, ITI Series, MFF UK, 2006, pp. 87-102.
Presented at: XML Prague 2006, 17.6.-18.6.2006, Prague,
Czech Republic.
At present the eXtensible Markup Language (XML) is used almost in all spheres of human activities. Its popularity results especially from the fact that it is a self-descriptive metaformat that allows to define the structure of XML data using other powerful tools such as DTD or XML Schema. Consequently, we can witness a massive boom of techniques for managing, querying, updating, exchanging, or compressing XML data.
On the other hand, for majority of the XML processing techniques we can find various spots which cause worsening of their time or space efficiency. Probably the main reason is that most of them consider XML data too globally, involving all their possible features, though the real data are often much simpler. If they do restrict the input data, the restrictions are often unnatural.
In this contribution we discuss the level of complexity of real XML collections and their schemes, which turns out to be surprisingly low. We involve and compare results and findings of existing papers on similar topics as well as our own analysis and we try to ¯nd the reasons for these tendencies and their consequences.
Toman Kamil, Mlýnková Irena
Statistics on The Real XML Data
In: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu, Ústav informatiky AV ČR, Prague, 2006, pp. 123-130.
ISBN: 80-903298-7-X
Presented at: Inteligentní modely, algoritmy, metody a nástroje pro vytváření sémantického webu - Seminář projektu programu Informační společnost, 5.10.-7.10.2006, Zadov,
Czech Republic.
At present the eXtensible Markup Language (XML) is used
almost in all spheres of human activities. We can witness a massive
boom of techniques for managing, querying, updating, exchanging, or
compressing XML data.
On the other hand, for majority of the XML processing techniques we can
find various spots which cause worsening of their time or space efficiency.
Probably the main reason is that most of them consider XML data too
globally, involving all their possible features, though the real data are
often much simpler. If they do restrict the input data, the restrictions
are often unnatural.
We discuss the level of complexity of real XML collections and their
schemes, which turns out to be surprisingly low. We involve and compare
results and findings of existing papers on similar topics as well as our
own analysis and we try to ¯nd the reasons for these tendencies and their
consequences.
Vaneková Veronika, Vojtáš Peter
A Description Logic with Concept Ordering and top-k Restriction
In: Proc. of 18th European Japanese Conference on Information Modelling and Knowledge Bases, EJC2008 Program Comitee and EJC2008 Program Coordination team, 2008, pp. 139-149.
ISBN: 978-3-540-85712-9
(in_print)
Presented at: EJC 2008: 18th European Japanese Conference on Information Modelling and Knowledge Bases, 2.-6.6.2008, Tsukuba, Japan.
Vlčková Z., Galamboš Leo
Dynamizace gridu
In: Proceedings of ITAT 2007, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), PONT s.r.o., Seňa, 2007, pp. 115-121.
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2007, 21.-27.9.2007, Polana,
Slovakia.
Vojtáš Peter, Gurský Peter
On top-k search with no random access using small memory
In: Proc. of 12th Advances in Databases and Information Systems, LNCS 5207, Springer-Verlag, Berlin, 2008, pp. 97-111.
ISBN: 978-3-540-85712-9
Presented at: ADBIS 2008: 12th Advances in Databases and Information Systems, 5.-9.9.2008, Pori, Finland.
Vojtáš Peter
Decathlon, Conflicting Objectives and User Preference Querying
In: Proc. of DATESO 2008, (Ed. J. Pokorný, V. Snášel, K. Richta), CEUR Workshop Proc., 2008, pp. 76-78.
ISBN: 978-80-248-1746-0
Presented at: Dateso 2008: Annual International Workshop on DAtabases, TExts, Specifications and Objects, 16.4.-18.4.2008, Desná - Černá Říčka,
Czech Republic.
Vojtáš Peter
Proceedings of ITAT 2005, Information Technologies - Applications and Theory
In: Proceedings of ITAT 2005, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), Prírodovedecká fakulta Univerzity Pavla Jozefa Šafárika, Košice, 2005.
ISBN: 80-7097-609-8
Vojtáš Peter
Fuzzy Logic as an Optimization Task
In: Fuzzy Logic and Technology, (Ed. Sobrevilla P., Montseny E.), Barcelona, 2005, pp. 781-786.
ISBN: 84-7683-872-3
Presented at: EUSFLAT - LFA 2005. Conference of the European Society for Fuzzy Logic and Technology /13./, Recontres Francophones sur la Logique Floue et ses Applications /11./, 7.9.-9.9.2005, Barcelona, Spain.
Vojtáš Peter
Model Theoretic and Fixpoint Semantics for Preference Queries over Imperfect Data
In: Proceedings of Inconsistency and Incompleteness in Databases, (Ed. Chomicki J., Wijsen J.), Munich, 2006, pp. 87-91.
Presented at: Inconsistency and Incompleteness in Databases, International Workshop Collocated with the 10 th International Conference on Extending Database Technology, 26.3.2006, Munich,
Germany.
We present an overview of our results on model theoretic and fixpoint semantics for a relational algebra using a model of many valued Datalog with similarity. Using our previous results on equivalence of our model and certain variant of generalized annotated programs, we base our querying on fuzzy aggregation operators (also called annotation terms, combining functions, utility functions). Using of fuzzy aggregation operators (distinct from database aggregations) enables us to reduce tuning of various linguistic variables. In practice we can learn fuzzy aggregator operators by an ILP procedure for every user profile. Our approach enables also integration of data from different sources via aggregation and similarity. Extending domains we discuss difference between fuzzy elements and fuzzy subsets. We also discuss an alternative, when all extensional data are stored crisp and fuzziness is in rules interpreting data, context and in user query.
Vojtáš Peter
A Fuzzy EL Description logic with Crisp Roles and Fuzzy Aggregation for Web Consulting
In: Proceedings of Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2006), Edition EDK, 2006, pp. 1834-1841.
ISBN: 2-84254-112-X
Presented at: Information Processing and Management of Uncertainty in Knowledge-Based Systems (IPMU 2006), 2.7.-7.7.2006, Paris,
France.
Vojtáš Peter
Fuzzy Logic Aggregation for Semantic Web Search for the Best Answer
In: Fuzzy Logic and the Semantic Web, (Ed. Sanchez E.), Elsevier, 2006.
ISBN: 0-444-51948-3
Vojtáš Peter
Information Technologies - Applications and Theory
In: Proceedings of ITAT 2006, Information Technologies - Applications and Theory, 2006.
ISBN: 80-969184-4-3
Vojtáš Peter
EL Description Logic Modeling Querying Web and Learning Imperfect User Preferences
In: Uncertainty Reasoning for the Semantic Web - Volume 2, Proceedings of the Second ISWC Workshop on Uncertainty Reasoning for the Semantic Web, (Ed. P.C.G. da Costa, K.B. Laskey, K.J. Laskey, F. Fung, M. Pool), 2006, pp. 2.
Presented at: Workshop on Uncertainty Reasoning for the Semantic Web URSW 2006, 5.11.2006, Athens, Georgia,
USA.
In this position paper we share ideas on modeling querying
web resources by (imperfect) combination of particular user preferences
based on description logic. Our basic assumption is, that web resources
are modeled crisp. Imperfection (uncertainty, vagueness,...) comes from
user context and preferences. We offer a model based on connection between
three EL-description logic systems: classical, annotated(fuzzy) and
a new variant of Bayesian description logic. The Bayesian part enables
learning each single user`s combination function and concepts.
Vojtáš Peter, Vomlelová M.
Learning fuzzy logic aggregation for multicriterial querying with user preferences
In: Proceedings of 27th Linz Seminar on Fuzzy Set Theory - Preferences, Games and Decisions, (Ed. J. Fodor, E.P. Klement, M. Roubens), Linz, 2006, pp. 128-129.
Presented at: 27th Linz Seminar on Fuzzy Set Theory - Preferences, Games and Decisions, 7.2.-11.2.2006, Linz,
Austria.
Vojtáš Peter
EL description logic with aggregation of user preference concepts
In: Frontiers in Artificial Intelligence and applications 154, Information modelling and Knowledge Bases XVIII, IOS Press, Amsterdam, 2007, pp. 154-165.
Wiedermann Jiří, Tel Gerard, Pokorný Jaroslav, Bieliková Mária, Štuller Július
Proceedings of SOFSEM 2006
In: Proceedings of SOFSEM 2006, Volume: II, ICS AS CR, Prague, 2006.
ISBN: 80-903298-4-5
Wiedermann Jiří, Tel Gerard, Pokorný Jaroslav, Bieliková Mária, Štuller Július
Proceedings of SOFSEM 2006: Theory and Practice of Computer Science
In: Proceedings of SOFSEM 2006: Theory and Practice of Computer Science, LNCS 3831, Springer-Verlag, Berlin, 2006.
ISBN: 3-540-31198-X
Yaghob Jakub, Zavoral Filip
Budování infrastruktury sémantického webu
In: Proceedings of ITAT 2006, Information Technologies - Applications and Theory, 2006.
ISBN: 80-969184-4-3
Presented at: ITAT 2006, 26.9.-1.10.2006, Chata Kosodrevina, Bystrá dolina, Nízke Tatry,
Slovakia.
Yaghob Jakub, Zavoral Filip
Semantic Web Infrastructure using DataPile
In: Proceeding of the International Workshop on Technologies and Applications of Knowledge Computing on the Web (IEEE/WIC/ACM International Conference on Web Intelligence), (Ed. C.J. Butz, N.T. Nguyen, Y. Takama, W. Cheung), IEEE Computer Society, 2006, pp. 630-633.
Presented at: International Workshop on Technologies and Applications of Knowledge Computing on the Web (IEEE/WIC/ACM International Conference on Web Intelligence), 18.12.-22.12.2006, Hong-Kong
.