Bartoň Stanislav, Dohnal Vlastislav, Sedmidubský Jan, Zezula Pavel
Gauging the Evolution of Metric Social Network
In: 5th International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2007) held at 33rd International Conference on Very Large Data Bases (VLDB 2007), 2007, pp. 12.
Presented at: Fifth International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2007), 24.9.2007, Vienna,
Austria.
In this paper, we tackle the issues of analyzing the struc-
tural evolution of the metric social network. The metric social network
operates in a P2P environment where peers maintain their own data
and the relationships among them are formed on the basis of the pro-
cessed similarity queries. The evolution is analyzed by traditional social
networking tools the characteristic path length and the clustering co-
efficient. Nonetheless, due to the special structure of the metric social
network, own designed gauges the average overlap and robustness of
description coefficients are presented to analyze the structure of emerg-
ing communities encompassing similar data.
Batko Michal, Novák David, Zezula Pavel
MESSIF: Metric Similarity Search Implementation Framework
In: Digital Libraries: Research and Development, Springer-Verlag, LNCS 4877, Berlin, Heidelberg, 2007, pp. 1-10.
ISBN: 978-3-540-77087-9
Batko Michal, Novák David, Zezula Pavel
MESSIF: Metric Similarity Search Implementation Framework
In: DELOS Conference 2007 - Working Notes, Information Society Technologies, Pisa, Italy, 2007, pp. 11-23.
Presented at: DELOS Conference 2007, 13-14.2.2007, Pisa,
Italy.
The similarity search has become a fundamental computational task in many applications. One
of the mathematical models of the similarity the metric space has drawn attention of many
researchers resulting in several sophisticated metric-indexing techniques. An important part of a
research in this area is typically a prototype implementation and subsequent experimental evaluation
of the proposed data structure. This paper describes an implementation framework called MESSIF
that eases the task of building such prototypes. It provides a number of modules from basic storage
management to automatic collecting of performance statistics. Due to its open and modular design it
is also easy to implement additional modules if necessary. The MESSIF also offers several ready-to-use
generic clients that allow to control and test the index structures and also measure its performance.
Bednárek David
Optimizing XQuery/XSLT programs using backward analysis
In: Proceedings of ITAT 2007, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), PONT s.r.o., Seňa, 2007, pp. 17-22.
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2007, 21.-27.9.2007, Polana,
Slovakia.
Daniel Milan
Classical Belief Conditioning and its Generalization to DSm Theory
In: Proceedings of The 6th International Conference on Information and Management Sciences, California Polytechnic State University, Berlin, 2007, pp. 596-603.
Presented at: The Sixth International Conference on Information and Management Sciences (IMS2007), 1.-6.7.2007, Lhasa,
Tibet, China.
Daniel Milan
Several Comments and Questions to Josang`s Smooth Coarsening
In: Proceedings of Czech-Japan Seminar on Data Analysis and Decision Making under Uncertainty, (Ed. Kroupa T., Vejnarová J.), UTIA AV ČR, Praha, 2007, pp. 27-40.
Presented at: Czech-Japan Seminar on Data Analysis and Decision Making under Uncertainty, 15.-18.09.2007, Liblice,
Czech Republic.
Daniel Milan
The DSm Approach as a Special Case of the Dempster-Shafer Theory
In: ECSQARU 2007, (Ed. Mellouli K.), LNAI 4724, Springer-Verlag, 2007, pp. 381-392.
Presented at: ECSQARU 2007, 31.10.-2.11.2007, Hammamet,
Tunisia.
This contribution deals with a belief processing which enables managing of
multiple and overlapping elements of a frame of discernment.
An outline of the Dempster-Shafer theory for such cases is presented,
including several types of constraints for simplification of its large
computational complexity.
DSmT - a new theory rapidly developing the last five years - is briefly
introduced.
Finally, it is shown that the DSmT is a special case of the general
Dempster-Shafer approach.
Dokulil Jiří, Tykal J., Yaghob Jakub, Zavoral Filip
Semantic Web Infrastructure
In: Proc. of the First IEEE International Conference on Semantic Computing, IEEE, 2007, pp. 209-215.
Presented at: ICSC 2007, 17.-19.9.2007, Irvine,
California.
The Semantic Web is not widespread as it has been expected by its founders. This is partially caused by lack of standard and working infrastructure for the Semantic Web. We have built a working, portable, stable, highperformance infrastructure for the Semantic Web. This paper is focused on tasks performed by the infrastructure.
Dokulil Jiří, Tykal J., Yaghob Jakub, Zavoral Filip
Semantic Web Repository and Interfaces
In: Proc. of SEMAPRO (Int. Conf. on Advances in Semantic Processing), IEEE, 2007.
Presented at: SEMAPRO (Int. Conf. on Advances in Semantic Processing), 4.-9.11.2007, Papeete,
French Polynesia (Tahiti) .
The Semantic Web is not widespread as it has been
expected by its founders. This is partially caused by
lack of standard and working infrastructure for the Semantic
Web. We have built a working, portable, stable,
high-performance infrastructure for the Semantic
Web. This enables various experiments with the Semantic
Web in the real world.
Dokulil Jiří, Katreniaková J.
Visualization of large schemaless RDF data
In: Proc. of SEMAPRO (Int. Conf. on Advances in Semantic Processing), IEEE, 2007, pp. 243-248.
Presented at: SEMAPRO (Int. Conf. on Advances in Semantic Processing), 4.-9.11.2007, Papeete,
French Polynesia (Tahiti) .
Since many XML documents do not contain any schema definition, we expected that there will be also RDF documents without RDF schema or ontology.Then the data can only be viewed as a general labeled directed graph and the idea to present the data to the user by drawing the graph seems natural. Because the data can be extremely large, it is impossible to display the whole graph at one time. Only a suitable start node is displayed and the rest of the graph can be explored by incremental navigation.To conserve space and show possible directions of further navigation to the user we have come up with a technique called node merging. By combining suitable graph drawing and navigation techniques we get a tool that can give the user good idea about structure and content of the data.
Dokulil Jiří, Tykal J., Yaghob Jakub, Zavoral Filip
Experimental Platform for the Semantic Web
In: Proceedings of ITAT 2007, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), PONT s.r.o., Seňa, 2007, pp. 67-72.
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2007, 21.-27.9.2007, Polana,
Slovakia.
Dokulil Jiří, Katreniaková J.
Vizualizácia RDF dát pomocou techniky zlučovania vrcholov
In: Proceedings of ITAT 2007, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), PONT s.r.o., Seňa, 2007, pp. 23-28.
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2007, 21.-27.9.2007, Polana,
Slovakia.
Duží Marie, Vojtáš Peter
Multi-Criterion Search from the Semantic Point of View
In: EJC`07, (Ed. Jaakkola H. et al.), Juvenes Print-TTY, Tampere, 2007, pp. 21-39.
Presented at: THE 17th EUROPEAN - JAPANESE CONFERENCE ON INFORMATION MODELLING AND KNOWLEDGE BASES , 4.-8.6.2007, Pori,
Finland.
Eckhardt Alan, Horváth T., Maruščák D., Novotný R., Vojtáš Peter
Uncertainty Issues in Automating Process Connecting Web and User
In: Proc. of Uncertainty Reasoning for the Semantic Web Workshop 2007, (Ed. F. Bobillo), CEUR Workshop Proc., 2007, pp. 1-12.
Presented at: Dateso 2008: Annual International Workshop on DAtabases, TExts, Specifications and Objects, 16.4.-18.4.2008, Desná - Černá Říčka,
Czech Republic.
Eckhardt Alan, Horváth T., Vojtáš Peter
Learning different user profile annotated rules for fuzzy preference top-k quering
In: Scalable Uncertainty Management, Springer, LNAI 4772, Berlin, 2007, pp. 116-130.
Presented at: SUM 2007 International Conference, 10.10.-12.10.2007, Washington,
US.
Uncertainty querying of large data can be solved by providing top-k answers according to a user fuzzy ranking/scoring function. Usually different users have different fuzzy scoring function a user preference model. Main goal of this paper is to assign a user a preference model automatically. To achieve this we decompose user’s fuzzy ranking function to ordering of particular attributes and to a combination function. To solve the problem of automatic assignment of user model we design two algorithms, one for learning user preference on particular attribute and second for learning the combination function. Methods were integrated into a Fagin-like top-k querying system with some new heuristics and tested.
Eckhardt Alan, Vojtáš Peter
Uživatelské preference při hledání ve webovských zdrojích
In: Znalosti 2007, Fakulta elektrotechniky a informatiky, VŠB - Technická univerzita Ostrava, 2007, pp. 179-190.
Presented at: Znalosti 2007, 21.2.-23.2.2007, Ostrava,
Czech Republic.
Eckhardt Alan
Inductive Models of User Preferences for Semantic Web
In: Proceedings of the Dateso 2007, CEUR Workshop Proc., 2007, pp. 103-114.
Presented at: Dateso 2007 Annual International Workshop on DAtabases, TExts, Specifications and Objects, 18.4.-20.4.2007, Desná - Černá Říčka,
Czech Republic.
User preferences became recently a hot topic. The massive
use of internet shops and social webs require the presence of a user modelling,
which helps users to orient them selfs on a page. There are many
different approaches to model user preferences. In this paper, we will
overview the current state-of-the-art in the area of acquisition of user
preferences and their induction. Main focus will be on the models of user
preferences and on the induction of these models, but also the process of
extracting preferences from the user behaviour will be studied. We will
also present our contribution to the probabilistic user models.
Eckhardt Alan, Pokorný Jaroslav, Vojtáš Peter
Integrating user and group preferences for top-k search from distributed web resources
In: Proc. of DEXA Workshop Decision Support for Structural Health Monitoring and Flexible Query Processing, (Ed. Tjoa A.M., Wagner R.R..), IEEE, 2007, pp. 317-322.
Presented at: DEXA Workshop, 3.-7.9.2007, Regensburg,
Germany.
We discuss models of user and group preferences in social networks and the Semantic web. We construct a model for user and group preference querying over RDF data as well as for ordering of answers by aggregation of particular attribute ranking. We have implemented our methods and heuristics into the Tokaf middleware framework prototype. We describe also experiments with Tokaf.
Eckhardt Alan, Pokorný Jaroslav, Vojtáš Peter
A system recommending top-k objects for multiple users preference
In: Proc. of FUZZ-IEEE 2007 International Conference on Fuzzy Systems, IEEE, 2007, pp. 1101-1106.
Presented at: FUZZ-IEEE 2007, 23.-26.7.2007, London,
UK.
We discuss models of user preferences in Web environment. We construct a model for user preference querying over a number of data sources and ordering of answers by a combination of particular attribute rankings. We generalize Fagin's algorithm in two directions - we develop some new heuristics for top-k search in the model without random access and propose a method of ordering lists of objects by user fuzzy function. To enable different user preferences our system does not require objects to be sorted - instead we use a B+- tree on each of the attribute domains. This leads to a more realistic model of Web services. We implement our methods and heuristics for search of top-k answers into Tokaf middleware framework prototype. We describe experiments with Tokaf and compare different performance measures with some other methods.
Eckhardt Alan, Horváth T., Vojtáš Peter
PHASES: A User Profile Learning Approach for Web Search
In: Web Intelligence, IEEE Computer SocietyScalable Uncertainty Management, Los Alamitos, 2007, pp. 780-783.
Presented at: WI 2007. IEEE/WIC/ACM International Conference on Web Intelligence, 2.11.-5.11.2007, Silicon Valley,
US.
Web search heuristics based on Fagin’s threshold
algorithm assume we have the user profile in the form
of particular attribute ordering and a fuzzy
aggregation function representing the user combining
function. Having these, there are sufficient algorithms
for searching top-k answers. Finding particular
attribute ordering and aggregation for a user still
remains a problem. In this short paper our main
contribution is a proof of concept of a new iterative
process of acquisition of user preferences and attribute
ordering .
Eckhardt Alan, Horváth T., Maruščák D., Novotný R., Vojtáš Peter
Uncertainty Issues in Automating Process Connecting Web and User
In: Proc. of Uncertainty Reasoning for the Semantic Web (URSW 2007), Workshop at ISWC+ASWC 2007, (Ed. deCosta P. et al.), 2007, pp. 97-108.
Presented at: ISWC 2007, 12.11.2007, Busan,
Korea.
Falchi Fabrizio, Gennaro Claudio, Rabitti Fausto, Zezula Pavel
A distributed incremental nearest neighbor algorithm
In: International Conference on Scalable Information Systems, Volume: 304, ACM Press, New York, 2007, pp. 1-10.
Presented at: INFOSCALE 2007, 6.-8.6.2007, Suzhou,
China.
Frolov A., Húsek Dušan, Muraviev P. Igor, Polyakov P. Y.
Boolean Factor Analysis by Attractor Neural Network
In: IEEE Transactions on Neural Networks, Volume: 18, No: 3, IEEE, 2007, pp. 698-707.
A common problem encountered in disciplines such as statistics, data analysis, signal processing, textual data representation, and neural network research, is finding a suitable representation of the data in the lower dimension space. One of the principles used for this reason is a factor analysis. In this paper, we show that Hebbian learning and a Hopfield-like neural network could be used for a natural procedure for Boolean factor analysis. To ensure efficient Boolean factor analysis, we propose our original modification not only of Hopfield network architecture but also its dynamics as well. In this paper, we describe neural network implementation of the Boolean factor analysis method. We show the advantages of our Hopfield-like network modification step by step on artificially generated data. At the end, we show the efficiency of the method on artificial data containing a known list of factors. Our approach has the advantage of being able to analyze very large data sets while preserving the nature of the data.
Galamboš Leo
Vyhledávání na Webu
In: DATAKON 2007, (Ed. Popelínský L., Výborný O.), Masaryk university, 2007, pp. 17-24.
Presented at: DATAKON 2007, 20.10.-23.10.2007, Brno,
Czech Republic.
Galamboš Leo, Lánský Jan, Žemlička M., Chernik K.
Compression of Semistructured Documents
In: International Journal of Information Technology, Volume: 4, No: 1, Elsevier, 2007, pp. 11-17.
EGOTHOR is a search engine that indexes the Web
and allows us to search the Web documents. Its hit list contains URL
and title of the hits, and also some snippet which tries to shortly
show a match. The snippet can be almost always assembled by an
algorithm that has a full knowledge of the original document (mostly
HTML page). It implies that the search engine is required to store
the full text of the documents as a part of the index.
Such a requirement leads us to pick up an appropriate compression
algorithm which would reduce the space demand. One of the solutions
could be some use of common compression methods, for instance
gzip or bzip2, but it might be preferable to develop a new method
which would take advantage of the document structure, or rather, the
textual character of the documents.
There already exist special compression text algorithms and methods
for a compression of XML documents. The aim of this paper is
an integration of the two approaches to achieve an optimal level of
the compression ratio.
Gurský Peter, Vojtáš Peter
Multikriteriálne vyhľadávanie najlepších objektov s podporou viacerých užívateľov
In: Znalosti 2007, Fakulta elektrotechniky a informatiky, VŠB - Technická univerzita Ostrava, 2007, pp. 52-62.
Presented at: Znalosti 2007, 21.2.-23.2.2007, Ostrava,
Czech Republic.
Gurský Peter, Horváth T., Jirásek J., Krajči S., Novotný R., Vaneková Veronika, Vojtáš Peter
Web Search with Variable User Model
In: DATAKON 2007, (Ed. Popelínský L., Výborný O.), Masaryk university, 2007, pp. 111-121.
Presented at: DATAKON 2007, 20.10.-23.10.2007, Brno,
Czech Republic.
Gurský Peter, Horváth T., Jirásek J., Krajči S., Novotný R., Vaneková Veronika, Vojtáš Peter
Knowledge Processing for Web Search - An integrated Model
In: Proc. of the 1st International Symposium on Intelligent and Distributed Computing (IDC 2007), STUDIES IN COMPUTATIONAL INTELLIGENCE, (Ed. Badica C., Paprzycki M.), Volume: 78, Springer, 2007, pp. 95-104.
Presented at: IDC 2007: 1st International Symposium on Intelligent and Distributed Computing, 18.-20.10.2007, Craiova,
Romania.
We propose a model of a middleware system enabling personalized
web search for users with different preferences. We integrate both inductive and
deductive tasks to find user preferences and consequently best objects. The
model is based on modeling preferences by fuzzy sets and fuzzy logic. We
present the model-theoretic semantic for fuzzy description logic f-EL which is
the motivation of creating a model for fuzzy RDF. Our model was
experimentally implemented and integration was tested.
Hanks Patrick
Why Bother with Corpus Evidence
In: Proceedings of the Second International Conference of the German Cognitive Linguistics Association, 2007.
(in_print)
Presented at: Second International Conference of the German Cognitive Linguistics Association, 5.10.-7.10.2006, Munich,
Germany.
Hanks Patrick, Pala Karel, Rychlý Pavel
Using Corpus Analysis to Mapping Lexical Sets onto Semantic Types through Corpus Analysis
In: Proceedings of the Fourth International Workshop on Generative Approaches to the Lexicon, 2007.
(in print)
Presented at: Fourth International Workshop on Generative Approaches to the Lexicon, 10-11.5.2007, Paris,
France.
Hanks Patrick, Pala Karel
Towards an empirically well-founded semantic ontology for NLP
In: Proceedings of the Fourth International Workshop on Generative Approaches to the Lexicon, 2007.
Presented at: Fourth International Workshop on Generative Approaches to the Lexicon, 10-11.5.2007, Paris,
France.
This paper examines some issues involved in
building a corpus-based ontology for use in
determining the meaning of words in text, in the
context of creating a “pattern dictionary”. How do
words cluster in paradigmatic lexical sets in actual
usage (as reflected in a large corpus), and can these
clusters be mapped onto a semantically structured
ontology? What semantic notions need to be
distinguished for this purpose, and what are the
appropriate theoretical foundations? What other
elements are needed for the application of
determining meaning in text?
Hanks Patrick
Editorial: Cognition and the Lexicon
In: Lexicology, (Ed. Hanks P.), Volume: 5, Routledge, Taylor and Francis Group, 2007.
ISBN: 978-0-415-70098-6
Hanks Patrick
Editorial: Formal Approaches to the Lexicon
In: Lexicology, (Ed. Hanks P.), Volume: 6, Routledge, Taylor and Francis Group, 2007.
ISBN: 978-0-415-70098-6
Hlaváčková D., Pala Karel
Surface and Deep Valency Frames in Czech
In: Proceedings of the 25th International Conference on Lexis and Grammar, 2007.
(in_print)
Presented at: The 25th International Conference on Lexis and Grammar, 6.9.-10.9.2006, Palermo,
Italy.
Hlaváčková D., Pala Karel
Computer Processing Derivational Relations in Czech
In: Computer Treatment of Slavic and East European Languages, L. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, 2007, pp. 198-208.
Presented at: Slovko 2007, 25.-27.10.2007, Bratislava,
Slovakia.
In the paper we deal with the derivational relations in Czech that form typical derivational nests (or subnets). Derivational relations are mostly of semantic nature and their regularity in Czech allows us to describe them in a way suitable for computer processing and then add them to the electronic databases such as WordNet almost automatically. For this purpose we have used the derivational version of morphological analyzer Ajka that is able to handle the basic and most productive derivational relations in Czech. A special derivational interface has been developed in our NLP Lab at FI MU by means of which we have explored the semantic nature of the selected noun derivational suffixes (22) as well as verb prefixes and established a set of the semantically labeled derivational relations, presently 14. With regard to the verbs we have paid attention to the selected verb semantic classes in connection with the derivational relations between selected prefixes (4) and corresponding Czech verbs. As an application we have added the selected derivational relations to the Czech WordNet and in this way enriched it with approx. 30 000 new Czech synsets.
Horák Aleš, Rambousek Adam
Administration Framework for the DEB Dictionary Server
In: Computer Treatment of Slavic and East European Languages, L. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, 2007, pp. 70-79.
Presented at: Slovko 2007, 25.-27.10.2007, Bratislava,
Slovakia.
This paper presents a new implementation of administration framework for the DEBII dictionary writing system. We present the details and examples of the user management part as well as graphical scenarios for dictionary service setup, adaptation and automatic generation of user application based on the dictionary XML schema.
Tento článek představuje novou implementaci administračního rozhraní systému pro tvorbu slovníků DEBII. V článku je podrobně popsán systém správy uživatelů a také grafikou doplněný postup vytvoření nového slovníku, jeho přizpůsobení a automatické generování uživatelské aplikace pomocí XML schématu slovníku.
Horák Aleš, Rambousek Adam
DEB Platform Deployment - Current Applications
In: RASLAN 2007: Recent Advances in Slavonic Natural Language Processing, Masaryk University, Brno, 2007, pp. 3-11.
In this paper, we summarize the latest development regarding the client dictionary writing applications based on the DEB development platform. The DEB framework is nowadays used in several full grown projects for preparation of high quality lexicographic data created within (possibly distant) teams of researchers. We briefly present the current list of DEB applications with the relevant projects and their phases. For each of the applications, we offer display the view of the interface with overview description of the most important features.
Horák Aleš, Rambousek Adam
Dictionary Management System for DEB Development Platform
In: NLPCS 2007: Proceedings of the 4th International Workshop on Natural Language Processing and Cognitive Science, INSTICC PRESS, Funchal, Portugal, 2007, pp. 129-138.
Presented at: NLPCS 2007, 12.-16.6.2007, Funchal,
Madeira - Portugal.
In the paper, we introduce new dictionary management interface for design, preparation and presentation of generic electronic XML dictionaries using the DEB (Dictionary Editing and Browsing) development platform. The DEB platform provides a strict client-server environment for general dictionary writing systems. So far several successful NLP tools have been implemented on this platform, one of the most known being the DEBVisDic tool for wordnet semantic network editing and visualization. This paper describes a new part of the DEB platform -- the Administration interface that is shared by all DEB applications running on one server machine.
Článek představuje nové rozhraní pro správu slovníků, které umožňuje návrh, přípravu a prezentaci obecných elektronických slovníků ve formátu XML s použitím vývojové platformy DEB (Dictionary Editing and Browsing). Platforma DEB poskytuje prostředí v architektuře klient-server pro obecné systémy pro vytváření slovníků. V současné době bylo na této platformě implementováno několik úspěšných NLP nástrojů, nejznámnější je nástroj DEBVisDic pro editaci a vizualizaci sémantických sítí typu wordnet. Tento článek popisuje novou součást platformy DEB -- administrační rozhraní, které sdílí všechny DEB aplikace spuštěné na jednom serveru.
Horák Aleš, Holan Tomáš, Kadlec V., Kovář Vojtěch
Dependency and Phrasal Parsers of the Czech Language: A Comparison
In: Proceedings of Text, Speech and Dialogue 2007, Springer, LNAI 4629, Berlin, Heidelberg, 2007, pp. 76-84.
Presented at: TSD 2007, 3.-7.9.2007, Plzeň,
Czech Republic.
In the paper, we present the results of an experiment with comparing the effectiveness of real text parsers of Czech language based on completely different approaches stochastic parsers that provide dependency trees as their outputs and a meta-grammar parser that generates a resulting chart structure representing a packed forest of phrasal derivation trees.
We describe and formulate main questions and problems accompanying such experiment, try to offer answers to these questions and finally display also factual results of the tests measured on 10 thousand Czech sentences.
Húsek Dušan, Pokorný Jaroslav, Řezanková Hana, Snášel Václav
Data clustering: From documents to the Web
In: Web Data Management Practices: Emerging Techniques and Technologies, (Ed. Vakali A., Pallis G.), Idea Group Inc., 2007, pp. 1-33.
The chapter provides a survey of some clustering methods relevant to the clustering document collections and, in consequence, Web data. We start with classical methods of cluster analysis which seem to be relevant in approaching to cluster Web data. The graph clustering is also described since its methods contribute significantly to clustering Web data. A use of artificial neural networks for clustering has the same motivation. Based on previously presented material, the core of the chapter provides an overview of approaches to clustering in the Web environment. Particularly, we focus on clustering web search results, in which clustering search engines arrange the search results into groups around a common theme. We conclude with some general considerations concerning the justification of so many clustering algorithms and their application in the Web environment.
Húsek Dušan, Moravec Pavel, Snášel Václav, Frolov A., Polyakov P. Y.
Comparison of Neural Network Boolean Factor Analysis Method with Some Other Dimension Reduction Methods on Bars Problem
In: Pattern Recognition and Machine Intelligence, (Ed. Ghosh A., De R.), LNCS 4815, Springer, Berlin, 2007.
ISBN: 978-3-540-77045-9
Presented at: PReMI 2007. International Conference (2.), 18.-22.12.2007, Kolkata,
India.
In this paper, we compare performance of novel neural network based algorithm for Boolean factor analysis with several dimension reduction techniques as a tool for feature extraction. Compared are namely singular value decomposition, semi-discrete decomposition and non-negative matrix factorization algorithms, including some cluster analysis methods as well. Even if the mainly mentioned methods are linear, it is interesting to compare them with neural network based Boolean factor analysis, because they are well elaborated. Second reason for this is to show basic differences between Boolean and linear case. So called bars problem is used as the benchmark. Set of artificial signals generated as a Boolean sum of given number of bars is analyzed by these methods. Resulting images show that Boolean factor analysis is upmost suitable method for this kind of data.
Jiroušek Radim, Vejnarová Jiřina, Daniel Milan
Compositional Models of Belief Functions
In: ISIPTA'07, Charles University, Faculty of Mathematics and Physics, Prague, 2007, pp. 243-252.
Presented at: ISIPTA'07 - FIFTH INTERNATIONAL SYMPOSIUM ON IMPRECISE PROBABILITY: THEORIES AND APPLICATIONS, 16.-19.7.2007, Prague,
Czech Republic.
After it has been successfully done in probability and
possibility theories, the paper is the first attempt to
introduce the operator of composition also for belief
functions. We prove that the proposed definition
preserves all the necessary properties of the operator
enabling us to define compositional models as an
efficient tool for multidimensional models representation.
Kovář Vojtěch, Horák Aleš
Reducing the Number of Resulting Parsing Trees for the Czech Language Using the Beautified Chart Method
In: Proceedings of 3rd Language and Technology Conference, Wydawnictwo Poznańskie, Poznań, 2007, pp. 433-437.
Presented at: LTC`07, 5.-7.2007, Poznań,
Poland.
In the paper, we present the beautified chart method used for reducing the number of output derivation trees for the Czech syntax parser synt. We show the evaluation results of the method, describe the appropriate algorithms and the parser internal data structures as well as problems with their implementation.
Článek popisuje metodu beautified chart pro omezení počtu výstupních derivačních stromů syntaktického analyzátoru češtiny synt. Je popsána naměřená úspěšnost metody, příslušné algoritmy, datové struktury a některé problémy při implementaci.
Kůrková Věra
Estimates of Data Complexity in Neural-Network Learning
In: SOFSEM 2007, LNCS 4362, Springer, Berlin, 2007.
Presented at: SOFSEM 2007, 20.2.-26.2.2007, Harrachov,
Czech Republic.
Complexity of data with respect to a particular class of neural networks is studied. Data complexity is measured by the magnitude
of a certain norm of either the regression function induced by a probability measure describing the data or a function interpolating a sample
ofinput/output pairs of training data chosen with respect to this probability. The norm is tailored to a type of computational units in the
network class. It is shown that for data for which this norm is small,
convergence of infima of error functionals over networks with increasing number of hidden units to the global minima is relatively fast. Thus
for such data, networks with a reasonable model complexity can achieve
good performance during learning. For perceptron networks, the relationship between data complexity, data dimensionality and smoothness
is investigated.
Kuthan T., Lánský Jan
Genetic Algorithms in Syllable-Based text Compression
In: Proceedings of the Dateso 2007, CEUR Workshop Proc., 2007, pp. 21-34.
Presented at: Dateso 2007 Annual International Workshop on DAtabases, TExts, Specifications and Objects, 18.4.-20.4.2007, Desná - Černá Říčka,
Czech Republic.
Syllable based text compression is a new approach to compression
by symbols. In this concept syllables are used as the compression
symbols instead of the more common characters or words. This new
technique has proven itself worthy especially on short to middle-length
text files. The effectiveness of the compression is greatly affected by the
quality of dictionaries of syllables characteristic for the certain language.
These dictionaries are usually created with a straight-forward analysis
of text corpora. In this paper we would like to introduce an other way of
obtaining these dictionaries using genetic algorithm. We believe, that
dictionaries built this way, may help us lower the compress ratio. We will
measure this effect on a set of Czech and English texts.
Lánský Jan, Chernik K., Vlčková Z.
Syllable-Based Burrows-Wheeler Transform
In: Proceedings of the Dateso 2007, CEUR Workshop Proc., 2007, pp. 1-10.
Presented at: Dateso 2007 Annual International Workshop on DAtabases, TExts, Specifications and Objects, 18.4.-20.4.2007, Desná - Černá Říčka,
Czech Republic.
The Burrows-Wheeler Transform (BWT) is a compression
method which reorders an input string into the form, which is preferable
to another compression. Usually Move-To-Front transform and then
Huffman coding is used to the permutated string. The original method [3]
from 1994 was designed for an alphabet compression. In 2001, versions
working with word and n-grams alphabet were presented. The newest
version copes with the syllable alphabet [7]. The goal of this article is to
compare the BWT compression working with alphabet of letters, syllables,
words, 3-grams and 5-grams.
Lánský Jan, Žemlička M.
Compression of a Set of Strings
In: Proc. of 2007 Data Compression Conference (DCC 2007), IEEE Computer Society Press, 2007, pp. 390-390.
Presented at: DCC 2007 Data Compression Conference, 27.-29.3.2007, Snowbird, Utah,
USA.
Lánský Jan, Chernik K., Vlčková Z.
Comparison of Text Models for BWT
In: Proc. of 2007 Data Compression Conference (DCC 2007), IEEE Computer Society Press, 2007, pp. 389-389.
Presented at: DCC 2007 Data Compression Conference, 27.-29.3.2007, Snowbird, Utah,
USA.
Linková Zdeňka, Nedbal Radim
Ontology approach to integration of geographical data
In: WETDAP 2007, Proceedings of the 1st Workshop Evolutionary Techniques in Data-processing, In Conjunction with Znalosti (Knowledge) 2007, Faculty of Electrical Engineering and Computer Science, VŠB - Technical University of Ostrava, Ostrava, 2007, pp. 35-41.
Presented at: Workshop Evolutionary Techniques in Data-processing, Associated with ZNALOSTI 2007 conference
, 21.-23.2.2007, Ostrava,
Czech Republic.
A key point in modern automated data processing is metadata semantics representation. Employing Semantic Web existing features - ontologies - is a promising option. Ontologies open a novel approach to knowledge representation.
The paper presents a GIS (Geographic Information System) domain application illustrating ontological approach to data integration and data
processing automation in the specific system. This VirGIS system is an integration system that works with spatio-temporal data. We start our
study with developing the data representation based on common Semantic Web techniques and build a VirGIS ontology.
Linková Zdeňka
Ontology-Based Schema Integration
In: Proceedings of SOFSEM 2007, ICS AS CR, Prague, 2007, pp. 71-80.
Presented at: SOFSEM 2007, 20.2.-26.2.2007, Harrachov,
Czech Republic.
Data integration usually provides a unified global view over
several data sources. A crucial part of the task is the establishment of the
connection between the global view and the local sources. For this purpose, two basic mapping approaches have been proposed: GAV (Global
As View) and LAV (Local As View). On the Semantic Web, there can
be considered also an ontological approach.
In this paper, data integration is solved using ontologies of the sources. To
express relationships between the global view and local source schemas,
an ontology for the integration system is built. Thus, a schema integration task is transformed to an ontology merging task.
Linková Zdeňka
Schema Matching in the SemanticWeb Environment
In: Doktorandský den 07, (Ed. F. Hakl), MATFYZPRESS, 2007, pp. 36-42.
Presented at: Doktorandské dny 2007, 17.-19.9.2007, Malá Úpa,
Czech Republic.
The paper deals with one step of non-materialized data integration - schema matching task. It works with data
sources on the Semantic Web; the crucial assumption for the considered task is available ontologies describing data
to integrate. Source ontologies are used to find correspondences between source schemas elements. For this, also
techniques known from ontology alignment and ontology merging field are used.
Linková Zdeňka
Mapování schémat v prostředí Sémantického webu
In: Doktorandské dny na KM FJFI 07, 2007, pp. 117-126.
ISBN: 978-80-01-03913-7
Článek se zabývá úlohami, které je třeba řešit při nematerializované
integraci dat. Zaměřuje se na hledání korespondencí mezi schématy a
mapování schémat. Návrh přístupu řešení těchto úloh na Sémantickém
webu těží z dostupných ontologiích popisujících integrované zdroje.
Ontologie jsou využity jak k hledání mapování, tak i při jejich
popisu.
Matousek T., Zavoral Filip
Extracting Zing Models from C Source Code
In: SOFSEM 2007, LNCS 4362, Springer, Berlin, 2007, pp. 900-910.
Presented at: SOFSEM 2007, 20.2.-26.2.2007, Harrachov,
Czech Republic.
In the paper, we propose an approach to an automatic extraction of verification models for the C language source code. We primarily focus on the representation of pointers and arrays, which make the extraction from the C language specific. We provide an implementation of the model extractor as a part of our broader effort to develop a verifier of Windows kernel drivers based on the Zing model checker. To demonstrate the feasibility of our approach, we give examples of the extraction results on a practical synchronization problem.
Mlýnková Irena
UserMap - an Enhancing of User-Driven XML-to-Relational Mapping Strategies
Technical Report: 2007/3, Charles University, Prague, 2007, 38 p.
As XML has undoubtedly become a standard for data representation, it is inevitable to propose and implement techniques for
efficient managing of XML data. A natural alternative is to exploit features and functions of (object-)relational database systems, i.e. to rely
on their long theoretical and practical history. The main concern of such
techniques is the choice of an appropriate XML-to-relational mapping
strategy.
In this paper we focus on enhancing of user-driven techniques which
leave the mapping decisions in hands of users. We propose an algorithm
which exploits the user-given annotations more deeply searching the
user-specified "hints" in the rest of the schema and applies an adaptive
method on the remaining schema fragments. We describe the proposed
algorithm, the similarity measure designed for this purpose, sample implementation of key features of the proposal called UserMap, and results
of experimental testing on real XML data.
Mlýnková Irena
XML Data in (Object-)Relational Databases
In: Diploma Thesis, Charles University, Prague, 2007, pp. 142.
Mlýnková Irena
An XML-to-Relational User-driven Mapping Strategy Based on Similarity and Adaptivity
In: Proc. of SYRCoDIS `07 4th Spring Young Researchers Colloquium on Databases and Information Systems, Volume: 256, CEUR Woskhop Proc., 2007, pp. 9-20.
Presented at: SYRCoDIS`07, 31.5.-1.6.2007, Moscow,
Russia.
As XML has become a standard for data representation,
it is inevitable to propose and implement
techniques for efficient managing of XML
data. A natural alternative is to exploit features
and functions of (object-)relational database
systems, i.e. to rely on their long theoretical
and practical history. The main concern of
such techniques is the choice of an appropriate
XML-to-relational mapping strategy.
In this paper we focus on enhancing of userdriven
techniques which leave the mapping decisions
in hands of users. We propose an algorithm
which exploits the user-given annotations
more deeply searching the user-specified
“hints” in the rest of the schema and applies an
adaptive method on the remaining schema fragments.
We describe the algorithm theoretically,
discussing the key ideas of the approach, chosen
solutions, their reasons, and consequences.
Finally, we overview the open issues related to
implementation of the proposed algorithm and
its experimental testing on real XML data.
Mlýnková Irena, Pokorný Jaroslav
Similarity and XML Technologies
In: Proc. of IADIS International Conference WWW/Internet 2007, (Ed. Isaias P., Nunes M.B., Barroso J.), IADIS, 2007, pp. 277-287.
Presented at: WWW/Internet 2007, 5.-8.10.2007, Vila Real,
Portugal.
As XML technologies have undoubtedly become a standard for data representation, it is inevitable to provide efficient implementations of W3C recommendations. A possible optimization of particular types of techniques can be found in exploitation of similarity of XML data and/or matching of XML patterns. In this paper we provide an overview and classification of such techniques from various points of view. We briefly describe the best known representatives of particular ideas and we discuss their key advantages and disadvantages. The text should serve as a good starting point for proposing an appropriate similarity-based optimization.
Mlýnková Irena, Pokorný Jaroslav
Similarity of XML Schema Fragments Based on XML Data Statistics
In: Proc. of Innovations '07: Proceedings of the 4th International Conference on Innovations in Information Technology, IEEE Computer Society Press, 2007, pp. 243-247.
Presented at: 4th International Conference on Innovations in Information Technology, 18.-20.11.2007, Dubai,
United Arab Emirates.
As XML has become a standard for data representation, it can be found in plenty of information technologies. A possible optimization of XML-based approaches can be exploitation of similarity of XML data. In this paper we propose a technique for evaluating similarity of XML schema fragments focusing on two often omitted aspects - structural level of similarity and tuning of parameters of the similarity measure. In the former case we exploit the results of statistical analysis of real-world XML data. In the latter case we show that the tuning problem is a kind of constraints optimization problem and can be solved using corresponding approaches. We have analyzed (dis) advantages of two of them, genetic algorithms and simulated annealing, and in further experiments we show that appropriate tuning produces a more precise similarity measure.
Mlýnková Irena
UserMap - an Exploitation of User-Specified XML-to-Relational Mapping Requirements and Related Problems
Technical Report: 2007/8, Charles University, Prague, 2007, 26 p.
As the XML has become a standard for data representation, it is inevitable
to propose and implement techniques for efficient managing of XML
data. A natural alternative is to exploit features of (object-)relational database systems,
i.e. to rely on their long theoretical and practical history. The main concern
of such techniques is the choice of an appropriate XML-to-relational mapping
strategy.
In this paper we focus on enhancing of user-driven techniques which leave the
mapping decisions in hands of users who specify their requirements using schema
annotations.We describe our prototype implementation called UserMap which is
able to exploit the annotations more deeply searching the user-specified “hints” in
the rest of the schema and applies an adaptive method on the remaining schema
fragments. Using a sample set of supported fixed mapping methods we discuss
problems related to query evaluation for storage strategies generated by the system,
in particular correction of the candidate set of annotations and related query
translation. And finally, we describe the architecture of the whole system.
Nečaský Martin
Conceptual modeling for XML
In: Diploma Thesis, Charles University, Prague, 2007, pp. 153 p..
Nečaský Martin
XSEM - A Conceptual Model for XML
In: Proceedings of the Fourth Asia-Pacific Conference on Conceptual Modelling (APCCM 2007) , (Ed. Roddick J. F., Annika H.), 2007, pp. 37-48.
Presented at: The Fourth Asia-Pacific Conference on Conceptual Modelling (APCCM 2007), 30.1.-2.2.2007, Ballarat, Victoria,
Australia.
We propose a new conceptual model for XML data
called XSEM as a combination of several approaches
in the area of the conceptual modeling for XML.
The model divides the conceptual modeling process of
XML data to two levels. On the first level, a designer
designs an overall non-hierarchical conceptual schema
of a domain. On the second level, he or she derives
different hierarchical representations of parts of the
overall conceptual schema using transformation op-
erators. These hierarchical representations describe
how the data is organized in an XML form.
Nečaský Martin
Using XSEM for Modeling XML Interfaces of Services in SOA
In: Proceedings of the Dateso 2007, CEUR Workshop Proc., 2007, pp. 35-46.
Presented at: Dateso 2007 Annual International Workshop on DAtabases, TExts, Specifications and Objects, 18.4.-20.4.2007, Desná - Černá Říčka,
Czech Republic.
In this paper we briefly describe a new conceptual model for
XML data called XSEM and how to use it for modeling XML interfaces
of services in service oriented architecture (SOA). The model is a
combination of several approaches in the area of conceptual modeling of
XML data. It divides the process of conceptual modeling of XML data to
two levels. The first level consists of designing an overall non-hierarchical
conceptual schema of the domain. The second level consists of deriving
different hierarchical representations of parts of the overall conceptual
schema using transformation operators. Each hierarchical representation
models an XML schema describing the structure of the data exchanged
between a service interface and external services.
Nečaský Martin, Pokorný Jaroslav
Extending E-R for Modelling XML Keys
In: Proc. of IEEE ICDIM 2007: Proc. of The Second International Conference on Digital Information Management, IEEE Computer Society, 2007, pp. 236-241.
Presented at: ICDIM 2007: The Second International Conference on Digital Information Management, 28.-31.10.2007, Lyon,
France.
With the growing popularity of XML there is a need not only to describe the structure of XML data but also its semantics. For the conceptual modelling of XML we can use existing conceptual models. However, special features of XML require extensions of these models. In this paper, we study conceptual modelling of XML keys. We extend the notion of E-R keys to be suitable for modelling the semantics of XML keys and we show how to express them on the XML logical level.
Nedbal Radim
Various Kinds of Preferences in Database Queries
In: Doktorandský den 07, (Ed. F. Hakl), MATFYZPRESS, 2007, pp. 49-59.
Presented at: Doktorandské dny 2007, 17.-19.9.2007, Malá Úpa,
Czech Republic.
The paper resumes recent advances in the
field of logic of preference and presents their
application in the field of database queries.
Namely, non-monotonic reasoning mechanisms
including various kinds of preferences are reviewed,
and a way of suiting them to practical
database applications is shown: reasoning including
sixteen strict and non-strict kinds of preferences,
inclusive of ceteris paribus preferences,
is feasible. However, to make the mechanisms
useful for practical applications, the assumption
of preference specification consistency
has to be relinquished. This is achieved in two
steps: firstly, all the kinds of preferences are de-
fined so that some uncertainty is inherent, and
secondly, not a notion of a total pre-order but a
partial pre-order is used in the semantics, which
enables to indicate some kind of conflict among
preferences. Most importantly, the semantics of
a set of preferences is related to that of a disjunctive
logic program.
Nedbal Radim
Algebraic Optimization of Database Queries with Preferences
In: Doktorandské dny na KM FJFI 07, 2007, pp. 157-167.
ISBN: 978-80-01-03913-7
The paper resumes a logical framework for formulating preferences and proposes
their embedding into relational algebra through a single preference operator parameterized by
a set of user preferences of sixteen various kinds, inclusive of ceteris paribus preferences, and
returning only the most preferred subsets of its argument relation. Most importantly, conflicting
set of preferences is permitted and preferences between sets of elements can be expressed.
Formal foundation for algebraic optimization, applying heuristics like push preference, also
is provided: abstract properties of the preference operator and a variety of algebraic laws
describing its interaction with other relational algebra operators are presented.
Nedbal Radim
Non-monotonic reasoning with Various Kinds of Preferences in the Relational data Model Framework
In: Proceedings of ITAT 2007, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), PONT s.r.o., Seňa, 2007, pp. 15-20.
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2007, 21.-27.9.2007, Polana,
Slovakia.
The paper gives an overview of recent advances
in the field of logic of preference and discusses their applicability
in the frame of the relational data model. Namely,
non-monotonic reasoning mechanisms with various kinds
of preferences are reviewed in detail, and a way of suiting
them to practical database applications is presented.
These mechanisms enable to reason simultaneously about
sixteen strict and non-strict kinds of preferences, including
ceteris paribus preferences. To make the mechanisms
useful for practical applications, the assumption of preference
specification consistency has to be loosened. This is
achieved in two steps: firstly, all the preference specifications
are generalized to permit uncertainty, and secondly,
not a total pre-order on worlds but a partial pre-order on
worlds is used in the semantics, which enables to indicate
some kind of conflict among worlds by their incomparability.
Most importantly, the semantics of set of preferences
is related to that of a disjunctive logic program.
Neruda Roman, Beuster Gerd
Towards Dynamic Generation of Computational Agents by Means of Logical Descriptions
In: International Workshop on Multi-Agent System Challenges for Ubiquitous and Pervasive Computing, UTBM/LST, Paris, 2007, pp. 17-28.
Presented at: MASUPC`07 International Workshop on Multi-Agent System Challenges for Ubiquitous and Pervasive Computing, 02.-04.05.2007, Paris,
France.
Neruda Roman
Hybrid Evolutionary Algorithm for Multilayer Perceptron Networks with Competetive Performance
In: Evolutionary Computation, IEEE, Los Alamitos, 2007, pp. 1620-1627.
Presented at: CEC 2007, Congress on Evolutionary Computation, 25.-28.09.2007, Singapore,
SG.
Nováček Vít
Imprecise Empirical Ontology Refinement: Application to Taxonomy Acquisition
In: Proceedings of ICEIS 2007, Kluwer Academic Publishing, Artificial Intelligence and Decision Support Systems, London, 2007, pp. 8.
(in_print)
Enterprise Information Systems (ICEIS 2007, revised selected papers), Springer, 2007, pp. 8.
(in_print)
Presented at: ICEIS 2007, 12.-16.6.2007, Funchal,
Madeira - Portugal.
Nováček Vít, Laera Loredana, Handschuh Siegfried
Dynamic Integration of Medical Ontologies in Large Scale
In: Proceedings of WWW2007/HCLSDI, ACM Press, New York, 2007, pp. 10.
(in_print)
Nováček Vít, Laera Loredana, Handschuh Siegfried
Aiding the Data Integration in Medicinal Settings by Means of Semantic Technologies
In: Making Semantics Work for Business, Semantic Technology Institutes International Workshop at European Semantic Technology Conference, Vienna, Austria, 2007.
(in_print)
Nováček Vít
A Non-traditional Inference Paradigm for Learned Ontologies
In: Proceedings of ESWC 2007 PhD Symposium, CEUR Workshop proceedings Workshop at ESWC 2007, Innsbruck, 2007.
Nováček Vít, Laera Loredana, Handschuh Siegfried
Semi-automatic Integration of Learned Ontologies into a Collaborative Framework
In: Proceedings of IWOD/ESWC 2007, Springer Verlag, Innsbruck, 2007, pp. 14.
(in_print)
Nováček Vít, Dabrowski Maciej, Kruk Sebastian R.
Extending Community Ontology Using Automatically Generated Suggestions
In: Proceedings of FLAIRS 2007, AAAI Press, Menlo Park, CA, 2007, pp. 6.
(in_print)
Nováček Vít, Handschuh Siegfried, Laera Loredana, Maynard Diana, Voelkel Max
Dynamic Ontology Lifecycle Scenario in Translational Medicine
In: Proceedings of the 5th European Conference of Computational Biology (ECCB 2006) - Book of Abstracts, Oxford University Press, Oxford, 2007, pp. 5.
(in print)
Novák David, Zezula Pavel
LOBS: Load Balancing for Similarity Peer-to-Peer Structures
Technical Report: FIMU-RS-2007-04, Faculty of Informatics, Masaryk University, Brno, 2007, 22 p.
Novák David
Image Similarity Search: Theory and Practice
In: Third Doctoral Workshop on Mathematical and Engineering Methods in Computer Science MEMICS 2007, Masaryk University and Technical University of Brno, Brno, 2007, pp. 154-160.
Presented at: MEMICS 2007, 26.10.-28.10.2007, Znojmo,
Czech Republic.
Novák David, Zezula Pavel
LOBS: Load Balancing for Similarity Peer-to-Peer Structures
In: Databases Information Systems and Peer-to-Peer Computing 2007, Springer Verlag, Berlin Heidelberg New York, 2007, pp. 1-8.
Presented at: DBISP2P 2007, 24.9.2007, Vienna,
Austria.
Novák David, Batko Michal, Dohnal Vlastislav, Zezula Pavel
Scaling up the Image Content-based Retrieval
In: Second DELOS Conference 2007 - Working Notes, DELOS Network of Excellence, Pisa, Italy, 2007, pp. 1-10.
Presented at: DELOS Conference 2007, 13-14.2.2007, Pisa,
Italy.
Obdržálek David, Benda J.
GFE - Graphical Finite State Machine Editor for Parallel Execution
In: ICEC 2007, (Ed. Ma L., Nakatsu R., Rauterberg M.), LNCS 4740, Springer, IFIP, 2007, pp. 401-406.
Presented at: ICEC 2007 - International Conference on Entertainment Computing, 20.-23.06.2005, Shanghai,
China.
Pala Karel, Horák Aleš, Rambousek Adam, Vetulani Zygmunt, Konieczka Paweł, Marciniak Jacek, Obrębski Tomasz, Rzepecki Przemysław, Walkowska Justyna
DEB Platform tools for effective development of WordNets in application to PolNet
In: Proceedings of 3rd Language & Technology Conference, Fundacja Uniwersytetu im. A. Mickiewicza, Poznań, 2007, pp. 514-518.
Presented at: LTC`07, 5.-7.2007, Poznań,
Poland.
Petrů Lukáš, Wiedermann Jiří
A Model of an Amorphous Computer and its Communication Protocol
In: SOFSEM 2007, LNCS 4362, Springer, Berlin, 2007.
Presented at: SOFSEM 2007, 20.2.-26.2.2007, Harrachov,
Czech Republic.
We design a formal model of an amorphous computer suit-
able for theoretical investigation of its computational properties. The
model consists of a ¯nite set of nodes created by RAMs with restricted
memory, which are dispersed uniformly in a given area. Within a limited
radius the nodes can communicate with their neighbors via a single-
channel radio. The assumptions on low-level communication abilities are
among the weakest possible: the nodes work asynchronously, there is no
broadcasting collision detection mechanism and no network addresses.
For the underlying network we design a randomized communication pro-
tocol and analyze its e±ciency. The subsequent experiments and combi-
natorial analysis of random networks show that the expectations under
which our protocol was designed are met by the vast majority of the
instances of our amorphous computer model.
Pomikálek Jan, Řehůřek R.
The Influence of Preprocessing Parameters on Text Categorization
In: International Conference on Computer, Information and Systems Science and Engineering, Springer, 2007.
(in_print)
Presented at: XIX International Conference on Computer, Information and Systems Science and Engineering, 29.1.-31.1.2007, Bangkok,
Thailand.
Rychlý Pavel, Kovář Vojtěch
Displaying Bidirectional Text Concordances in KWIC format
In: Proceedings of 5th Biennial Conference of the Asian Association for Lexicography, University of Madras, Chennai, India, 2007, pp. 96-100.
Presented at: Asialex 2007, 6.-8.12.2007, Chennai,
India.
Rychlý Pavel
Manatee/Bonito - A Modular Corpus Manager
In: RASLAN 2007: Recent Advances in Slavonic Natural Language Processing, Masaryk University, Brno, 2007, pp. 97-102.
Rychlý Pavel, Kilgarriff A.
An Efficient Algorithm for Building a distributed Thesaurus (and other Sketch Engine Development)
In: Association for Computational Linguistics, Proceedings of the ACL 2007 Demo and Poster Sessions, Prague, 2007, pp. 41-44.
Presented at: ACL 2007, 23.-30.6.2007, Prague,
Czech Republic.
Řimnáč Martin
Advanced Features of Attribute Annotated Data Sets
In: WETDAP 2007, Proceedings of the 1st Workshop Evolutionary Techniques in Data-processing, In Conjunction with Znalosti (Knowledge) 2007, Faculty of Electrical Engineering and Computer Science, VŠB - Technical University of Ostrava, Ostrava, 2007, pp. 54-59.
Presented at: Workshop Evolutionary Techniques in Data-processing, Associated with ZNALOSTI 2007 conference
, 21.-23.2.2007, Ostrava,
Czech Republic.
The paper compares features of learning and querying process
in the situation, when values in the input data set are annotated by
attributes or this information is not available. The attribute annotation
enables to consider global relationships, which are useful to express the
data semantics in a explicit way. It will be shown data can be accessed
with no semantic interpretation and then, after the evaluation process,
the result can be interpreted.
Řimnáč Martin
Minimalising Binary Predicate Knowledge Base using Transitivity Rule in Incremental Algorithm
Presented as an invited talk: 22nd European Conference on Operational Research EURO 2007
, 8.-11.7.2007, Prague,
Czech Republic.
Machine learning methods can be seen as an optimalisation task reducing differences
between an expected and returned result on a given data set. A corresponding
knowledge base can be expressed in many ways, for example, by a binary predicate
formalism.
The talk deals with a minimalisation of predicate ammount in such a repository,
which is enabled by a transitivity. The transitive reduction algorithm will be
detaily given for an incremental (attribute annotated data driven) building of a
knowledge base; a base model with higher expressiveness will be prefered.
Finally, an effect of the selected model to estimated explicit semantic definitions
of symbols (internal base interpretation) will be mentioned as well.
Řimnáč Martin
Redukce datových modelů
In: Doktorandský den 07, (Ed. F. Hakl), MATFYZPRESS, 2007, pp. 80-86.
Presented at: Doktorandské dny 2007, 17.-19.9.2007, Malá Úpa,
Czech Republic.
Přıspěvek se zabývá aspekty optimalizace paměťových nároků binárního úložiště atributově anotovaných dat
na základě transitivní redukce zobecněného systému funkčních závislostí. Tento systém buď může být předem
daný modelem, v tomto případě se ukazuje, že je možné optimalizaci použít jednorázově; a nebo tento model
je inkremetálním způsobem odhadován a pak se ukazuje vhodným pouze již jednou naoptimalizované úložiště
pouze upravovat opět inkrementálním způsobem. V poslední sekci se příspěvek zaobírá rozborem nejednoznačnosti
výsledku včetně detailního rozboru vlastností základních konfigurací částí modelu způsobující tuto nejednoznačnost.
V neposlední řadě je analyzována složitost dílčích operací v úložišti.
Řimnáč Martin, Linková Zdeňka
Automatizovaný návrh pravidel pro integraci dat
Řimnáč Martin, Špánek Roman, Linková Zdeňka
Sémantický web: vize globálního úložiště dat?
In: DATAKON 2007, (Ed. Popelínský L., Výborný O.), Masaryk university, 2007, pp. 176-186.
Presented at: DATAKON 2007, 20.10.-23.10.2007, Brno,
Czech Republic.
Cílem příspěvku je předložit vizi nových přístupů pro sdílení a vyhledávání dat na internetu. Opírá se o prověřené technologie pracující nad textovými webovými dokumenty a propojuje je se sémantickým webem, moderním prostředkem pro výměnu dat a aktuálními trendy ve vývoji internetu jako celku.
Řimnáč Martin, Špánek Roman, Linková Zdeňka
SemanticWeb: Vision of Distributed and Trusted Data Environment?
In: WWM 2007, 2007, pp. 627-634.
Presented at: WWM 2007, 1st International Web X.0 and Web Mining Workshop, held in collocation with ICDIM 2007, 28.10.-31.10.2007, Lyon,
France.
The vision of the semantic web as a distributed and
trusted environment for data sharing together with related
issues are presented. The paper brings a basic binary
matrix formalism for the internal representation of sources
and shows the clasical issues as a data inconsistency and a
data integration. Aspects of these issues lead to the binary
formalism to be generalised into the <0,1> interval one to
enable the consideration of uncertainty at various level.
Finally, the need of a source trust definition is presented
and discussed with respect to a semantic web.
Řimnáč Martin
Data Structure Estimation for RDF Oriented Repository Building
In: Complex, Intelligent and Software Intensive Systems, (Ed. Barolli L., Tjoa A.), IEEE Computer Society, Los Alamitos, 2007, pp. 147-154.
Presented at: CISIS`07 International Conference on Complex, Intelligent and Software Intensive Systems, 10.-13.04.2007, Vienna, Austria.
Sedmidubský Jan, Bartoň Stanislav, Dohnal Vlastislav, Zezula Pavel
Adaptive Approximate Similarity Searching through Metric Social Networks
Technical Report: FIMU-RS-2007-06, Faculty of Informatics, Masaryk University, Brno, 2007, 22 p.
Exploiting the concepts of social networking represents a novel approach to the approximate
similarity query processing. We present an unstructured and dynamic P2P environment in
which a metric social network is built. Social communities of peers giving similar results
to specific queries are established and such ties are exploited for answering future queries.
Based on the universal law of generalization, a new query forwarding algorithmis introduced
and evaluated. The same principle is used to manage query histories of individual peers with
the possibility to tune the tradeoff between the extent of the history and the level of the queryanswer
approximation. All proposed algorithms are tested on real data and medium-sized
P2P networks consisting of tens of computers.
Sedmidubský Jan, Bartoň Stanislav, Dohnal Vlastislav, Zezula Pavel
Querying Similarity in Metric Social Networks
In: Network-Based Information Systems, First International Conference, NBiS 2007, Springer, Berlin, 2007, pp. 278-287.
Presented at: NBiS 2007, 3.-7.9.2007, Regensburg,
Germany.
In this paper we tackle the issues of exploiting the concepts of social networking in processing similarity queries in the environment of a P2P network. The processed similarity queries are laying the base on which the relationships among peers are created. Consequently, the communities encompassing similar data emerge in the network. The architecture of the presented metric social network is formally defined using the acquaintance and friendship relations. Two version of the navigation algorithm are presented and thoroughly experimentally evaluated. Finally, learning ability of the metric social network is presented and discussed.
Skopal Tomáš, Hoksza D.
Improving the Performance of M-tree Family by Nearest-Neighbor Graphs
In: Advances in Databases and Information Systems, LNCS 4690, Springer, Berlin, 2007, pp. 172-188.
Presented at: ADBIS 2007, 29.9.-3.10.2007, Varna,
Bulgaria.
The M-tree and its variants have been proved to provide an efficient similarity search in database environments. In order to further improve their performance, in this paper we propose an extension of the M-tree family, which makes use of nearest-neighbor (NN) graphs. Each tree node maintains its own NN-graph, a structure that stores for each node entry a reference (and distance) to its nearest neighbor, considering just entries of the node. The NN-graph can be used to improve filtering of non-relevant subtrees when searching (or inserting new data). The filtering is based on using ”sacrifices” selected entries in the node serving as pivots to all entries being their reverse nearest neighbors (RNNs). We propose several heuristics for sacrifice selection; modified insertion; range and kNN query algorithms. The experiments have shown the M-tree (and variants) enhanced by NN-graphs can perform significantly faster, while keeping the construction cheap.
Skopal Tomáš
Unified Framework for Exact and Approximate Search in Dissimilarity Spaces
In: Transactions on Database Systems (TODS), Volume: 32, No: 4, ACM, 2007, pp. 1-47.
In multimedia systems we usually need to retrieve database (DB) objects based on their similarity
to a query object, while the similarity assessment is provided by a measure which defines a
(dis)similarity score for every pair of DB objects. In most existing applications, the similarity measure
is required to be a metric, where the triangle inequality is utilized to speed up the search
for relevant objects by use of metric access methods (MAMs), for example, the M-tree. A recent
research has shown, however, that nonmetric measures are more appropriate for similarity modeling
due to their robustness and ease to model a made-to-measure similarity. Unfortunately, due to
the lack of triangle inequality, the nonmetric measures cannot be directly utilized by MAMs. From
another point of view, some sophisticated similarity measures could be available in a black-box
nonanalytic form (e.g., as an algorithm or even a hardware device), where no information about
their topological properties is provided, so we have to consider them as nonmetric measures as well.
From yet another point of view, the concept of similarity measuring itself is inherently imprecise
and we often prefer fast but approximate retrieval over an exact but slower one.
To date, the mentioned aspects of similarity retrieval have been solved separately, that is, exact
versus approximate search or metric versus nonmetric search. In this article we introduce a similarity
retrieval framework which incorporates both of the aspects into a single unified model. Based
on the framework, we show that for any dissimilarity measure (either a metric or nonmetric) we
are able to change the “amount” of triangle inequality, and so obtain an approximate or full metric
which can be used for MAM-based retrieval. Due to the varying “amount” of triangle inequality,
the measure is modified in a way suitable for either an exact but slower or an approximate but
faster retrieval. Additionally, we introduce the TriGen algorithm aimed at constructing the desired
modification of any black-box distance automatically, using just a small fraction of the database.
Slušný Stanislav, Vidnerová Petra, Neruda Roman
Behavior Emergence in Autonomous Robot control by Means of Feedforward and Recurrent Neural Networks
In: WCECS 2007, (Ed. Ao S., Douglas C., Grundfest W., Schruben L., Wu X.), IA ENG, LNCS, Hong Kong, 2007, pp. 518-523.
Presented at: WCECS 2007. World Congress on Engineering and Computer Science, 24.-26.10.2007, San Francisco,
USA.
Snášel Václav, Řezanková Hana, Húsek Dušan, Kudělka Miloš, Lehečka Ondřej
Semantic Analysis of Web Pages using Cluster Analysis and Nonnegative matrix Factorization
In: Advances in Intelligent Web Mastering, (Ed. Wegrzyn-Wolska K., Szczepaniak P.), Volume: 43, Springer, Berlin, 2007, pp. 328-336.
ISBN: 978-3-540-72574-9
Presented at: AWIC 2007. Atlantic Web Intelligence Conference (5.), 25.6.-27.6.2007, Fontainbleau,
France.
In this paper, the web pages concerning products sale are analyzed with the aim to create clusters of similar web pages and characterize these by GUI patterns. We applied GD-CLS (gradient descent - constrained least squares) method which combines some of the best features of other methods. Both traditional methods for searching clusters and nonnegative matrix factorization are used.
Špánek Roman
Maintaining Trust in Large Scale Environments
In: Doktorandský den 07, (Ed. F. Hakl), MATFYZPRESS, 2007, pp. 94-102.
Presented at: Doktorandské dny 2007, 17.-19.9.2007, Malá Úpa,
Czech Republic.
Špánek Roman
Supporting Secure Communication in Distributed Environments
Špánek Roman
Reputation System for Large Scale Environments
In: WWM 2007, 2007, pp. 621-626.
Presented at: WWM 2007, 1st International Web X.0 and Web Mining Workshop, held in collocation with ICDIM 2007, 28.10.-31.10.2007, Lyon,
France.
The paper describes a new approach for treating trust in
reconfigurable groups of users with special accent on trust
in the next generations of the Internet. The proposed model
uses properties of weighted hypergraphs. Model flexibility
enables description of relations between nodes such that
these relations are preserved under frequent changes. The
ideas can be straightforwardly generalized to other concepts
describable by weighted hypergraphs. The consistency
of the proposal was verified in a couple of experiments
with our pilot implementation SecGRID.
Špánek Roman, Pirkl Pavel, Kovář P.
The Blue Game Project: Ad-hoc Multiplayer Mobole Game with Social Dimension
In: CoNEXT 2007, New York, 2007.
Presented at: 3rd Annual CoNEXT Conference, 10.-13.12.2007, New York,
USA.
The paper presents the BlueGame project an ad-hoc multiplayer
mobile game based on the Dungeons&Dragons board
game. The main idea lies in the adoption of Bluetooth Piconet
configuration and direct face to face contact of players
in real environments.
Tyl Pavel
Problematika integrace ontologií
In: Doktorandský den 07, (Ed. F. Hakl), MATFYZPRESS, 2007, pp. 110-115.
Presented at: Doktorandské dny 2007, 17.-19.9.2007, Malá Úpa,
Czech Republic.
Internet je ohromným zdrojem provázaných, ale většinou neuspořádaných dat. Sémantický web, jako rozšíření
webu současného, se snaží tuto neuspořádanost řešit a to nejen bezprostředně pro lidského uživatele, ale zejména
z hlediska možnosti strojového zpracování informací. Cílem je doplnit data o metadata, která mají být srozumitelná
jak pro člověka, tak pro počítač. Tato metadata jsou nejčastěji vyjádřena pomocí ontologií, které jsou jedním
ze základních stavebních prvků sémantického webu. V příspěvku se snažím nastínit některé z možností integrace
(slučování) ontologií za účelem sdílení informací.
Vlčková Z., Galamboš Leo
Dynamizace gridu
In: Proceedings of ITAT 2007, Information Technologies - Applications and Theory, (Ed. Vojtáš P.), PONT s.r.o., Seňa, 2007, pp. 115-121.
Presented at: Konferencia o informačných (inteligentných) technológiách - aplikácie a teória 2007, 21.-27.9.2007, Polana,
Slovakia.
Vojtáš Peter
EL description logic with aggregation of user preference concepts
In: Frontiers in Artificial Intelligence and applications 154, Information modelling and Knowledge Bases XVIII, IOS Press, Amsterdam, 2007, pp. 154-165.
Wiedermann Jiří
Lesk a bída nestandardních výpočetních systémů
In: SOFTECON 2007, Softec, Bratislava, 2007, pp. 1-32.
Presented at: SOFTECON 2007. Odborná konferencia o víziách a trendoch v moderných informačných technologiách, 1.3.2007, Bratislava,
Slovakia.
Wiedermann Jiří
Nástin architektury vědomého kognitivního agenta se dvěma vnitřními modely světa
In: Kognice a umělý život, (Ed. Kelemen J., Kvasnička V., Pospíchal J.), Sleská univerzita, Opava, 2007, pp. 377-383.
Presented at: Kognice a umělý život VII, 28.5.-31.5.2007, Smolenice,
Slovakia.
Nastíníme jednoduchou, ale přesto kognitivně účinnou
architekturu kognitivního agenta. Náš model se liší od
jiných podobných modelů především využíváním dvou
komplementárních vnitřních modelů světa, které mají jiný
úkol než v podobných modelech známých z odborné
literatury. První z nich zachycuje senzorimotorickou
„syntaxi“ agentova chování a je využíván pro situování
agenta v jeho prostředí. Druhý model popisuje
senzorimotorickou dynamiku světa agenta a je využíván
pro řízení agentova chování. Informace v obou vnitřních
modelech závisí od agentova ztělesnění a jeho zkušeností.
Ukážeme, že kognitivní potenciál našeho modelu
podstatně překračuje možnosti dřívějších modelů tím, že
podporuje algoritmické procesy podobající se ve svých
důsledcích vyšším kognitivním funkcím, jakými jsou
imitační učení a rozvoj komunikace, řeči, myšlení a
vědomí.
Wiedermann Jiří
Spojení samoorganizace s výpočty: minimální život v moři umělých molekul
In: Myseľ, inteligencia a život, (Ed. Kvasnička V., Trebanický P., Pospíchal J., Kelemen J.), Slovenská technická univerzita, Bratislava, 2007, pp. 497-512.
Bakteriod je formální abstraktní hybridní systém, který ve své činnosti kombinuje
výpočetní a nevýpočetní mechanizmy. Ukážeme, že v prostředí umělých molekul, nadanými
jistými samoorganizačními schopnostmi, některé bakteroidy vykazují znaky minimálního života:
jsou autonomní, replikují se a mají schopnost darwinovské evoluce. Návrh bakteroidů je
inspirován představami současné molekulární biologie o dnes již neexistujících (či zatím
neobjevených) formách protoživota.
Wiedermann Jiří
Výpočetní meze kognitivních a inteligentních systémů
In: Umělá inteligence, (Ed. Mařík V., Štěpánková O., Lažanský J.), Academia, 2007, pp. 75-90.
ISBN: 978-80-200-1470-2
V příspěvku budeme hledat výpočetní meze kognitivních a inteligentních systémů, a
to jak biologických, tak i umělých a hybridních, které jsou kombinacích obou předchozích
druhů. Společnou platformu poskytne komputacionalimus, tj. víra, že kognitivní resp.
inteligentní procesy jsou v konečném důsledku výpočetními procesy. Ukážeme, že v principu
mohou existovat kognitivní systémy, a dokonce i v praxi existují „zárodky“ takových systémů,
které předčí svou výpočetní sílou výpočetní sílu Turingových strojů. Tyto výsledky naznačují,
že tzv. Church-Turingovu tezi, hovořící o centrálním postavení Turingových strojů ve světě
výpočtů a algoritmů, je třeba vidět v souvislosti s fyzikálními principy, které kognitivní systém
při své činnosti využívá, a se způsobem, kterým systém komunikuje s okolím.
Wiedermann Jiří, Petrů Lukáš
On the Universal Computing Power of Amorphous Computing Systems
Technical Report: V-1009, ICS AS CR, Prague, 2007, 11 p.
Amorphous computing differs from the classical ideas about computations almost in every aspect. The
architecture of amorphous computers is random, since they consist of a plethora of identical computational
units spread randomly over a given area. Within a limited radius the units can communicate wirelessly
with their neighbors via a single-channel radio. We consider a model whose assumptions on the underlying
computing and communication abilities are among the weakest possible: all computational units are finite
state probabilistic automata working asynchronously, there is no broadcasting collision detection mechanism
and no network addresses. We show that under reasonable probabilistic assumptions such amorphous
computing systems can possess universal computing power with a high probability. The underlying theory
makes use of properties of random graphs and that of probabilistic analysis of algorithms. To the best of
our knowledge this is the first result showing the universality of such computing systems.
Zezula Pavel, Giuseppe Amato, Dohnal Vlastislav
Similarity Search: The Metric Space Approach
In: ACM SAC 2007 Conference. ACM SAC 2007 Conference Tutorial, ACM, Seoul, Korea, 2007.
Presented at: ACM SAC 2007, , Seoul,
Korea.
Similarity searching has become afundamental computational task in a variety of application areas, including multimedia information retrieval, data mining, pattern recognition, machine learning, computer vision, biomedical databases, data compression and statistical data analysis. In such environments, an exact match has little meaning, and proximity/distance (similarity/dissimilarity) concepts are typically much more fruitful for searching. In this tutorial, we review the state of the art in developing similarity search mechanisms that accept the metric space paradigm. We explain the high extensibility of the metric space approach and demonstrate its capability with examples of distance functions. The efforts to further speed up retrieval are demonstrated by a class of approximated techniques and the very recent proposals of scalable and distributed structures based on the P2P communication paradigm.
Similarity searching has become afundamental computational task in a variety of application areas, including multimedia information retrieval, data mining, pattern recognition, machine learning, computer vision, biomedical databases, data compression and statistical data analysis. In such environments, an exact match has little meaning, and proximity/distance (similarity/dissimilarity) concepts are typically much more fruitful for searching. In this tutorial, we review the state of the art in developing similarity search mechanisms that accept the metric space paradigm. We explain the high extensibility of the metric space approach and demonstrate its capability with examples of distance functions. The efforts to further speed up retrieval are demonstrated by a class of approximated techniques and the very recent proposals of scalable and distributed structures based on the P2P communication paradigm.