Bartoň Stanislav, Dohnal Vlastislav, Sedmidubský Jan, Zezula Pavel
Gauging the Evolution of Metric Social Network
In: 5th International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2007) held at 33rd International Conference on Very Large Data Bases (VLDB 2007), 2007, pp. 12.
Presented at: Fifth International Workshop on Databases, Information Systems and Peer-to-Peer Computing (DBISP2P 2007), 24.9.2007, Vienna,
Austria.
In this paper, we tackle the issues of analyzing the struc-
tural evolution of the metric social network. The metric social network
operates in a P2P environment where peers maintain their own data
and the relationships among them are formed on the basis of the pro-
cessed similarity queries. The evolution is analyzed by traditional social
networking tools the characteristic path length and the clustering co-
efficient. Nonetheless, due to the special structure of the metric social
network, own designed gauges the average overlap and robustness of
description coefficients are presented to analyze the structure of emerg-
ing communities encompassing similar data.
Batko Michal, Novák David, Zezula Pavel
MESSIF: Metric Similarity Search Implementation Framework
In: Digital Libraries: Research and Development, Springer-Verlag, LNCS 4877, Berlin, Heidelberg, 2007, pp. 1-10.
ISBN: 978-3-540-77087-9
Batko Michal, Novák David, Zezula Pavel
MESSIF: Metric Similarity Search Implementation Framework
In: DELOS Conference 2007 - Working Notes, Information Society Technologies, Pisa, Italy, 2007, pp. 11-23.
Presented at: DELOS Conference 2007, 13-14.2.2007, Pisa,
Italy.
The similarity search has become a fundamental computational task in many applications. One
of the mathematical models of the similarity the metric space has drawn attention of many
researchers resulting in several sophisticated metric-indexing techniques. An important part of a
research in this area is typically a prototype implementation and subsequent experimental evaluation
of the proposed data structure. This paper describes an implementation framework called MESSIF
that eases the task of building such prototypes. It provides a number of modules from basic storage
management to automatic collecting of performance statistics. Due to its open and modular design it
is also easy to implement additional modules if necessary. The MESSIF also offers several ready-to-use
generic clients that allow to control and test the index structures and also measure its performance.
Falchi Fabrizio, Gennaro Claudio, Rabitti Fausto, Zezula Pavel
A distributed incremental nearest neighbor algorithm
In: International Conference on Scalable Information Systems, Volume: 304, ACM Press, New York, 2007, pp. 1-10.
Presented at: INFOSCALE 2007, 6.-8.6.2007, Suzhou,
China.
Hanks Patrick
Why Bother with Corpus Evidence
In: Proceedings of the Second International Conference of the German Cognitive Linguistics Association, 2007.
(in_print)
Presented at: Second International Conference of the German Cognitive Linguistics Association, 5.10.-7.10.2006, Munich,
Germany.
Hanks Patrick, Pala Karel, Rychlý Pavel
Using Corpus Analysis to Mapping Lexical Sets onto Semantic Types through Corpus Analysis
In: Proceedings of the Fourth International Workshop on Generative Approaches to the Lexicon, 2007.
(in print)
Presented at: Fourth International Workshop on Generative Approaches to the Lexicon, 10-11.5.2007, Paris,
France.
Hanks Patrick, Pala Karel
Towards an empirically well-founded semantic ontology for NLP
In: Proceedings of the Fourth International Workshop on Generative Approaches to the Lexicon, 2007.
Presented at: Fourth International Workshop on Generative Approaches to the Lexicon, 10-11.5.2007, Paris,
France.
This paper examines some issues involved in
building a corpus-based ontology for use in
determining the meaning of words in text, in the
context of creating a “pattern dictionary”. How do
words cluster in paradigmatic lexical sets in actual
usage (as reflected in a large corpus), and can these
clusters be mapped onto a semantically structured
ontology? What semantic notions need to be
distinguished for this purpose, and what are the
appropriate theoretical foundations? What other
elements are needed for the application of
determining meaning in text?
Hanks Patrick
Editorial: Cognition and the Lexicon
In: Lexicology, (Ed. Hanks P.), Volume: 5, Routledge, Taylor and Francis Group, 2007.
ISBN: 978-0-415-70098-6
Hanks Patrick
Editorial: Formal Approaches to the Lexicon
In: Lexicology, (Ed. Hanks P.), Volume: 6, Routledge, Taylor and Francis Group, 2007.
ISBN: 978-0-415-70098-6
Hlaváčková D., Pala Karel
Surface and Deep Valency Frames in Czech
In: Proceedings of the 25th International Conference on Lexis and Grammar, 2007.
(in_print)
Presented at: The 25th International Conference on Lexis and Grammar, 6.9.-10.9.2006, Palermo,
Italy.
Hlaváčková D., Pala Karel
Computer Processing Derivational Relations in Czech
In: Computer Treatment of Slavic and East European Languages, L. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, 2007, pp. 198-208.
Presented at: Slovko 2007, 25.-27.10.2007, Bratislava,
Slovakia.
In the paper we deal with the derivational relations in Czech that form typical derivational nests (or subnets). Derivational relations are mostly of semantic nature and their regularity in Czech allows us to describe them in a way suitable for computer processing and then add them to the electronic databases such as WordNet almost automatically. For this purpose we have used the derivational version of morphological analyzer Ajka that is able to handle the basic and most productive derivational relations in Czech. A special derivational interface has been developed in our NLP Lab at FI MU by means of which we have explored the semantic nature of the selected noun derivational suffixes (22) as well as verb prefixes and established a set of the semantically labeled derivational relations, presently 14. With regard to the verbs we have paid attention to the selected verb semantic classes in connection with the derivational relations between selected prefixes (4) and corresponding Czech verbs. As an application we have added the selected derivational relations to the Czech WordNet and in this way enriched it with approx. 30 000 new Czech synsets.
Horák Aleš, Rambousek Adam
Administration Framework for the DEB Dictionary Server
In: Computer Treatment of Slavic and East European Languages, L. Štúr Institute of Linguistics, Slovak Academy of Sciences, Bratislava, 2007, pp. 70-79.
Presented at: Slovko 2007, 25.-27.10.2007, Bratislava,
Slovakia.
This paper presents a new implementation of administration framework for the DEBII dictionary writing system. We present the details and examples of the user management part as well as graphical scenarios for dictionary service setup, adaptation and automatic generation of user application based on the dictionary XML schema.
Tento článek představuje novou implementaci administračního rozhraní systému pro tvorbu slovníků DEBII. V článku je podrobně popsán systém správy uživatelů a také grafikou doplněný postup vytvoření nového slovníku, jeho přizpůsobení a automatické generování uživatelské aplikace pomocí XML schématu slovníku.
Horák Aleš, Rambousek Adam
DEB Platform Deployment - Current Applications
In: RASLAN 2007: Recent Advances in Slavonic Natural Language Processing, Masaryk University, Brno, 2007, pp. 3-11.
In this paper, we summarize the latest development regarding the client dictionary writing applications based on the DEB development platform. The DEB framework is nowadays used in several full grown projects for preparation of high quality lexicographic data created within (possibly distant) teams of researchers. We briefly present the current list of DEB applications with the relevant projects and their phases. For each of the applications, we offer display the view of the interface with overview description of the most important features.
Horák Aleš, Rambousek Adam
Dictionary Management System for DEB Development Platform
In: NLPCS 2007: Proceedings of the 4th International Workshop on Natural Language Processing and Cognitive Science, INSTICC PRESS, Funchal, Portugal, 2007, pp. 129-138.
Presented at: NLPCS 2007, 12.-16.6.2007, Funchal,
Madeira - Portugal.
In the paper, we introduce new dictionary management interface for design, preparation and presentation of generic electronic XML dictionaries using the DEB (Dictionary Editing and Browsing) development platform. The DEB platform provides a strict client-server environment for general dictionary writing systems. So far several successful NLP tools have been implemented on this platform, one of the most known being the DEBVisDic tool for wordnet semantic network editing and visualization. This paper describes a new part of the DEB platform -- the Administration interface that is shared by all DEB applications running on one server machine.
Článek představuje nové rozhraní pro správu slovníků, které umožňuje návrh, přípravu a prezentaci obecných elektronických slovníků ve formátu XML s použitím vývojové platformy DEB (Dictionary Editing and Browsing). Platforma DEB poskytuje prostředí v architektuře klient-server pro obecné systémy pro vytváření slovníků. V současné době bylo na této platformě implementováno několik úspěšných NLP nástrojů, nejznámnější je nástroj DEBVisDic pro editaci a vizualizaci sémantických sítí typu wordnet. Tento článek popisuje novou součást platformy DEB -- administrační rozhraní, které sdílí všechny DEB aplikace spuštěné na jednom serveru.
Horák Aleš, Holan Tomáš, Kadlec V., Kovář Vojtěch
Dependency and Phrasal Parsers of the Czech Language: A Comparison
In: Proceedings of Text, Speech and Dialogue 2007, Springer, LNAI 4629, Berlin, Heidelberg, 2007, pp. 76-84.
Presented at: TSD 2007, 3.-7.9.2007, Plzeň,
Czech Republic.
In the paper, we present the results of an experiment with comparing the effectiveness of real text parsers of Czech language based on completely different approaches stochastic parsers that provide dependency trees as their outputs and a meta-grammar parser that generates a resulting chart structure representing a packed forest of phrasal derivation trees.
We describe and formulate main questions and problems accompanying such experiment, try to offer answers to these questions and finally display also factual results of the tests measured on 10 thousand Czech sentences.
Kovář Vojtěch, Horák Aleš
Reducing the Number of Resulting Parsing Trees for the Czech Language Using the Beautified Chart Method
In: Proceedings of 3rd Language and Technology Conference, Wydawnictwo Poznańskie, Poznań, 2007, pp. 433-437.
Presented at: LTC`07, 5.-7.2007, Poznań,
Poland.
In the paper, we present the beautified chart method used for reducing the number of output derivation trees for the Czech syntax parser synt. We show the evaluation results of the method, describe the appropriate algorithms and the parser internal data structures as well as problems with their implementation.
Článek popisuje metodu beautified chart pro omezení počtu výstupních derivačních stromů syntaktického analyzátoru češtiny synt. Je popsána naměřená úspěšnost metody, příslušné algoritmy, datové struktury a některé problémy při implementaci.
Nováček Vít
Imprecise Empirical Ontology Refinement: Application to Taxonomy Acquisition
In: Proceedings of ICEIS 2007, Kluwer Academic Publishing, Artificial Intelligence and Decision Support Systems, London, 2007, pp. 8.
(in_print)
Enterprise Information Systems (ICEIS 2007, revised selected papers), Springer, 2007, pp. 8.
(in_print)
Presented at: ICEIS 2007, 12.-16.6.2007, Funchal,
Madeira - Portugal.
Nováček Vít, Laera Loredana, Handschuh Siegfried
Dynamic Integration of Medical Ontologies in Large Scale
In: Proceedings of WWW2007/HCLSDI, ACM Press, New York, 2007, pp. 10.
(in_print)
Nováček Vít, Laera Loredana, Handschuh Siegfried
Aiding the Data Integration in Medicinal Settings by Means of Semantic Technologies
In: Making Semantics Work for Business, Semantic Technology Institutes International Workshop at European Semantic Technology Conference, Vienna, Austria, 2007.
(in_print)
Nováček Vít
A Non-traditional Inference Paradigm for Learned Ontologies
In: Proceedings of ESWC 2007 PhD Symposium, CEUR Workshop proceedings Workshop at ESWC 2007, Innsbruck, 2007.
Nováček Vít, Laera Loredana, Handschuh Siegfried
Semi-automatic Integration of Learned Ontologies into a Collaborative Framework
In: Proceedings of IWOD/ESWC 2007, Springer Verlag, Innsbruck, 2007, pp. 14.
(in_print)
Nováček Vít, Dabrowski Maciej, Kruk Sebastian R.
Extending Community Ontology Using Automatically Generated Suggestions
In: Proceedings of FLAIRS 2007, AAAI Press, Menlo Park, CA, 2007, pp. 6.
(in_print)
Nováček Vít, Handschuh Siegfried, Laera Loredana, Maynard Diana, Voelkel Max
Dynamic Ontology Lifecycle Scenario in Translational Medicine
In: Proceedings of the 5th European Conference of Computational Biology (ECCB 2006) - Book of Abstracts, Oxford University Press, Oxford, 2007, pp. 5.
(in print)
Novák David, Zezula Pavel
LOBS: Load Balancing for Similarity Peer-to-Peer Structures
Technical Report: FIMU-RS-2007-04, Faculty of Informatics, Masaryk University, Brno, 2007, 22 p.
Novák David
Image Similarity Search: Theory and Practice
In: Third Doctoral Workshop on Mathematical and Engineering Methods in Computer Science MEMICS 2007, Masaryk University and Technical University of Brno, Brno, 2007, pp. 154-160.
Presented at: MEMICS 2007, 26.10.-28.10.2007, Znojmo,
Czech Republic.
Novák David, Zezula Pavel
LOBS: Load Balancing for Similarity Peer-to-Peer Structures
In: Databases Information Systems and Peer-to-Peer Computing 2007, Springer Verlag, Berlin Heidelberg New York, 2007, pp. 1-8.
Presented at: DBISP2P 2007, 24.9.2007, Vienna,
Austria.
Novák David, Batko Michal, Dohnal Vlastislav, Zezula Pavel
Scaling up the Image Content-based Retrieval
In: Second DELOS Conference 2007 - Working Notes, DELOS Network of Excellence, Pisa, Italy, 2007, pp. 1-10.
Presented at: DELOS Conference 2007, 13-14.2.2007, Pisa,
Italy.
Pala Karel, Horák Aleš, Rambousek Adam, Vetulani Zygmunt, Konieczka Paweł, Marciniak Jacek, Obrębski Tomasz, Rzepecki Przemysław, Walkowska Justyna
DEB Platform tools for effective development of WordNets in application to PolNet
In: Proceedings of 3rd Language & Technology Conference, Fundacja Uniwersytetu im. A. Mickiewicza, Poznań, 2007, pp. 514-518.
Presented at: LTC`07, 5.-7.2007, Poznań,
Poland.
Pomikálek Jan, Řehůřek R.
The Influence of Preprocessing Parameters on Text Categorization
In: International Conference on Computer, Information and Systems Science and Engineering, Springer, 2007.
(in_print)
Presented at: XIX International Conference on Computer, Information and Systems Science and Engineering, 29.1.-31.1.2007, Bangkok,
Thailand.
Rychlý Pavel, Kovář Vojtěch
Displaying Bidirectional Text Concordances in KWIC format
In: Proceedings of 5th Biennial Conference of the Asian Association for Lexicography, University of Madras, Chennai, India, 2007, pp. 96-100.
Presented at: Asialex 2007, 6.-8.12.2007, Chennai,
India.
Rychlý Pavel
Manatee/Bonito - A Modular Corpus Manager
In: RASLAN 2007: Recent Advances in Slavonic Natural Language Processing, Masaryk University, Brno, 2007, pp. 97-102.
Rychlý Pavel, Kilgarriff A.
An Efficient Algorithm for Building a distributed Thesaurus (and other Sketch Engine Development)
In: Association for Computational Linguistics, Proceedings of the ACL 2007 Demo and Poster Sessions, Prague, 2007, pp. 41-44.
Presented at: ACL 2007, 23.-30.6.2007, Prague,
Czech Republic.
Sedmidubský Jan, Bartoň Stanislav, Dohnal Vlastislav, Zezula Pavel
Adaptive Approximate Similarity Searching through Metric Social Networks
Technical Report: FIMU-RS-2007-06, Faculty of Informatics, Masaryk University, Brno, 2007, 22 p.
Exploiting the concepts of social networking represents a novel approach to the approximate
similarity query processing. We present an unstructured and dynamic P2P environment in
which a metric social network is built. Social communities of peers giving similar results
to specific queries are established and such ties are exploited for answering future queries.
Based on the universal law of generalization, a new query forwarding algorithmis introduced
and evaluated. The same principle is used to manage query histories of individual peers with
the possibility to tune the tradeoff between the extent of the history and the level of the queryanswer
approximation. All proposed algorithms are tested on real data and medium-sized
P2P networks consisting of tens of computers.
Sedmidubský Jan, Bartoň Stanislav, Dohnal Vlastislav, Zezula Pavel
Querying Similarity in Metric Social Networks
In: Network-Based Information Systems, First International Conference, NBiS 2007, Springer, Berlin, 2007, pp. 278-287.
Presented at: NBiS 2007, 3.-7.9.2007, Regensburg,
Germany.
In this paper we tackle the issues of exploiting the concepts of social networking in processing similarity queries in the environment of a P2P network. The processed similarity queries are laying the base on which the relationships among peers are created. Consequently, the communities encompassing similar data emerge in the network. The architecture of the presented metric social network is formally defined using the acquaintance and friendship relations. Two version of the navigation algorithm are presented and thoroughly experimentally evaluated. Finally, learning ability of the metric social network is presented and discussed.
Zezula Pavel, Giuseppe Amato, Dohnal Vlastislav
Similarity Search: The Metric Space Approach
In: ACM SAC 2007 Conference. ACM SAC 2007 Conference Tutorial, ACM, Seoul, Korea, 2007.
Presented at: ACM SAC 2007, , Seoul,
Korea.
Similarity searching has become afundamental computational task in a variety of application areas, including multimedia information retrieval, data mining, pattern recognition, machine learning, computer vision, biomedical databases, data compression and statistical data analysis. In such environments, an exact match has little meaning, and proximity/distance (similarity/dissimilarity) concepts are typically much more fruitful for searching. In this tutorial, we review the state of the art in developing similarity search mechanisms that accept the metric space paradigm. We explain the high extensibility of the metric space approach and demonstrate its capability with examples of distance functions. The efforts to further speed up retrieval are demonstrated by a class of approximated techniques and the very recent proposals of scalable and distributed structures based on the P2P communication paradigm.
Similarity searching has become afundamental computational task in a variety of application areas, including multimedia information retrieval, data mining, pattern recognition, machine learning, computer vision, biomedical databases, data compression and statistical data analysis. In such environments, an exact match has little meaning, and proximity/distance (similarity/dissimilarity) concepts are typically much more fruitful for searching. In this tutorial, we review the state of the art in developing similarity search mechanisms that accept the metric space paradigm. We explain the high extensibility of the metric space approach and demonstrate its capability with examples of distance functions. The efforts to further speed up retrieval are demonstrated by a class of approximated techniques and the very recent proposals of scalable and distributed structures based on the P2P communication paradigm.