HyTex Phase IModel architecture and strategies of text-to-hypertext-conversion The specialized text corpus and its editing TermNet - modeling and processing of terminological knowledge The results of the first phase are documented in the report on work and results—project B1 (PDF 184 KB). Practical results of the first phase are:
In you are interested in these results, do not hesitate to contact us. => Back to top Model architecture and strategies of text-to-hypertext-conversionIn the first phase we developed a model-architecture as a theoretical and methodical basis for the automatic text-to-hypertext conversion. This architecture exploits information from three layers for segmentation and linking according to coherence criteria. (cf. depiction ):
In developing strategies of text-to-hypertext conversion, we focused on the following subject areas:
Macro strategy: "Terminology-sensitive linking" A main problem of establishing coherence in the selective reception of specialized texts arises from the fact that a receiver is—with respect to the employment of terms—not able to decide which specific conceptualization the author based them on. In the subject area of sensitive linking, we develop a pragmatically established method which allows linking instances of specialized terms to the appropriate definition in the preceding text which is necessary to comprehend the correct meaning of the term in its current context.
=> Back to top The specialized text corpus and its editing The complete specialized text corpus consists of documents of different text types and contains approximately 25,000 standard pages. In order to distinguish the logical document structure of the corpus, we - in cooperation with the sub-project
=> Back to top Demo prototype HyTex.1We have nearly completed the development of a demo prototype by means of which the different strategies of text-to-hypertext conversion may be tested. For that purpose, the core corpus was annotated according to logical text structure and in regard to definitions and entities of term use; the annotation in regard to phenomena of co-reference and connectives is not yet completed. The strategies of text-to-hypertext conversion were applied (segmentation and linking). => Back to top TermNet - modeling and processing of terminological knowledge By borrowing from the description concepts introduced in Our statistics show the various units which have been modeled (TermSets, lexemes, different types of relations).
=> Back to top Technical implementationThe technical implementation is conducted on the basis of XML technologies. The different annotation layers are unified and converted into a web-based presentation format by means of the programming language XSLT. Doing this, the TermNet is analyzed as well. In future work, this transformation shall not be directly programmed by means of XSLT. Instead, it shall be programmed in HTTL (Hypertext Transformation Language), a programming language we developed ourselves for generating hypertext views. => Back to top ( Deutsch ) |