FEATURES OF NON-LOCAL SEMANTIC LINKS IN RUSSIAN TEXTS
Annotation
Subject of Research. One of the ways of automatic text analysis is the construction of subordination trees, in which the words of a sentence are connected with each other by semantic-syntactic links. The field of research is Russian-language texts, which have a general political, artistic and highly specialized character. Special attention is paid to the cases when the words are connected being far from each other at a considerable distance. Method. The subordination trees were built with the help of semantic-syntactical parser.Then the calculation of the distribution of links of different types by lengths was performed. The appearance frequencies of nonlocal links are studied. Main Results. It is shown that the fraction of non-local connections depending on the type can reach up to tens of percent. This is especially important for links coming from predicate nodes (subject, adverbial, etc.), as well as for anaphoric ones. It is noted that publicly available semantic classifiers and thesaurus have limited applicability for solving the problem of correct linking of remoted words in a sentence. Practical Relevance. It is shown that when solving the problem of extracting information that is ontological or scenario-based, as well as coreference, the long syntactic links that form the non-local semantic context cannot be neglected. The conclusion is drawn that the analysis of n-grams only is insufficient for the adequate selection of information from the text that is ontological or scenario. In this regard, there is a need to compile micro-dictionaries, focused on certain syntactic structures.
Keywords
Постоянный URL
Articles in current issue
- AUTOMATIC OBJECT CLASSIFICATION ACCORDING TO 3D-LIDAR DATA BASED ON SINGLE-PHOTON COUNTING TECHNOLOGY
- RESEARCH OF ORAL MUCOSA REGENERATION AFTER FRACTIONAL TREATMENT BY DIODE LASER WITH 980 NM WAVELENGTH
- RESEARCH OF HUMAN BLOOD OPTICAL PROPERTIES WITH CONCENTRATION CHANGES OF BLOOD COMPONENTS IN TERAHERTZ FREQUENCY RANGE
- STUDY OF OPTICAL PROPERTIES AND SPECTRAL CHARACTERISTICS OF BRAIN GLIOBLASTOMA AND LUNG ADENOCARCINOMA
- FORMATION OF INSULATING BARRIERS IN SILICA POROUS FILMS BY CO2 LASER WRITING
SPECTRAL CHARACTERISTICS STUDY OF PHASE-SHIFTED FIBER BRAGG GRATINGS UNDER PRESSURE APPLIED PERPENDICULAR TO FIBER AXIS
- POSSIBILITY OF LOW ALTITUDES MEASUREMENT ABOVE SEA SURFACE UNDER CONDITIONS OF HAZE AND FOG
- METHOD OF GAS-DISCHAGE VISUALIZATION FOR DETERMINATION OF PATHOLOGIES OF BIOLOGICAL TISSUES
- ALGORITHM OF ADAPTIVE OUTPUT CONTROL OF LINEAR SYSTEM WITH IMPROVED PARAMETRIC CONVERGENCE
- CONTROL OF THE MECHATRONIC SYSTEM WITH FLEXIBLE ROTATING LINK: THEORY AND EXPERIMENT
- STRUCTURE RESEARCH OF FILM COMPOSITIONS BASED ON CHITOSAN/ POLYHYDROXYBUTYRATE BLEND BY INFRARED AND X-RAY FLUORESCENCE SPECTROSCOPY
- MODELING AND ALGORITHMIC PROVISION OF DYNAMIC INDENTIATION PROCESS
- TESTING AND DEBUGGING OF EMBEDDED COMPUTING SYSTEMS BASED ON LEVEL MODELS
- PATTERN RECOGNITION METHODS IN CASE OF VISUAL INFORMATION SEMANTIC INTEGRITY VIOLATIONS
- ANALYSIS OF INFORMATION INTERACTION SECURITY WITHIN GROUP OF UNMANNED AERIAL VEHICLES
- COMPUTATIONALLY EFFECTIVE NUMERICAL SIMULATION METHOD FOR DIFFRACTION-BLURRED IMAGES OF OBJECTS WITH PIECEWISE-LINEAR EDGE CONTOUR
- COVERT CHANNEL TECHNIQUE BASED ON STREAMING PROTOCOL
- METHODS OF LIFE CYCLE INCREASE FOR THE INTERNET OF THINGS
- SPEECH ACQUISITION IN NOISY ENVIRONMENTS USING DUAL MICROPHONE ARRAYS
- INFORMATION ON UTILIZATION OF DATA CENTER RESOURCES WITH MESSAGE BROKER IMPLEMENTATION
- DETERMINATION OF OVERLAPPING REGION FOR ELECTRONIC MODULE IMAGES
- RESEARCH OF EMOJI ROLE IN ONLINE COMMUNITY
- COMPARISON OF APPROACHES TO UNKNOWN PARAMETERS IDENTIFICATION IN GYRO DRIFT MODEL
- PSEUDORANDOM NUMBER GENERATOR ON CELLULAR AUTOMATA
- RESULTS OF EXPERIMENTAL RESEARCH AND SIMULATION OF MULTI-CORE FIBER WAVEGUIDE WITH FIBER BRAGG GRATINGS ARRAY
FREQUENCY DETERMINATION OF PULSE SIGNAL WITH CONSTANT BEAT BY DINT OF RECURRENT USAGE OF FOURIER TRANSFORM