![Scientific and Technical Journal of Information Technologies, Mechanics and Optics](/images/mag-ntv.png)
PROBABILITY DISTRIBUTION OVER THE SET OF CLASSES IN ARABIC DIALECT CLASSIFICATION TASK
![Scientific and Technical Journal of Information Technologies, Mechanics and Optics](/images/mag-ntv.png)
Annotation
Subject of Research.We propose an approach for solving machine learning classification problem that uses the information about the probability distribution on the training data class label set. The algorithm is illustrated on a complex natural language processing task - classification of Arabic dialects. Method. Each object in the training set is associated with a probability distribution over the class label set instead of a particular class label. The proposed approach solves the classification problem taking into account the probability distribution over the class label set to improve the quality of the built classifier. Main Results. The suggested approach is illustrated on the automatic Arabic dialects classification example. Mined from the Twitter social network, the analyzed data contain word-marks and belong to the following six Arabic dialects: Saudi, Levantine, Algerian, Egyptian, Iraq, Jordan, and to the modern standard Arabic (MSA). The paper results demonstrate an increase of the quality of the built classifier achieved by taking into account probability distributions over the set of classes. Experiments carried out show that even relatively naive accounting of the probability distributions improves the precision of the classifier from 44% to 67%. Practical Relevance. Our approach and corresponding algorithm could be effectively used in situations when a manual annotation process performed by experts is connected with significant financial and time resources, but it is possible to create a system of heuristic rules. The implementation of the proposed algorithm enables to decrease significantly the data preparation expenses without substantial losses in the precision of the classification.
Keywords
Постоянный URL
Articles in current issue
- NANOSCALE STRUCTURES GENERATION WITHIN THE SURFACE LAYER OF METALS WITH SHORT UV LASER PULSES
- INKJET PRINTING OF ALUMOOXIDE SOL FOR DEPOSITION OF ANTIREFLECTING COATINGS
- INCREASED IMAGE QUALITY BY SYNTHESIZING SPACE PHOTOS WITH DIFFERENT EXPOSURES
- ROBUST CONTROL ALGORITHM FOR MULTIVARIABLE PLANTS WITH QUANTIZED OUTPUT
- COLLAPSE KINETIC OF COMPOSITES BASED ON COPOLYMERS OF ACRYLIC ACID AND ACRYLAMIDE FILLED WITH BENTONITE IN AQUEOUS SOLUTIONS OF POLYVALENT METALS
- FORMATION OF NANOSTRUCTURED CuO FILM ON FLUOROPHOSPHATE GLASS SURFACE
- VIRTUAL REALITY FOR MANAGEMENT OF SITUATIONAL AWARENESS DURING GLOBAL MASS GATHERINGS
- MUTUAL IMAGE TRANSFORMATION ALGORITHMS FOR VISUAL INFORMATION PROCESSING AND RETRIEVAL
- AUTOMATIC ANALYSIS OF LOCAL ROUTES AND ADJACENT HOUSE TERRITORY FOR URBAN PLANNING SUPPORT
- METHOD OF RARE TERM CONTRASTIVE EXTRACTION FROM NATURAL LANGUAGE TEXTS
- ANALYSIS OF STATISTICAL DATA FROM NETWORK INFRASTRUCTURE MONITORING TO DETECT ABNORMAL BEHAVIOR OF SYSTEM LOCAL SEGMENTS
- ON INFORMATION SECURITY SOLUTIONS APPLICABLE TO D2D COMMUNICATIONS WITHIN THE 5G DOMAIN: ANALYZING THE INFLUENCE OF USER MOBILITY
- DYNAMIC FEATURE SELECTION FOR WEB USER IDENTIFICATION ON LINGUISTIC AND STYLISTIC FEATURES OF ONLINE TEXTS
- DEEP LEARNING MODEL FOR BILINGUAL SENTIMENT CLASSIFICATION OF SHORT TEXTS
- INNOVATIVE HEAT FLUX SENSOR
- APPROACH TO SYNTHESIS OF PASSIVE INFRARED DETECTORS BASED ON QUASI-POINT MODEL OF QUALIFIED INTRUDER
- NUMERICAL SIMULATION OF MASS TRANSFER IN CENTRIFUGAL EVAPORATOR
- A CALCULATION OF SEMI-EMPIRICAL ONE-ELECTRON WAVE FUNCTIONS FOR MULTI-ELECTRON ATOMS USED FOR ELEMENTARY PROCESS SIMULATION IN NONLOCAL PLASMA
- PARAMETRICAL IDENTIFICATION OF DIFFERENTIAL-DIFFERENCE HEAT TRANSFER MODEL DURING LIDAR TEMPERATURE MONITORING
- EFFECT OF UV LASER ON SPECTRAL PROPERTIES OF BORATE GLASSES DOPED WITH COPPER CHLORIDE NANOCRYSTALS
- PROJECT ENGINEERING DATA MANAGEMENT AT AUTOMATED PREPARATION OF DESIGN DOCUMENTATION
- CONTROL SYSTEM FOR TILTABLE PLATE WITH TWO DEGREES OF FREEDOM FOR RESEARCH OF DYNAMIC MANIPULATION PROBLEMS
- APPARATUS FOR SURFACE TREATMENT OF FREE-FORM OBJECT BY LASER RADIATION
- AUTOMATED REMOTE MANAGEMENT AND CONTROL SYSTEM OF THE LABORATORY EQUIPMENT