AN APPROACH FOR CLONE DETECTION IN DOCUMENTATION REUSE
Annotation
The paper focuses on the searching method for repetitions in DocBook/DRL or plain text documents. An algorithm has been designed based on software clone detection. The algorithm supports filtering results: clones are rejected if clone length in the group is less than 5 symbols, intersection of clone groups is eliminated, meaningfulness clones are removed, the groups containing clones consisting only of XML are eliminated. Remaining search is supported: found clones are extracted from the documentation, and clone search is repeated. One step is proved to be enough. Adaptive reuse technique of Paul Bassett – Stan Jarzabek has been implemented. A software tool has been developed on the basis of the algorithm. The tool supports setting parameters for repetitions detection and visualization of the obtained results. The tool is integrated into DocLine document development environment, and provides refactoring of documents using found clones. The Clone Miner clone detection utility is used for clones search. The method has been evaluated for Linux Kernel Documentation (29 documents, 25000 lines). Five semantic kinds of clones have been selected: terms (abbreviations, one word and two word terms), hyperlinks, license agreements, functionality description, and code examples. 451 meaningful clone groups have been found, average clone length is 4.43 tokens, and average number of clones in a group is 3.56.
Keywords
Постоянный URL
Articles in current issue
- PHOTONICS AND OPTICAL INFORMATICS IN EUROPE: TRENDS OF 2003–2013
- TWO-DIMENSIONAL LOCALIZATION OF ATOMIC POPULATIONS IN FOUR-LEVEL QUANTUM SYSTEMS
- THE RECURRENT ALGORITHM FOR INTERFEROMETRIC SIGNALS PROCESSING BASED ON MULTI-CLOUD PREDICTION MODEL
- INVESTIGATION OF BIOLOGICAL OBJECTS IN OPTICAL COHERENCE TOMOGRAPHY WITH DATA PROCESSING BY SEQUENTIAL MONTE CARLO METHOD
- AUTOMATIC CALIBRATION METHOD FOR STEREOSCOPIC SYSTEM
- METHOD OF IMAGE QUALITY ENHANCEMENT FOR SPACE OBJECTS
- ROBUST REGULATION FOR SYSTEMS WITH POLYNOMIAL NONLINEARITY APPLIED TO RAPID THERMAL PROCESSES
- NANOSTRUCTURING AS A WAY FOR THERMOELECTRIC EFFICIENCY IMPROVEMENT
- SPECTRAL AND LUMINESCENT PROPERTIES OF CHROMIUM IONS IN FORSTERITE-LIKE NANO-GLASS CERAMICS
- SPECTRAL AND LUMINESCENT PROPERTIES OF FLUOROPHOSPHATE GLASSES DOPED WITH YTTERBIUM AND ERBIUM
- PARAMETERS OPTIMIZATION OF METAL-DIELECTRIC NANOSTRUCTURES FOR SENSOR APPLICATIONS
- HLD-METHODOLOGY APPLICATION FOR RECONFIGURABLE EMBEDDED SYSTEMS DESIGN
- METHOD OF HIGH-QUALITY SPEECH SYNTHESIS WITH A SMALL DATABASE USAGE
- DETECTION OF CLIPPED FRAGMENTS IN ACOUSTIC SIGNALS
- TWO-LEVEL HIERARCHICAL COORDINATION QUEUING METHOD FOR TELECOMMUNICATION NETWORK NODES
- AN APPROACH FOR CLONE DETECTION IN DOCUMENTATION REUSE
- EFFECTIVENESS ASSESSMENT METHODOLOGY OF INFORMATION SECURITY MANAGEMENT SYSTEM THROUGH THE SYSTEM RESPONSE TIME TO INFORMATION SECURITY INCIDENTS
- MOVING PERSON IDENTIFICATION IN VIDEO SURVEILLANCE SYSTEMS
- MULTISENSOR SYSTEM APPLICATION FOR PREPARATIONS BITTERNESS EVALUATION IN TRADITIONAL CHINESE MEDICINE
- ACCURACY EVALUATION FOR THE NON-CONTACT DEFECT AREA MEASUREMENT AT THE COMPLEX-SHAPE SURFACES UNDER VIDEOENDOSCOPIC CONTROL
- COMPARATIVE ANALYSIS OF ENERGY ACCUMULATION SYSTEMS AND DETERMINATION OF OPTIMAL APPLICATION AREAS FOR MODERN SUPER FLYWHEELS
- MULTI-GRID METHOD OF CONVERGENCE SPEEDING-UP FOR THE SOLUTION OF GAS DYNAMICS PROBLEMS ON UNSTRUCTURED MESHES
- EXTENSION OF TENSOR PRODUCT FOR OPERATORS ON THE DIRAC OPERATOR EXAMPLE
- MOLECULAR DYNAMIC SIMULATION OF PEPTIDE POLYELECTROLYTES
- IDENTIFICATION OF NONLINEAR MODEL PARAMETERS FOR RAPID THERMAL PROCESSES