Cluster analysis, also known as clustering, creates statistical data file decompositions based on a mutual object dissimilarity measure. When interpreting ...
Cluster analysis, also known as clustering, creates statistical data file decompositions based on a mutual object dissimilarity measure. When interpreting an input data as a matrix, each row vector represents mathematical notation of a particular object attributes, e.g. questionnaire answers of single person, while its columns contain values of a particular statistical criterion for all objects, e.g. all questioned respondents. During the clustering process the objects are grouped to the clusters containing those objects that are mutually much similar to each other than to objects contained in the other clusters. The dissimilarity measure is a metrics assigning the level of the dissimilarity as the value between 0 and 1 to each pair of objects taken from the given data file. The identical objects have dissimilarity equal to 0 while the completely different ones have its value equal to 1. There exists a lot of methods how to construct the dissimilarity measure and these depend on the criteria types being considered. The most commercial clustering applications expect only a single type for all criteria in order to be able to provide valid clustering results. However in the real life applications the criteria types in a single data file usually vary. Clusteryser is offering multiple clustering methods suitable for mixed data analysis: - k-prototypes method (covers k-means method and k-modes method) - 1000 objects limit, - polythetic agglomerative hierarchical clustering using single linkage, average linkage and complete linkage -100 objects limit, - monothetic divisive hierarchical clustering - 500 objects limit, - minimum skeleton based clustering - 100 objects limit. In order to provide correct results, there is full support for mixed criteria types. The following criteria types can be distinguished in a data file: - a quantitative - containing numeric values in interval or relative scales, these represent continuous characteristics and are usually implemented in available clustering methods, e.g. car engine power, - an ordinal - numeric values can be ordered like in the case of quantitative criterion, but their values do not represent any natural quantitative property allowing an evaluation of a mutual difference or a ratio for two values, e.g. employee seniority level, - a nominal - these criteria discretely distinguish values, however there does not exist any natural interpretation of a mutual ordering of their values, e.g. a color of the car, - a binary - these criteria have only two possible values. There might appear two variants - an occurrence of both values is either symmetrical or asymmetrical. Various criteria types combinations can be easily investigated. Clusteryzer accepts Microsoft Excel 2007 sheets (XLSX) and comma separated values files (csv) containing non-negative numerical data. Data is imported to the application either through iTunes file sharing or by e-mail attachments and resulting data can be exported in the same formats. The clustering results can be researched by Clusteryzer application and following views are available: - a hierarchical clustering view, - a grouped clustering to desired number of clusters containing values for objects, - a cluster distribution, - a dendrogram diagram for hierarchical methods (limited to certain depth), - cluster centroids for k-prototypes method, - 2D and 3D cluster modeling view capable of a model rotation, any criteria pair or triad can be investigated. The dendrogram and cluster modeling view can be exported as JPG images for further processing. Clusteryzer introduces an easy to use interface and allows straightforward investigation capabilities for managers, data miners and researchers. Clusteryser is a handy tool not just for on the go utilization, it makes the clustering investigation easier than ever.
Size: 4.88 MB
Price: 0,00 €
Day of release: 0000-00-0