High Performance Data Mining: Scaling Algorithms, by Yike Guo, R.L. Grossman

By Yike Guo, R.L. Grossman

High functionality facts Mining: Scaling Algorithms, functions and Systems brings jointly in a single position vital contributions and up to date examine leads to this speedy relocating sector.
High functionality information Mining: Scaling Algorithms, purposes and Systems serves as an outstanding reference, supplying perception into one of the most demanding examine concerns within the box.

Show description

Read or Download High Performance Data Mining: Scaling Algorithms, Applications and Systems PDF

Best organization and data processing books

Visual and Spatial Analysis - Advances in Data Mining, Reasoning, and Problem Solving Boris Kovalerchuk (Springer 2004 596s)

Complex visible research and challenge fixing has been performed effectively for millennia. The Pythagorean Theorem used to be confirmed utilizing visible ability greater than 2000 years in the past. within the nineteenth century, John Snow stopped a cholera epidemic in London by way of presenting particular water pump be close down. He came upon that pump by way of visually correlating info on a urban map.

Entertainment Computing – ICEC 2004: Third International Conference, Eindhoven, The Netherlands, September 1-3, 2004. Proceedings

The development of knowledge and verbal exchange applied sciences (ICT) has enabled huge use of ICT and facilitated using ICT within the deepest and private area. ICT-related industries are directing their enterprise ambitions to domestic purposes. between those functions, leisure will differentiate ICT purposes within the inner most and private marketplace from the of?

Theory of Relational Databases

The speculation of Relational Databases. David Maier. Copyright 1983, machine technology Press, Rockville. Hardcover in first-class . markings. NO airborne dirt and dust jacket. Shelved in know-how. The Bookman serving Colorado Springs considering the fact that 1990.

Extra resources for High Performance Data Mining: Scaling Algorithms, Applications and Systems

Sample text

We have the following three design decisions: 1. How to partition the MBRs of the leaf nodes such that nearby rectangles are in the same partition, and the size of each partition is almost the same? 2. How to distribute the partitions of rectangles onto the computers? 3. How to replicate the index among 1 computers? For the first question, we propose to use space filling Hilbert curves to achieve good clustering. In a k-dimensional space, a space-filling curve starts with a path on a k-dimensional grid of side 2.

If the space utilization of the R*-tree is high (near 100%), the number of objects on every data page will be almost the same. 100% space utilization can be achieved by using index packing techniques (cf. Kamel and Faloutsos, 1993). 2. 0LQLPL]HGFRPPXQLFDWLRQFRVWNearby objects are assigned to the same computer by partitioning data pages using Hilbert curves. 3. 'LVWULEXWHGGDWDDFFHVVLocal and remote data can be efficiently accessed (cf. Lemma 2). , 1996), to distributed spatial index structures onto several computers.

The replicated index provides an efficient access of data, and the interference between computers is also minimized through the local access of the data. The slave-to-slave and master-to-slaves communication is implemented by message passing. The master manages the task of dynamic load balancing and merges the results produced by the slaves. We implemented our method on a number of workstations connected via Ethernet (10 Mbit). A performance evaluation shows that PDBSCAN scales up very well and has excellent speedup and sizeup behavior.

Download PDF sample

Rated 4.63 of 5 – based on 41 votes