Los Andes University Banner

Visual Analytics Architecture for large table-based datasets

Sampling, summarize and exploration of big datasets to provide the user with a representative and understandable sample of the data .

Visual Analytics provide the user with tools to process data in a very intuitive way. One of the challenges Visual Analytics face nowadays is the need to represent big amounts of information in a way that the user can explore. This large amounts of data can not be managed by conventional machines and must be partitioned or underrepresented. This thesis project presents a technique of representative sampling for large table-based datasets. The dataset is sample every k step in order to obtain a representative collection of data from the original dataset. Experiments are made to determine the best sampling method to be use in datasets of different sizes and varying the step expected between samples.

Github repository


John Alexis Guerra Gómez
Assistant Professor