Data Management in BayesiaLab

Data Import Wizard

BayesiaLab’s Data Import Wizard allows you to connect to a variety of data sources. The wizard guides you step-by-step through the all necessary pre-processing tasks, in preparation for subsequent machine learning.
For moderately-sized datasets, importing data in CSV format is a convenient option.
BayesiaLab can also connect to SQL-compatible database server, e.g. MySQL, PostgreSQL, SQL Server, Amazon Redshift, etc.

Discretization

BayesiaLab processes all data on a discretized basis. As part of BayesiaLab’s Data Import Wizard, a number of methods are available to discretize any continuous variables. In BayesiaLab, all “parameters” describing probabilistic relationships between variables are contained in conditional probability tables (or cubes/hypercubes when two dimensions are exceeded), which means that no functional forms are utilized. Given this nonparametric, discrete approach, BayesiaLab can implicitly handle highly nonlinear relationships between variables.

Available Discretization Algorithms:

  • Decision Tree
  • Density Approximation
  • K-Means
  • Normalized Equal Distances
  • Equal Distances
  • Equal Frequencies

Data Import Workflow

Missing Values Processing

BayesiaLab offers a range of sophisticated methods for missing values processing from which you can choose. During network learning, BayesiaLab performs missing values processing automatically “behind the scenes.” More specifically, the Structural Expectation-Maximization algorithm and the Dynamic Completion algorithm are automatically applied after each modification of the network during learning, i.e. after every single arc addition, suppression and inversion.

Screenshots