Model your problem using a flexible probabilistic language based on graphical models. Then, fit it with data using a Bayesian approach to handle modelling uncertainty.
AMIDST provides tailored parallel and distributed implementations of Bayesian parameter learning (and probabilistic inference) for batch and streaming data. This processing is based on flexible and scalable message passing algorithms.
Specify your model using probabilistic graphical models with latent variables and temporal dependencies
Perform inference on your probabilistic models with powerful approximate and scalable algorithms.
Update your models when new data is available. This makes our toolbox appropriate for learning from data streams.
Use your defined models to process massive data sets in a distributed computer cluster using Flink or Spark.
Code your models or algorithms within AMiDST and expand the toolbox functionalities. Flexible toolbox for academics performing their experimentation in machine learning.
Leverage existing functionalities and algorithms by interfacing to existing software tools such as Hugin, MOA, Weka, R, etc
//Load the datastream
String path = "datasets/simulated/";
String filename = path+"BCC_month0.arff";
DataStream data = DataStreamLoader.open(filename);
//Learn the model
Model model = new GaussianMixture(data.getAttributes());
model.updateModel(data);
BayesianNetwork bn = model.getModel();
//Set-up Flink session.
final ExecutionEnvironment env = ExecutionEnvironment.getExecutionEnvironment();
//Load the data stream (with Flink)
String path = "datasets/simulated/";
String filename = path+"BCCDist_month0.arff";
DataFlink data = DataFlinkLoader.loadDataFromFolder(env, filename, false);
//Learn the model
Model model = new GaussianMixture(data.getAttributes());
model.updateModel(data);
BayesianNetwork bn = model.getModel();
/* This feature is still under development */
AMIDST Toolbox has been used to do risk prediction in credit operations, and as data is collected continuously and reported on a monthly basis, this gives rise to a streaming data classification problem. This work has been performed in collaboration with one of our partners, the Spanish bank BCC.
AMIDST Toolbox has been used to prototype models for early recognition of traffic maneuver intentions. Similarly to the previous case, data is continuously collected by car on-board sensors giving rise to a large and quickly evolving data stream. This work has been performed in collaboration with one of our partners, DAIMLER.
If you have any question about the toolbox or if you want to collaborate in the project, please do not hesitate to contact us. You can do it through the following email address.
CONTACTThis software was performed as part of the AMIDST project. AMIDST has received funding from the European Union’s Seventh Framework Programme for research, technological development and demonstration under grant agreement no 619209.