CAREN - Class project Association Rule ENgine

An implementation using an algorithm based on a Depth-First expansion and BitMaps database representation. Caren was developed with the purpose of deriving Classification models. CAREN generates association rules for attribute/value and basket datasets. Caren implements three different methods for dealing with numeric attributes: binary, class intervals and Srikant discretization. It also implements different metrics for rules filtering and several formats for rules output. The package includes a classifier ($predict$) that uses prediction models made out of rules generated from the Caren engine. Caren derives and handles classification, numeric prediction and recommendation models. A module for pre-processing numeric attributes is also included ($convert$).
 

Developed under the FCT - POSI/2001 project CLASS - (Classificação e Associação usando Regras de Associação).

 


Download:

An improved depth-first expansion implementation of Caren is available. It is an enhanced version with more features. This version includes a new schema for representing rules and prediction models. The overal association rules engine was improved. New features include: Ensemble methods like Post-bagging and Iterative Reordering, Distribution rules, numeric prediction models using these rules (regression rules), rule selection using Fisher Exact Test, a new and optimized implementation of the improvement filter and several other performance speedups. It also implements Webb's Bonferroni-like layered critical values adjustment to cope with multiple hypothesis testing. For each association rule, Caren derives 13 distinct interest measures (confidence, lift, conviction, chi-squared statistics, laplace, leverage, jaccard, cosine, phi, mutual information, weighted relative accuracy, entropy and gini. Caren now includes a proposal to derive RULES describing CONTRAST SETS to detect the differences between contrasting groups.

The package now includes two versions of a C-shell scripts for performing N-cross validation. Folders are derived using WEKA stratified fold generation methods.

Read files "README-caren_command" for details.

The new caren 2.6 version is now released. It includes a truly rule based algorithm, new pruning filters, a contrast sets algorithm, a CMAR algorithm implementation, Jittering ensembles implementation, among several other features.


Contrast Sets datasets stucco and rcs


                    The new CAREN2.6 system (beta version updated 14/4/2010).


                    The CAREN2.5.2 system (updated 19/11/2009).



Old versions of caren

                    CARENCLASS2.5 system (21/02/2008).

                    CARENCLASS2.4.2 system (4/09/2007).

                    CARENCLASS2.4 system.

                    CARENCLASS2.3 system.


Reports:

Technical report describing our Apriori implementation (old version).

Technical report describing the Predict module and the new version of the Caren implementation.