with a lot of help from all of u i succeed in making my code run for
traing and testing class thanks alot to u alll....but now i m facing
one other problem.... i want to see the prediction list as i can see
in weka when i trun on the option "output predictions" in which class
i m going to find the code for this... secondly i want to see the rows
as a full when i app the LVQ on it... is there any method exist with
which i can see both training and testing rowzin output...... and
there prediction... thanks a lot....
--
Regards,
Sufiyan Warraich
NUST Institute of Information Technology
Rawalpindi
Pakistan
0321-5186069

Hello, I read the code source of the alisamento procedure in model tree,
even so I don't still understand its operation.
The smoothing constant in the code source is 15. Why this value?
Starting from the lineal equation generated like me I do obtain the
planed model?
I used the codes below in some available example, even so, the results
were not correct. You can mention an example
SMOOTHING_CONSTANT = 15.0;
protected static double smoothingOriginal(double n, double pred,
double supportPred)
throws Exception {
double smoothed;
smoothed = ((n * pred) + (SMOOTHING_CONSTANT * supportPred)) / (n +
SMOOTHING_CONSTANT);
return smoothed;
}
and also
coefficients[i] + = ((SMOOTHING_CONSTANT * coeffsUsedByLinearModel[i]) /
(n + SMOOTHING_CONSTANT));
.
.
.

I'm not quite sure I understand your question. Are you asking why there
is no linear model for the internal node in your example? The reason is
that the model for the internal node has been combined with the
original model for a particular leaf node to form a new model for that
leaf node (based on the smoothing formula). You can do this because the
smoothing formula is a linear combination of two linear models, and a
linear combination of two linear functions yields a new linear
function.
To figure out how exactly a linear model for a particular leaf was
derived, you probably have to change the source code for M5, so that it
prints the original linear model for each node. Then you could verify
that the smoothed model at a leaf is a combination of the linear models
that occur along the path from the root to that leaf.
Cheers,
Eibe
On Jul 21, 2005, at 1:47 PM, Andreia Vieira wrote:
> Eike, a great amount of material exists on data mining, even so, the
> people stop mentioning practical examples what it could facilitate and
> a lot our understanding. I am studying the model tree, even so,
> arrested to the subject smooting of the model tree, because understand
> purpose of that task, but not understand as arriving to the result. I
> already read several goods, besides one that you are one of the
> authors (Using Model trees goes classification),bem as, the book
> written by Witten Ian, et al., even so, I don't understand as it is
> made the smoothing of the tree. I already placed in the list but I
> didn't have return. Please, help me in the practice to understand this
> procedure.
> I find two points of split (0,56 and 0,46). Pruning was the division
> 0,45, with 3 instances below and 5 above. Even so, I don't find the
> equation smooth for
> these observations. The tool WEKA supplies me the following result. I
> don't want you to solve the problem, I wanted to understand which the
> instances that were used to build the final model according to the
> equation described in several matters. p´= (np + kq)/(n+k)
>
>
> M5 pruned model tree:
> (using smoothed linear models)
>
> y2 <= 0.46 : LM1 (3/8.995%)
> y2 > 0.46 : LM2 (5/3.989%)
>
> LM num: 1 P = 7.5434 * y2 - 2.9596
> LM num: 2 P = 9.2628 * y2 - 3.7201
>
>
> Y
> 0,1833
> 0,2907
> 0,2949
> 0,6854
> 1,0000
> 1,2410
> 1,8264
> 1,9803
>
> X1
> 0,0566
> 0,0670
> 0,0789
> 0,0878
> 0,1241
> 0,1295
> 0,1566
> 0,1714
>
> X2
> 0,3499
> 0,4576
> 0,4613
> 0,8093
> 1,0848
> 1,2984
> 1,8145
> 1,9530
>
>
> Thank you!

Hi Willis,
I agree with Paul's earlier post regarding the best resources for
publications on time series data
mining is Eamonn's web site.
http://www.cs.ucr.edu/~eamonn/http://www.cs.ucr.edu/~eamonn/selected_publications.htm
He's has got different variations of the DTW (Dynamic Time Warping)
algorithms from some
of the publications on his web site. DTW has been mainly used
by Signal Processing community (eg - speech processing) but now it
spreads to other areas such
as data-mining and bio-informatics. You can do a Google on DTW for more
infos and also I have
come across a fast version of DTW (called FastDTW by the authors of
the paper) . There is a
DTW applet with source code available on the internet that matches the
similarities of audio
signals (music), just google for it. There are many methods for doing
time-series data mining (such as wavelets,
ICA, recursive digital filters as ARMA, ARIMA, AR , LPC, FIR, IIR, and
so on...) but DTW is just one
of the methods. I have a Java implementation of DTW based on chapter 8
of the book title "Pattern Recognition"
by Theodoridis & Koutroumbas. The codes is pasted below and you can
modify it to suit what you are trying
to do with WEKA.
Cheers,
Sione.
---------------------------------- Java DWT
----------------------------------------
import weka.classifiers.functions.pace.Matrix;
import java.util.*;
/**
* Reference :
* ----------
* Chapter 8 of "Pattern Recognition" (2nd Ed.) by
* S. Theodoridis and K. Koutroumbas, published by Academic Press.
*/
public class DynamicTimeWarping {
private Matrix targetTimeSeries;
private ArrayList otherInstancesTimeSeries;
private Hashtable timeSeriesRanking;
/**
*
* @param target Matrix
*/
public DynamicTimeWarping(Matrix target) {
this.targetTimeSeries = target;
otherInstancesTimeSeries = new ArrayList();
timeSeriesRanking = new Hashtable();
}
/**
*
* @param otherInstances Collection
*/
public void addOtherInstancesTimeSeries(Collection otherInstances){
otherInstancesTimeSeries.addAll(otherInstances);
}
/**
*
* @param instanceTimeSeries Matrix
*/
public void addOneInstancesTimeSeries(Matrix instanceTimeSeries) {
otherInstancesTimeSeries.add(instanceTimeSeries);
}
/**
*
* @param newTargetTimeSeries Matrix
*/
public void setTargetTimeSeries(Matrix newTargetTimeSeries){
this.targetTimeSeries = newTargetTimeSeries;
}
/**
*
*/
public void removeAllOtherInstancesTimeSeries(){
otherInstancesTimeSeries.clear(); }
/**
*
* @throws Exception
*/
public void computeDTW() throws Exception{
if(otherInstancesTimeSeries.isEmpty()){
throw new Exception("computeDTW : Empty collections of time-series.");
}
timeSeriesRanking.clear();
int count = 0;
Iterator iter = otherInstancesTimeSeries.iterator();
while(iter.hasNext()){
count++;
Matrix instanceTimeSeries = (Matrix)iter.next();
dtwMain(instanceTimeSeries, count);
}
}//end method
/**
*
* @param otherInstanceTimeSeries Matrix
* @param count int
*/
private void dtwMain( Matrix otherInstanceTimeSeries, int count){
int MM = targetTimeSeries.getColumnDimension();
int NN = otherInstanceTimeSeries.getColumnDimension();
Matrix d = new Matrix(MM ,NN);
Matrix D = new Matrix(MM ,NN);
double val = 0.0;
for(int m=0; m<MM ; m++){
for(int n=0; n<NN; n++){
val = targetTimeSeries.get(0,m) - otherInstanceTimeSeries.get(0,n);
d.set(m,n,val*val);
}
}
D.set(0,0,d.get(0,0));
for(int m=1; m<MM ; m++){
val = d.get(m,0) + D.get(m-1,0);
D.set(m,0,val);
}
for(int n=1; n<NN; n++){
val = d.get(0,n) + D.get(0,n-1);
D.set(0,n,val);
}
for(int m=1; m<MM ; m++){
for(int n=1; n<NN ; n++){
val = Math.min( Math.min(D.get(m-1,n), D.get(m-1,n-1))
,D.get(m,n-1));
val += d.get(m,n);
D.set(m,n,val);
}
}
double Dist = D.get(MM -1,NN-1);
int m = MM -1;
int n = NN-1;
double k=1.0;
int ind = 1;
while ((m+n)!=0){
if (m==0) { n = n-1; }
else if (n==0) { m = m-1; }
else{
//an alternative here is to change into 5 points warping instead
of 3 points.
double[] arr = { D.get(m-1,n), D.get(m,n-1), D.get(m-1,n-1)};
ind = indexOfArrayMinimum(arr);
if(ind==0){ m = m-1; }
else if(ind==1){ n = n-1; }
else if(ind==2){
m = m-1;
n = n-1;
}
}
k=k+1.0;
}//end while
timeSeriesRanking.put("Time Series - "+count, new Double(Dist/k));
}//end method
/**
*
* @return HashMap
*/
public Hashtable getTimeSeriesRanking() { return timeSeriesRanking; }
/**
*
* @param arr double[]
* @return int index of the minimum value
*/
private int indexOfArrayMinimum(double[] arr){
int len = arr.length;
int ind = 0;
double val = arr[0];
for(int i=1; i<len; i++){
if(arr[i]<val){
val = arr[i];
ind = i;
}
}
return ind;
}
public static void main(String[] args) {
//NOTE : time-series can have different lengths
Matrix t = new Matrix(new double[][]{{19, 5, 12, 10,
18, 15, 9}});
Matrix r1 = new Matrix(new double[][]{{0, 16, 9, 12,
16, 18, 15}});
Matrix r2 = new Matrix(new double[][]{{4, 8, 19, 18,
8, 18, 1}});
Matrix r3 = new Matrix(new double[][]{{7, 16, 0, 3,
4, 4, 12}});
DynamicTimeWarping dtw = new DynamicTimeWarping(t);
ArrayList arrayList = new ArrayList();
arrayList.add(r1);
arrayList.add(r2);
arrayList.add(r3);
dtw.addOtherInstancesTimeSeries(arrayList);
try { dtw.computeDTW(); }
catch(Exception ex) { ex.printStackTrace();}
Hashtable map = dtw.getTimeSeriesRanking();
Double dbl1 = (Double)map.get("Time Series - 1");
System.out.println("s1 = "+dbl1);
Double dbl2 = (Double)map.get("Time Series - 2");
System.out.println("s2 = "+dbl2);
Double dbl3 = (Double)map.get("Time Series - 3");
System.out.println("s3 = "+dbl3);
}
}//----------------------------- End Definition
--------------------------------

Hi,
I am very new to data mining field.
I have a clustering problem to do and was wondering whether how I can
do it in Weka or anyone can suggest which method I have to use
my data format is
Object AttributeA AttributeB
Year0 Year1 Year2 Year3 Year0 Year1 Year2 ... YearN
1 1000 950 880 450 20 25
2 950 N/A 470
3
.
.
.
n
I could understand it is a time series data.But I dont know how to
proceed to cluster it.
Please if somebody could suggest hpw I can do this using weka or any
other literature can help me a lot
thanks for your time
Wills

I have a large number of text files organized into folders
corresponding to their target classes. i.e. suppose I have 1000 files
and 10 target classes, each text file is stored in the folder
corresponding to its target class.
I was wondering how I could use weka for doing text classification.
i.e given a new text file I want to classify it into one of the 10
target classes.
I couldn't find this in the weka documentation.
Could someone please help me.
Shobhit

Hi,
I need to classify and 10-class dataset using Voted perceptron which is a 2-class classification algorithm. So, in order to actuallly do this .. i have to to the process 10 times each time for one class. I am talking about the optdigits dataset which is 10-class.
For example, first time i need to classify digit 0 and not-zero. second time digit one and not-one. and finally consolidate all results.
Is there any way I can i can do this WEKA?
-Thanks a lot
Sophie
__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around
http://mail.yahoo.com

hi all,
I am looking for Papers (or Links) about Classification based on VOTED PERCEPTRON. If anybody willing to share any relevant literature ?
Any help will be highly appreciated and great help towards my Thesis.
-Sophie
---------------------------------
Start your day with Yahoo! - make it your home page