Page 1 of 1

Boxplot

Posted: Wed Jan 28, 2004 1:54 pm
by 8439323
I have attempted to use the UseCustomValues := True but the boxplots are never right (the whiskers are incorrectly drawn). Here is the code I used in simple tryouts (Delphi 6, & TeeChart version 6.01 pro):


I create a new application and stick a chart on Form1 with one boxplot series:

-------------------------------------------------------------------------------
unit Main;

interface

uses
Windows, Messages, SysUtils, Variants, Classes, Graphics, Controls, Forms,
Dialogs, TeEngine, Series, TeeBoxPlot, ExtCtrls, TeeProcs, Chart;

type
TForm1 = class(TForm)
Chart1: TChart;
Series1: TBoxSeries;
procedure FormCreate(Sender: TObject);
private
{ Private declarations }
public
{ Public declarations }
end;

var
Form1: TForm1;

implementation

{$R *.dfm}

procedure TForm1.FormCreate(Sender: TObject);
begin

With Series1 do
begin
Clear;
AddArray([10,12,13,15,17,23,26,35,50]);
UseCustomValues := True;

Median := 17;
Quartile1 := 13;
Quartile3 := 26;
InnerFence1 := -6.5;
InnerFence3 := 45.5;
OuterFence1 := -26;
OuterFence3 := 65;
AdjacentPoint1 := 10;
AdjacentPoint3 := 35;
end;

end;

end.

---------------------------------------------------------------

I would be very grateful if somewhat would indicate what is wrong here and how to correct it ... and end the frustration this is causing!

Posted: Wed Jan 28, 2004 8:29 pm
by Marjan
Hi.

I think I fixed (all reported) box plot series in the latest v7.0 (beta) sources. All box plot drawing is done in TCustomBoxSeries DrawValues and DrawValue methods.

Just to be sure, what are "incorrect" and what "correct" results in your case (there are several definitions for "correct" plot, let's sync on one definition)?

BTW, I've tried your data with NSSP/Pass and I got the same results I'm getting with default TeeChart drawing algorithm.

Box plot ("correct" figure)

Posted: Thu Jan 29, 2004 3:41 am
by 8439323
For me the "correct" figure would be one corresponding to the "user" values I provided.

Thus the lower whisker (the one in the negative direction) should extend to 10 (AdjacentPoint1) while the whisker in the positive ditection should extend to 35 (AdjacentPoint3).

But when I run the program I listed, I find that the latter (the upper whisker) in fact extends DOWNWARDS (cutting across the box) to a value at least equal to the first quartile. (I cannot be sure where that whisker ends up because of possible overlap of its end with the lower end of the box or else with the lower whisker).

Incidentally, the values I provide follow from the values I give for the first and third quartiles. The latter are determined by Tukey et al's method of "fourths". The Inner and Outer fences are determined in the usual way.

(My choice of the method of fourths arises because that is the one my students are taught and it is logically consistent with the standard method of determining the median: the approach taken in the method of fourths is to determine Q1 and Q2 as the medians of the two halves of the order set of values in the same way as the median of the whole set is determined).

So .. my concern here is not with drawing the "correct" box plot for the data (there are at least 3 versions for that in the literature), but with drawing boxplots that agree with the "user" values that are provided for the median, Quartile1 and 3, etc.

Many thanks for your attention to this matter.

Box plot problem solved(?)

Posted: Thu Jan 29, 2004 2:11 pm
by 8439323
I think I have solved my problem. During some trial-and-error exploration I noticed that AdjacentPoint1 and AdjacentPoint3 are of type integer, while the others, Q1, Q3, etc. are of type double.

It suggests that AdjacentPoints are not data values (as I have been assuming) but ordinal positions in the (sorted) series of data values, with the first position's index being 0 and not 1.

Revised values for AdjacentPoint using this interpretation appear to give the correct (and well-formed) box plots for the values for median, etc., that I provide.

I have yet to check this out thoroughly but thought I would let you know right away so that you won't spend more of your time on this (assuming that I am indeed correct in my interpretation of AdjacentPoints).

Posted: Thu Jan 29, 2004 4:09 pm
by Marjan
Hi.
It suggests that AdjacentPoints are not data values (as I have been assuming) but ordinal positions in the (sorted) series of data values

Precisely. To give you an idea what's going on behind the scenes, here is the internal code I'm using to find adjacent points:

Code: Select all

{ find adjacent points }
     for i := 0 to N-1 do if FInnerFence1<=SampleValues.Value[i] then Break;
     FAdjacentPoint1:=i;

     for i := FMed to N-1 do if FInnerFence3<=SampleValues.Value[i] then Break;
     FAdjacentPoint3 := i-1;