曝光台 注意防骗
网曝天猫店富美金盛家居专营店坑蒙拐骗欺诈消费者
problem and use the statistical software R to solve it.
Lets first look at the sample quantiles where 2 {0.1, 0.2, . . . , 0.9}. Using
”kuantile” function of R, which calculates the sample quantiles according to the
above rule, we obtained:
>kuantile(x, c(0.1, 0.9))
>percentiles: 10% 90%
>quantiles: 34 52
In Figure 2.5, one can see the graph of sample quantiles: Now we would like to
find the parameters of the Gamma distribution whose quantiles between 10% and
22
Figure 2.5: Quantiles of GHP Durations of Year 2008 OP2
90% are closest to our sample. In order to find these, we tried to minimize the sum
of squares of the horizontal distance between the sample quantiles and theoretical
Gamma quantiles. In this case, we cannot use the usual maximum likelihood
algorithm as we censored our sample in the interval between 0.1-quantile and 0.9-
quantile. Since this part of the sample we are using does not give the whole set
of observations, the distribution parameters one finds using the usual MLE algorithm
would give high errors between the estimated quantiles and the actual sample
quantiles. We define the sum of squares of the distance between sample quantiles
and theoretical quantiles as
d =X
(μ − s − F−1(, k, ))2
where μ is the sample quantile, in (0,1) (in our case, we are interested in
2 {0.1, 0.2, 0.3, . . . , 0.9}) and F is the theoretical cumulative Gamma distribution
function.
23
We would like to minimize the above quantity over s, k and , which are shift,
shape and scale parameters, respectively. The statement of the problem is:
Given μ
min
s,k,
d =X
(μ − s − F−1(, k, ))2
subject to:
s 0, k 0, 0
F cumulative Gamma distribution function.
We used the statistical software R for this simple optimization problem. You can
find the corresponding code below:
>p<-function(x) kuantile(x,seq(0.1, 0.9, 0.1))
>i<-seq(0.1,0.9,0.1)
>fh<-function(theta,x)sum((p(x)-theta[1]-qgamma(i,shape=theta[2],
>+scale=theta[3],log=F))**2)
>theta.start<- c(var(x)/mean(x) ,(mean(x))**2/var(x),1)
>out<-function(x)nlm(fh,theta.start,x=x)
In the above code, fh is the previously defined d function. qgamma is an R function
which finds the quantiles for given probability, shape and scale for Gamma distribution.
We used ”theta” for our unknown parameters s, k and , and minimize the
function fh over using R minimization function nlm. theta.start defines the
starting point for the minimization algorithm and we calculated it from the sample
as before.
We used here the data set of departure flights in 2008 Operation Plan Period 2, aircraft
type 734, departure station Amsterdam Schiphol Airport, and arrival station
in Europe.
Here is the R output of the above defined code:
minimum value of the function "out"
24
$minimum
[1] 1.380391
the estimated Gamma parameter minimizing the defined distance
$estimate
[1] 33.350446 1.192140 6.942234
number of iterations
$iterations
[1] 100
Figure 2.6 shows the sample quantiles and the Gamma distribution curve which
is found by minimizing the sum of squares of the distance between quantiles.
Figure 2.6: Gamma Distribution Fit by Optimizing the Distance between Theoretical
and Sample Quantiles
25
We also tested if the distribution is Gamma for the censored set of observations
censored at 0.10- and 0.90- quantiles. First, let’s look at the quantile - quantile
plot of the sample quantiles and the theoretical estimated Gamma quantiles shown
in Figure 2.7. As one observes from the graph, the sample data is rounded and
Figure 2.7: Quantile-Quantile Plot of the Shifted Sample Quantiles and Estimated
Gamma Quantiles
grouped; therefore, the number of unique points in the sample is too few. However,
it is still difficult to conclude if the sample distribution is Gamma when the
outliers are censored. Therefore, we applied 2 Goodness of Fit Test with
H0 = The censored part of the sample comes from a Gamma distribution with the
estimated parameters.
H1 = The distribution of the censored part of the sample is not Gamma with the
parameters.
26
Applying the given 2 formula
2 =
nX
i=1
(Oi − Ei)2
Ei
= 226.4756
The corresponding p value is very close to 0 with 7 degrees of freedom. Therefore,
we have to reject the null hypothesis saying the censored part of the data between
0.1 and 0.9 quantiles are disttibuted according to Gamma distribution.
2.4 DISTRIBUTION ANALYSIS: WEIBULL DISTRIBUTION
中国航空网 www.aero.cn
航空翻译 www.aviation.cn
本文链接地址:
航空资料31(14)