熱線電話:13121318867

登錄
首頁精彩閱讀數據分析之_散點圖?_數據分析師
數據分析之_散點圖?_數據分析師
2014-11-26
收藏

數據分析之_散點圖_數據分析師


一:什么是散點圖 - What is a scatter plot

任何數據分析的第一步是圖形化曲線顯示數據,根據相互關系,圖形曲線被稱為散點圖。散點圖可以表示兩個變量之間真實的關系強度,關系的趨勢,是否存在Outliers

 

二:散點圖的目的是什么

ü         觀察變量之間的關系,發現統計數據中是否存在問題,或者特殊值和感興趣的數據

ü         數據是如何被離散化的

ü         通過眼睛觀察是否存在Outliers

 

三:示例說明

一個人的肺活量和屏住呼吸時間的研究,一個人能屏住呼吸多久,一個研究者選擇一組人作為研究對象,測量每個人的肺活量作為第一個變量,屏住呼吸時間作為第二個變量,研究者將使用散點圖來描述數據,假設肺活量作為水平軸,屏住呼吸時間做為垂直軸。

 

四:代碼實現

基于Java開源的數據圖形顯示組件-JFreeChart已經實現了離散圖,只要我們提供數據即可

基于上面描述的演示如下:

 

 

 plot

  

 

五:相關性系數 correlation coefficient – R/r

r calculation

 

Relationship Between X and Y Axis

r = + 1.0

Strong - Positive

As X goes up, Y always also goes up

r = + 0.5

Weak - Positive

As X goes up, Y tends to usually also go up

r = 0

- No Correlation -

X and Y are not correlated

r = - 0.5

Weak - Negative

As X goes up, Y tends to usually go down

r = - 1.0

Strong - Negative

As X goes up, Y always goes down

 

本例中的r值為0.9814324978439516,顯然肺活量跟屏住呼吸時間長短有很強的正相關性。

以下為源代碼:

 package com.dataanalysis.plots;  import java.awt.Color;  import javax.swing.JPanel;  import org.apache.commons.math.stat.descriptive.DescriptiveStatistics; import org.jfree.chart.ChartFactory; import org.jfree.chart.ChartPanel; import org.jfree.chart.JFreeChart; import org.jfree.chart.annotations.XYTextAnnotation; import org.jfree.chart.axis.NumberAxis; import org.jfree.chart.plot.PlotOrientation; import org.jfree.chart.plot.XYPlot; import org.jfree.chart.renderer.xy.XYLineAndShapeRenderer; import org.jfree.data.xy.DefaultXYDataset; import org.jfree.data.xy.XYDataset; import org.jfree.ui.ApplicationFrame; import org.jfree.ui.RefineryUtilities;  // - http://en.wikipedia.org/wiki/Scatter_plot  public class ScatterPlotDemo extends ApplicationFrame {      /** *   */ private static final long serialVersionUID = 1L; private static double[][] data; /**      * A demonstration application showing a scatter plot.      *      * @param title  the frame title.      */     public ScatterPlotDemo(String title) {         super(title);         JPanel chartPanel = createDemoPanel();         chartPanel.setPreferredSize(new java.awt.Dimension(600, 400));         setContentPane(chartPanel);     }      private static JFreeChart createChart(XYDataset dataset) {         JFreeChart chart = ChartFactory.createScatterPlot("Scatter Plot Demo",                 "lung capacity(ml)", "time holding breath(s)", dataset, PlotOrientation.VERTICAL, true, false, false);           XYPlot plot = (XYPlot) chart.getPlot();         plot.setNoDataMessage("NO DATA");         plot.setDomainZeroBaselineVisible(true);         plot.setRangeZeroBaselineVisible(true);                  XYLineAndShapeRenderer renderer = (XYLineAndShapeRenderer) plot.getRenderer();         renderer.setSeriesOutlinePaint(0, Color.black);         renderer.setUseOutlinePaint(true);                  // x axis         NumberAxis domainAxis = (NumberAxis) plot.getDomainAxis();         domainAxis.setAutoRange(true);                  // Y axis         NumberAxis rangeAxis = (NumberAxis) plot.getRangeAxis();         rangeAxis.setAutoRange(true);                  XYTextAnnotation textAnnotation =           new XYTextAnnotation("R = " + calculateCoefficient(data),           370, 25); // r value         textAnnotation.setPaint(Color.BLUE);         textAnnotation.setToolTipText("Correlation Coefficient");          plot.addAnnotation(textAnnotation);                  return chart;     }          /**      * Creates a panel for the demo (used by SuperDemo.java).      *       * @return A panel.      */     public static JPanel createDemoPanel() {         JFreeChart chart = createChart(createXYDataset());         ChartPanel chartPanel = new ChartPanel(chart);         chartPanel.setPopupMenu(null);         chartPanel.setDomainZoomable(true);         chartPanel.setRangeZoomable(true);         return chartPanel;     }          public static XYDataset createXYDataset() {     DefaultXYDataset xyDataset = new DefaultXYDataset();     data = new double[2][12];         // x axis data - lung capacity(ml)     data[0] = new double[]{400,397,360,402,413,427,389,388,405,422,411,433};         // y axis data - time holding breath(s)     data[1] = new double[]{21.7,20.7,17.7,21.9,23.7,25.7,20.4,20.1,22.9,24.8,22.5,25.9};         xyDataset.addSeries("Research Data", data);     System.out.println("Correlation Coefficient = " + calculateCoefficient(data));     return xyDataset;     }          public static double calculateCoefficient(double[][] data) {     DescriptiveStatistics xDataSet = new DescriptiveStatistics(); for(int i=0; i<data[0].length; i="" xdataset="" descriptivestatistics="" ydataset="new" descriptivestatistics="" for="" i="0;" i="" i="" ydataset="" double="" n="yDataSet.getValues().length;" double="" xysum="0.0d;" double="" xpowsum="0.0d;" double="" ypowsum="0.0d;" for="" i="0;" i="" i="" xysum="" xdataset="" ydataset="" xpowsum="" math="" ypowsum="" double="" s1="xySum" -="" ydataset="" double="" xs="xPowSum" -="" double="" ys="yPowSum" -="" double="" s2="Math.sqrt(xS" ys="" return="" s2="" starting="" point="" for="" the="" demonstration="" application="" args="" ignored="" public="" static="" void="" main="" args="" scatterplotdemo="" demo="new" scatterplotdemo="" plot="" demo="" demo="" refineryutilities="" demo=""> </data[0].length;>

數據分析咨詢請掃描二維碼

若不方便掃碼,搜微信號:CDAshujufenxi

數據分析師資訊
更多

OK
客服在線
立即咨詢
日韩人妻系列无码专区视频,先锋高清无码,无码免费视欧非,国精产品一区一区三区无码
客服在線
立即咨詢