[ Log On ]
  • Home
  • Tst
  • Cha
  • Enc
  • Code
  • IP
  • Fun
  • Sub
  • DigF
  • Cis
  • Com
  • Db
  • About
  • Netsim

Histograms

 

[Back] When we have data we often has a given pattern, such as fitting a normal distribution - such as with IQs and height of a person - or could be random. One of the best ways to look at these is a probability distribution with a histogram. The following shows a normal distribution and a random distribution. For a uniform random generator, we should get a flat trend across all the values, whereas the normal distribution depends on the μ and σ values. The more samples we have, the nearer it will get to the expected distribution:

Options

Normal- μ:
Normal- σ:
Samples:
Bins:

Try an example

  • μ=5, σ=3, Samples=100 Calc
  • μ=5, σ=3, Samples=1,000 Calc
  • μ=5, σ=3, Samples=10,000 Calc
  • μ=5, σ=3, Samples=100,000 Calc
  • μ=5, σ=3, Samples=1000, Bins=10 Calc
  • μ=5, σ=3, Samples=1000, Bins=50 Calc
  • μ=5, σ=3, Samples=1000, Bins=100 Calc
  • μ=5, σ=3, Samples=1000, Bins=200 Calc
  • μ=5, σ=10 Calc
  • μ=5, σ=1 Calc
  • μ=5, σ=0.5 Calc
  • μ=5, σ=5 Calc
  • μ=55, σ=10 Calc. This is the mark distribution we often aim for in academia.

Normal distribution: \( f(x)=\frac{1}{\sqrt{2\pi}\sigma}e^{-(x-\mu)^2/(2\sigma^2)}\)

eg: μ=9, σ=2 \( f(9)=\frac{1}{\sqrt{2\pi}2}e^{-(9-9)^2/(2(2)^2)}= 0.2\). This should be the value for x=9: Calc

[View]

Source code

The following outlines the Python code used:

import matplotlib.pyplot as plt
import numpy as np
import sys
import random

file ='1111'
mu=9.0
sig=2.0
samples=10000



if (len(sys.argv)>1):
        file=str(sys.argv[1])

if (len(sys.argv)>2):
        mu=float(sys.argv[2])

if (len(sys.argv)>3):
        sig=float(sys.argv[3])

if (len(sys.argv)>4):
        samples=int(sys.argv[4])



fig,myplot = plt.subplots(4, 1,figsize=(8,8))
plt.tight_layout(w_pad=1.5, h_pad=2.0)


uniSamples = [random.random() for i in xrange(samples)]


myplot[0].hist(uniSamples, bins=100, normed=True)
myplot[0].set_title("Uniform random number generator histogram")
myplot[0].set_xlabel("x")
myplot[0].set_ylabel("Frequency of occurrence")
print "Uni: ",uniSamples[0:10]  #Take a look at the first 10


normSamples = [random.normalvariate(mu,sig) for i in xrange(samples)]

myplot[1].hist(normSamples, bins=100, normed=True)
myplot[1].set_title(r"Normal Histogram RNG $\mu = "+str(mu)+"$ and $\sigma = "+str(sig)+"$")
myplot[1].set_xlabel("x")
myplot[1].set_ylabel("Frequency of occurrence")

print "Norm: ",normSamples[0:10]  #Take a look at the first 10


triSamples = [random.triangular(0,1,0.5) for i in xrange(samples)]

myplot[2].hist(triSamples, bins=100, normed=True)
myplot[2].set_title(r"Triangular Histogram RNG")
myplot[2].set_xlabel("x")
myplot[2].set_ylabel("Frequency of occurrence")

print "Tri: ",triSamples[0:10]  #Take a look at the first 10
    

logSamples = [random.weibullvariate(mu,sig) for i in xrange(samples)]

myplot[3].hist(logSamples, bins=100, normed=True)
myplot[3].set_title(r"Weibull Histogram RNG")
myplot[3].set_xlabel("x")
myplot[3].set_ylabel("Frequency of occurrence")
   
print "Log: ",logSamples[0:10]  #Take a look at the first 10

plt.show()