torsdag, oktober 02, 2008

Og litt om sannsynlighet

Jeg skrev i en tidligere blogg at:
Statistikk og sannsynlighetsregning er ikke noe som det faller naturlig for oss å være gode i. Selvsagt kan vi lære å beherske dette også, men i så fall må vi gå frem på den tungvinte måten og studere statistisk metode. Hvis vi bare forsøker å gi vårt beste estimat ut fra hva som intuitivt virker "rimelig å anta", er sannsynligheten stor for at svaret ikke bare er feil, men astronomisk feil.


Et av mine favoritteksempler på dette er fra boken "The Canon" av Natalie Angier (og hvis du bare skal lese en bok om vitenskap i løpet av livet, er Angiers bok slett ikke noe dårlig tips!). Tenk deg at du får tilbud om å ta en ny HIV-test som presenteres som "95% nøyaktig", og bestemmer deg for å la deg teste - bare for sikkerhets skyld - selv om du ikke tilhører noen av høyrisikogruppene. En uke senere kommer sjokket: testen var positiv! Må ikke dette bety at det er 95% sannsynlighet for at du er HIV-smittet? Ikke nødvendigvis:

Unbate your breath. Even if it was your vital fluid that yielded the positive result, the real odds are much smaller than 95 percent that you are genuinely HIV positive. In the lively Port Said of the free market, the definition of a test's accuracy can vary depending on the needs and temperament of the pharmaceutical company, but in general this figure would mean the following: on the one hand, the test will accurately detect the human immunodeficiency virus in 95 percent of those who have it but will fail to catch 5 percent of those infected; on the other hand, it will correctly rate as negative 95 percent of all noncarriers, but - and here's where your comfort food comes in - it will mistakenly generate a positive result for 5 percent of uninfected patients. Why should you find solace in a puny false-positive figure like 5 percent? Because the potential pool, the sample space, embodied in that figure is formidable. In the United States, HIV infection remains relatively rare, afflicting about 1 in 350 people. Taking a more population worthy slant on the problem, that means in a random group of 100,000 Americans, some 285 will be HIV positive, and 99,715 not. Yet if we screened all 100,000 with our AIDS test, what would we expect? The assay would accurately pick up 217 of the 285 viral carriers; but it would slap a fallacious writ of panic on some 4,986 noncarriers. To calculate the odds that a positive result means you are actually infected, you divide the total number of true positives you'd expect in your sample space (271) by the total number of positives overall - false (4,986) and true (271) together. Slice 271 by 5,252, and you end up with a probability of 5 percent. The gist of that calamitous phone call, then, amount to the flip figure of your initial fears: there is a 95 percent chance you're virus-free.

0 Comments: