Strange Data: May 2015

Forecasting is hard. If I could do it well I wouldn't be picking an NHL fantasy team. I would be picking stocks. And I would be swanning around on my yacht in the mediterranean. With that excuse in mind how are things going in the play-off pool.

After the first round of play-offs I've come out ahead of the pack but nowhere near first place. That honour goes to team Rye followed by team Disaronno. I follow in a close third, ten points back from the front.

--------------------------------------

rank | Team Points

----------+---------------------------

1 | Rye 51

2 | Disaronno 43

3 | Strobes 41

4 | Rum 38

5 | Everglo 37

6 | Moonshine 35

7 | Absinthe 34

8 | Jack Daniels 33

9 | Jagermeister 31

10 | Beer 29

11 | Brandy 28

12 | Gin 27

--------------------------------------

But as Henry Jones Sr. would say "thersh no shilver medal for finishing shecond". I either place first or I lose. So based upon how things have been going so far how am I projected to finish? This would be the time for any Jets HR people to send me that contract and not finish reading this blog.

Turns out not well. I project forward how I expect points will evolve based upon a couple key pieces of information. First, half of all teams are eliminated from the playoffs each round. This reduces the number of players in the pool by roughly half each round that we go through. It also means that on average each person in the pool should see their points per round half each successive round. This is a basic geometric growth process that you might have seen in grade 11 math and it should be a fairly reasonable way to project forward how many points each fantasy team should get based on information from the first round.

In addition to this we also know which NHL players continue to be in the pool and which ones have exited. Even if a fantasy team continues to have ten of their players, if they're not particularly productive then it really doesn't matter that a majority of the fantasy team has not yet been eliminated. Conversely, if someone has only a few very productive players they might be able to milk a whole lot of points out of them even though their fantasy team has been eliminated. Team Rye finds themselves in this latter situation where they only have four players after the first round but they're unfortunately very productive players.

So this next assumption that I make is that, for the second round, each team will get the same amount of points that they got in the first round, minus the points of the players that were eliminated in the first round. This amount then halves for each additional round. In math terms it looks like this:

Total projected points are equal to the already earned points in the first round, Xp, plus the projected points from the remaining rounds. Xp-n is a term for all of the points earned by players in the first round who were eliminated and therefore cannot earn any more points. Taking into account that half of all NHL players will be eliminated, the difference is then divided by a geometric factor as the rounds of play-offs progress.

This is how everyones points are projected to evolve.

And this is the final rank estimation based upon this projection.

------------------------

Rank | Team

----------+-------------

1 | Disaronno

2 | Rye

3 | Jack Daniels

4 | Gin

5 | Everglo

6 | Strobes

7 | Jagermeister

8 | Rum

9 | Beer

10 | Absinthe

11 | Moonshine

12 | Brandy

------------------------

This bodes poorly for my fantasy NHL team. So what went wrong here? In looking back at the regression results from my original model a couple of things leap out. First, the model put a big emphasis on goaltending. The model estimated a large coefficient for the save percentage term but it didn't do it very precisely. This is the root cause of the model predicting Montreal to do well. It overemphasized the great season that Carey Price had. On the back of this I picked two Montreal players. The remaining variables are also not all that precisely estimated which should induce a lot of uncertainty in the predictions.

The second thing that went wrong was idiot human error in the form of my own stupid picks. Despite saying that I would only pick players from teams that had a good chance of advancing in a series I picked two players from Nashville. I probably should have avoided the Nashville - Chicago series where the betting markets basically predicted a toss-up.

But all is not lost. As of writing this I am in decent position with a ten point gap between me and the next fantasy team. I've kept pace with teams Rye and Disaronno. There are a couple of reassuring reasons why these projections might be wrong. First, I bet the heaviest on Anaheim by picking up three of their players. They then proceeded to destroy Winnipeg in very short order (don't deny it, it happened) but that meant that the series only went four games. Fantasy teams including team Disarrono bet heavily on Chicago where the series went to seven games and his players had three additional games to get points. The Minnesota - Chicago series does not look like it is going seven games. Besides this, team Rye has three of his remaining players on Minnesota and team Disarrono has five players on Chicago. One of them is very screwed after this round.

Irrelevant statistic to this post: I watched the Pacquio-Mayweather fight last night and my favourite statistic was that the total take from the fight was greater than the gross national incomes of the 29 poorest nations.

As medical school winds down several things occur. The snow starts to melt. The rotations on various medical services become devoid of fourth years. Medical students become lazy and relaxed before the long slog of residency starts. And companies looking to vampire as much money out of a group of future one percenters scuttle out of the woodwork looking for a piece of the financial action.

Over the last couple of weeks my classmates and I have been subjected to a parade of slick, bespoke suit wearing, banker types who all gave presentations on how much money we were all eventually going to make. The bankers gave pitches that were all subtle variations on "you're all going to be stinking rich. Now let me take some of that money and make you even more stinking rich and myself stinking rich too!" And people wonder why medical students get such big egos. On top of that, every one of them brought us a "free lunch" to go with the presentation. I technically have two degrees on the topic of free lunches so I put that phrase in quotes for a reason.

A lot of the pitches were on the importance of navigating tax loopholes. Essentially you pay these guys so that you don't have to pay the government even more. But in addition to this, almost all of them pitched disability insurance to us. The line was that the greatest earning potential was "in our brains" and that we should "protect it" by insuring it. Almost invariably, at some point during the pitch there was a story about some doctor who had got their hands chopped off in a random golfing/boating/gambling accident and who had filed for disability insurance the week before but the paperwork hadn't gone through yet and if only he had purchased disability insurance sooner! Now his kids hated him and his wife divorced him and he was an alcoholic all because he didn't have disability insurance.

Disability insurance has been pitched to me more than a couple of times during medical school, which automatically makes me sceptical of it. Every time some finance guy tells me about disability insurance, I get the feeling that he's looking at me as though I was transforming into a giant sack of money with a dollar bill on the front of it. Given that so many people want to sell medical students disability insurance, it must be a fairly lucrative field. Most medical students are young and don't have a lot of risks to become disabled in the next couple of years. I would also have thought that doctors in general have a pretty low risk of becoming disabled given the income they make and the lifestyles they lead. None of these bankers have ever given me any actuarial odds on how likely it was for me to become disabled at any point. They just told me the anecdote about the pathetic doctor without disability insurance. If doctors were getting their hands chopped off left and right, the bankers would have pitched disability insurance that way. Let's see if this actually stands up to statistical evidence.

Now a couple of things before I go into methodology here. First, I'm flattered that some people have mistaken my economics degrees as evidence that I have any real knowledge about how insurance works. The type of insurance that I learned about was more like this (scroll down to the appendix). Try applying any of this to real life. But there are two broad principles that leap out of insurance theory in economics. First, purchasing insurance is related to risk. What is the true risk of becoming disabled? This is really the focus of this post.

Second, purchasing insurance is related to how unhappy you are at accepting a certain level of risk. How risk averse are you and what do you have to lose if you don't purchase insurance? Most people are in some way risk averse but some are more so than others. I don't have any kids (that the legal system acknowledges), but if I did I would be a lot more worried about becoming disabled even if my true risk of becoming disabled was the same. Similarly, I'm a pretty relaxed guy but if I was an uptight worrier I might consider getting disability insurance to give myself "peace of mind".

I can give you something of an answer on the first reason to purchase insurance, but the second reason is all up to you. So you are to take none of what follows as sound financial advice. This is not "actionable" as my lawyers would say. It's up to you to decide whether you should get disability insurance.

So at what risk are you, fellow medical student, of becoming disabled during your career? It's actually much more difficult to answer this question because the public use labour force surveys don't identify doctors in the workforce. There are so few doctors that if you actually noted them in the national data you could start to pick out individuals with a couple of other key variables. To avoid this they don't identify doctors but they do identify people working in certain fields and in certain professions. They identify people with certain degrees. So for what follows, I haven't actually been able to identify any doctors but I have identified people who according to the data look very much like doctors. These are people with graduate degrees, who work in the health sector, and who are professionals in the health sector. This group of people does not include nurses, but it probably includes people who are pharmacists, dentists, chiropractors, physiotherapists, and other professional health care workers. I will be denoting this group with the abbreviation "PWLLD" (people who look like doctors).

This data is taken from the 2013 cross-sectional Labour Force Survey (LFS). The LFS is a monthly survey that evaluates trends in Canadian labour force participation and employment. It also identifies whether someone is unemployed or underemployed because of a personal disability or illness. This data is re-weighted using a frequency weight to provide nationally representative results.

The one major problem with my strategy here is that because I cannot identify doctors directly, this will necessarily include chiropractors or dentists or other such people who may be different in their respective probabilities of becoming disabled. I think the true probability of disability will be close to this, but there will most certainly be some error in this estimate.

I look at two major outcomes. First, what is the probability of being disabled or ill to the point where one has to work part time? Second, what is the probability of being so badly disabled or ill that one cannot work?

So what is the probability in this group of people who look like doctors of becoming disabled so they can only work part time? The following pie charts are estimated percentages of people who are in the labour force and their reasons for working part time. The chance that someone in the PWLLD group has to take part-time work because of a disability or illness is about half a percent. PWLLD do actually become disabled at a higher rate than the general population but it's not really that much higher. A full 83 percent of PWLLD are either not underemployed or consider themselves outside of the labour force (the "not applicable" category). Interestingly, PWLLD tend to work part time because of personal preference more so than the general labour force. An additional 5% of the PWLLD take time off as a personal preference and this fully accounts for the difference between the two groups in the "not applicable" category. They also tend to work part time more than the general population in order to care for their children. High salaries among the PWLLD may be the reason that these people can take part time work for personal preference or to take care of children.

Reason for working part time - People who look like doctors

Reason for working part time - General labour force

PWLLD do much better than the general population when it comes to job-ending illnesses or disabilities. Less than .02% of PWLLD get a job-ending disability or illness. The general population gets a job-ending disability or illness about 20 times this rate.

Reason for being unemployed- People who look like doctors

Reason for being unemployed - General labour force

My suspicions about the age of onset of a disability or illness for PWLLD are also confirmed. Of the PWLLD who get a work-limiting disability or illness and who have to work part time, about 90% occur after the age of 50. About 98% occur after the age of 40 (the red line represents PWLLD, the blue line represents all people with graduate degrees, and the green line represents the general population). I looked at job-ending illnesses and disabilities as well, but because there are so few PWLLD who get job-ending illnesses or disabilities, it's not really informative.

What age do people get disabilities/illnesses causing part-time work? Red line is PWLLD, Blue line is people with graduate degrees, green line is the general population.

So the reason why I think all of these financial companies are trying to sell medical students on disability insurance is two-fold. First, it's free money for the first ten to fifteen years of a doctor's life. The likelihood that an insurance company is going to pay out to any doctor is already low, and the chance it'll have to pay out to a 30 year-old doctor is exceedingly low. Second, it's a way for a company to worm its way into future doctors' financial affairs and reel them in for more lucrative financial services down the road. So I probably won't buy disability insurance just yet.

But if I ever get my hands chopped off in a random golfing/boating/gambling accident, I'm really going to be eating my words.

Strange Data

Wednesday 6 May 2015

Can I out-predict a bunch of yahoos with statistics? NHL Playoff Edition (Update 1)

Sunday 3 May 2015

How many doctors become disabled?