Strange Data: 2016

Tuesday 8 November 2016

Which presidential candidate will make you drink more?

I was a debater in high school. At great personal expense to my social life I went out and spoke instead of participating in sports or art or music or whatever as an extracurricular activity. For some of you who know me but didn’t know that, I bet some puzzle pieces are falling into place.

My debate teacher was well respected and had a great depth of experience with judging debating. He had a regular story about the worst debate he had ever judged. The resolution, or what was being debated, was “better dead than red”. For those of you who did sports in high school rather than debating, any resolution is essentially meaningless and can be interpreted in any fashion you would like. The above resolution could have been a debate about the merits of communism versus capitalism; it could have been a debate about whether we should force parents to vaccinate their children against their will; it could have even been about how terrible tomatoes are and how we would be better off if we didn’t eat them.

Instead though, the debaters in the “worst debate of all time” defined the resolution as whether it was better to be aboriginal or dead. The debate team that opposed the resolution settled on the argument that death was horrible but that it was only slightly worse than being aboriginal. The team in favor went with the similarly racist argument that, no, living as an aboriginal person was horrible for reasons including that they were drunk all the time and that they didn’t work. Death was obviously preferable. The second speaker on this team did eventually decide to go with the reasonable argument that a lot of why aboriginal Canadians have had so many social and economic problems have been as a result of systemic oppression. But then he went back to defend his partner’s points on the drunkenness and crime and all of that. It sounds like something out of a piece of Ku Klux Klan literature rather than a high school debate.

It’s hard to compare anything to it. As they were teenagers at the time I guess we have to give them the benefit of the doubt. Nevertheless, it seems like an obtuse and tone-deaf debate to have when you are (most likely) a white male, (most likely) come from a household that has an income in the six figures, and (most likely) are wearing a suit jacket with some prep school crest on it. True to stereotype, most debaters were white-male prep-school educated, knobs when I was debating. I was, of course, the only exception.

The closest that I have come to witnessing a similar debate though (and this is in the crudest sense as it’s difficult to get anywhere close to this “worst debate of all time”) was several weeks ago when I watched the brouhaha in St. Lou-ha-ha, the melee in Missouri, the federal debate fisticuffs between Hillary and Donald. A candidate running for president had to publicly deny sexually assaulting women, disavowed on record something their running mate said, and then threatened to lock up their opponent. And Hillary was there too. It was a debate that wouldn’t have been out of place in a banana republic.

I am under no illusion that I will have any effect on the outcome of this presidential contest. Most of the people who read this blog are Canadian and are therefore ineligible to vote in an American election. I am not going to convince any Americans that I know who are voting for Hillary to switch their allegiance. I am not going to convince any Americans I know who are voting for Donald because most of these people are stuck scratching their lobotomy scars when I use polysyllabic words like banana.

So I have nothing to contribute to this debate. But what I can do is address the obvious question: who will cause you to drink more if elected president? We have a perfect testbed for this question, which is to see how much the candidates cause us to drink during the final debate which occurred two weeks ago in Las Vegas.

To test this question, I recruited one healthy male volunteer of legal drinking age into this experiment. In the interest of anonymity, we will call him by his initials, SS. SS has a prodigious brain and an even more prodigious liver and so could spare both in the interest of the scientific method. I also purchased a police-grade breathalyzer which makes this blog post, by far, the most expensive in Strange Data history. I had attempted to run this experiment on the previous debate in St. Louis, but the first breathalyzer I purchased seemed to function as a random number generator rather than an accurate test of blood alcohol content. I had to reschedule until Amazon shipped the new one.

Good old-fashioned American debate drinking games have been a staple of the socially awkward college student since debates were televised. They’re as American as apple pie or failed gun control legislation. Most of the normal drinking game rules are based upon the script that regular American politicians follow. This pabulum forms the basis of all political drinking games because it seems like most American politicians (but really all politicians) can’t get through a speech without speaking all of these terms. I chose the following terms or concepts as the foundation of the drinking game rules:

"middle class"
"small businesses"
A promise of tax cuts
The “I’m a regular joe, not a politician” story
Accusing an opponent or the establishment of being “Washington insiders"
“The children are our future” anecdote

These would be the terms that I would expect from a regular election cycle where the candidates weren’t universally hated and scandal plagued. This is not the case this year. So to add to these terms were several specific to the discussion (I use that term loosely) that has been occurring around the candidates this year. These “wildcard” terms include:

Any affair/sexual impropriety talks
Emails/Wikileaks talks
Stamina/health/fitness talk
China – they’re taking our jobs
Mexico or how to build a free wall
Tax returns

Mentioning any of these twelve concepts earned one drink (which I measured in a one-ounce shot glass). The drink would be earned on a per-thought basis. For example, a candidate who said “middle class” repeatedly during the same line of argument would earn themselves one drink. An opposing candidate who replied to that line of argument though would also earn themselves a drink. A candidate who mentioned one of these terms, then switched topics, and then switched back to the term in question would earn themselves two drinks.

I chose these twelve terms for two major reasons: they’re fairly objective and they’re discrete, so it’s easy to measure drinks. In previous elections I have used drinking rules like, “if the candidates speak over the moderator, drink until they stop”, which makes it difficult to measure alcohol intake. I have also used drinking rules like “if the candidate tells a lie then drink”, which is a subjective rule at least in real time (and wouldn’t you be drinking constantly?). But to the above objective drinking rules I added one that is more subjective: if a candidate says something that makes my jaw drop, finish a drink. To accurately gauge alcohol intake for SS I used a measuring cup to record these finished drinks.

As this is a blog that’s nominally about health and medicine, I guess I should talk about something health-related. Your liver is an organ that loves you dearly. It loves you so much that it will help clear any alcohol in your blood out of your body so that this alcohol does not kill you. I once heard from an intensive care doctor that all you really need to live is a small bit of brain and your liver. Everything else can kind of be replaced and although you will have a poor and uncomfortable life, you will live.

The way that alcohol enters your body (usually) to get to your liver is by drinking it. It goes in your face. It then is dumped by the esophagus into the stomach where about 20% of it is absorbed via the stomach lining. The stomach then dumps it into the initial part of the small intestine – the duodenum – where the remaining 80% is absorbed. Any food in the stomach delays gastric emptying and uptake of alcohol, which explains why people do not feel as intoxicated as quickly when they drink and eat.

Alcohol gets absorbed into the blood and then is transported via the portal blood system directly to the liver where that organ gets first crack at ridding you of it. This is called the first pass system and it is why orally ingested drugs (that are cleared by the liver) are usually less potent than those given directly by intravenous. This first pass system occurs both in the liver and at the interface between the gastric mucosa and circulation system. The liver itself continually purifies blood flowing through it so that any alcohol that does not get cleared through first pass will be purified on further passes through the portal system.

From the liver’s circulatory system alcohol disseminates into the bloodstream. It then exerts its effects on the brain and gives the feelings of intoxication. Alcohol continues to circulate throughout the blood and is processed by the liver every time it passes through the portal vein system. The biochemical mechanism for this process in the liver is a conversion by the enzyme alcohol dehydrogenase into acetic acid and then into carbon dioxide and water. It is during the processing by alcohol dehydrogenase that a coenzyme is required; this rate limits the conversion so that at sufficiently high quantities of alcohol in the blood, the liver will metabolize and excrete alcohol at a constant rate. Once you hit saturation you can theoretically predict blood alcohol content (BAC) and the excretion of alcohol by the following equation:

Blood alcohol content (BAC) is then a factor of the volume of alcohol consumed (v), the strength of the alcohol (z), the proportion of alcohol absorbed (a – which is presumed to be one under “normal” experimental situations), alcohol density (d – equal to 0.789g/ml), mass of the subject (m), and a conversion factor that is a product of the water content of a person’s tissues (r). The excretion rate is a factor of the time since consuming alcohol (t) and a coefficient that has been estimated from previous experiments (B – about equal to 13.3 g/dL/hr).

As we have an ability to predict BAC, we can then estimate the tolerance that SS has to alcohol. There are several ways to classify tolerance in alcohol ingestion. One way in particular classification separates how the alcohol acts on the body and how the body acts on alcohol. Functional tolerance relates to how sensitive the brain is to the intoxicating effects of alcohol – it is how the alcohol acts on the body. Increasing functional tolerance is why alcoholics can drink quarts of vodka and not slur their words. This is a difficult thing to measure accurately as BACs will still be high despite behaviour that suggests otherwise.

What we can measure a little bit more accurately though is dispositional tolerance. Dispositional tolerance is how the body deals with alcohol. A liver that is chronically exposed to alcohol can clear the same amount of alcohol at a quicker pace. Since we have a way to estimate the theoretical BAC of SS and we will also have empirical estimates of BAC through the breathalyzer we can then check his relative dispositional tolerance to alcohol. If the log of the ratio of empirical BAC to predicted BAC is above zero then SS has higher dispositional tolerance than the average person. If the log of the ratio of empirical BAC to predicted BAC is below zero then SS has a lower dispositional tolerance and is a lightweight.

So some logistics. BAC breathalyzer measurements were taken at six minute blocks and all drinks taken during a block were attributed to the beginning of that block. The debate itself lasted about 90 minutes but the total time we assessed the BAC for SS was 114 minutes. A scribe/referee counted drinks and I went back after the debate with a transcript and checked the count to ensure accuracy. SS fasted prior to the session and did not otherwise consume alcohol during the experiment. In honor of America we chose an American beer that had an alcohol content of 4.8%. It was the weakest beer in my fridge.

So first of all, what were the subjects of discussion (again – using that term loosely) that earned the most drinks during the debate? The following graph shows that, on a per-drink basis, Clinton and Trump talked about China and Mexico a lot. Tax cuts were also a big subject of discussion during the economic portion of the debate. True to form, Clinton made SS drink on the stereotypical politician talking points like “middle class” or “small businesses”. True to form, Trump mainly made SS drink during the wildcard portion of the debate while discussing the Wikileaks scandal and making vaguely racial remarks. He was one mad hombre.

On the topic of jaw-drops, there were four in total during this debate. Clinton called Trump “Putin’s puppet”, which made Trump’s face look like it was about to burst from all the blood flowing to it. One for Clinton. Trump accused Hillary Clinton and Barack Obama of personally organizing a riot in Chicago during one of his rallies. Then he refused to say that he would accept the results of the election. Two jaw-drops for Trump. The moderator, Chris Wallace, also got a jaw-drop for casually mentioning that the candidates (because they hate each other so much) could not agree to closing arguments at the end of the debate. Kudos to Chris for getting on the board.

On a who-will-make-you-drink-more basis, we have conflicting results. In terms of number of drinks, Clinton won. In terms of volume of drinks, Trump won, mostly on the back of his jaw-drop drinks. Like the tortoise and the hare, Clinton was slow and steady, and Trump was all over the place. This tie-of-sorts seems consistent with their leadership styles.

Over the totality of the debate, SS consumed about 1700ml of beer. His peak BAC occurred at the 96^th minute at the end of the debate, thanks to Chris Wallace and his loud mouth. This peak level was 0.102 g/dl. The other BAC peak was a 0.099 g/dl reading in the 48^th minute after a flurry of affair and sexual impropriety talk and then Trump accusing a sitting US president of organizing a riot.

It turns out that, at least relative to the Widmark prediction, SS clears alcohol well at high levels but is a lightweight at lower levels of consumption. As the debate continued into the 50th minute and the volume of beer increased dramatically the ratio of measured to predicted BAC dipped below zero. His liver was lazy at the beginning of the debate but started to kick into higher gear as the night wore on and more outlandish things were said.

So I guess we’ll give this contest to Trump as he did make SS drink the highest volume of beer. It’s a pyrrhic victory for the Donald but at least it’s a victory. Winston Churchill once said that you can always trust the Americans to do the right thing after exhausting all of the alternatives. I never thought one of those alternatives would be a presidential candidate who was a woman-groping, race-baiting, anti-intellectual, but here we are. I wish I had more of a profound ending to this post but it’s hard to be profound when you have to affirm that these attributes are disqualifying for a leader. But the Americans will do the right thing. That, or my liver will be working a lot harder over the next four years.

Friday 26 August 2016

Where is the most desirable place to do a medical residency (2016 edition)?

There is a cognitive dissonance between selecting students for admission and then producing doctors. The traits that make you a slam dunk admission were no longer all that important once you got into the program. You play violin in the symphony? You’re in! But you should really stop playing and study more. You’re an Olympic gymnast? Here’s your white coat! But there’s no time to practice, you should study more. You volunteer at an orphanage for victims of arson? Welcome to medical school! But you should really stop helping them and study more.

To some extent, medical schools don’t really care about what extra-curricular activities you salt your resume with as long as you breach a certain threshold. The signal they get from a padded resume is that you’ve been busy and you can remain busy and this is good because medical school is busy. Rather than having much of an interest in students with hobbies, they’re selecting for students who can tolerate a heavy work load. This theory seems far more consistent with what then happens in medical school, which as one staff physician put to me, was to “make all of you unique little snowflakes into snow”.

This homogenization process would manifest itself regularly in pre-rotation orientation sessions when the attending physicians would inevitably ask the group of senior medical students to tell the group about “one thing you like to do in your free time”. A common response was “I like to travel”, which I think was a reflection of how few extra-curriculars most of us had at that point and also how most pleasurable thoughts revolved around getting the hell away from medical school. The attending physician would then say some generic pleasantry about traveling and free-time and, without a trace of irony, pass out the call-schedule. Welcome to internal medicine/pediatrics/surgery/obstetrics, you’re working for the next fourteen days straight!

This lack of intellectual diversity became more of a problem when it came to the residency match. Everyone had gotten into medical school based on their grades (which were uniformly good in undergrad) and their extra-curriculars (which no one had any more because of medical school). So almost everyone was a generic medical student to the higher-ups. Medical school in Canada is also unique in that most residency programs don't care what your grades were in medical school as long as you pass, so there was little to distinguish medical students on this basis. I was an enormous beneficiary of this policy so I'm not sure I have a whole lot of grounds to malign it, but it did induce some perverse incentives.

Specifically, it induced medical students to "work together as part of the medical team". This may sound like a good thing when one of the deans says it, but it is more colloquially known as ass-kissing. Marks and special skills mattered far less and so it became more about who liked you when you worked for them. Since rotations were only several weeks at a time, intense, adoring flattery was the best way to snag a good reference or to gain favor with whatever program director. As it was another thing I wasn't very good at in medical school, I found the process extremely exhausting. It also gave a lot of power to people higher up the food chain in medicine. Medicine is a profession known for its abuses of learners, and sometimes you can see why when there are imbalances like the ones inherent in the match process.

Anyway, this culminated in the fourth year of medical school during the residency match, where universities across the country listed their preferred medical students. If you made the cut, you got into your program and you got the privilege of taking crap from senior physicians for a paycheque. About a year ago I had matched and they (probably) couldn’t fire me because I had a contract, so I decided to turn the tables on the medical schools and rank them.

As I know too many people in medicine, this was my most popular blog to date. Not even the sex one beat it, despite it being WAY more interesting, better-written and full of dirty word-play. To capitalize on the upcoming match process and boost traffic to my blog, I am releasing a new and improved medical school rank.

*********************************************************************************

My ultimate goal last year was to base the relative rank of a medical school on the desirability of the school to medical students. I didn’t want to base the rankings on journal citations, academic staff, or anything tangible at all. I wanted to use the wisdom of the crowds to determine ranking. Places that do a better job at making their residents happy for whatever reason would attract more medical students. That may be because they are located in nicer cities, they are more academically impressive, or they treat their residents well. I don’t care how or why they are more desirable, just whether they are more desirable.

It's a little more difficult to assess desirability in a system where there is a cap on demand for spots. There are only so many residency spots to go around and when they fill up that's it. In a well-functioning market, desirability is something that can be revealed (arguably) by volume in a short-term sense and in price in a longer-term sense. This doesn't work with residency spots and so my logic last year was to see what universities seemed to attract the medical students from across the country. The argument for revealing desirability was that a desirable school is universally desirable and should attract a high volume of applicants as well as the best applicants from other medical schools. Undesirable schools would not receive as diverse an array of applicants and so this would show up in a limited number of outside medical students going to those schools.

Given the data available, I still think that this is the best way to rank medical schools, but the methodology I used was incomplete. This previous statistic for a medical school was essentially the average percentage of other medical school classes that went to the medical school in question. Although I hinted at the problems with this, I couldn’t come up with a solution and so I really didn’t do anything about it. I was also going to the University of Toronto and it was number one, so it was a result that coincided with my prior expectations.

The major problem had to do with the size of the accepting medical school as compared to the size of the donating medical school. When you have a large pool to accept into, you can take a large number of people from other medical classes. Since the University of Toronto has some 300-odd residency spots, it can easily accept 10% of the medical class from Memorial University. But Memorial is a small medical school and its residency class is similarly small. Memorial University couldn’t really take 10% of the University of Toronto’s medical class even if it was the most desirable medical school in the country. This puts smaller medical schools at a disadvantage relative to larger medical schools.

This year I tweaked the methodology to try and account for this. My small insight into this was to imagine what the allocation of medical students would look like if every medical school in the country was equally desirable. Medical students would be indifferent in going to either Memorial or Toronto or any other medical school for a residency. In this utopian scenario they would all apply to all of the universities and get accepted at roughly equal rates to each university (assuming a roughly similar distribution of talent across the medical classes). The result would be that each medical class would allocate a percentage of their medical students in proportion to the size of the residency class at the accepting medical school. Under these conditions, if the University of Toronto has a residency class that comprises 10% of the total spots across the country, then it should take 10% of the medical class from Memorial (and 10% from UBC, and 10% from Western etc.) If Memorial has a residency class that comprises 3% of the total residency spots across the country, then it should take 3% of the medical students from Toronto (and 3% from UBC and so on).

The way to measure desirability is then to estimate this baseline scenario and then to see how far real life deviates for each medical school. Deviations above the baseline mean that the medical school is more popular that it would be in a scenario where all medical schools were equally popular. Deviations below the baseline mean that the medical school is less popular than it would be in a scenario where all medical schools were equally popular.

The other little tweak to the model is to get a measure of the effective spots that a residency class consists of and that a medical school contributes to the total pool of medical students. This tries to acknowledge that a medical class consists of leavers and stayers. A medical student who stays at their medical school for residency means one fewer medical student from outside who can get into that class. A medical student who stays also means one fewer medical student in the pool of prospective medical student to go to other schools. I also added the unmatched spots into the pool of total residency spots available which was something I did not do last year.

The statistic for ranking the medical schools is then based on the following set of equations. The utopian benchmark case where every medical school is equally popular is defined as:

For a medical school a, the utopian benchmark value B is the ratio between the effective spots at a medical school and the total residency spots available at all medical schools. The effective spots at a medical school is the difference between the total residency spots at that medical school (R) and the medical students from medical school a that go to that school for residency (r). Add to this the unmatched residency spots, U, at the university for a total number of effective residency spots at the medical school. The denominator is the total number of available residency spots across the country. This is all of the unmatched spots, plus the open spots at each university - the difference between each residency class and the number of medical students that stay at their home medical school for residency.

The actual case that we observe is described by the following relationship:

r is the number of medical students from school b who go to school a. Divide this by the total number of medical students available from medical school b which is the difference between the total medical students at school b and the number of them that stay at school b for residency (r).

The ranking statistic for school a is then the difference between the benchmark scenario, B, and the actual real life scenario A.

Add up all the of these deviations for all of the medical schools in the country (excluding the medical school in question) and take the average and you get a measure of a medical schools desirability for residency.

***********************************************************************************

So onto the results and some interpretation is warranted here. First, the more desirable a medical school is for residency, the more negative its desirability score will be. This is a result of the difference between the baseline utopian scenario and what we observe in real life. A more negative number means that medical students are going to these universities above what we would expect. These universities are punching above their weight. A university that has a positive desirability score has a baseline utopian scenario score above what we observe in real life. It is a less desirable place to do a residency. A university that has a desirability score of zero means that the utopian scenario is roughly what we observe in real life.

Using this new methodology for the 2016 match (and 2016 match data from CaRMS), UBC places first and Laval places last. My own university, Toronto places second while my old medical school, Manitoba, places second last. Now onto the speculation. Why do these rankings look the way they do?

There are a number of obvious reasons as to why people go to places for residency that probably influence these rankings. First is the location. Like the 2014 rankings, the top schools are located in cities and towns with a reputation for being more exciting than the locations of other universities. They also are known for being universities with decent research and clinical reputations. Conversely the prairie universities and the more remote universities (NOSM, Memorial) are difficult to get to, cold, and are not known for their night life. They also, probably because of scaling issues, have less robust research reputations. This makes it difficult to attract and keep residents in these places.

As per last year, Quebec schools have a hard time attracting residents from across the country likely because of the language barrier. Montreal is the one exception here because it seems to be able to attract residents mostly from other Quebecois medical schools. McGill, as the one major English medical school in Quebec, ranks pretty low because you still need to learn French to go there and pay in Quebec is (still) shockingly bad for residents. The difference between a first year resident's salary in Quebec and in Ontario is over $10,000 and English speaking residents have better alternative options outside of Quebec. As a result, McGill suffers disproportionately in the rankings. Residents who speak exclusively French on the other hand, don't have an option to leave the province, which is why Montreal does so well.

Now, how have these rankings changed over the last year? I went back and repeated the same exercise on 2015 data (which was my match year) to see how these rankings have fluctuated.

Since last year there have been a couple of major changes. Memorial and Alberta have both dropped in these rankings. I suspect this may be due to the ongoing fallout from the low price of oil. Previously, the provincial governments could throw money at their medical schools and residents could count on a decent salary over the long run. Both provincial governments have indicated a need to reign in expenditures. Health is one of the bigger portfolios and thus a target for cuts. I suspect residents are pricing this fact in for this year's rankings. The exception to this is Calgary (also being in the province of Alberta) which had a largely stable ranking. The University of Calgary has one significant thing going for it, which is that it is neither located in Edmonton or St. John's. Nevertheless, its ranking may start to collapse over the next couple of years unless oil prices rebound.

Montreal, Queen's, and to a smaller extent, McGill, climbed in this year's rankings. Neither Montreal nor McGill did anything to warrant the bump in rankings as their desirability scores stayed almost constant. They are beneficiaries of other universities falling in desirability. Queen's had a true measurable increase in its desirability score for reasons that escape me.

So I leave you with this little disclaimer. There is a certain wisdom in crowds revealing desirability, but it's wrong to say that this ranking suggests that a medical school will be a better place for your own residency. A lot of decisions go into picking a medical school for residency and the wisdom of the crowds is only one small input into that decision process. A plurality of medical students find that their best option is their own medical school. For every medical school other than Queen's and Ottawa, over 35% of the medical class stayed for residency and Ottawa matched about 30% of its class to its residency program. Queen's, for the third year running, matched the lowest percentage of its own medical students to its residency program at a dismal 14.5%. This in itself should give you some pause about the accuracy of these rankings. If medical students who have been at Queen's for four years have a near-universal interest in leaving, you have to wonder about the desirability of that residency program no matter what these rankings say.

The last point that I'll make is that it really doesn't matter what medical school you match to for a number of reasons. Thanks to accreditation, teaching and learning is largely standardized across programs. Each university has its own research strengths and agenda, but if you want to make it a part of your career, they will bend over backwards to accommodate you. Medical schools love research even if most of what is done is basically useless.

Finally, if you're worried about a medical school based on its location or quality of life, just remember this: for those of you going into a specialty program, it doesn't really matter because you'll all be looking at the inside of a hospital for the next five years anyway. And if you're going into family medicine and you get stuck at an undesirable medical school, you can just move away after two wasted years. Don't sweat it!

But life is funny. I didn't get into any one of the top three programs I applied to and looking back this was a blessing in disguise. If I had gotten my top program I would be in a different city and in a specialty program that would be entirely unsuited to my personality. Instead I got into a program that fits me perfectly. I didn't get the program that I thought I desired the most and it worked out for the best. It'll probably work out for you too.

Monday 25 July 2016

When do people have sex?

I’ve been offline for a couple of months due to a string of writing-unfriendly rotations. Months of emergency medicine, internal medicine, and obstetrics has an uncanny ability to destroy all vestiges of a life outside of medicine. I ended on obstetrics, and there’s nothing that I want to do less than write a blog after handling a screaming, poop-covered baby that has passed through a screaming, poop-covered woman. And then I got lazy. If I’m being honest, this last reason accounts for the majority of eight months of radio silence.

I’m venturing forth onto a pediatrics rotation and will be dealing with cantankerous, non-verbal, tiny animals for a month. Today I got urinated on, and it was only day one. So this post is going to examine when people make the mistake of making these little monsters. What are the daily and weekly patterns for getting down and dirty - when are people having sex?

People can be a little leery of researchers who ask them about their sex life. Data is scarce and probably pretty unreliable, especially when it comes to surveys on this subject because people get creeped out when people ask them about their sex life. But luckily everyone is on the internet these days and the internet doesn’t lie. If you remember that bit of wisdom from legendary American Senator and angry old codger Ted Stevens, the internet “is not a big truck, it’s a series of tubes”. All of these tubes lead to Google.

Analysis of Google data is a big business, mostly for Google, who uses your search patterns to show you ads for whatever it thinks you might buy. Nobody embodies the ethos that “if you’re not paying for the product then you are the product” better than Google. So remember that the next time you’re searching for whatever weird stuff you may search for in the comfort of your own home. Google is watching, and waiting to sell you shovels or lawn darts or a guided tour of a cheese factory. They know your deepest darkest secrets.

But because Google likes to think of itself as a benevolent tech version of Big Brother, it also likes to sometimes make its search data available to the researchers who live in its Oceania. Economists have used this data to see the effect of racism on the election of Barack Obama: in the surprise academic result of the century, it lost him votes. They have also used it to show that economic recessions increase child abuse – a result that was actually a little more surprising because survey data had previously shown the opposite. Another paper that used Google data showed the effect of MTV’s show 16 and pregnant on teenage pregnancy rates, which was to decrease teenage pregnancy. All of these papers have been published in top-notch economics journals including the American Economic Review, which is a journal that most economists would literally give some of their fingers to get published in.

More pertinent to this blog though is the field of “now-casting” using Google data. It's an effective real-time tool that can be used to answer questions about the world today. Google search data has been used to track the daily ebb and flow of stock prices. It has been used to predict when influenza outbreaks might happen (i.e., when people search for “do I have the flu?” more often). And I’m going to use it to show when people are knocking boots in Ontario.

So for this blog I use open-source Google Trends data. It's free and easy to download, but the nature of the data obscures interpretation. This is because of the way that Google formats its data. For a particular time frame and place, Google constructs an index for a group of search terms. The top-searched term in the time frame receives a score of 100. A term that receives half of the number of search queries of that top ranked term receives a score of 50, and so on. Because this scale is relative, we can only make relative inferences, not cardinal ones.

For this post, I use two data periods for a set of Google Trends search queries made in the province of Ontario. The first data period encompasses a 90-day run from April 25, 2016 to July 22, 2016. The second data period encompasses hourly search data over the last week. How can we use Google search data to learn about the timing of bonking in Ontario? Again, it's unfortunately not as simple as asking Google trends to return the search volumes for when people search for “sex”. When you have to search for the term “sex” on Google, you probably aren’t getting any. But what people might search for are the complementary goods necessary to have sex. When you buy hot dogs, you need the buns. When you go dirty dancing, you need the birth control.

So if people need birth control for sex, how can we use this to determine the timing of sex? To do this, we need the types of birth control that people are searching to buy around the time of doing the deed. Birth control that you might need before doing the deed is a condom. Birth control that you might need after doing the deed is plan b.

Stepping back a little here, what is the logic behind using Google Search data to “now-cast” aggressive cuddling? People search things when they are about to get intimate. For example, many people use condoms when they bump uglies. Sometimes this is a last-minute thing, so when people have more sex, they use Google to search the closest source of condoms. So while there should be some baseline search volume for condoms, when these search volumes go up, we can make an argument that the rates of sweeping the chimney goes up as well. The same goes for plan b. Because of this, the search volumes for these two complementary goods should nicely bookend when people are organ grinding.

So without any further titillation, this is the long-run time series for the search terms “condom” and “plan b” in Ontario.

Long-run search volumes for "condom" and "plan b" in Ontario cycle together, and they cycle on a weekly basis. Notably, the peak search for "condom" was on May 8, 2016, or Mothers Day. Mothers everywhere in Ontario received more than just breakfast in bed that day. The peak in search volumes for "plan b" during this period occurred on July 16, the date of a large Guns-n-Roses concert at the Rogers Centre (you may think that this is a completely spurious correlation but two out of the top seven Google trends searches in Canada that day had to do with Guns-N-Roses). I had thought that taking a date to see Axl Rose screech at you was the greatest form of birth control, but I stand corrected.

Since this data cycles so regularly, looking into the weekly data also reveals the day of the week people may be getting busy. The cycleplot groups the search volumes by each day of the week over the 90 day period observed. So for example, when looking at the x-axis in the following graph, all of the search volumes on Sundays are grouped together so that they can be compared to other days of the week. For simplicity, I have only included searches for "condoms" but the cycleplot looks similar for "plan b".

Over the 90-day stretch, search volumes for the term "condoms" peak on Sunday, then decrease throughout the middle of the week. The nadir is on Wednesday and search volumes begin to increase as the weekend begins. Not surprisingly, people most often engage in a little slap and tickle over the weekend. But the fact that this peak in nocturnal activity occurs on Sunday rather than Friday or Saturday is interesting. Date nights are supposed to be Friday and Saturday, so why are we seeing spikes in search volumes on Sundays? Looking at the hourly data explains this anomaly.

The short-term Ontario time-series for search volumes of "condom" and "plan b" are shown below. Again, there is a significant amount of cycling in this data, with peaks occurring in the very early mornings. This explains the high search volumes for birth control on Sundays - people have their dates on Saturday night and the party continues into the early morning of Sunday.

Hourly cycleplots confirm that Google searches for birth control occur after midnight. The peak in "condom" searches peaks at about 2AM. It decreases steadily until about 4AM and then is stable. It then rises again as the day approaches 11PM.

Searches for "plan b" follow similar dynamics throughout the day, but the peak in search volumes is displaced a little bit later. Peak search time for "plan b" occurs at about 3-4AM. Searches collapse past that point and don't begin to rise again until about midnight.

Again, it's notable that the rise in searches for "plan b" seems to follow the rise in searches for "condoms" just as one would expect if some afternoon delight happened in between. This data would then suggest that the peak time for assault with a friendly weapon occurs between 2AM and 3AM in Ontario.

A couple things: this data suffers from some obvious limitations. First, you need an internet connection, which might skew the demographics towards younger people. Moreover, you might expect that this specific identification strategy only catches those people who are unprepared for laying pipe. This is a big limitation, as people who are unprepared may be late to the party and so the pants-off dance-off that we observe in the data might be happening later than when the average person might normally engage in rolling in the hay. Condoms and plan b are also goods that might be exclusive to certain types of sex and certain types of sexual partners. Nevertheless I think that it's pretty neat that we could even pick out this biased relationship from Google trends.

I'll leave you with what is probably the most important thing to take from this post. I was able to fit over 20 euphemisms for sex into a 1700-word post. If that's not a successful blog, I don't know what is.