3/25/17

Landing a Data Science Gig in New York City

I wrote this on thanksgiving day (11/24/16) so I purged my thoughts while they were fresh and now I’m finally getting around to publishing this (lazy..? I know). Oddly enough 6 months later, I?m now rewriting this from the lens of a hiring manager – my opinions have changed a bit.

To give thanks to all those who helped me through my journey, I wanted to write this guide to finding a data science gig in NYC specifically for the underdog - no advanced degrees. Job search articles are quite popular but mostly by folks who have the prototypical background: PhDs. Having no advanced degree whatsoever, the adversities are quite different.

There’s definitely a stigma with folks without advanced degrees and you have to prove yourself a bit more. Having had quite a few Q&A sessions with lead data scientists with such backgrounds, the short answer of how you stand-out is bodies of work: End-to-end data products and/or write-up analyses. Demonstrate you know how to take a business problem and deliver value.

About interviews, people say “you’ve got to play their game” which is true on some level. But you could also filter certain types of interviewers out upfront and focus on ones which play to your strengths. There’s a sea of start-ups out there, it makes sense to be somewhat selective depending on where you?re at experience/skill-wise.

I focused on interviews which were very representative of day-to-day work. Being an ex-quant, I already had strong practical skills.

To make sure I’m providing proper & relevant info, I’ve ran this past a few hiring managers at major start-ups and job searchers with entry level skills. Though I’ll continue to mention, you could know everything through-and-through, have no imposter syndrome, yet still struggle; which is reflective of a harsh environment, not of your skills. So, be discriminative on what you internalize.

Some statements here may seem pervasive or biased, and that’s because they are. I’m only speaking from my experience and do not intend to cast a blanket statement over an entire population of interviewers.

This writing is comprehensive. Some parts you may find redundant, in which case it should serve as reinforcement to pre-existing knowledge, boosting self-confidence. Other parts you’ll find original and insightful (hopefully).

Bay Area vs. NYC

NYC is oriented towards salespeople and relationship managers. As such, the culture permeates throughout the community, even reflected in data science interviews.

The traditional mindset of interviewing–

In a nutshell, the evaluation is more perception based vs the west coast. There’s certainly controversy around how to best evaluate a problem-solver and perception based interviews are a terrible way to do it - bodies of research show this. Although, it’s unsurprisingly effective for sales recruiting. So in this regard, Bay Area wins.

Verbose=True

In regular conversation, it’s good practice to limit your answers to less than 60 secs to keep your audience engaged. That rule goes out the window for these types of questions. Err on the side of verbosity. Be able to speak “at length” about a technical subject. It’s supposed to demonstrate your rigor & interest level. Though you very well could be repeating words memorized from a textbook/blog.

Bay Area is oriented towards engineers. Having lived in LA as well, I’d also extend this to the rest of California. Intellectual honesty is much more appreciated - they care about what you can do. It’s preferred to admit your flaws and shortcomings skill-wise upfront. Whereas in NYC, you’re always in a mode to “sell yourself” and admitting flaws might be seen as weakness. Which makes me wonder, do people want to hire good interviewers or practitioners? …A topic for another article.

Other distinctions noted for each section below.

Resume

Courtesy of all the publicity around purportedly high salaries of Data Scientists, everyone wants to be one!

Unless you fit the mainstream mold, it’s difficult for hiring folks to distinguish you from the rest of the applicants. So work hard on making your resume distinct! It applies twice, i) application submission and ii) in networking.

Project structure

Each project from your past jobs should follow a consistent structure:

__Item Conversions__  
- (Problem) Modeled conversions and user engagement for items on our platform. Previously thought of as an unsolvable problem due to sparsity and low observation count.  
- (Solution) Augmented dataset from disparate sources which created strong classification & regression models. Generalized by feature engineering, grid-searching hyper-parameters, and PCA.  
- (Impact) Increased conversion rate by 18%. Write-up analysis increased understanding for decision-makers.  
(Used)_Sklearn, NumPy, Hadoop, Hive._

Initially, start-off verbose then extract the salient points to make it concise – use these points in your resume.

This structure not only helps the recruiters & hiring managers parse relevant points as they read, but it also helps how you frame your thoughts during the interview. As you write, always keep this in mind: Less is more.

About the “Thou shalt not break the 1 page resume” rule: Fuck that rule. But be extra mindful of the sequence of topics and projects. First thing a recruiter/hiring manager sees should be most relevant to the skills & experience they seek. For me it was– i) technical skills (2 lines) ii) my prior experience as a Data Scientist iii) my experience as a quant

This arrangement will pique their interest enticing them to dig deeper to subsequent pages after just a quick glance.

Projects sheet

Showcase your projects with links accompanied by a brief description (same structure mentioned above). Links should ideally be to a blog post and github repo. Note: people like to see the end-product, not sift through code on github.

Do not let the code do the talking for you. Use visualizations and qualitative summaries to convey your findings – communicate them succinctly.

Cover sheet trick

If you’re really interested in a particular job, do the following: i. Make a T-chart. ii. Requirements from the job spec on the left side and on the right side, list corresponding examples of past work where you’ve demonstrated the corresponding requirement. iii. Include this as the first page of the resume so it doesn’t get lost.

Sourcing interviews

Initially casting a wide net and actively narrowing it with your preferences is a reasonable approach and it’s what most people do including myself. As you go through each and every interview make sure to ask deep questions about fit and be highly reflective about your performance post interview. A post-mortem should be standard part of your process.

Outline what you liked/disliked about the role/company and how your performance could’ve been better, etc.

NYC’s quite different from SF in that it’s highly susceptible to networking (which is unfortunately biased towards extroverts). On the bright-side, it’s a great place to learn the skill which pays a lifetime of dividends.

Why is networking superior to application/resume submission. All jobs on the market are available through networking, application/resume jobs is a subset. Only companies who are properly staffed and have a good recruiting infrastructure are equipped to handle the volume from an application portal. For example, at a lean start-up the recruiters only have so much time to allocate to certain requisitions. In our case, hiring engineers was priority over a data scientist, which made my role exclusively available through networking.

Cold email outreach

Use LinkedIn and rapportive to figure out someone’s email address (work) and write them a message about your inquiry. Github is another great resource for capturing email addresses (personal/work).

Opening email format–

Who you are in 2 sentences. Sell yourself a bit here. It should be clear that you’re someone worth talking to.

State what you want from them. Try and be as specific as possible. For example, “I saw your talk on conv nets and wanted to ask about the optimization algorithm you were using e.g. Adam, SGD? Would you have a sec for a few questions?” (2 more sentences)

Overall, it shouldn’t take anyone longer than 60 secs to read. Be succinct, clear, and to the point.

People often want to help-out but they usually don’t know how. By being specific, you’ll notice a positive reaction more often than not in your outreach.

Traditional job boards

Meetups

Which meetups?

What to do at meetups? Show up early to meet with the host and others. The host is a hub – a popular person – it’s worth investing in that relationship. He/She may also have helpful advice along with knowing about companies that are hiring.

The preliminary stages of the meetup are often reserved for folks looking to hire/network, if that’s not the case then just go talk to folks after the talk.

Approaching people: Start with someone that’s not talking to anyone just to warm things up. Have goal of meeting with 4 or so people per meetup whether or not they’re beneficial to you. Prepare a canned intro of yourself (background and job search interest) and always meet people with the intention of seeking advice or getting to know them vs asking for job outright.

An easy way to start a convo: Simply stick out your hand and introduce yourself and ask “so what’d you think of the talk?”

The types of meetups you go to are very important: beginner topics will have a beginner audience.

How to get noticed versus the other job searchers: have your 60 sec intro story down.

Note: if you’ve been studying, it’ll show in your interactions with people because more than often it’ll be the subject that’s top of mind. Taking in depth about a nuance but valuable subject can be very polarizing.

Have one good recruiter

I used: “clutch talent”

Interviews Questions

Questions for Recruiter

This is a chance to save yourself a bunch of time and frustration and opportunity cost of interviewing with a better company. So, it’s pretty important you take this call seriously.

Have they ever hired a data scientist before?

Data Science is still pretty nascent, consequently lots of folks have gotten into roles with very little experience in the real-world, only being validated by their credentials. In short this means there are a lot of junior hiring managers.

As such, they might have very biased vetting processes and ultimately end-up building poor teams which you don’t want to be a part of.

The issue is interviews can be very disjointed from the day-to-day tasks as a Data Scientist, so it make sense to deliberately study for interviews as much as possible.

Behavioral

“What to do you want to do?” i. State what you’re actually interested in doing on a day-to-day. e.g. “Come up with a data science or engineering solutions to a business problem” ii. Conclude with how it contributes to their needs Assuming it’s a fit, I’d couch (i) in the context of their needs and end it with how you’d like to make a contribution to them. This shows that you’re ultimately looking to be a team player. You don’t want to come off like someone that’s only “out for themselves.”

“Tell me about yourself?”

“What have you been up to?”

“What’s your favorite algorithm?”

On communicating “the approach” of a problem

Theory-based interviews (BUZZWORD BINGO)

ML must knows–

Feature Selection–

Dimensionality Techniques–

*know these down to the math

Elements of Statistical Learning (ESL)–

Watch these videos first on 1.5x

  1. Linear Regression
  2. Linear Classification
  3. Regularization
  4. Boosting
  5. SVM
  6. RF
  7. Ensembling Obviously read some areas deeper than others but be especially mindful of the vocabulary being used. This bolsters your communication skills.

CS algo–

Didn’t come across too many of these though the question which got me my job was the [word-break problem] (http://www.geeksforgeeks.org/dynamic-programming-set-32-word-break-problem/)

Probability Theory and Intro Stats–

This should cover dice rolling and coin flips Qs

Practical based interviews

SQL–

Bash–

Experimental design and A/B testing

Take-homes–

Being a practitioner, this is where I shine - nearly had 100% conversion to an onsite. Takehome assignments take more time but they’re a wonderful way to build your skill and are usually quite enjoyable (assuming you love practicing data science and the take-home material is mature).

As you do these, purge all helper functions to a central toolbox. This will aid you when you actually land a job so you could hit the ground running and it helps get through subsequent take-homes quickly.

Summary sheet at the very top

Get rid of the noise! Abstract away most code into a helpers file, suppress all error warnings. In the helper file, use descriptive variable names that make sense. Include docstrings in your helper functions and practice clean-code principles. Bonus for unit tests for all of your helper functions.

Talk big game for your later approach. Don’t be a afraid to wow them here.

The purpose of this section is to demonstrate that you’ve scoped problems thoroughly before diving in and you can communicate succinctly.

Seeking guidance

Whenever you’re stuck on any part of this job search phase, ask for help. People are very much willing to help you out provided they can. What you can do is be very specific about where you need help.

I need help with X. I’ve tried a,b,c but it’s gotten me no where. Any tips?

This shows– i) you’ve put in effort. ii) specifically where you’re stuck.

Also know when advice is bad or not relevant. This is a tough thing to do but don’t naively take everything at face value and follow instructions, always question it’s validity.

Staying motivated

Optimal weekly schedule

Other helpful resources

I hope you found this post helpful, it took me a long ass time to write it. Best of luck!