Dungeons & Data Science: Slaying your Machine learning Interview
Your guide to ML interviews in these troubling times
This week I have a special edition for you. Given the current climate of the tech industry, I wanted to share some practical advice on getting ready for machine learning interviews. As always, I’m grateful for your time and attention.
This Week on Gradient Ascent:
ML Interview special 🤝🧑🏫
The AI Interview Survival Guide:
Preparing for tech interviews these past few years has become like pursuing a Ph.D. program in many ways. There's so much to read and absorb, no clear path with many open-ended questions, and never enough time.
What's worse is that the modern interview process is a one-fits-all approach that disregards the unique backgrounds that candidates come from. Instead, it tries to shoehorn all candidates through the same weird funnel. So whether you're an alligator, elephant, or gazelle, you'll need to show that you can fly before you can show your true strengths.
Sounds gloomy, doesn't it?
Fear not, I've got you covered.
This isn't one of those guides asking you to study everything under the sun. Instead, I'll focus on dispelling common anti-patterns in each target area that's asked in interviews all the way from startups to big tech. In terms of scope, this guide doesn't focus on data engineering and infrastructure-type roles. If you're applying to be a machine learning engineer, data scientist, research engineer, or research scientist, this is for you.
Disclaimer: This is based on my past experience. YMMV.
Anatomy of ML Interviews:
The "Loop"
Recruiter screen
Timed coding test*
One-two technical phone screens
Take-home assignment*
Onsite loop
1-2 Coding rounds
1-2 Design rounds
1 Specialization round
1 Behavioral round
Offer conversations
* - Common with startups, less so with big tech. Sometimes these rounds replace a technical phone screen.
Data Structures & Algorithms
When is it asked:
Pretty much in every round except perhaps design and behavioral rounds. The objective of this round is to see that you can "actually code". However, it's morphed into a skill show where you usually need to code vomit two "medium" or "hard" level problems within 40 minutes while explaining your thought process. Easy huh?
Anti-pattern(s):
Ever heard the phrase "Grinding Leetcode"? No? There are actually guides for it. Simply put, sacrificing your spare time and mindlessly engaging in solving coding problems will not get you anywhere. All over LinkedIn and sometimes Twitter, I see folks bragging that they've solved hundreds if not thousands of problems from Leetcode (LC). But to what end? This approach almost always ends in burnout.
Summary:
Avoid focusing on quantity
Cover different areas proportional to how frequently they're asked in interviews
Avoid memorizing solutions
Do this instead
Despite the common expectation "You need to solve 2 medium-hard LC problems in 40 minutes", grinding Leetcode won't get you far. Here's why - The problems you're given will rarely be the same as the ones in Leetcode. Yes, they'll be very similar. A good interviewer will simply change the constraints. If you merely memorize problems, you'll code up a perfect solution to the wrong problem.
Instead, work on identifying patterns - tie them to common data structures or algorithms. Learn how to use clarifying questions to reduce the possible solutions down. For example, if your interviewer says "I need O(1) access but I'm ok with using extra space", that should immediately bring hashmaps as one of the possibilities for you.
To build these skills, spend time with a pencil and paper, working out problems from different topics. Focus on problem-solving, not on coding. Once you have a grasp of that, then focus on implementing it in code.
Power tip 1: Use Google docs to practice coding. It's much harder to spot and correct mistakes without syntax highlighting. So if you practice in a harder environment, you'll get better at avoiding these mistakes in the first place. You might say "But Sairam, what about auto-correct and other annoying features that prevent me from writing code on Google docs". That's why I have…
Power tip 2: Here is how you disable all that to have a nice peaceful coding environment on Google docs.
Power tip 3: Use ChatGPT! If you haven't tried it yet, use this language model as a tutor to explain problems to you. If you're stuck, type in the problem statement and ask it to solve it. ChatGPT usually explains how it arrived at a solution, so it's perfect for teaching you how to approach a similar problem. That being said, don't trust it blindly. It can hallucinate 😉.
Takeaway: Solving 100 problems (not just easy ones) and covering a broad variety in the process is significantly better than grinding 500 popular ones without context.
Resources:
Besides the obvious existence of Leetcode, here are a couple worth checking out:
Elements of Programming Interviews (EPI) - Avoid the "Cracking the coding interview" book. It's a fantastic book if you're just starting. But, EPI is much closer to the kind of difficulty you'll see in tech interviews. Use it instead.
Grind/Blind 75 - Use a curated list like this to prepare in a methodical fashion.
System Design
When is it asked
Usually, this is reserved for the onsite loop. I've had a couple of friends who told me they were asked about System Design concepts during a phone screen, but those are outliers.
What is it about
There are two types of design interviews, Scalable System Design, and, ML System Design. While these might seem like the same thing, they focus on different competencies. Scalable system design interviews focus on building a system that scales to millions of users (Eg: Design Youtube, Design an autocomplete system, etc.). There usually isn't a perfect solution so the focus here is to show how you'd handle tradeoffs. These types of interviews are the "elephant flying" interviews I alluded to earlier. These skills are learnt by building large systems at work. However, a lot of folks aren't exposed to these types of problems at university, and, even at work. However, this interview is always a part of the loop. So bite the bitter pill and prepare for it.
ML Design interviews also focus on large-scale systems. But, the crucial difference is how you design the ML side right from data prep, feature engineering, modeling, validation, and so on. Common examples are "Design a movie recommender", "Design a frequently listened to playlist for Spotify", and so on. The ML component you design is a single cog in a bigger system, so it's important to design it so that it helps the overall system. The tradeoffs here are about how your ML system helps solve the problem but meets the latency or other criteria specified by your interviewer.
Unfortunately, these rounds play a big role in determining your seniority. Positions at senior level and higher require solid performances in design rounds. That's the way the cookie crumbles. Best to enjoy it with milk.
Anti-pattern(s):
Since a lot of folks don't have formal exposure to designing these systems hands-on, they have a tendency to over prepare - They read every book under the sun, watch all the videos on system design, pore through research papers, and more. Stop grinding DDIA (Not to be confused with DDLJ). This leads to memorizing which in turn leads to interview failure.
Do this instead
First, accept that no matter how much you slog, there's always a new problem that can be thrown at you. So, focus on the ideas and why they're used. For example, if you need to use consistent hashing, understand why it's needed in the first place and what problem it solves. Second, these systems have been built by hundreds of engineers over several years. As good as you might be, thinking you need to build that exact system in 40 minutes is a sure-shot path to failure.
Here's what worked for me. Reducing scope through clarifying questions. A design problem can either be breadth-based (many components to cover but with little depth) like designing a ride-share system like Uber. Or, it can be a deep dive into a single component like designing an autocomplete system or URL shortener. If you tease this information out by asking the right questions, you avoid a thousand wrong rabbit holes to dive into.
For ML design interviews, bias more towards the ML part of the system. There can be a temptation to talk about scaling the system and so on, but focus more on the ML aspects. After all, this is one of the areas where you really can shine and show your strengths.
Finally, look at apps and services you use on a daily basis and try to deconstruct how they work. What happens when you order something on Amazon, or how do you get restaurant recommendations on Yelp and so on. Tie these problems back to the concepts and you'll remember more!
Resources:
Besides the DDIA book I referenced above,
- has an excellent series of books, videos, and a newsletter you can check out for design interviews.
System Design Primer: This repository is pretty good when it comes to some commonly asked themes.
ML System Design: This is a great guide to review when preparing for ML design interviews.
Stanford's ML System Design course: This is one of the best resources to learn the fundamentals of ML systems in production.
ML Fundamentals
When is it asked
Everywhere - right from the phone screen to icebreakers to full interview rounds. This is the heart of what you need to prepare for in my opinion. There is a wide range of things to cover but almost all of these fall under the "fundamentals" bucket.
What is it about
These types of questions (and rounds) focus on your core ML skills. There are too many things to enumerate here but to give you a taste, here's a non-comprehensive list:
Basic probability and statistics
Basic linear algebra and minimal calculus chops
Fundamental ML models like
Linear regressors and classifiers
Decision trees & Random forests
Boosted trees
Clustering & Dimensionality reduction
ML concepts like:
Bias & Variance
Handling overfitting, underfitting
Regularization
Loss functions
Metrics
Feature engineering
The questions you can face here range from simple "Do you know?" to "Implement K-means clustering in Python". Unfortunately, there isn't a standardization of these questions like coding rounds so it's really hard to pinpoint what might be asked.
Anti-pattern(s):
Avoid trying to implement everything from scratch. It takes too much time to do it and the ROI is minimal. Don't overprepare on the math side of things either. If you know how Bayes rule works, how vector and matrix math and basic differentiation works (chain rule), you're probably good for 80% of the interviews you'll face.
Additionally, avoid trying to use multiple textbooks and courses to prepare for this. You'll just procrastinate trying to figure out which one to use first. Most importantly, don't disregard preparing for these types of interviews in favor of coding or system design. These skills are what you're being hired for. Make sure you are well-oiled in this domain.
Do this instead
Pick a book or a course that works for you and work through the assignments along with the lectures. If you're a student looking to transition to the industry, you might already be doing this in school. If you're a working professional, you'll need to use this approach to refresh your knowledge. There are some commonly asked ML coding questions (K-means, KNN, linear/logistic regression, etc.). Practice implementing those. Remember, do the 20% that will get you 80% of the way there.
Resources:
Machine learning a probabilistic perspective: This is a great book to use as a refresher (don't read it front to back!)
Prof. Andrew Ng's ML course: Great way to learn from video!
Specialization
When is it asked
Usually onsite rounds and maybe during a phone screen. This is based on your project work and area of specialty. If you're a student, then this could be related to your course projects or internship work. For a working professional, this usually relates to projects listed on your resume.
What is it about
This will be a deep dive into your work. The interviewer will probe the depth of your technical knowledge in your area of expertise (computer vision, NLP, recommendation systems, RL, etc.). They can ask you to dive into the details of a project you did that's relevant to the company, or ask you core concept questions related to your expertise, or have you code up a solution after giving you a dataset to work with. If you are a Ph.D. student or hold a Ph.D., this could also be a presentation of your research work. The goal of this round is to test your ML breadth and depth.
Anti-pattern(s):
Avoid the need to learn everything in your specialty. Don't try to learn a new area of specialty from scratch just because the company you're interviewing with works a lot in that area. Have a broad understanding of where the field is, and know how things work, but dive deeper into your projects.
Do this instead
Use your resume as a guide. Pick the projects listed there and work back from there. You should spend a disproportionate amount of time and effort learning the tech stack, the algorithmic details, and the tradeoffs of your projects. Often times the company trying to hire you has a specific need and they've seen something in your profile (resume!) that can help them with it. So find the areas of common intersection and double down on those. Plus, if you speak in great depth with authority on your work, that sends a strong signal that you know what you're talking about.
Anyone can learn something from an online source and wax lyrical about it in an interview. Only you can talk about your work in depth. Leverage that.
Resources:
There are many here, but I'll list a couple below:
Computer Vision - This is a go-to for anyone learning modern computer vision.
NLP - Ditto, but for NLP.
RL - The gold standard for learning RL.
Deep Learning - The only resource you need for deep learning.
Culture Fit
When is it asked
Everywhere, sometimes not explicitly. In addition to your technical skills, your people skills matter. That's also why there's an entire interview round in some companies dedicated to evaluating this.
Anti-pattern(s):
Avoid underestimating this. A lot of folks are top-notch at technical chops but are terrible people to work with daily. Your interviewers want to know that you're a good colleague, someone they can grab a coffee with. As good an orator as you might be, you can't come up with clear narratives under the pressure of interviews. These rounds can sink you even if you aced your technical rounds.
Do this instead
Think of your past achievements, and prepare clear stories to walk through them. Everyone recommends the S.T.A.R (Situation Task Action Result) approach, so try seeing if that works for you. Practice your responses to situational questions with a friend or over some mock interviews. Use the time to reverse interview your interviewer - learn about the company and team through their eyes.
The preparation pyramid
Finally, not everyone starts off at the same level of competence for interviewing. So the question "how long will it take before I'm ready" is very hard to answer. In fact, you'll never feel ready. At some point, you'll just have to jump in.
Here is a typical working professional's competency readiness at the start of their preparation. Next to it is what the companies test you most on. The goal is to create a similar chart for yourself and then plan out where you need to work on.
This might be a contrarian take, but I believe maximizing and doubling down on your strengths will help you more than trying to eliminate your weaknesses. I don't mean completely avoiding an area of weakness. Raise your floor. But blow out your ceiling. That will help you stand out.
Overall Resource:
This book from Chip Huyen is fantastic for ML interview prep. Definitely check it out.
I hope this guide was useful to you. These are crazy times and I wish you the best in your interview preparation journey. Is there something I missed out on above that you've found useful in your preparation? Drop me a comment below.
For system design resources :
1. Engineering blogs of big companies like facebook/Meta, Amazon, Uber, Netflix,Google etc are great.
2. Youtube channels like Interviewready by Gaurav Sen
3. System Design Interview Vol 1 & 2 by Alex Xu (Book)
And thankyou so much for this newsletter. :) I really needed it.
That's a lot of valuable advice for job seekers, in general. Nice! And thanks for sharing.