ML Code Exercises | Modern Orange

221
64
mchab
1 year ago
deep-ml.com

TrackerFF
·
1 year ago
·
[ - ]

No one (well, few professionals at least) will reinvent the wheel when it comes to standard scientific computations and methods. Like numerical math, linear algebra, etc.

Looking through the problem sets in the link, the majority seems to be asking for just that.

If you're wondering whether or not someone knows how to transpose a matrix, or find the eigenvalues, let them do that on the whiteboard. No need to leetcode-ify such problems, because with 99.99% probability they'll provide you with solutions that are subpar compared to industry standard packages. There's more than time and space complexity when it comes to these problems.

EDIT: Also, you'll potentially lose a lot of high-quality candidates if you suddenly start to test people on methods they haven't worked with or seen in quite a while.

If you ask something like "please show us the equations for a support vector machine, and how you can compute a SVM" you could fail even world class ML scientists, if they haven't touched those for 10 years. Which is a very real possibility in the current ML scene.

I'd say that almost every ML interview I've had, or been part of, have been more big picture whiteboard interviews. Specific programming questions have ranked quite low on things to prioritize.

mchab
·
1 year ago
·
[ - ]

I really enjoyed andrej karpathy’s zero to hero videos and I like the concept of you don’t know something till you build it, so I made this site, probably should of come up with a better title because it is made as a learning tool not as interview prep like leetcode

seanc
·
1 year ago
·
[ - ]

First off, thanks! This does look like a fun way to learn things.

Secondly, FWIW, when I read the the term 'excercises' in the HN title I interpreted that to mean exactly a learning tool and not interview prep. The term "Challenges" in the website title is maybe a little less specific.

jvanderbot
·
1 year ago
·
[ - ]

I can appreciate that. It wasn't until I implemented a few matrix factorization routines that I appreciated the decisions that go into Eigen, etc. It wasn't until I tried it with SIMD that I appreciated the speedups and knew where to look to coax them out.

petermcneeley
·
1 year ago
·
[ - ]

> No one (well, few professionals at least) will reinvent the wheel when it comes to standard scientific computations and methods. Like numerical math, linear algebra, etc.

> because with 99.99% probability they'll provide you with solutions that are subpar compared to industry standard packages.

Somebody did last week with only a modest amount of effort: https://news.ycombinator.com/item?id=40870345

akudha
·
1 year ago
·
[ - ]

Most orgs need drivers but they interview like mechanics. If I am a driver, I am expected to drive different vehicles. Sure I know to do basic stuff like change tires/oil etc, but I am not going to know how to fix the engine or something else under the hood, right?

lamename
·
1 year ago
·
[ - ]

So it is a leetcode equivalent ;)

sourabhv
·
1 year ago
·
[ - ]

lol

coliveira
·
1 year ago
·
[ - ]

This kind of interview questions come from the mind of software developers, because that's the only thing they know how to do. When faced with some new area of knowledge, their instinct is to try to implement that in Python or some other language and imagine they have "learned" it. It doesn't occur to them that implementing things is not that helpful when it comes to most math topics.

bArray
·
1 year ago
·
[ - ]

This is quite a nice way to learn about ML, props for this!

Edit: I see a lot of people complaining about interviews, but instead I consider this a good resource for checking you understand fundamental principles.

mchab
·
1 year ago
·
[ - ]

Exactly, I think putting leetcode in the title triggered a lot of people

esafak
·
1 year ago
·
[ - ]

The hard part about ML isn't the implementation but the theory. If you're not sure what SVD is how is this going to help? https://www.deep-ml.com/problem/12

oehpr
·
1 year ago
·
[ - ]

It gives you an impetus to learn and a question to test your understanding. I'd say there's a pretty good track record for this style of teaching.

mchab
·
1 year ago
·
[ - ]

The learn section should help, but I think I need to spend more time improving the learn section

oehpr
·
1 year ago
·
[ - ]

I would say that "learn" button is a little unclear. It might be better to just have the whole learning section beneath the question, always visible. That will also help drive home the intent of the page, since so many people think this is some weird interview questions prep site.

csmpltn
·
1 year ago
·
[ - ]

Who the fuck asks about this useless garbage in an ML job interview. This is such a waste of time and gives you absolutely zero insight into the candidate, how they think, how they’re able to dissect and handle complex issues, their seniority, etc. Whoever expects people to regurgitate this garbage during a job interview is a loser themselves, and will only end up recruiting similar losers to hang out with and get NOTHING done ever. ML job interviews specifically are bottom of the barrel standard.

HiPHInch
·
1 year ago
·
[ - ]

Thank you for your work! Is there something wrong with: https://www.deep-ml.com/problem/7 ?

mchab
·
1 year ago
·
[ - ]

Yes, it seems like there is an issue with that question will try and fix that as soon as possible, thank you for the catch

admissionsguy
·
1 year ago
·
[ - ]

Looks like a decent problem set to accompany an introductory ML class. No need to get so defensive. However, I thought leetcode meant algorithmic problem solving while the problems here simply ask to implement the various elementary operations.

mchab
·
1 year ago
·
[ - ]

Yeah I think I miss titled my post it is more of a learning tool and less of a leetcode/ interview prep site

dr_kiszonka
·
1 year ago
·
[ - ]

I think as a learning tool this is pretty great! I want to implement the most common ML and stats algos over the next few months to review how they work on a deeper level and your website will help a lot. I like that you explain all terms in your equations.

Personally, I would probably enjoy even more explanations and/or links to good resources, e.g., visualizations, etc. as well as more information in the solutions (e.g., via comments or doc strings). Good job anyway!

dang
·
1 year ago
·
[ - ]

Ok, we've changed the title above. I hope that helps!

(Submitted title was "Leetcode but for ML".)

mchab
·
1 year ago
·
[ - ]

I would, but do not see the option to change the title

dang
·
1 year ago
·
[ - ]

We already did! I was just letting you know.

mchab
·
1 year ago
·
[ - ]

Nice, thank you

mrits
·
1 year ago
·
[ - ]

The issue with leetcode type questions is that formally trained and experienced people often could not answer these questions without specifically practicing for them. Most of the topics on this list could be covered in an introduction course.

renegade-otter
·
1 year ago
·
[ - ]

If you have to "study" something for interviews every single time because it's absolutely not relevant to your day job - it's probably bullshit.

Everyone copies the FAANG interview process because it looks cool - except that FAANG is just a welfare program for recent graduates, who indulge in peer interview hazing because they are not doing anything else. They don't study for Leetcode because they want to DO something - they study because of the money. But in a real company you have to DO things.

What has Google done in the last decade that is REALLY useful? Google Gmail and Docs can be maintained by probably 50 people, their search has gotten useless and all they do is kill their own products because maintenance toil is a total drag.

Like the dumb brain teasers that Google "pioneered" in 2000s. How many golf balls can fit in a 747? I don't know, but I can estimate how many can fit up your a...

This Leetcode nonsense will go the way of THAT, in time.

Just no.

bena
·
1 year ago
·
[ - ]

It was Microsoft who started with the “golf balls in a plane” style questions.

Google iterated to the standard DSA questions that are common now.

And I don’t think they’re entirely without merit. However, people think you should be testing to find the ceiling. That’s impossible. Not only do you have the issue of whether or not the candidate just got lucky by getting a question they just happen to know, if you are hiring for a more junior position, it’s likely you don’t need them to know it in the first place.

Our goal should be to test the floor, not the ceiling. Find questions that can be answered by anyone with the skill set you desire. Sometimes that floor is: can you write runnable code.

We’ve just completed a hiring cycle where several candidates couldn’t transform a simple circuit diagram into a Boolean statement. One candidate who professed SQL knowledge who couldn’t write a simple query. And I mean “how many buckets do you have?” level of simple.

On paper, these candidates seemed good. Several even had GitHub repositories. But, end of the day, I’m going to ask you to do a task. I’m going to need it by a date. I’m going to need that completed without having to comb over it and possibly rewrite chunks of it.

I don’t need the next Linus Torvalds, but so many candidates come with greatly exaggerated resumes and we have to winnow somehow.

coliveira
·
1 year ago
·
[ - ]

They're very busy reinventing the same product over and over, so they can kill it again next month!

·
1 year ago
·
[ - ]

ldjkfkdsjnv
·
1 year ago
·
[ - ]

Google invented AI

renegade-otter
·
1 year ago
·
[ - ]

Machine learning? They did not. They iterated on it, and then dropped the ball, losing the race to OpenAI.

My point exactly.

ldjkfkdsjnv
·
1 year ago
·
[ - ]

Generative ai came from efforts to improve search via text embeddings

drfunk
·
1 year ago
·
[ - ]

Nice project! I have a few qualms with the instructions (sometimes misleading or unclear) and the implementation. For instance some problems fail, because 0. is considered different from 0.0

Using np.testing.assert_allclose in your asserts would solve this I think (https://numpy.org/doc/stable/reference/generated/numpy.testi...).

Happy to contribute / elaborate if you think it's be useful! :)

mchab
·
1 year ago
·
[ - ]

Thank you for the help! Will definitely try this instead of my current method, if you’d like you could join the discord https://discord.gg/s4uVTQwk and let me know if you have any other ideas

diimdeep
·
1 year ago
·
[ - ]

I like Code Kata approach, it allows to learn and practice.

But dislike siloed websites like Leetcode where they ask you to bear with their awful web experience, I want to keep my code and notes offline and close in case I need it in a year or 10 years.

Approach with simple test files and exercises is more appealing to me https://github.com/dabeaz-course/python-mastery

So what is the goal here, to be like Leetcode ? or spread knowledge ? If latter, put material as plain markdown and .py files on github repo, we will say thank you.

mchab
·
1 year ago
·
[ - ]

Originally I started this as an open source project, and currently thinking of a similar system to what you shared where I make the problems open source and keep the site close sourced. Here was my original project https://github.com/moe18/DeepMLeet

sourabhv
·
1 year ago
·
[ - ]

While this might be helpful to gain a deeper understanding, but adding a time constraint and making it something that can be asked in an interview sounds painful. Please make this a github repo instead like python_koans

sk11001
·
1 year ago
·
[ - ]

Typically what happens for ML engineering roles is that you have a regular Leetcode round as for any other SWE position and an additional round with ML questions without coding - there's no ML-specific LC questions. Which is nice as a candidate because it's yet another thing to prepare for, even if the questions are relevant and being able to solve them is kind of neat.

iknownthing
·
1 year ago
·
[ - ]

I've definitely had ML questions involving coding e.g. implement k-means

mchab
·
1 year ago
·
[ - ]

Created a discord for anyone that had any recommendations or wants to stay up to date on new questions we are working on https://discord.gg/s4uVTQwk

ZoomerCretin
·
1 year ago
·
[ - ]

The first example is a bit confusing.

Example: input: a = [[1,2],[2,4]], b = [1,2] output:[5, 10] reasoning: 11 + 22 = 5; 12+ 24 = 10

Which 1 and 2 correspond to the 1 and 2 from a and b?

mchab
·
1 year ago
·
[ - ]

That is a good point, thank you for the input I will change up the example problem to clear things up

MOARDONGZPLZ
·
1 year ago
·
[ - ]

I haven’t seen anyone ask these types of questions for interviewing for ML positions. They feel like ChatGPT or straight from a textbook. Can you share how you arrived at these questions?

mchab
·
1 year ago
·
[ - ]

I created these questions from a mix or resources, some from libraries like numpy linalg docs, and sklearn docs. Some from textbooks like https://www.deeplearningbook.org/ And others I asked chatgpt about

sweezyjeezy
·
1 year ago
·
[ - ]

Edit: previous title was "Leetcode for ML" or somesuch...

I like the idea and might try some! But as a warning: leetcode is specifically aimed at prepping for interviews, and I've never seen questions like these in an interview (I'm somewhere between an MLE and ML researcher FWIW). The most common kinds of ML-specific things in my experience are:

- ML system design (basically everyone does this)

- ML knowledge questions ("explain ADAM etc.")

- probability + statistics knowledge

- ML problem solving in a notebook (quite rare, but some do it)

mchab
·
1 year ago
·
[ - ]

Probably should have titled it something else, I made it more as a learning platform for people to get better at ml by implementing algorithms from scratch. I’m currently a data scientist but wanted to become a machine learning researcher or engineer and I thought these types of questions would help

iknownthing
·
1 year ago
·
[ - ]

I saw the k-means one a couple times

DasCorCor
·
1 year ago
·
[ - ]

This website is super buggy. Sign up with Google doesn't work. The code editor keeps running in to tabs vs spaces issues. Defaults to 2 space tabs like it is Javascript.

mchab
·
1 year ago
·
[ - ]

Thank you for the feedback will look into that

Edit: the sign up works for me, but the spacing is an issue

iknownthing
·
1 year ago
·
[ - ]

I'm curious how you run the python code in the browser

·
1 year ago
·
[ - ]

·
1 year ago
·
[ - ]

anualvis
·
1 year ago
·
[ - ]

Is it down for anyone else too?

mchab
·
1 year ago
·
[ - ]

can you not get to the site or when you run your code it does not run?

Xeamek
·
1 year ago
·
[ - ]

Great resource!

kebsup
·
1 year ago
·
[ - ]

It's sad how a lot of people see this as "a bad way to test job candidates" rather than a "fun way to practice ML skills".

lamename
·
1 year ago
·
[ - ]

Those comments are based on the original title introducing it as an ML Leetcode. The title is more accurate now.

mchab
·
1 year ago
·
[ - ]

thanks! I think having leetcode in the title angered a lot of people

rvz
·
1 year ago
·
[ - ]

It doesn't matter. I would have preferred that the title mentions that it is Leetcode-like anyway.

But thanks for giving Leetcode yet another idea to test AI Engineers who do not know how to write a multi-layered perceptron or a softmax activation function from scratch with yet another repository of already solved puzzles to making it easier for interviewers. I'd say its pretty useful myself.

And so it begins with the complaints of "The AI interview is broken", "We are the only industry that does this" frequently being preached here.

·
1 year ago
·
[ - ]

htrp
·
1 year ago
·
[ - ]

Please don't.

Leetcode already ruined so many coding interviews by asking people to do bullshit like

"Output data from a stream in order, make the solution performant"

Why would you ruin ML for us too?

Looking at your site, problem #1 is Multiply a matrix times a vector..... in no universe is that a legitimate ML interview question.

Also ML is such a huge field (everything from statistical learning through to transformer neural networks), I fail to see how you could say your solution tests core skills. If I'm hiring for an ASR Role, it's going to be very different than for a CV role.

r-zip
·
1 year ago
·
[ - ]

> in no universe is that a legitimate ML interview question

Why not? This seems like the ML equivalent of FizzBuzz. If you don't know how matrix multiplication works well enough to implement it, I would argue that you don't know what you're doing at all.

·
1 year ago
·
[ - ]

·
1 year ago
·
[ - ]

rty32
·
1 year ago
·
[ - ]

My nightmare has finally come true.

dang
·
1 year ago
·
[ - ]

Ok, but please don't post unsubstantive comments to HN, and especially not shallow dismissals of someone's work.

https://news.ycombinator.com/showhn.html

https://news.ycombinator.com/newsguidelines.html

rty32
·
1 year ago
·
[ - ]

Sorry for the judgement of the lack of "substance" of the comment, but to my defense I see this kind of comment all the time under almost every post (including this one), and it is not always obvious unless pointed out.

And this is in no way dismissive of the work. I can definitely see the value in this -- I am just saying many people don't wish to see this, which many people apparently agree based on the number of votes.

dang
·
1 year ago
·
[ - ]

Yes, too many people post that sort of unsubstantive comment—the cheap one-liner is maybe the biggest forum cliché there is—but that doesn't make it ok.

I believe you that you intended something more thoughtful, but the rest of us don't have access to your intention (or the real meaning of the comment in your head). We can only go by what you actually post, so if you want to make a more thoughtful point, you need to do so explicitly.

https://hn.algolia.com/?dateRange=all&page=0&prefix=true&que...

tjungblut
·
1 year ago
·
[ - ]

Inverting a binary tree became implementing SVD with arrays only.

·
1 year ago
·
[ - ]

srghogdhio
·
1 year ago
·
[ - ]

[flagged]