Skip to main content
Colin Jaffe/3 min read

Utilizing Lambda Functions in Data Science

Lambdas in DS Workflows

.apply with lambda

df['col'].apply(lambda x: x*2) for transformations.

groupby().agg()

groupby().agg(lambda x: x.max() - x.min()) for custom aggregations.

Sort by Custom Key

sorted(items, key=lambda x: (x.year, x.month)).

Pipe Stages

df.pipe(lambda d: d.assign(...)) for chained transformations.

Master Python at Noble Desktop

Noble Desktop's Python Programming Immersive covers AI APIs, data analysis, and modern Python development.

This lesson is a preview from our Data Science & AI Certificate Online (includes software) and Python Certification Online (includes software & exam). Enroll in a course for detailed lessons, live instructor support, and project-based training.

So a lot of the functions you write when you're doing data science are these very simple functions that define an operation for you that is one evaluation, one calculation.

So a lot of the functions you write when you're doing data science are these very simple functions that define an operation for you that is one evaluation, one calculation. There are no variables involved, extra variables. There are no multi-step processes.

It's just, hey, how do we calculate the return value here? So to do that, we generally turn to lambdas. Now, I'm gonna get us there step-by-step by outlining first how we can have a named lambda. If I wanted to say add five, instead of the above, I'm gonna say add five equals a lambda.

And we first, the way we define that is we say num colon and then we say what that returns after the colon. Num plus five. Now, I believe it's mad at me.

Oh, it's mad at me because I forgot to write the actual word lambda. There we go. See, even an experienced Data Scientist still forgets the syntax every once in a while.

All right, so this is a function and it's the exact same code or the exact same process as the function here. Instead of doing two lines, though, we're able to do it in one. We're able to say add five equals, like it's a variable that holds a value.


I mean, it is a variable, but the value it holds is an instruction, a set of instructions. And it says this colon is saying everything to the left of it is the argument it takes in, num. And everything to the right of it is what this function returns.

Anything to the right of the colon. So this is, again, the equivalent of take in a num, return num plus five, take in a num, and return, because of the colon, num plus five. And we can use it in the same way.

If I uncomment that, maybe comment that back out, we get the original 15 and 30. We can do the same thing for the other one. Let me comment back in the ADD SENSE, and we'll try ADD SENSE.

We'll say ADD SENSE equals a lambda, where we've taken in maybe dollar amount, and we return round dollar amount plus random dot random to two places. Same thing, little more complicated, because this is a little bit more of a complicated line. But we have the same structure here.


ADD FIVE SENSE is a lambda function, meaning this sort of one-line, terse, short little syntax for a function that takes in dollar amount as a variable, as an input, and returns, because of the colon, everything to the right. Okay, we write a lot of these lambdas, and it can be a very simple way to define a function. And in a minute, we'll do a lambda without having to come up with a name, which is another nice advantage.

Naming variables is very challenging, and it's nice to not have to come up with a name for something as simple. It doesn't really need a name as this function. Okay, let's, in our next video, we will walk through, we'll make a data frame, and we'll walk through what we could do to run apply on it, to apply this function to every variable, every value in the data frame.