# Pandas Interview Questions for Data Science

--

Today, I will cover some Pandas Interview questions that I have sourced from different websites and are useful for interviews.

*Q1*:How to create new columns derived from existing columns in Pandas?

- We create a new column by assigning the output to the DataFrame with a new column name in between the
`[]`

. - Let’s say we want to create a new column
`'C'`

whose values are the multiplication of column`'B'`

with column`'A'`

. The operation will be easy to implement and will be element-wise, so there's no need to loop over rows.

- Also, other mathematical operators (
`+`

,`-`

,`\*`

,`/`

) or logical operators (`<`

,`>`

,`=`

,`…`

) work element-wise. But if we need more advanced logic, we can use arbitrary Python code via`apply()`

. - Depending on the case, we can use
`rename`

with a dictionary or function to rename row labels or column names according to the problem.

*Q2*: How are `iloc()`

and `loc()`

different?

`DataFrame.iloc`

is a method used to retrieve data from a Data frame, and it is an integer position-based locator (from 0 to length-1 of the axis), but may also be used with a boolean array. It takes input as an integer, arrays of integers, a slice object, a boolean array, and functions.

`DataFrame.loc`

gets rows (and/or columns) with particular labels. It takes input as a single label, list of arrays, and slice objects with labels.

*Q3*:What are the operations that the Pandas Groupby method is based on?

*Splitting*the data into groups based on some criteria.*Applying a function*to each group independently.*Combining the results*into a data structure.

*Q4*: How to check whether a Pandas DataFrame is empty?

You can use the attribute `df.empty`

to check whether it's empty or not:

*Q5*: How does the **groupby()**** method works in Pandas?**

- In the first stage of the process, data contained in a
*pandas object*, whether a`Series`

,`DataFrame`

, or otherwise, is split into groups based on one or more*keys*that we provide. - The splitting is performed on a particular
*axis*of an object. For example, a`DataFrame`

can be grouped in its rows`(axis=0)`

or its columns`(axis=1)`

. - Once this is done, a
*function*is applied to each group, producing a*new value*. Finally, the results of all those function applications are combined into a result object. The form of the resulting object will usually depend on what’s being done to the data. - In the figure below, this process is illustrated for a simple group aggregation.

## Q6 :What Is Time Series In pandas?

A time series is an ordered sequence of data which basically represents how some quantity changes over time. pandas contains extensive capabilities and features for working with time series data for all domains.

pandas supports:

- Parsing time series information from various sources and formats
- Generate sequences of fixed-frequency dates and time spans
- Manipulating and converting date time with timezone information
- Resampling or converting a time series to a particular frequency
- Performing date and time arithmetic with absolute or relative time increments