Scott Boston's Networkx solution is the preferred solution...

There are two solutions to this problem. The first one is a vectorized pandas type solution and should be fast over larger datasets, the second is pythonic and does not work well on the size of dataset the OP was looking for, the original df size is (223635,4).

- PANDAS SOLUTION

This problem seeks to find out how many people each person in an organization manages, including subordinate's subordinates. This solution will create a dataframe by adding successive columns that are the managers of the previous columns, and then counting the occurance of each employee in that dataframe to determine the total number under them.

First we set up the input.

```
import pandas as pd
import numpy as np
data = [
["John", "144", "Smith", "200"],
["Mia", "220", "John", "144"],
["Caleb", "155", "Smith", "200"],
["Smith", "200", "Jason", "500"],
]
df = pd.DataFrame(data, columns=["Name", "SID", "Manager_name", "Manager_SID"])
df = df[["SID", "Manager_SID"]]
# shortening the columns for convenience
df.columns = ["1", "2"]
print(df)
1 2
0 144 200
1 220 144
2 155 200
3 200 500
```

First the employees without subordinates must be counted and put into a seperate dictionary.

```
df_not_mngr = df.loc[~df['1'].isin(df['2']), '1']
non_mngr_dict = {str(key):0 for key in df_not_mngr.values}
non_mngr_dict
{'220': 0, '155': 0}
```

Next we will modify the dataframe by adding columns of managers of the previous column. The loop is stopped when there are no employees in the right most column

```
for i in range(2, 10):
df = df.merge(
df[["1", "2"]], how="left", left_on=str(i), right_on="1", suffixes=("_l", "_r")
).drop("1_r", axis=1)
df.columns = [str(x) for x in range(1, i + 2)]
if df.iloc[:, -1].isnull().all():
break
else:
continue
print(df)
1 2 3 4 5
0 144 200 500 NaN NaN
1 220 144 200 500 NaN
2 155 200 500 NaN NaN
3 200 500 NaN NaN NaN
```

All columns except the first columns are collapsed and each employee counted and added to a dictionary.

```
from collections import Counter
result = dict(Counter(df.iloc[:, 1:].values.flatten()))
```

The non manager dictionary is added to the result.

```
result.update(non_mngr_dict)
result
{'200': 3, '500': 4, nan: 8, '144': 1, '220': 0, '155': 0}
```

- RECURSIVE PYTHONIC SOLUTION

I think this is probably way more pythonic than you were looking for. First I created a list 'all_sids' to make sure we capture all employees as not all are in each list.

```
import pandas as pd
import numpy as np
data = [
["John", "144", "Smith", "200"],
["Mia", "220", "John", "144"],
["Caleb", "155", "Smith", "200"],
["Smith", "200", "Jason", "500"],
]
df = pd.DataFrame(data, columns=["Name", "SID", "Manager_name", "Manager_SID"])
all_sids = pd.unique(df[['SID', 'Manager_SID']].values.ravel('K'))
```

Then create a pivot table.

```
dfp = df.pivot_table(values='Name', index='SID', columns='Manager_SID', aggfunc='count')
```

dfp

```
Manager_SID 144 200 500
SID
144 NaN 1.0 NaN
155 NaN 1.0 NaN
200 NaN NaN 1.0
220 1.0 NaN NaN
```

Then a function that will go through the pivot table to total up all the reports.

```
def count_mngrs(SID, count=0):
if str(SID) not in dfp.columns:
return count
else:
count += dfp[str(SID)].sum()
sid_list = dfp[dfp[str(SID)].notnull()].index
for sid in sid_list:
count = count_mngrs(sid, count)
return count
```

Call the function for each employee and print the results.

```
print('SID', ' Number of People Reporting')
for sid in all_sids:
print(sid, " " , int(count_mngrs(sid)))
```

Results are below, sorry I was a bit lazy in putting the names with the sids.

```
SID Number of People Reporting
144 1
220 0
155 0
200 3
500 4
```

Look forward to seeing a more pandas type solution!