Pandas: Indexes#
import pandas as pd
persons = pd.DataFrame({
'firstname': ['Joerg', 'Johanna', 'Caro', 'Philipp' ],
'lastname': ['Faschingbauer', 'Faschingbauer', 'Faschingbauer', 'Lichtenberger' ],
'email': ['office@faschingbauer.co.at', 'johanna@email.com', 'caro@email.com', 'philipp@email.com'],
'age': [56, 27, 25, 37 ],
})
Default Index: Row Number#
persons
| firstname | lastname | age | ||
|---|---|---|---|---|
| 0 | Joerg | Faschingbauer | office@faschingbauer.co.at | 56 |
| 1 | Johanna | Faschingbauer | johanna@email.com | 27 |
| 2 | Caro | Faschingbauer | caro@email.com | 25 |
| 3 | Philipp | Lichtenberger | philipp@email.com | 37 |
See how rows are numbered
No column name given
⟶ default index
persons.index
RangeIndex(start=0, stop=4, step=1)
Setting Custom Index#
Notice how
emailappears to be unique⟶ could be used as an index
persons.set_index('email')
firstname lastname age email office@faschingbauer.co.at Joerg Faschingbauer 56 johanna@email.com Johanna Faschingbauer 27 caro@email.com Caro Faschingbauer 25 philipp@email.com Philipp Lichtenberger 37 This does not change anything
Returns modified copy (could be assigned to another variable that you continue to work with, for example)
personsis still the same as beforepersonsfirstname lastname email age 0 Joerg Faschingbauer office@faschingbauer.co.at 56 1 Johanna Faschingbauer johanna@email.com 27 2 Caro Faschingbauer caro@email.com 25 3 Philipp Lichtenberger philipp@email.com 37
Setting Custom Index, inplace=True#
Many (but not all)
DataFramemethods support aninplaceparameterDefault
False⟶ no change
Returns a modified copy of the
DataFrameobject
Nice for trying around on a large dataset that we don’t want to damage
Add
inplaceif everything works⟶ No return value
persons.set_index('email', inplace=True)
Modified object in-place
personsfirstname lastname age email office@faschingbauer.co.at Joerg Faschingbauer 56 johanna@email.com Johanna Faschingbauer 27 caro@email.com Caro Faschingbauer 25 philipp@email.com Philipp Lichtenberger 37 Index has changed
persons.index
Index(['office@faschingbauer.co.at', 'johanna@email.com', 'caro@email.com', 'philipp@email.com'], dtype='str', name='email')
Custom Index, And loc[]#
loc[]selects by row label (⟶ index)Row labels are not row numbers anymore ⟶ cannot be used as row labels
persons.loc[0]
--------------------------------------------------------------------------- KeyError Traceback (most recent call last) File ~/My-Environments/jfasch-home/lib64/python3.14/site-packages/pandas/core/indexes/base.py:3641, in Index.get_loc(self, key) 3640 try: -> 3641 return self._engine.get_loc(casted_key) 3642 except KeyError as err: File pandas/_libs/index.pyx:168, in pandas._libs.index.IndexEngine.get_loc() --> 168 'Could not get source, probably due dynamically evaluated source code.' File pandas/_libs/index.pyx:176, in pandas._libs.index.IndexEngine.get_loc() --> 176 'Could not get source, probably due dynamically evaluated source code.' File pandas/_libs/index.pyx:583, in pandas._libs.index.StringObjectEngine._check_type() --> 583 'Could not get source, probably due dynamically evaluated source code.' KeyError: 0 The above exception was the direct cause of the following exception: KeyError Traceback (most recent call last) Cell In[9], line 1 ----> 1 persons.loc[0] File ~/My-Environments/jfasch-home/lib64/python3.14/site-packages/pandas/core/indexing.py:1207, in _LocationIndexer.__getitem__(self, key) 1205 maybe_callable = com.apply_if_callable(key, self.obj) 1206 maybe_callable = self._raise_callable_usage(key, maybe_callable) -> 1207 return self._getitem_axis(maybe_callable, axis=axis) File ~/My-Environments/jfasch-home/lib64/python3.14/site-packages/pandas/core/indexing.py:1449, in _LocIndexer._getitem_axis(self, key, axis) 1447 # fall thru to straight lookup 1448 self._validate_key(key, axis) -> 1449 return self._get_label(key, axis=axis) File ~/My-Environments/jfasch-home/lib64/python3.14/site-packages/pandas/core/indexing.py:1399, in _LocIndexer._get_label(self, label, axis) 1397 def _get_label(self, label, axis: AxisInt): 1398 # GH#5567 this will fail if the label is not present in the axis. -> 1399 return self.obj.xs(label, axis=axis) File ~/My-Environments/jfasch-home/lib64/python3.14/site-packages/pandas/core/generic.py:4253, in NDFrame.xs(self, key, axis, level, drop_level) 4249 new_index = index[loc : loc + 1] 4250 else: 4251 new_index = index[loc] 4252 else: -> 4253 loc = index.get_loc(key) 4254 4255 if isinstance(loc, np.ndarray): 4256 if loc.dtype == np.bool_: File ~/My-Environments/jfasch-home/lib64/python3.14/site-packages/pandas/core/indexes/base.py:3648, in Index.get_loc(self, key) 3643 if isinstance(casted_key, slice) or ( 3644 isinstance(casted_key, abc.Iterable) 3645 and any(isinstance(x, slice) for x in casted_key) 3646 ): 3647 raise InvalidIndexError(key) from err -> 3648 raise KeyError(key) from err 3649 except TypeError: 3650 # If we have a listlike key, _check_indexing_error will raise 3651 # InvalidIndexError. Otherwise we fall through and re-raise 3652 # the TypeError. 3653 self._check_indexing_error(key) KeyError: 0
New row label:
emailpersons.loc['office@faschingbauer.co.at']
firstname Joerg lastname Faschingbauer age 56 Name: office@faschingbauer.co.at, dtype: object
persons.loc[['office@faschingbauer.co.at', 'johanna@email.com']]
firstname lastname age email office@faschingbauer.co.at Joerg Faschingbauer 56 johanna@email.com Johanna Faschingbauer 27
Custom Index, And iloc[]#
iloc[]selects by row number⟶ still valid as before
persons.iloc[0]
firstname Joerg
lastname Faschingbauer
age 56
Name: office@faschingbauer.co.at, dtype: object
persons.iloc[[0, 1]]
| firstname | lastname | age | |
|---|---|---|---|
| office@faschingbauer.co.at | Joerg | Faschingbauer | 56 |
| johanna@email.com | Johanna | Faschingbauer | 27 |
Sorting DataFrame Object By Index Column#
DataFrame.sort_index(): noninplaceby default ⟶ returns modified copypersons.sort_index(ascending=True)
firstname lastname age email caro@email.com Caro Faschingbauer 25 johanna@email.com Johanna Faschingbauer 27 office@faschingbauer.co.at Joerg Faschingbauer 56 philipp@email.com Philipp Lichtenberger 37 Sorting in place
persons.sort_index(ascending=True, inplace=True)
personsfirstname lastname age email caro@email.com Caro Faschingbauer 25 johanna@email.com Johanna Faschingbauer 27 office@faschingbauer.co.at Joerg Faschingbauer 56 philipp@email.com Philipp Lichtenberger 37
Links#
Corey Schafer: Python Pandas Tutorial (Part 3): Indexes - How to Set, Reset, and Use Indexes
Data School: How do I use the MultiIndex in pandas?