Last updated: Apr 11, 2024
Reading timeยท4 min
Use the dtype.names
attribute to get the column names of a NumPy ndarray
in Python.
The dtype.names
attribute returns a tuple of the field names of the
ndarray
.
import numpy as np data = np.genfromtxt( 'employees.txt', names=True, encoding='utf-8', delimiter=',', dtype=None, ) # [('Alice', 'Smith', '2023-01-05') ('Bobby', 'Hadz', '2023-03-25') print(data) # ('first_name', 'last_name', 'date') print(data.dtype.names)
The code sample assumes that you have the following employees.txt
file in the
same directory as your main.py
script.
first_name,last_name,date Alice,Smith,2023-01-05 Bobby,Hadz,2023-03-25 Carl,Lemon,2021-01-24
We used the
numpy.genfromtext()
method to load the data from the file into a NumPy ndarray
.
data = np.genfromtxt( 'employees.txt', names=True, encoding='utf-8', delimiter=',', dtype=None, )
Each line past the first is split at the specified delimiter character (a ,
in
the example).
We then used the
dtype.names
attribute to get the column names of the ndarray
.
# ('first_name', 'last_name', 'date') print(data.dtype.names)
The dtype.names
attribute returns a tuple of the field names of the array, or
None
if there are no field names.
You can use indexing to access specific column names in the tuple.
import numpy as np data = np.genfromtxt( 'employees.txt', names=True, encoding='utf-8', delimiter=',', dtype=None, ) # [('Alice', 'Smith', '2023-01-05') ('Bobby', 'Hadz', '2023-03-25') print(data) # ('first_name', 'last_name', 'date') print(data.dtype.names) print(data.dtype.names[0]) # ๐๏ธ first_name print(data.dtype.names[1]) # ๐๏ธ last_name print(data.dtype.names[2]) # ๐๏ธ date
If you need to add column names to a plain NumPy ndarray
, use the
unstructured_to_structured
method.
import numpy as np import numpy.lib.recfunctions as rfn arr = np.array([[1, 2, 3], [4, 5, 6]]) new_arr = rfn.unstructured_to_structured( arr, np.dtype( [ ('Column_1', int), ('Column_2', int), ('Column_3', int) ] ) ) # ๐๏ธ [(1, 2, 3) (4, 5, 6)] print(new_arr) # ๐๏ธ ('Column_1', 'Column_2', 'Column_3') print(new_arr.dtype.names) print(new_arr['Column_1']) # ๐๏ธ [1 4]
We defined a plain NumPy array and used the unstructured_to_structured()
method to convert the unstructured array to a structured array.
We set the array's column names and used dtype.names
to get a tuple containing
the names.
As shown in the code sample, you can use square brackets to access the elements of the array by column names.
If you need to read values from a CSV file and get the column names, you can
also use the pandas
module.
First, make sure
you have the pandas
module installed by
running the following command from your terminal.
pip install pandas # or with pip3 pip3 install pandas
Here is the employee.csv
file for the example.
first_name,last_name,date Alice,Smith,2023-01-05 Bobby,Hadz,2023-03-25 Carl,Lemon,2021-01-24
And here is the related main.py
file.
import pandas as pd df = pd.read_csv( 'employees.csv', sep=',', encoding='utf-8' ) # first_name last_name date # 0 Alice Smith 2023-01-05 # 1 Bobby Hadz 2023-03-25 # 2 Carl Lemon 2021-01-24 print(df) print('-' * 50) # ๐๏ธ Index(['first_name', 'last_name', 'date'], dtype='object') print(df.columns) print('-' * 50) # ๐๏ธ ['first_name', 'last_name', 'date'] print(df.columns.tolist())
We used the
pandas.read_csv() method
to read the comma-separated CSV file into a DataFrame
object.
You can then use the
DataFrame.columns()
method to get the column labels of the DataFrame
.
If you need to convert the object to a list, use the tolist()
method.
You can learn more about the related topics by checking out the following tutorials: