Step 2: Coding the data - Construction of initial categories


By reading and re-reading the data in order to develop a profound knowledge of the data, an initial set of labels is identified. This step is very laborious (especially with large amounts of data). Pieces of text are coded, i.e. given a label or a name. Generally, in the qualitative analysis literature, “ data coding” refers to this data management. However data coding refers to different levels of analysis.

Here are some commonly used terms (Paillé and Muchielli, 2011):


Labeling a text or part of a text is the identification of the topic of the extract, not what is said about it. “What is the extract about?” The labels allow to make a first classification of the documents/ extracts. They are useful in a first quick reading of the corpus.

Example: “Familial difficulties”


The code is the numerical/truncated form of the label. This tool is not very useful in qualitative data analysis.

Example: “Fam.Diff.”


The theme goes further than the label. It requires a more attentive lecture.

 “What is the topic more precisely?”

Example: “Difficulties to care for children”


Statements are short extracts, short syntheses of the content of the extract. “What is the key message of what is said?”, “What is told?”           
The statement is more precise than the theme because it resumes, reformulates or synthetizes the extract. They are mainly used in phenomenology.

Example: The respondent tells that she has financial difficulties because she has to spend time and money to take care of her children.

Conceptualizing category:

Conceptualizing categories are the substantive designations of phenomena occurring in the extract of the analyzed corpus. Hence, this approaches theory construction.

Example: “Parental overload”


These types of coding terms are generally more specific to certain types of qualitative data analysis methods (Paillé and Muchielli, 2011).

By coding qualitative data, meanings are isolated in function of answering the research question. One piece of text may belong to more than one category or label. Hence there is likely to be overlap between categories. Major attention should be paid to “rival explanations” or interpretations about the data.


For further detailed information on coding qualitative data:
Saldaña J. The coding manual for qualitative researchers. 2nd edition ed. London: Sage Publications; 2013.