EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection (2024)

Zheng Li¹, Dawei Zhu¹, Qilong Ma², Weimin Xiong¹, Sujian Li¹
¹State Key Laboratory for Multimedia Information Processing,
School of Computer Science, Peking University
²School of Software, BNRist, Tsinghua University
{lycheelee, dwzhu,lisujian}@pku.edu.cn
mql22@mails.tsinghua.edu.cn

Abstract

Personality is a fundamental construct in psychology, reflecting an individual’s behavior, thinking, and emotional patterns. Previous researches have made some progress in personality detection, primarily by utilizing the whole text to predict personality. However, these studies generally tend to overlook psychological knowledge: they rarely apply the well-established correlations between emotion regulation and personality. Based on this, we propose a new personality detection method called EERPD. This method introduces the use of emotion regulation, a psychological concept highly correlated with personality, for personality prediction. By combining this feature with emotion features, it retrieves few-shot examples and provides process CoTs for inferring labels from text. This approach enhances the understanding of LLM for personality within text and improves the performance in personality detection. Experimental results demonstrate that EERPD significantly enhances the accuracy and robustness of personality detection, outperforming previous SOTA by 15.05/4.29 in average F1 on the two benchmark datasets.

\UseRawInputEncoding

EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection

Zheng Li¹, Dawei Zhu¹, Qilong Ma², Weimin Xiong¹, Sujian Li¹¹State Key Laboratory for Multimedia Information Processing,School of Computer Science, Peking University²School of Software, BNRist, Tsinghua University{lycheelee, dwzhu,lisujian}@pku.edu.cnmql22@mails.tsinghua.edu.cn

1 Introduction

As a fundamental construct in psychology, personality reveals the true nature of the individual and creates a certain impression on others(Jung, 1959; Corr and Matthews, 2009; Jung, 1959).With the advancement of Natural Language Processing (NLP) technologies, there has been an growing interest in automatic detection of personality(Petrides and Mavroveli, 2018; Yang etal., 2023), whichplays a pivot role in numerous human-oriented NLP applications, such as psychological health assessment Wilkinson and Walford (2001), personalized recommendation systems Hu and Pu (2010), and human-computer interaction Pocius (1991).

EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection (1)

Traditional personality detection methods either treat text as a whole, relying primarily on direct content analysisYang etal. (2023); Hu etal. (2024), or laying particular emphasis on emotion expressionMohammad and Kiritchenko (2013); Li etal. (2021). However, these approaches often overlook the role of Emotion Regulation(Gross, 2008), a key psychological concept related to personality.Different from emotion that is usually expressed in a short-term manner, emotion regulation is a stable, long-term status of managing and controlling one’s emotional responses, as exemplified in Figure1. Psychological researches have demonstrated a clear correlation between one’s personality and emotion regulationBarańczuk (2019); Petrides and Mavroveli (2018); Borges and Naugle (2017).

Inspired by the psychological studies above, we propose a RAG-based framework named EERPD for automatic detection of personality, leveraging both emotion and emotion regulation as guidance.To guide the LLMs in personality detection, we first construct a reference library composing of a large number of text-personality pairs. Reference samples most similar to the input text are retrieved to facilitate few-shot learning. To be specific, we categorize each sentence of the input text into emotion sentences and emotion regulation sentences, encode them separately, and then combine the two vectors for effective retrieval of the most similar samples from the reference library. For each retrieved sample, we further include corresponding chain-of-thoughts (CoT) emphasizing emotion and emotion regulation to direct the LLM’s attention to these aspects.

For comprehensive evaluation, we test our EERPD on both the the Kaggle dataset ¹¹1https://www.kaggle.com/datasnaek/mbti-type for MBTIMyers-Briggs (1991) detection and the Essays dataset Pennebaker and King (1999) for the Big Five personalityGoldberg (1990) detection. The experimental results show that our method significantly improves the few-shot performance of GPT-3.5 in personality detection tasks, outperforming previous SOTA by 15.5/4.3 on average F1 on the two datasets. Further ablation and analysis consolidates the effectiveness of emotion regulation in personality detection, aligning well with psychological discoveries. To sum up, our contributions are as follows:

•
To our best knowledge, we are the first to incorporate psychological knowledge of emotion regulation for automatic personality detection.
•
We propose EERPD, a RAG-based framework that combines Emotion and Emotion Regulation to improve personality detection. Comprehensive experiments on two benchmarks show that our EERPD outperforms all the strong baselines by a large margin.
•
We have conducted in-depth analyses to confirm the effectiveness of EERPD from various aspects, as well as the efficacy of emotion regulation for personality detection.

2 Related Work

Personality Detection In the early development of personality detection, Francis and Booth (1993) introduced the Linguistic Inquiry and Word Count (LIWC), pioneering the use of psycholinguistic features for personality analysis through texts. This tool became foundational for feature engineering in subsequent studies, such as those by Pennebaker and King (1999) and Argamon etal. (2005), which focused on linguistic styles and lexical predictors of personality traits, achieving moderate accuracies in detecting traits like extraversion and neuroticism.

With the emergence of neural networks, research expanded significantly. Techniques like CNNs and LSTMs enhanced personality prediction from social media Tandera etal. (2017); Xue etal. (2018a). The introduction of BERT advanced the field further, with Gjurković etal. (2020) showing its effectiveness in analyzing personality and demographics on Reddit without extensive feature engineering.

Recent studies have explored multi-task and multimodal approaches to personality detection. Sang etal. (2022) used movie scripts to predict MBTI types of fictional characters, showing the potential of diverse data integration. Li etal. (2021) employed multitask learning to detect emotions and personality traits simultaneously, demonstrating the efficiency of shared representations.

Current research explores large language models (LLMs) for personality detection, as shown by Yang etal. (2023) and Hu etal. (2024), indicating a shift towards inferring personality traits directly from text with minimal reliance on traditional feature engineering and adopting more holistic, context-aware methodologies.

Emotion and Emotion Regulation The relationship between emotion and personality is well-studied in psychology and NLP. Psychological theories Keltner (1996); Davidson (2001); Reisenzein and Weber (2009) link personality traits with emotional experiences, lading the foundation for inferring personality from emotions. Mohammad and Kiritchenko (2013) showed fine-grained emotions enhance personality detection, Rangra etal. (2023) demonstrated the effectiveness of emotional features in speech, and Li etal. (2021) found that multitask learning improves prediction accuracy.

EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection (2)

The connection between emotion regulation and personality has also been explored in psychology. Emotion regulation is significantly associated with and can influence personality Barańczuk (2019). Individuals with strong emotion regulation skills show personality traits assosiated with higher job satisfaction and better stress management Petrides and Mavroveli (2018). Borges and Naugle (2017) found emotion regulation variables predict specific personality dimensions. However, to the best of our knowledge, no NLP-based personality detection methods utilize emotion regulation for prediction.

3 Task Formulation

In this paper, we focus on the personality detection task, which aims to predict an individual’s personality traits from text.Each text to be detected, $X$ , is made up of $\{x_{1},x_{2},...,x_{n}\}$ , where each $x_{i}$ is a sentence.The goal of personality detection is to map $X$ to a multidimensional label $y$ .

4 Method

In this section, we introduce our EERPD framework, as illustrated in Figure2.First, we construct a reference library, providing text-label pairs to serve as examples for the personality detection model (§4.1).Then, we utilize psychological knowledge to categorize each sentence in text into Emotion Sentences (ES) and Emotion Regulation Sentences (ERS) (§4.2). After that, we retrieve the examples through the combination of ES and ERS (§4.3). In inference phase, we utilize examples from reference library for personality detection (§4.4). And the whole method is shown as Algorithm1.

4.1 Reference Library Construction

As the model used for personality detection lacks specialized knowledge in psychology, we employ the Retrieval-Augmented Generation (RAG) method to retrieve and inject relevant examples from the reference library.We first constructed a reference library, represented as $C=\{(CX_{i},y_{i})\}_{i=1}^{N}$ , where $CX_{i}$ and $y_{i}$ represents the reference input text and personality label of the $i\rm\mbox{-}th$ instance, and $N$ is the size of the reference library.In our method, we use the training set of the corresponding task as the reference library.

4.2 Sentence Categorization

When performing personality detection, emotion and emotion regulation have different characteristics: emotion is expressed in a short-term manner, while emotion regulation is a long-term skill for managing and controlling one’s emotional responses. Therefore, we need to handle them separately.For the input text $X$ , we use a prompt-based approach to have the large language model (LLM) categorize its sentences into two parts.We define the classification criteria and input them along with text $X$ into the model, which can then labels each sentence $x$ to perform sentence classification. In this way, the sentences in input text $X$ is categorized into two parts: Emotion Sentences $(X_{e}=\{x_{e1},x_{e2},...,x_{en}\})$ and Emotion Regulation Sentences ( $X_{r}=\{x_{r1},x_{r2},...,x_{rn}\})$ . Referring to concepts from psychologyGross (2008), the classification criteria are defined as follows:

Emotion Sentences: The feelings in the sentence is dominated by emotion, it should be an obvious reaction to a recent event and not indicative of a deeper, long-standing trait or belief.

Emotion Regulation Sentences: The feelings in the sentence is dominated by emotion regulation, it should reflect the author’s enduring traits rather than immediate circ*mstances.

For details of the prompt used to accomplish sentence categorization. please refer to AppendixB

Input: Hyperparameter: $\alpha$ , LLM ${\rm{:LLM}}\left(\cdot\right)$ , Author’s text: X, Reference Library: D, EER split Prompt: $I_{p}$ , $I_{m}$ , Prediction Prompt: PMT

Output: The inferred personality trait: y

$X_{e,r}\leftarrow LLM({X,I_{e,r}})$ ;

$V_{xe,xr}\leftarrow vectorize(X_{e,r})$ ;

$V_{x}\leftarrow V_{xe}+V_{xr}$ ;

$Sim\leftarrow[null]$ ;

foreach text $d$ in $D$ do

$d_{e},d_{r}\leftarrow\rm{LLM}\left(d,I_{e},I_{r}\right)$ ;

$V_{de,dr}\leftarrow vectorize(d_{e},d_{r})$ ;

$V_{d}\leftarrow\alpha V_{de}+(1-\alpha)V_{dr}$ ;

$sim\leftarrow 1-\cos(V_{x},V_{d})$ ;

append $sim$ to $Sim$ ;

$\{t_{1,2}\}\leftarrow argsort_{t\in D}\ Sim[-2:]$ $Egs\leftarrow\{text_{t_{1},t_{2}},CoT_{t_{1},t_{2}}\}$ ;

$y\leftarrow LLM(PMT,Egs,X)$ ;

return y;

4.3 Example Retrieval

People with similar personalities tend to exhibit similar patterns in both emotion and emotion regulation. Therefore, when assessing personality, we retrieve relevant examples from the reference library to assist in detection. To fully utilize both emotion and emotion regulation, we combine them to search for similar examples.

Given the Emotion Sentences $X_{e}$ and Emotion Regulation Sentences $X_{r}$ of text $X$ , we compute their respective vector representations $V_{xe}$ and $V_{xr}$ by the roberta-large model, and then calculate a weighted embedding using a hyperparameter $\alpha$ , such that: $V_{x}=\alpha V_{xe}+(1-\alpha)V_{xr}$ This hyperparameter $\alpha$ allows for adjustable emphasis between emotion and emotion regulation influences in the representation and exploring the importance of both for personality detection. Similarly, each text in the reference library is processed to obtain a corresponding composite embedding: $V_{d}=\alpha V_{de}+(1-\alpha)V_{dr}$

Then we use $Sim(V_{x},V_{d})=1-\cos(V_{x},V_{d})$ to identify the two most analogous texts from the Reference Library. The selected texts, along with their associated CoTs, serve as examples in 2-shot learning for the LLM.

4.4 Personality Prediction

When conducting personality detection, we use psychological knowledge from the MBTI and OCEAN personality dimension models, and generate model prompts using few-shot and CoT learning strategies. In this way, we emphasize emotion and emotion regulation to direct the model’s attention to these aspects. The whole prompt is shown in Figure 3, and more details in A.

Psychological Knowledge. In the prompt, each personality dimension from MBTI or OCEAN is introduced with a precise psychological definition. For instance, "Extroversion (E) or Introversion (I): indicates whether a person is more inclined to draw energy from the external world or the internal world."

Few-Shot and CoT Learning Strategies. We retrieve two examples from Reference Library as §4.3 mentioned, demonstrating how specific personality traits manifest in textual form through emotion and emotion regulation. Also, we leverage LLM to generate CoTs for each example. The two texts along with CoTs are used as 2-shot examples in the prompt. More details about auxiliary CoT generation is shown in Appendix C.

5 Experiments

5.1 Datasets

We conduct an evaluation of EERPD using two publicly available datasets: the Kaggle dataset ²²2https://www.kaggle.com/datasnaek/mbti-type and the Essays dataset Pennebaker and King (1999).

The Kaggle dataset is sourced from PersonalityCafe ³³3http://personalitycafe.com/forum, and is an extensive collection of textual data aimed at exploring and predicting personality types based on the Myers-Briggs Type Indicator (MBTI). The personality classification follows the MBTI framework Myers-Briggs (1991), which segments personality into four dimensions: Introversion/Extraversion (I/E), Sensing/Intuition (S/N), Thinking/Feeling (T/F), and Judging/Perceiving (J/P). The dataset consists of 8,674 entries, each entry representing an individual’s text data (each consisting of 45-50 posts) along with their corresponding MBTI type.

The Essays dataset is a comprehensive collection of text data designed for personality recognition tasks, particularly focusing on the Big Five personality traits Goldberg (1990). Given specific instructions, volunteers wrote freely to express their thoughts within a limited time. 2,468 texts along with each author’s Big Five personality traits ( Agreeableness, Conscientiousness, Extraversion, Neuroticism, and Openness) makes up this dataset.

Due to the limited API resources, we randomly selected 10% samples form each test set to evaluate our EERPD.

Methods	I/E		S/N		T/F		J/P		Average
Methods	Acc.	F1	Acc.	F1	Acc.	F1	Acc.	F1	Acc.	F1
TF-IDF+SVM	71.00	44.94	79.50	46.38	75.00	74.25	61.50	58.59	71.75	56.04
Regression	61.34	64.00	47.10	54.50	76.34	76.50	65.58	66.00	62.59	65.25
AttRCNN	-	59.74	-	64.08	-	78.77	-	66.44	-	67.25
TrigNet	77.80	66.64	85.00	56.45	78.70	78.32	73.30	71.74	78.70	68.29
DDGCN	78.10	70.26	84.40	60.66	79.30	78.91	73.30	71.73	78.78	70.39
BERT	77.30	62.50	84.90	54.04	78.30	77.93	69.50	68.80	77.50	65.82
RoBERTa	77.10	61.89	86.50	57.59	79.60	78.69	70.60	70.07	78.45	67.06
Zero-shot-CoT	76.50	64.27	83.50	55.16	72.50	71.99	57.50	53.14	72.50	61.14
Two-shot-CoT	85.93	85.41	78.89	77.55	87.44	86.77	69.35	70.36	80.40	80.02
TAE	-	70.90	-	66.21	-	81.17	-	70.20	-	72.07
PsyCoT	79.00	66.56	85.00	61.70	75.00	74.80	57.00	57.83	74.00	65.22
EERPD(our)	87.10	86.63	91.01	90.59	89.17	89.15	81.34	82.12	87.15	87.12

Methods	AGR		CON		EXT		NEU		OPN		Average
Methods	Acc.	F1	Acc.	F1	Acc.	F1	Acc.	F1	Acc.	F1	Acc.	F1
LIWC+SVM	51.78	47.50	51.99	52.00	51.22	49.20	51.09	50.90	54.05	52.40	52.03	50.40
Regression	50.96	51.01	54.65	54.66	55.06	55.06	57.08	57.09	59.51	59.51	55.45	55.47
W2V+CNN	-	46.16	-	52.11	-	39.40	-	58.14	-	59.80	-	51.12
BERT	56.84	54.72	57.57	56.41	58.54	58.42	56.60	56.36	60.00	59.76	57.91	57.13
RoBERTa	59.03	57.62	57.81	56.72	57.98	57.20	56.93	56.80	60.16	59.88	58.38	57.64
Zero-shot-CoT	58.94	58.09	55.14	42.49	57.55	55.63	57.49	54.63	58.78	54.40	57.58	53.05
Two-shot-CoT	55.06	57.27	59.51	59.63	52.63	52.84	53.85	53.64	57.09	57.74	55.63	56.22
PsyCoT	61.13	61.13	59.92	57.41	59.76	59.74	56.68	56.58	60.73	57.30	59.64	58.43
EERPD(our)	64.98	65.01	68.00	68.64	62.01	63.02	56.00	56.00	61.02	60.93	62.40	62.72

5.2 Baselines

In our experiments, we adopt the following previous methods as baselines.

Statistical Learning: These methods aim to enhance sentiment classification accuracy through statistical learning methods. Tighe etal. (2016) uses SVM with LIWC Pennebaker etal. (2001) and linguistic cognitive analysis. Cui and Qi (2017) uses SVM with TF-IDF for feature extraction. Park etal. (2015) uses ridge regression to conduct regression modeling between the language features and users’ Big Five personality traits.

Neural Network Models: These methods leverage neural network architectures to enhance personality detection. W2V+CNN Rahman etal. (2019) is a non-pretrained CNN model Chen (2015) combined with the word2vec algorithm for context representation. AttRCNN Xue etal. (2018b) uses a hierarchical structure with a variant of Inception Szegedy etal. (2017) to encode each post. DDGCN Yang etal. (2022) employs a domain-adapted BERT to encode each post and a dynamic deep graph network to aggregate posts non-sequentially. Small language models like BERT Devlin etal. (2019) and RoBERTa Liu etal. (2019) are fine-tuned on "bert-base-cased" and "roberta-base" backbones, encoding the context for Essays and combining post representations using mean pooling for Kaggles.

Large Language Models: These methods either use LLMs directly or incorporate them as a significant component of the model architecture. Kojima etal. (2022) inserts a reasoning step with "Let’s think step by step", and is adopted as the Zero-shot-CoT baseline in this work. TAE Hu etal. (2024) improves small model performance in personality detection using text augmentations from LLMs and contrastive learning. PsyCoT Yang etal. (2023) uses psychological questionnaires as a CoT process, leveraging LLM to perform multi-turn dialogue ratings. We also build a Two-shot CoT prompt as a reference baseline for our EERPD, consisting of two randomly selected examples.

5.3 Implementation Details

Due to baseline research and economic considerations, we request the GPT-3.5 API (gpt-3.5-turbo-16k-0613) to obtain results, which is currently the most popular and forms the foundation of ChatGPT. For Zero-shot-CoT, Two-shot-CoT and our EERPD methods, we set the temperature to 0 to get a reliable rather than innovative output. For the PsyCoT and TAE method, we adopt the results from Yang etal. (2023) and Hu etal. (2024). For the AttRCNN and W2V+CNN, we adopt the results from Hu etal. (2024), setting the learning rate for the pre-trained post encoder to 1e-5, and for other parameters 1e-3. For the other fine-tuning based methods, we adopt the baseline results directly from Yang etal. (2023), where the learning rate was set to 2e-5 and the test performance was evaluated by averaging the results of five runs. The evaluation metrics employed in our study include Accuracy and Macro-F1 score.

5.4 Overall Results

The overall results of EERPD and several baselines on Kaggle are listed in Table 1, and on Essays are listed in Table 2. The small model baselines can divided into three types: statistical learning models (LIWC+SVM, TF-IDF+SVM, Regression), convolution models (W2V+CNN, AttRCNN, TrigNet, DDGCN), and small language models (BERT and RoBERTa). The baselines involved in large language models are: Zero-shot-CoT, Two-shot-CoT, TAE and PsyCoT.

Several key points emerge from these results:

First, EERPD outperforms the baselines on almost all the personality traits, surpassing the fine-tuned models and other prompt-based methods. Specifically, EERPD enhances standard Two-shot-CoT prompting with an average increase of 6.75/7.10 points in Accuracy and Macro-F1 on Kaggle, 6.77/6.50 in Accuracy and Macro-F1 on Essays.

Second, EERPD performs worse than other methods on the Neuroticism trait. Further analysis reveals that this discrepancy may be due to the low correlation between language-based assessments and self-report questionnaires for Neuroticism. As shown in Park etal. (2015), Neuroticism has the lowest correlation coefficients with self-report questionnaires, indicating that language models struggle to accurately capture and predict this trait, leading to lower prediction accuracy.

Third, although includes two examples with CoT, Two-shot-CoT baseline does not consistently improve the performance of PsyCoT or even Zero-shot-CoT on Essays. Our investigation shows that the examples in Two-shot-CoT are sometimes unhelpful or even negative for detection in Essays, with high conflicts with the sample to be tested. For instance, if both given examples have high Agreeableness, even if the author criticizes their roommate in the test sample, the reasoning still considers it to be high agreeable.

Methods	Kaggle
Methods	I/E	S/N	T/F	P/J	Average
EERPD_{r/w 0-shot}	76.50	83.50	72.50	57.50	72.50
EERPD_{r/w 2-shot}	85.93	78.89	87.44	69.35	80.40
EERPD_{w/o E}	71.24	66.34	80.61	67.43	71.40
EERPD_{w/o ER}	70.14	65.29	80.03	69.55	71.25
EERPD	87.10	91.01	89.17	81.34	87.15

5.5 Ablation Study

To verify the importance of each component in our EERPD, we conduct an ablation study on 100 samples randomly selected from each of the Kaggle test dataset and Essays test dataset.

Emotion and Emotion Regulation.

We first analyze the contributions of Emotion and Emotion Regulation for example retrieval. $EERPD{}{w/o\ E}$ refers to the condition where $\alpha=1.0$ , meaning examples are retrieved only based on Emotion Regulation Vectors. $EERPD{}{w/o\ ER}$ refers to the condition where $\alpha=0.0$ , meaning examples are retrieved only based on Emotion Vectors. As shown in Table 3, we use three groups for comparison: our overall method shows significant improvements compared to the 0-shot and 2-shot baselines; compared to baselines utilizing only Emotion Regulation ( $EERPD{}{w/o\ E}$ ) or only Emotion ( $EERPD{}{w/o\ ER}$ ), it is demonstrated that both components contribute to improvements. Our combined method outperforms both $EERPD{}{w/o\ E}$ and $EERPD{}{w/o\ ER}$

Parameter $\alpha$ .

We investigate the trade-off parameter $\alpha$ in our EERPD, demonstrating the model’s sensitivity to $\alpha$ variations and identifying its optimal range. The results, shown in Figure 4, illustrate performance variations with $\alpha$ on the Kaggle and Essays test datasets.

For the two dataset, both accuracy and Macro-F1 score peaked at $\alpha=0.7$ , then declined. These findings suggest that $\alpha=0.7$ has the best balance between Emotion Regulation and Emotion, with emotion regulation proving more predictive of personality. Performance drops when $\alpha$ is 0 or 1, highlighting that combining both features is more effective than using a single one. The experimental results also show that $\alpha=0.7$ , is also higher than $\alpha=0.5$ , which is the vector of the original text directly used for retrieval without combining in weighted proportion. And it proves that our combined method is more efficient than the whole article for retrieval.

Auxiliary CoT.

We evaluate the effect of the auxiliary CoTs generation statistically, using the 100 random samples from Kaggle dataset. And the F1 scores are shown in Table 4. Each data point is the mean of 5 trials. T-test analysis demonstrates that the auxiliary CoTs has statiscal significance with p less than 0.05.

Dataset	E/I	N/S	T/F	P/J
EERPD	90.56	91.97	92.43	81.51
w/o CoTs	78.61	83.17	86.88	79.03

EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection (4)

6 Analysis

6.1 Different Base Model

EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection (5)

EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection (6)

To evaluate the robustness of our EERPD method across different model architectures, we conducted experiments using various popular language models, including BART Lewis etal. (2019), BERT Devlin etal. (2018), LLAMA Touvron etal. (2023), RoBERTa Liu etal. (2019), and W2V Le and Mikolov (2014). The evaluation dataset consists of 100 samples selected from the Kaggle test dataset.

The results presented in Figure 5, indicate that EERPD maintains consistent performance across different model architectures. This flexibility allows for broader applicability in various settings.

6.2 Correlation Analysis on Example Selection

To evaluate our example selection, we conducted a correlation test between the selected examples and the test set examples. As shown in Figure 6, the result reveals no significant correlation, confirming that our selection method does not leak test set answers to the model. Instead, it identifies examples with similar reasoning patterns. This demonstrates that our method effectively teaches the model relevant reasoning techniques, ensuring that it learns to generalize rather than memorize specific answers. Thus, our approach enhances the model’s ability to perform accurate label predictions based on learned reasoning strategies.

Dataset	E/I	N/S	T/F	P/J
Order	84.16	86.03	91.14	80.74
Random	85.54	89.40	89.02	79.61

6.3 Impact of Post Order

The Kaggle dataset includes a collection of posts for each user. These posts are combined in sequence to form a lengthy document, and each post is an input $X$ waiting for detection. However, as Yang etal. (2023) mentioned, researches by Yang etal. (2021, 2022) have shown that sequential encoding of posts is sensitive to order in fine-tuned models. To determine if EERPD are affected by post order, we randomly shuffled the posts and re-evaluated our method using 100 samples. And the F1 results are printed in Table 5. Each score is the average result after five rounds of experiments. The T-test analysis indicates that there is no statistically significant difference between the sequential test and the random-shuffled test.

Dataset	E/I	N/S	T/F	P/J
Standard	84.23	85.26	87.33	79.27
EERPD	90.56	91.97	92.43	81.51

6.4 Statistical Tests

To statistically evaluate the significance of our approach, we compared the standard Two-shot-CoT baseline with EERPD using the 100 random samples from Kaggle dataset, and the F1 scores are shown in Table 6. Each data point is the mean of 5 trials. T-test analysis demonstrates that our enhancements has statiscal significance, with p-values of less than 0.05 for each dimension.

7 Conclusion

In this paper, we introduced EERPD, a novel few-shot personality detection method inspired by psychological concepts. By leveraging emotion regulation and emotion to retrieve few-shot samples, EERPD forms a robust Chain of Thought (CoT) that guides large language models in evaluating personality traits. Experiments on two benchmark datasets show that EERPD significantly outperforms traditional methods and other novel prompts. This approach uniquely integrates psychological insights to enhance the reasoning abilities of large language models, offering a new perspective for personality detection.

Limitations

Due to the resource constraints, we only conduct experiments and analysis about LLMs on GPT-3.5. The extent to which GPT-4 or GPT-4o models can benefit from our EERPD remains unknown.

This study primarily focuses on improving the LLM’s performance by leveraging the psychological knowledge of Emotion Regulation. How to exploit trainable and tunable models like BERT and LLAMA to further optimize EERPD is left for future investigation.

This method carries certain potential risks. Even with well-intentioned use, personality detection may lead to misjudgments, negatively impacting individuals’ careers or social relationships.

Ethics Statement

This work adheres to the ACL Ethics Policy. We assert that, to the best of our knowledge, our work does not present any ethical issues. We have conducted a thorough review of potential ethical implications in our research and found none.

Appendix A Appendix: Prompt for Prediction

To better understand our method, we provide all the prompt in appendix, and record for the whole prediction is in Figure 7.

Appendix B Appendix: Prompt for Sentence Categorization

⬇

# Record of prediction prompt

MBTI is a tool used to assess a person’s psychological preferences and personality types, and there are 16 different types of MBTI, each consisting of four letters representing four dimensions of preference. And the four dimensions are:

Extroversion (E) or introversion (I) : indicates whether a person is more inclined to draw energy from the outside world or the inside world.

Sense (S) or intuition (N) : indicates whether a person is more inclined to focus on concrete facts and details, or abstract concepts and possibilities.

Thinking (T) or emotion (F) : indicates whether a person is more inclined to make decisions using logic and principles, or values and emotions.

Judgment (J) or perception (P) : indicates whether a person is more inclined to a planned and organized lifestyle, or a flexible and random lifestyle.

You are an AI assistant who specializes at MBTI personality traits. I will give you texts from the same author, and then I will ask you the author’s MBTI type, and then you need to give me your choice.

The definition of Emotion Regulation and Emotion are as follows:

1.Emotion Sentences: These sentences should be clearly linked to immediate, temporary feelings that arise from specific, recent incidents or current situations. The key is that the emotion should be an obvious reaction to a recent event and not indicative of a deeper, long-standing trait or belief.

2. Emotion Regulation Sentence: These sentences must consistently reflect the speaker’s enduring traits. They should not be influenced by immediate circ*mstances but rather indicate a persistent and characteristic ability of controling emotion.

Please refer to the following examples to learn how to use Emotion Regulation and Emotion in the text for personality classification.

I will give you 45~50 posts from the same user, divided by |||. Please use MBTI personality analysis to help me analyze what the user’s MBTI is most likely to be.

Here are two examples:

—

Example 1:

The posts of this user are: +cot.iloc[e1][’posts’]+

Result: """+cot.iloc[e1][’type’]+"""

Process:"""+cot.iloc[e1][’cot’]+"""

—

Example 2

The posts of this user are: """+cot.iloc[e2][’posts’]+"""

Result: """+cot.iloc[e2][’type’]+"""

Process:"""+cot.iloc[e2][’cot’]+"""

—

Now, analysis the user’s MBTI type with your reasoning process.

It should be noted that when the user’s first dimension result is E, the user’s fourth dimension result is more likely to be P.

The user’s post reads as follows:

"""+post+"""

Your response should follow the following format:

Result: {The four letters represent the type of mbti you guessed}

Process: {your reasoning process}.

To better understand our method, we provide all the prompt in appendix, and record for EER Text Split prompt is in Figure 8.

⬇

# Record of EER Text Split

I am a sentence sentiment adjudicator specialized in distinguishing the sources of emotions in sentences - whether they stem from the speaker’s current mood or their inherent personality. Your task is to assist me by examining the text and discerning the dominant influence in each sentence, based on these highly refined definitions:

1. Emotion Sentences: These sentences should be clearly linked to immediate, temporary feelings that arise from specific, recent incidents or current situations. The key is that the emotion should be an obvious reaction to a recent event and not indicative of a deeper, long-standing trait or belief.

• Highly Refined Definition: Look for signs that the emotion is an immediate response to a particular event, is temporary, and doesn’t reflect an ongoing pattern of thoughts or behaviors.

• Examples and Analysis:

- "Sex can be boring if it’s in the same position often. For example me and my girlfriend are currently in an environment where we have to creatively use cowgirl and missionary. There isn’t enough…" - Emotion, as it describes a current, specific situation causing temporary boredom.

- "I’m thrilled about the concert tonight!" - Emotion, as the excitement is tied to a specific, imminent event (the concert).

- "I am anxious because of the upcoming exam." - Emotion, since the anxiety is a temporary response to a specific future event (the exam).

- "I am angry with my friend for something they did last week." - Emotion, because the anger is a reaction to a specific, recent event (the friend’s action last week).

• Highly Refined Definition: Determine if the expression is reflective of a longstanding personality trait for emotion control, consistent over time and not a reaction to a specific, recent circ*mstance.

• Examples and Analysis:

- "I’m finding the lack of me in these posts very alarming." - Emotion Regulation, as it reflects a long-term concern about self-representation rather than an immediate emotional reaction.

- "Giving new meaning to ’Game’ theory." - Emotion Regulation, as it expresses a general viewpoint on a concept, not about temporary feelings.

- "Hello *ENTP Grin* That’s all it takes. Than we converse and they do most of the flirting while I acknowledge their presence and return their words with smooth wordplay and more cheeky grins." - Emotion Regulation, as it describes a consistent behavior pattern rather than a reaction to a specific event.

- "Real IQ test I score 127. Internet IQ tests are funny. I score 140s or higher. Now, like the former responses of this thread I will mention that I don’t believe in the IQ test. Before you banish…" - Emotion Regulation, as it reflects a long-standing skepticism towards IQ tests, not an immediate emotional reaction.

Special Case: Any sentences containing only a URL should be classified as ’Emotion Regulation’.

- "http://www.youtube.com/watch?v=4V2uYORhQOk" - Emotion Regulation, because it is a pure URL.

- "http://playeressence.com/wp-content/uploads/2013/08/RED-red-the-pokemon-master-32560474-450-338.jpg" - Emotion Regulation, as it is a URL.

Ambiguous Examples and Detailed Analysis:

1. "The last thing my INFJ friend posted on his facebook before committing suicide the next day. Rest in peace~" - Emotion. Although it mentions an INFJ personality type, the focus is on the immediate emotional reaction to the friend’s recent suicide.

2. "I often find myself reflecting deeply on my experiences." - Emotion Regulation. This indicates a consistent trait of introspection, not linked to a specific, recent event.

For each sentence provided, carefully determine whether it primarily reflects ’Emoiton’ or ’Emotion Regulation’, based on these highly refined criteria. List each sentence and categorize it as either ’Emotion’ or ’Emotion Regulation’.

The texts from this author are: """ + post + """.

Respond in the following format without any reason or explain:

0. [Emotion/Emotion Regulation]

1. [Emotion/Emotion Regulation]

2. [Emotion/Emotion Regulation]

Focus meticulously on these criteria to maximize the accuracy and consistency of classification.

Appendix C Appendix: Prompt for Generation of auxiliary CoT

To better understand our method, we provide all the prompt in appendix, and record for generating the CoT in Reference Library is in Figure 9.

⬇

# Record of CoT Generation

Suppose you are a psychologist with a keen interest in personality types and online behavior. You know that MBTI is a tool used to assess a person’s psychological preferences and personality types, and there are 16 different types of MBTI, each consisting of four letters representing four dimensions of preference. And the four dimensions are:

Extroversion (E) or introversion (I) : indicates whether a person is more inclined to draw energy from the outside world or the inside world.

Sense (S) or intuition (N) : indicates whether a person is more inclined to focus on concrete facts and details, or abstract concepts and possibilities.

Thinking (T) or emotion (F) : indicates whether a person is more inclined to make decisions using logic and principles, or values and emotions.

Judgment (J) or perception (P) : indicates whether a person is more inclined to a planned and organized lifestyle, or a flexible and random lifestyle.

I will give you 45~50 posts from the same user, divided by |||. Please use MBTI personality analysis to help me analyze what the user’s MBTI is most likely to be. I will give you 45~50 posts from the same user, divided by |||, and the MBTI type of the user. Please use MBTI personality analysis to help me analyze why the user is this MBTI type.

Here is an example:

—

Example:

The posts of this user are: ’Wow, thank you for this thread! Physical vs. metaphysical is a great topic! I find that I am very much the same way your are. How can I put it….I have my days. :) The more I develop my xSxJ, the…|||my room. I like to be in my bad, next to my books, with my fan on and laptop nearby.|||I wouldn’t say that I can read souls - but I can see potential. I can sense sadness, happiness, uneasiness, etc. I can tell when someone is not happy where they are and with what they are doing with…|||thank you for being so polite! :)|||I find eye contact is key. I acknowledge their existence and importance by maintaining eye contact with them throughout the conversation. Not by staring in their eyes in a creeper way, but by making…|||As an INFJ male I can somewhat relate to your post. A very close lady friend of mine and I were like this for years! I had always liked her and could read her fairly well. I knew when she needed…’

Result: INFJ

Process: Based on the posts you provided, I would guess that the user is an INFJ personality type. INFJs are known as the advocates, who are quiet and mystical, yet very inspiring and tireless idealists. They are often deeply spiritual, compassionate, and intuitive. They value harmony, authenticity, and personal growth. They can also be very sensitive, private, and perfectionistic.

Some clues that suggest the user is an INFJ are:

First of all, I think the user is an introvert (I). The user prefers to spend time alone in his room with books and laptop, rather than socializing with many people. He also seem to be more focused on his inner world of thoughts and feelings, rather than the outer world of events and actions.

Secondly, I think the user is an intuitive (N). He is interested in abstract concepts and possibilities, such as physical vs. metaphysical. He can see the potential in people and situations, and he is not limited by the facts and details. He also has a wide range of knowledge and interests, and he is constantly learning and innovating.

Thirdly, I think the user is a feeler (F). He makes decisions based on his values and emotions, rather than logic and principles. He can sense the emotions of others and empathize with them. He is polite and respectful, and he values harmony and cooperation.

Lastly, I think the user is a judger (J). He prefers a planned and organized lifestyle, rather than a flexible and random one. He has a clear sense of direction and purpose, and he likes to achieve his goals. He also have a strong xSxJ side, which means he can use his sensing function to deal with reality and details when necessary.

Therefore, based on my analysis, I think the user’s MBTI type is INFJ. INFJs are known as the advocates or the counselors. They are idealistic, creative, compassionate, and insightful. They have a vision of how to make the world a better place, and they use their intuition and feeling to inspire and motivate others. They are also loyal, dedicated, and supportive of their friends and loved ones.

—

Now, you should generate the {Process}, according to the MBTI type and the posts given to you.

The user’s MBTI type is: """+type+""", and the user’s posts are:"""+post+""".

Your response should follow the following format:

Process: {your reasoning process}.

"""

References

Argamon etal. (2005)Shlomo Argamon, Sushant Dhawle, Moshe Koppel, and JamesW Pennebaker. 2005.Lexical predictors of personality type.In Proceedings of the 2005 joint annual meeting of the interface and the classification society of North America, pages 1–16.
Barańczuk (2019)Urszula Barańczuk. 2019.The five factor model of personality and emotion regulation: A meta-analysis.Personality and Individual Differences, 139:217–227.
Borges and Naugle (2017)LaurenM Borges and AmyE Naugle. 2017.The role of emotion regulation in predicting personality dimensions.Personality and Mental Health, 11(4):314–334.
Chen (2015)Yahui Chen. 2015.Convolutional neural network for sentence classification.Master’s thesis, University of Waterloo.
Corr and Matthews (2009)PhilipJ Corr and GeraldEd Matthews. 2009.The Cambridge handbook of personality psychology.Cambridge University Press.
Cui and Qi (2017)Brandon Cui and Calvin Qi. 2017.Survey analysis of machine learning methods for natural language processing for mbti personality type prediction.Final Report Stanford University.
Davidson (2001)RichardJ Davidson. 2001.Toward a biology of personality and emotion.Annals of the New York academy of sciences, 935(1):191–207.
Devlin etal. (2018)Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018.Bert: Pre-training of deep bidirectional transformers for language understanding.arXiv preprint arXiv:1810.04805.
Devlin etal. (2019)Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019.Bert: Pre-training of deep bidirectional transformers for language understanding.In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4171–4186.
Francis and Booth (1993)MEFrancis and RogerJ Booth. 1993.Linguistic inquiry and word count.Southern Methodist University: Dallas, TX, USA.
Gjurković etal. (2020)Matej Gjurković, Mladen Karan, Iva Vukojević, Mihaela Bošnjak, and Jan Šnajder. 2020.Pandora talks: Personality and demographics on reddit.arXiv preprint arXiv:2004.04460.
Goldberg (1990)LewisR Goldberg. 1990.An alternative" description of personality": the big-five factor structure.Journal of personality and social psychology, 59(6):1216.
Gross (2008)JamesJ Gross. 2008.Emotion regulation.Handbook of emotions, 3(3):497–513.
Hu etal. (2024)Linmei Hu, Hongyu He, Duokang Wang, Ziwang Zhao, Yingxia Shao, and Liqiang Nie. 2024.Llm vs small model? large language model based text augmentation enhanced personality detection model.In Proceedings of the AAAI Conference on Artificial Intelligence, volume38, pages 18234–18242.
Hu and Pu (2010)Rong Hu and Pearl Pu. 2010.A study on user perception of personality-based recommender systems.In User Modeling, Adaptation, and Personalization: 18th International Conference, UMAP 2010, Big Island, HI, USA, June 20-24, 2010. Proceedings 18, pages 291–302. Springer.
Jung (1959)CGJung. 1959.The archetypes and the collective unconscious.
Keltner (1996)Dacher Keltner. 1996.Facial expressions of emotion and personality.Handbook of emotion, adult development, and aging, pages 385–401.
Kojima etal. (2022)Takeshi Kojima, ShixiangShane Gu, Machel Reid, Yutaka Matsuo, and Yusuke Iwasawa. 2022.Large language models are zero-shot reasoners.In ICML 2022 Workshop on Knowledge Retrieval and Language Models.
Le and Mikolov (2014)Quoc Le and Tomas Mikolov. 2014.Distributed representations of sentences and documents.In International conference on machine learning, pages 1188–1196. PMLR.
Lewis etal. (2019)Mike Lewis, Yinhan Liu, Naman Goyal, Marjan Ghazvininejad, Abdelrahman Mohamed, Omer Levy, Ves Stoyanov, and Luke Zettlemoyer. 2019.Bart: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension.arXiv preprint arXiv:1910.13461.
Li etal. (2021)Yang Li, Amirmohammad Kazameini, Yash Mehta, and Erik Cambria. 2021.Multitask learning for emotion and personality detection.arXiv preprint arXiv:2101.02346.
Liu etal. (2019)Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019.Roberta: A robustly optimized bert pretraining approach.arXiv preprint arXiv:1907.11692.
Mohammad and Kiritchenko (2013)Saif Mohammad and Svetlana Kiritchenko. 2013.Using nuances of emotion to identify personality.In Proceedings of the International AAAI Conference on Web and Social Media, volume7, pages 27–30.
Myers-Briggs (1991)Isabel Myers-Briggs. 1991.Introduction to type: A description of the theory and applications of the myers-briggs indicator.Consulting Psychologists: Palo Alto.
Park etal. (2015)Gregory Park, HAndrew Schwartz, JohannesC Eichstaedt, MargaretL Kern, Michal Kosinski, DavidJ Stillwell, LyleH Ungar, and MartinEP Seligman. 2015.Automatic personality assessment through social media language.Journal of personality and social psychology, 108(6):934.
Pennebaker etal. (2001)JamesW Pennebaker, MarthaE Francis, and RogerJ Booth. 2001.Linguistic inquiry and word count: Liwc 2001.Mahway: Lawrence Erlbaum Associates, 71(2001):2001.
Pennebaker and King (1999)JamesW Pennebaker and LauraA King. 1999.Linguistic styles: language use as an individual difference.Journal of personality and social psychology, 77(6):1296.
Petrides and Mavroveli (2018)KonstantinosV Petrides and Stella Mavroveli. 2018.Theory and applications of trait emotional intelligence.Psychology: The Journal of the Hellenic Psychological Society, 23(1):24–36.
Pocius (1991)KymE Pocius. 1991.Personality factors in human-computer interaction: A review of the literature.Computers in Human Behavior, 7(3):103–135.
Rahman etal. (2019)MdAbdur Rahman, Asif AlFaisal, Tayeba Khanam, Mahfida Amjad, and MdSaeed Siddik. 2019.Personality detection from text using convolutional neural network.In 2019 1st international conference on advances in science, engineering and robotics technology (ICASERT), pages 1–6. IEEE.
Rangra etal. (2023)Kalpana Rangra, Virender Kadyan, and Monit Kapoor. 2023.Emotional speech-based personality prediction using npso architecture in deep learning.Measurement: Sensors, 25:100655.
Reisenzein and Weber (2009)Rainer Reisenzein and Hannelore Weber. 2009.Personality and emotion.The Cambridge handbook of personality psychology, 2:81–99.
Sang etal. (2022)Yisi Sang, Xiangyang Mou, MoYu, Dakuo Wang, Jing Li, and Jeffrey Stanton. 2022.Mbti personality prediction for fictional characters using movie scripts.arXiv preprint arXiv:2210.10994.
Szegedy etal. (2017)Christian Szegedy, Sergey Ioffe, Vincent Vanhoucke, and Alexander Alemi. 2017.Inception-v4, inception-resnet and the impact of residual connections on learning.In Proceedings of the AAAI conference on artificial intelligence, volume31.
Tandera etal. (2017)Tommy Tandera, Derwin Suhartono, Rini Wongso, YenLina Prasetio, etal. 2017.Personality prediction system from facebook users.Procedia computer science, 116:604–611.
Tighe etal. (2016)EdwardP Tighe, JenniferC Ureta, Bernard AndreiL Pollo, CharibethK Cheng, and Remedios deDiosBulos. 2016.Personality trait classification of essays with the application of feature reduction.In SAAIP@ IJCAI, pages 22–28.
Touvron etal. (2023)Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, etal. 2023.Llama 2: Open foundation and fine-tuned chat models.arXiv preprint arXiv:2307.09288.
Wilkinson and Walford (2001)RossB Wilkinson and WendyA Walford. 2001.Attachment and personality in the psychological health of adolescents.Personality and Individual Differences, 31(4):473–484.
Xue etal. (2018a)DiXue, Lifa Wu, Zheng Hong, Shize Guo, Liang Gao, Zhiyong Wu, Xiaofeng Zhong, and Jianshan Sun. 2018a.Deep learning-based personality recognition from text posts of online social networks.Applied Intelligence, 48(11):4232–4246.
Xue etal. (2018b)DiXue, Lifa Wu, Zheng Hong, Shize Guo, Liang Gao, Zhiyong Wu, Xiaofeng Zhong, and Jianshan Sun. 2018b.Deep learning-based personality recognition from text posts of online social networks.Applied Intelligence, 48.
Yang etal. (2021)Feifan Yang, Xiaojun Quan, Yunyi Yang, and Jianxing Yu. 2021.Multi-document transformer for personality detection.In Proceedings of the AAAI Conference on Artificial Intelligence, volume35, pages 14221–14229.
Yang etal. (2022)Tao Yang, Jinghao Deng, Xiaojun Quan, and Qifan Wang. 2022.Orders are unwanted: Dynamic deep graph convolutional network for personality detection.In Proceedings of AAAI 2023.
Yang etal. (2023)Tao Yang, Tianyuan Shi, Fanqi Wan, Xiaojun Quan, Qifan Wang, Bingzhe Wu, and Jiaxiang Wu. 2023.Psycot: Psychological questionnaire as powerful chain-of-thought for personality detection.In The 2023 Conference on Empirical Methods in Natural Language Processing.

EERPD: Leveraging Emotion and Emotion Regulation for Improving Personality Detection (2024)

Abstract

1 Introduction

2 Related Work

3 Task Formulation

4 Method

4.1 Reference Library Construction

4.2 Sentence Categorization

4.3 Example Retrieval

4.4 Personality Prediction

5 Experiments

5.1 Datasets

5.2 Baselines

5.3 Implementation Details

5.4 Overall Results

5.5 Ablation Study

Emotion and Emotion Regulation.

Parameter α𝛼\alphaitalic_α.

Auxiliary CoT.

6 Analysis

6.1 Different Base Model

6.2 Correlation Analysis on Example Selection

6.3 Impact of Post Order

6.4 Statistical Tests

7 Conclusion

Limitations

Ethics Statement

Appendix A Appendix: Prompt for Prediction

Appendix B Appendix: Prompt for Sentence Categorization

Appendix C Appendix: Prompt for Generation of auxiliary CoT

References

Parameter $\alpha$ .