key: cord-0431340-tlit53r0
authors: Yao, Jing; Dou, Zhicheng; Xie, Ruobing; Lu, Yanxiong; Wang, Zhiping; Wen, Ji-Rong
title: USER: A Unified Information Search and Recommendation Model based on Integrated Behavior Sequence
date: 2021-09-30
journal: nan
DOI: 10.1145/3459637.3482489
sha: c7b41207bf3c96e1942f61eff45ed333825953b4
doc_id: 431340
cord_uid: tlit53r0

Search and recommendation are the two most common approaches used by people to obtain information. They share the same goal -- satisfying the user's information need at the right time. There are already a lot of Internet platforms and Apps providing both search and recommendation services, showing us the demand and opportunity to simultaneously handle both tasks. However, most platforms consider these two tasks independently -- they tend to train separate search model and recommendation model, without exploiting the relatedness and dependency between them. In this paper, we argue that jointly modeling these two tasks will benefit both of them and finally improve overall user satisfaction. We investigate the interactions between these two tasks in the specific information content service domain. We propose first integrating the user's behaviors in search and recommendation into a heterogeneous behavior sequence, then utilizing a joint model for handling both tasks based on the unified sequence. More specifically, we design the Unified Information Search and Recommendation model (USER), which mines user interests from the integrated sequence and accomplish the two tasks in a unified way.

On Internet platforms, search and recommendation are two major approaches to help users obtain the required knowledge. In this paper, we mainly focus on the domain of information content service which aims to deliver news feeds, tweets, or web articles to users 1 . In order to improve users' satisfaction with search and recommendation results, a lot of personalized search models and recommendation models have been proposed [5, 10, 13, 14, 20, 36, 37] . These models aim to mine user preferences from their historical behaviors to infer their current intents and generate a personalized document ranking list that can satisfy the current user interest. Typically, many deep learning based personalized search models learn a representation of user interests from her search history to re-rank the candidate documents [13, 20, 21, 43, 44, 48] . Recommendation models also present document ranking lists according to the user's browsing history [14, 36, 37, 39, 41] . However, most existing studies concentrate on only one single task, namely either search or recommendation. They devise a specific model applicable for one task, but rarely consider their combination.

Currently, there are more and more mobile Apps and websites where both information search and recommendation services are available. For the example of Toutiao 2 platform shown in Figure 1 , users can not only actively issue queries to seek information, but also browse the recommended articles. Indeed, some early attempts of combining the two services have already been applied. For example, some articles are recommended along with the clicked search results. Queries may also be suggested at the end of a recommended news article. Therefore, how to effectively aggregate the two tasks together is an essential and valuable problem.

Actually, some early studies [2] have discussed the similarity between search and recommendation. The two tasks share the same target -helping people get the information they require at the right time. Zamani and Croft [46] propose a vanilla joint learning framework to handle both tasks at the same time. They train two separate models for the two tasks through a joint loss, but neglect the essential relatedness between them in human information-seeking behaviors. Actually, users usually switch between the two services when they are obtaining information from the Web. Let us take the example in Figure 1 for illustration. When a user browses the article list generated by the recommendation system, she is attracted by the article titled "New energy vehicle: Weilai ...?". After reading this article, she switches to the search engine and issues a query to seek more knowledge about "New energy vehicle". Then, she browses the search results and articles recommended along with the clicked document to know more. Such an information-seeking pattern which mixes behaviors made in proactive searches and passive recommendations is common in our surfing process. From the example, we find that the user may switch between the search service and recommender system for a single target, both the search behaviors and browsing behaviors reflect her personalized information need. Therefore, jointly modeling the entire user behavior sequence is expected to discover real user intents more precisely. Besides, some close associations may exist between the two kinds of behaviors that browsing could stimulate search and search might impact browsing in the future. Richer interaction and training data is available. Motivated by this scenario, we pay attention to jointly modeling both tasks of personalized search and recommendation in the information content domain, exploring the potential relatedness between their corresponding user behaviors to promote each other.

To begin with, we integrate the user's historical search and browsing behaviors in chronological order, getting a simplified heterogeneous behavior sequence shown in Figure 2 . represents browsed articles, indicates queries issued by the user, and is documents clicked under the corresponding query. Then, we propose a Unified Information SEarch and Recommendation model (USER) to encode the heterogeneous sequence and solve the two tasks in a unified way. We think recommendation and personalized search share the same paradigm: recommendation can be treated as personalized search taking an EMPTY query. Hence we design the USER model in a personalized ranking style, to rank candidate documents based on the input query (using empty for recommendation) and the user preferences contained in the integrated behavior sequence. This model has several advantages. First, we aggregate the user's search and recommendation logs, alleviating the problem of data sparsity faced by a single task. Second, based on the merged behavior sequence, more comprehensive and accurate user profiles can be constructed, improving personalization performance. Third, the potential relatedness between search and recommendation can be captured to essentially promote each other.

Specifically, our USER model is composed of four modules. First, a text encoder is used to learn the representation for the documents and queries. Second, the session encoder models the integrated behavior sequence in the current session, captures relatedness between the search and browsing behaviors, and clarifies the user's current intention. As for a search behavior including a query and clicked documents with strong relevance, we employ a co-attention structure [25] to fuse their representations. Then, a transformer layer is constructed to capture the associations between the search and browsing behaviors in the session and fuse the context into the current intention. Third, the history encoder learns information from the long-term heterogeneous history sequence as an enhancement. Finally, we build a unified task framework to complete the two tasks in a unified way. We first pre-train the unified model with the training data from both tasks, alleviating data sparsity. Then, we make a copy for each task and finetune it with the corresponding task data to fit the individual data distribution. We experiment on a dataset comprised of search and browsing behaviors constructed from a real-world information content service platform with both search and recommendation engines. The results verify that our model outperforms separate baselines and alleviates data sparsity.

Our main contributions are summarized as follows: (1) We pay attention to both tasks of personalized search and recommendation. For the first time, we integrate separate behaviors of the two tasks into a heterogeneous behavior sequence. (2) We model the relatedness between a user's search and browsing behaviors to promote both personalized search and recommendation. (3) We propose a unified search and recommendation model (USER) that accomplishes the two tasks in a unified way with an encoder for the integrated behavior sequence and a unified task framework.

Personalized Search. Personalized search customizes search results for each user by inferring her personal intents. Early studies relied on features and heuristic methods to analyze user interests. Focusing on click features, Dou et al. [12] proposed P-Click to re-rank documents with their historical click counts. Topic-based features were applied to build user profiles [4, 9, 17, 26, 30, 31, 35] . The Open Directory Project (ODP) [26] , learned or latent topic models [6] were used to obtain the topic-based information of a web page. Besides, the user's reading level and location are applied for personalization [3, 11] . Multiple features were combined with a learning to rank method [7, 8] to compute a personalized score [5, 29] .

Recently, deep learning was applied to capture potential user preferences. Song et al. [27] leveraged personal data to adapt a general ranking model. Ge et al. [13] devised a hierarchical RNN with query-aware attention to dynamically mine preference information. Lu et at. [20] employed GAN [15] to enhance the training data. Yao et al. [44] adopted reinforcement learning to learn user interests. Zhou et al. [49] explored re-finding behaviors with a memory network. The latest studies were committed to disambiguating the query by introducing entities [21] , training personal word embeddings [43] , or involving search history as the context [48] . All these models are specially designed for the personalized search task.

Information Recommendation Models. Personalized content recommendation is critical to help users alleviate information overload and find something interesting. Traditional recommendation systems mainly depended on collaborative filtering (CF) [24] and factorization machine (FM) [23] . With the emergence of deep learning, Figure 2 : Illustration of the integrated behavior sequence. The target behavior is personalized search with a query or recommendation with an empty query.

many models combined both low-and high-order feature interactions, such as Wide & Deep [10] and DeepFM [16] . Specially, representation based models have been studied for the recommendation of news articles that have abundant textual information. These models include two modules: a text encoder to obtain article representations and a user encoder to learn user representation from her browsing history. Then, articles are ranked based on their relevance with the user. Okura et al [22] devised an auto-encoder to learn news representations, and used an RNN to generate user representations. Wu et al. [36] learned article vectors from titles, bodies and topic categories. User representation was a weighted sum of the browsed news vectors. Wu et al. [37] set user embeddings to generate personalized attention to calculate the article and user representations. They also exploited multi-head self-attention [28] to capture contextual information [39] . LSTUR [1] kept both shortterm and long-term user profiles. To enhance text representations, entities in the article and their neighbors in the knowledge graph are considered [19, 32, 33] . The GNN structure [40] was also adopted to model high-order relatedness between users and articles [14, 18] . In these models, only the recommendation task is discussed.

Joint Search and Recommendation. Some studies considered both the search and recommendation tasks. In e-commerce, an early work [34] built a unified recommendation and search system by merging their features. Zamani et al. [46] proposed a joint learning framework that simultaneously trains a search model and a recommendation model by optimizing a joint loss. For the situation with only recommendation data but not search logs, a multi-task framework was trained on browsing interactions [47] . These joint methods simply combined the two tasks and train two separate models through multi-task learning or joint loss, without exploring more essential dependency between them. Search history was also used to help generate recommendations for the users with little browsing history [38, 45] . This model just targeted one single task with data from the other task as complementary information. In this paper, we propose a unified model to solve the two tasks at the same time, mining the relatedness between their corresponding user behaviors to promote each other.

Search and recommendation are two main approaches to help people obtain information. Many separate personalized search models and recommendation models have been proposed. As analyzed in Section 1, people usually achieve their information targets through a mixture of proactive searches and passive recommendation, which is popular on information content service platforms with both search and recommendation engines. Both kinds of behaviors reflect the user's information need and preferences. Thus, compared to existing separate approaches, jointly modeling the two tasks and exploiting the relatedness between them might have the potential to promote each other. In this paper, we integrate the user's search and browsing behaviors into a sequence to discover more accurate user interests, then design a unified model to solve the two tasks in a unified way. Next, we define the new problem to be handled.

Recall that we focus on the information content domain, let us formulate a user's behaviors with notations. On an information content service platform with both search engine and recommendation engine, the user could browse articles in the recommendation system, issue queries to seek for information and click satisfied documents in the search engine. All these behaviors are sequential, so we integrate them into a heterogeneous behavior sequence in chronological order. Referring to existing session segmentation methods [13, 20] , we divide the user's whole behavior sequence into several sessions with 30 minutes of inactivity as the interval. Past behaviors in the current session are viewed as the short-term history. The other previous sessions constitute the long-term history. Specifically, we denote the user's history sequence as = { , } = {{ 1 , . . . , −1 }, }, where is the number of sessions. Each session corresponds to a sub-sequence with both behaviors, such as { 1 , 2 , ( 3 , 3 1 , 3 2 ), . . . , }. We illustrate the whole behavior sequence in Figure 2 . The horizontal edges indicate the sequential relationship between two consecutive actions, while the slanted edges point to the documents clicked under the corresponding query. The blue vertical lines separate sessions. For example, in the current Session , the user first browses two articles in the recommendation system. Then, she enters a query in the search engine and clicks a document under this query. At the current moment , the user would perform a target behavior, either search with an issued query or browsing. For both tasks, we are supposed to infer the user's intent and return a personalized document list. Due to the same paradigm, we regard the recommendation task as personalized search with an empty query, and complete the two tasks in a unified personalized ranking style. Facing or an empty query, the model is required to return a personalized document list based on the query and the user interests learned from the user's integrated behavior sequence.

The architecture of our USER model is shown in Figure 3 . First, the text encoder is used to learn representations for documents and queries. Second, the session encoder models the user's integrated behavior sequence within the current session to clarify her information need. Then, the history encoder enhances the user's intent representation by mining information from the long-term history. Finally, we design a unified task framework to complete personalized search and recommendation in a unified way. We present the details of each module in the remaining parts of this section.

For each query , clicked document and browsed article , we apply the text encoder to learn their semantic representations. Taking the calculation of a browsed article as an example, =

] where is the number of words in the article, the complete text encoder can be divided into three sub-layers. The first 

Browsed article

History-level Transformer Unified Score

Keywords … … ! )

Text Encoder

User ID Figure 3 : The architecture of our USER model. There are four major components: the text encoder to learn representations for queries and documents; the session encoder to model integrated behaviors in the current session; the history encoder to mine information from the long-term behavior sequence; the unified task framework to complete both tasks in a unified way.

is the word embedding layer that converts the word sequence into a matrix with word vectors, i.e. Emb = [ 1 , 2 , . . . , ] ∈ × . corresponds to the low-dimensional word vector of . In addition, contexts within the article are also helpful for users to figure out the true meaning of a word. For example, the different meanings of "Apple" in "Apple fruit" and "Apple company" can be distinguished based on the different contextual words "fruit" and "company". Therefore, we set a word-level transformer [28] as the second sub-layer to obtain the context-aware word representations ∈ × by capturing interactions between words.

= Transformer (Emb ).

The details about transformer can be referred to [28] . The last sub-layer is a word-level attention layer. In a piece of text, different words contribute different informativeness for expressing the semantics of this text. For instance, in the sequence 'symptoms of novel coronavirus pneumonia', the word 'symptoms' is very informative for learning the text representation, while 'of' has little information. To highlight important words in a text sequence, we exploit a word-level attention mechanism to give them larger weights. We set a trainable vector as the query in the attention mechanism. The weights ∈ for all words are computed as:

where and are parameters. The final contextual representation of the browsed document ∈ ×1 is the weighted sum of all the word vectors, i.e. = =1 . Contextual representations of the query and clicked document are computed in the same way.

At the current time , the user has a target action, either search or browsing. We represent her intention with a vector . If the user issues a query for search, the intention is initialized with the text representation of this query computed by the text encoder. Otherwise, we use the corresponding trainable user embedding Emb as initialization. This step is realized by a select gate, as:

if the target behavior is search Emb if the target behavior is browsing (3) Then, we mine information from the user's history comprised of search and browsing behaviors to clarify her personal intent . According to existing studies [13, 48] , it is thought that behaviors within a session show consistency in the user's information need. Thus, the user's past behaviors during the current session could provide rich contextual information for deducing her current intention. In the unified search & recommendation scenario we study, there are both search and browsing actions in a session, as shown in Figure 2 . We analyze that several possible relationships exist between the behaviors in the heterogeneous sequence: (1) For a document clicked under a query, we think this document satisfies the user's information need to be expressed by this query. It shows strong relevance between the query and the document.

(2) After the user browses a series of recommended articles, she might be triggered to seek for more related information through proactive searches. (3) Queries are actively issued by the user, explicitly showing her preferences. With these queries and clicked documents, we can figure out the points of interest the user focuses on when browsing articles. We design a session encoder to capture these associations in the current session and employ the session context to enhance the intent representation.

First, for a historical query and the corresponding clicked documents, we are supposed to learn the strong relevance between them. Clicked documents indicate the user's intention contained in the query keywords, and the query highlights the important words in the documents. Thus, we suggest adopting the co-attention structure [25] to calculate their representation vectors by fusing their interactive information, instead of the vanilla word-attention mechanism. Taking a query and the clicked documents 1 , 2 , . . . as an example, the detailed computing process is as follows. At the first step, we obtain the contextual vector matrices and for the query and each document through the word embedding layer and the word-level transformer of our text encoder. Vectors of all clicked documents are concatenated together as = [ 1 ; 2 ; . . .]. Then, we compute an affinity matrix between and .

= tanh(( ) ),

where ∈ × is a weight matrix to be learned. The attention weights for the query and documents are calculated based on the interactive features in the affinity matrix, as:

, , ℎ , ℎ are parameters. and are the attention weights for query keywords and document terms respectively. We calculate the attended representation for the query and documents as the weighted sum of the contextual vectors and .

The two vectors are concatenated to generate the representation of a historical search behavior through an MLP layer, i.e. = MLP( [ ; ]). For a browsing behavior made in recommendation, it corresponds to only a browsed article . Thus, its representation is just the article representation calculated by the text encoder. With the representation of all past behaviors in the current session calculated, = { 1 , 2 , . . .}, we could capture the relationships between the search and browsing behaviors, and fuse the session context into the user's current intention. We combine with the target intention and pass them through a sessionlevel transformer for interaction. On account of the behaviors are sequential and heterogeneous, we add the position and type information of each behavior for clarification. The action type includes search (S) and browsing (B). Finally, the output of the last position represents the user's current intention fusing the session context. 

[ , ] , [ , ] are the position embedding and type embedding. Transformer last (·) means taking the output of the last position.

With the session encoder described above, we clarify the user's current information need under the help of the short-term history, obtaining . But for the situation with little session history, it is still ambiguous due to the lack of session context. The user's long-term behavior history often reflects relatively stable interests, which also provides some assistant information. Thus, we further model the long-term history to enhance the user's intent representation based on . At first, we process each historical session with the session encoder to capture the connections between search and browsing behaviors, getting the contextual representation for all historical behaviors,

We concatenate all session sub-sequences as a long behavior sequence and combine it with the target action as [ , ] . Then, a history-level transformer module is conducted on the long-term heterogeneous sequence to fuse the history information into the current intention. To preserve the sequential information between actions, we involve the position of each behavior {1, 2, . . . }. In the final, we take the output of the last position as the user's intent representation enhanced by the long-term behavior history, denoted as .

where [ , ] is the position embedding. Motivated by some news recommendation models [37, 39] , the user's attention to a document is also impacted by her interests. Besides, the user might intend to find a specific document that appeared in the history, as analyzed in [49] . Thus, for the candidate document , we can use the long-term history to enhance its representation calculated by the text encoder in the same way as the target intent, getting

We will use together with to calculate the personalized ranking score for the candidate document in the unified task framework that will be introduced in the next part.

As for the personalized search and recommendation tasks in the information content domain, the main difference between them is whether there is an issued query. In the problem definition, we claim to unify the two different tasks as a unified problem by regarding the recommendation task as personalized search with an empty query. We represent the user's current intention as that is initialized with the issued query for search or the user embedding Emb for recommendation. The unified problem is to rank the candidate document based on the personalized relevance that is calculated with the current intention , the query (empty for recommendation) and the user history . The personalized relevance is denoted as unified ( | , , ).

Through the text encoder, session encoder and history encoder, we get the representations of the user's current intention and candidate document, i.e. , , and . We calculate the relevance between each pair of them by cosine similarity sim(·, ·). Moreover, for the personalized search task, the correlation between the candidate document and the query keywords is also critical. Thus, we additionally pay attention to the interactive features between the context-aware representations of the query and document, i.e. and

. We exploit the interaction-based component KNRM [42] to calculate the interactive score inter( , ). The detailed calculation process can be found in [42] . Besides, following [13, 20] , we also extract several relevance-based features , for personalized search. When calculating the relevance for articles in recommendation, the interaction score and features are all empty. Finally, the score for the candidate document is calculated by aggregating all these scores and features with an MLP layer, as:

Φ() represents an MLP layer without an activation function. Whether for the search or recommendation task, we generate personalized document list by calculating relevance scores in this way.

We adopt a pairwise manner to train our USER model. For both personalized search and recommendation tasks, we construct each training sample as a document group comprised of a positive document and negative documents presented in the same impression, represented as { + , ( − 1 , . . . , − )}. For each document group, we aim to maximize the score of the positive document and minimize that of those negative documents. The loss L is computed as the negative log-likelihood of the positive sample. We have:

exp( unified ( + )) + =1 exp( unified ( − )) ). (13) where unified (·) is the abbreviation of unified (·| , , ). We minimize the loss with the Adam optimizer.

In the unified scenario, we have access to both search and recommendation data. Thus, we can train one USER model with data from the two tasks and apply the trained model to both of them. However, there may be a problem that some gaps exist between the data distributions of the search task and recommendation task. The only unified model trained on the data from the two tasks is difficult to achieve the best performance on both of them. Therefore, we propose an alternative training method. We first pre-train a unified model with both task data. Then, we make a copy for each task and finetune it with the corresponding task data to fit the individual data distribution. In this case, the model not only benefits from more training data but also adapts to the specific task.

Dataset There is no public dataset with both search and recommendation logs of a shared set of users in the information content domain. To evaluate our unified model, we construct a dataset comprised of users' search and browsing behaviors from a popular information service platform that has both search and recommendation engines. We randomly sample 100,000 users. Then, we obtain their search logs in its search engine and browsed articles recommended by the recommendation system for three months. The whole log is preprocessed via data masking to protect user privacy.

Each piece of search data includes an anonymous user ID, the action time, a query, top 20 returned documents, click tags and click dwelling time. As for each recommendation record, only a browsed article is kept, without other presented but unclicked documents. We generate pseudo unclicked documents for each browsed article for model training. We rank all documents in the recommendation log based on a weighted score of the popularity measured by the click count and the topic similarity with the browsed article calculated by cosine similarity. Nine negative documents ranked at the top are sampled for each browsed article. The original recommendation list is randomly shuffled. All search and browsing behaviors of each user are merged into a sequence in chronological order.

We separate a user's whole behavior sequence into sessions with 30 minutes of inactivity as the interval [13, 20] . Users' browsing behaviors are usually more frequent than search behaviors, which leads to an unbalance in the dataset. Since we intend to explore the relatedness between the search and browsing behaviors, we sample sessions containing both actions and three sessions before and after these sessions. To guarantee each user has enough history for building user profile, we treat the log data of the first eight weeks as the historical set and the other five weeks log as the experimental data. The experimental data is used for training, validation and testing with 4:1:1 ratio. The statistics are shown in Table 1 .

Metrics Referring to existing works [37, 39] , the recommendation task is also to re-rank the candidate documents. For both tasks, we take the sat-clicked documents with more than 30 seconds of dwelling time as relevant and the others as irrelevant. We choose common ranking metrics to evaluate our model and baselines, including MAP, MRR, P@1, Avg.C (average position of the clicked documents), NDCG@5 and NDCG@10. For recommendation, we also adopt AUC to measure the click-through rate.

The original search results are returned by the search engine. The original recommendation lists are randomly shuffled. Besides, we compare our model with state-of-the-art personalized search models, news recommendation models and the joint framework [46] .

HRNN [13] : A hierarchical RNN model with query-aware attention to dynamically mine relevant history information.

RPMN [49] : This model captures complex re-finding patterns of previous queries or documents with the memory network.

PEPS [43] : Yao et al. claim that different users have different understandings of the same word due to their knowledge. They learn personal word embeddings to clarify the query keywords.

HTPS [48] : It encodes the user's history as the context information to disambiguate the current query. We adapt it to the unified scenario by adding the user's browsed articles into her history.

NPA [37] : The model sets user embeddings to compute personalized word-and news-level attention. It highlights important words and articles to generate informative news and user representations. NRMS [39] : This model utilizes multi-head self-attention to learn news and user representations by capturing the relatedness between words and browsed articles. By adding documents clicked in the search history, we adapt it into the unified scenario.

LSTUR [1] : It includes the short-term user interests modeled from the recent clicked articles with GRU and the long-term profile corresponding to a trainable user embedding.

GERL [14] : It applies transformer on the user's interaction graph to capture high-order associations between users and news.

JSR [46] : This is a general joint framework that trains a separate search model and a recommendation model by optimizing a joint loss. We select HRNN for search and NRMS for recommendation.

USER: This is the unified model proposed in this paper. USER-S and USER-R are the variants used in independent search and recommendation scenarios respectively. They share the same structure as USER but have access to only the data of that single task.

We conduct multiple sets of experiments to decide the model parameters as follows 3 . The size of word embeddings, pretrained by word2vec on all logs, and user embeddings is 100. Due to users' click decisions are usually made based on titles, we use titles in our experiment, instead of complete articles. For each query or document title, the max sequence length is 30. In the history sequence, we maintain up to 20 sessions and the maximum number of user behaviors in a session is 5. The number of heads in the transformer is 8 and the hidden dimension is 50. The number of negative samples in each document group is 4. The learning rate is 1 − 3. 3 The code is on https://github.com/jingjyyao/Personalized-Search/tree/main/USER 6 EXPERIMENTAL RESULTS

We compare all models in various scenarios: pure personalized search with only search data, pure recommendation with only recommendation data and unified scenario with both data. The results are shown in Table 2 . We have several findings:

(1) The comparison of the same model trained with the independent task dataset and the unified dataset. For HTPS, NRMS and USER, their performance on the unified dataset is better than that on the independent task data. For example, the personalized search model HTPS trained on the unified history promotes 0.6% in MAP based on that trained with pure search data. The recommendation model NRMS has 1.6% improvement in MAP with the unified data. Consistently, our USER model in the unified scenario also shows improvements over USER-S and USER-R on all metrics. Compared to the independent task data, the unified dataset is comprised of both search and browsing behaviors, from which we analyze the user's preferences. The results demonstrate that a more precise user interest profile can be constructed based on the integrated behavior sequence to improve ranking qualities.

(2) The comparison of our USER model and the separate personalized search or recommendation baselines. The USER-S and USER-R variants achieve better results than the corresponding baselines on independent scenarios. Greater improvements are observed on the complete USER model in the unified case with both data, with paired t-test at p<0.05 level. Specifically, on the pure personalized search, USER-S outperforms the HTPS. In recommendation, USER-R promotes NRMS on all evaluation metrics. This proves that our history encoders can effectively learn user interests to improve personalized rankings. Furthermore, the complete USER model promotes HTPS more greatly in the unified scenario. We analyze it may because the USER model is pretrained by both search and recommendation tasks on the unified dataset, which benefits from more training data.

(3) The comparison of our unified model USER and the general joint framework JSR. Compared to JSR, USER improves the corresponding separate variants (USER-S and USER-R) much better by training with the unified data. The HRNN and NRMS combined in JSR show similar performance to the original HRNN and NRMS. However, the USER model achieves 1.14% improvements in MAP over the USER-S and 1.20% in MAP over the USER-R. JSR simply combines a personalized search model and a recommendation model through optimizing a joint loss, without exploiting any interactions between them. In the USER model, we integrate the two kinds of behaviors into a heterogeneous sequence and complete both tasks based on this sequence. The results suggest that USER provides a better approach to aggregate the two tasks and capture the associations between them to promote each other.

To conclude, with the unified data comprised of the user's search and browsing logs, a more comprehensive user profile and more training samples can be obtained for personalization. Besides, the USER model is promising to capture the relatedness between the two tasks to promote each other.

To analyze how the major modules in our model impact the effects, we conduct several ablation studies. The variants are as follows.

USER w/o Session Encoder: We discard the short-term history and the session encoder for clarification.

USER w/o History Encoder: In this variant, we remove the long-term history and the history-level transformer.

USER w/o Unified Pre-train: We skip pre-training one unified model with the training data from both tasks, but train two separate models from scratch, with integrated history sequences.

USER w/o Unified Data: With only separate task data not the unified dataset, USER degrades to USER-S and USER-R respectively.

From the results shown in Table 3 , we can observe that:

(1) Removing the session encoder or history encoder and the corresponding behavior history causes a decline in all evaluation metrics for both personalized search and recommendation tasks. This proves that both encoders mine information from the user's history to help personalization. The session encoder captures the user's consistent intention in the current session. The history encoder learns stable user interests in the long-term history. The two parts help clarify the user's current information need together.

(2) There is a decrease in the ranking results when skipping the unified pre-training, especially for the personalized search task. This confirms the benefits of more training samples constructed from both task data in our unified model. It has few impacts on the recommendation task. A possible reason is that the browsing behaviors in recommendation are usually far more frequent than search behaviors, thus the recommendation task has enough training samples. Discarding the unified data leads to a greater decline in both tasks. On a separate task dataset, only one kind of user behavior is available whether in history or training. This decline demonstrates that the integrated behavior sequence is more informative.

We further test our model and baselines on different subsets: the first search/recommend behavior of each user, and sessions with search & recommendation. The results are shown in Figure 4 , using the improvement of MAP over the original ranking as the metric.

First Search/Recommend Behavior. We claim that USER is promising to alleviate data sparsity by merging the user's search and recommendation logs. To verify this effectiveness of USER, we sample each user's first search record and recommendation record in the testing data to construct a subset. In this subset, there is little search history for each piece of search data, and little browsing history for each recommendation record. It is a cold-start case for separate personalized search and recommendation tasks. From Figure 4 (a) and (c), we find that USER trained on the unified dataset Figure 5 : Illustration of the heterogeneous behavior sequence in a session and the attention weights of the current action over historical behaviors in different models. A darker area indicates a larger weight.

outperforms the corresponding baselines with only separate search or recommendation data. In the unified situation, for the user's first search behavior with little search history, the browsing history can be a supplement for mining the user's preferences. As for the first recommendation sample, the search history can also be used as auxiliary information. Thus, we think that combining the two tasks as well as the corresponding behaviors indeed eliminates the problem of user data sparsity and the cold-start challenge. Sessions with Search & Recommendation. In this paper, we intend to explore the relatedness between the user's search and browsing behaviors to promote the two tasks. Therefore, we sample a subset comprised of sessions with both behaviors. We select several independent baselines and JSR for comparison.

From Figure 4 (b) and (d), we find that USER achieves the best on both tasks. The other joint model JSR that consists of HRNN and NRMS shows similar performance to the separate models. In USER model, we deduce the user's intent based on the integrated behavior sequence. Thus, the potential relatedness between the two kinds of behaviors can be captured to promote personalization, especially for these sessions with both behaviors. However, JSR trains two separate models through a joint loss, which might have difficulty learning the interactions between the two tasks. These results also suggest that USER copes with the unified scenario better than JSR.

In this paper, we focus on the situation with both search and recommendation services in the information content domain. We design a unified model (USER) to jointly handle the two tasks. To illustrate the advantages of our model more intuitively, we conduct a case study to analyze the user's mixed behaviors within a session. Moreover, we discuss the impacts of the user's historical behaviors on the current action in USER, HTPS and NRMS. The impacts are indicated by the attention weights. The results are in Figure 5 .

Observing the user's behaviors in the session, we find the user's preferences reflected by the search behaviors and browsing behaviors are consistent, probably about sushi, small muscle fish and Japanese jack mackerel. Besides, there is some relatedness between the two kinds of behaviors. For example, the user browses the article titled "The story of Japanese jack mackerel" in recommendation, followed by a query "Japanese jack mackerel" to seek more relevant information. Thus, integrating the two tasks together has the potential to promote each other. With the aggregated behaviors, we can mine more precise information about the user's interests to help the current ranking. Let us take the last search query "Japanese jack mackerel" as an example. Obviously, this query is strongly relevant to both the historical query "Japanese jack mackerel" and the browsed article "The story of Japanese jack mackerel". USER pays high attention to both the two strongly relevant behaviors. However, HTPS, which is proposed for the independent search case, can only attend to the historical queries, without any information about the browsing actions. With regard to the last recommendation, the historical query "small muscle fish" also reflects relevant user interests, which will be highlighted in USER. This case study fully proves the value of aggregating the two separate tasks together and our proposal of the unified model.

In this paper, we focus on the connections between the personalized search and recommendation in the information content domain, and explore an effective approach to jointly model them together. We integrate the user's search and browsing behaviors into a heterogeneous behavior sequence. Then, we propose the unified model USER. It includes encoders to mine information from the heterogeneous behavior sequence for personalization and a unified task framework to solve both tasks in a unified ranking style. We experiment with a dataset comprised of both behaviors constructed from a real-world commercial platform. The results confirm that our model outperforms the state-of-the-art separate baselines on both tasks. In the future, we will combine the two tasks better.

Neural News Recommendation with Long-and Short-term User Representations

Information Filtering and Information Retrieval: Two Sides of the Same Coin?

Inferring and using location metadata to personalize web search

Classificationenhanced ranking

Modeling the impact of short-and long-term behavior on search personalization

Latent Dirichlet Allocation

Learning to rank using gradient descent

Ranking, Boosting, and Model Adaptation

Towards query log based personalization using topic models

Wide & Deep Learning for Recommender Systems

Personalizing web search results by reading level

A large-scale evaluation and analysis of personalized search strategies

Personalizing Search Results Using Hierarchical RNN with Query-aware Attention

Graph Enhanced Representation Learning for News Recommendation

Generative Adversarial Networks

DeepFM: A Factorization-Machine based Neural Network for CTR Prediction

Building user profiles from topic models for personalised search

Graph Neural News Recommendation with Unsupervised Preference Disentanglement

KRED: Knowledge-Aware Document Representation for News Recommendations

PSGAN: A Minimax Game for Personalized Search with Limited and Noisy Click Data

Knowledge Enhanced Personalized Search

Embedding-based News Recommendation for Millions of Users

Factorization Machines

Item-based collaborative filtering recommendation algorithms

dEFEND: Explainable Fake News Detection

Web search personalization with ontological user profiles

Adapting deep RankNet for personalized search

Attention is All you Need

Context Models For Web Search Personalization

Search Personalization with Embeddings

Temporal Latent Topic User Profiles for Search Personalisation

DKN: Deep Knowledge-Aware Network for News Recommendation

Knowledge Graph Convolutional Networks for Recommender Systems

Unified Recommendation and Search in E-Commerce

Enhancing personalized search by mining and modeling task behavior

Neural News Recommendation with Attentive Multi-View Learning

NPA: Neural News Recommendation with Personalized Attention

Neural News Recommendation with Heterogeneous User Behavior

Neural News Recommendation with Multi-Head Self-Attention

A Comprehensive Survey on Graph Neural Networks

Deep Feedback Network for Recommendation

End-to-End Neural Ad-hoc Ranking with Kernel Pooling

Employing Personal Word Embeddings for Personalized Search

RLPer: A Reinforcement Learning Model for Personalized Search

Product Recommendation Based on Search Keywords

Joint Modeling and Optimization of Search and Recommendation

Learning a Joint Search and Recommendation Model from User-Item Interactions

Encoding History with Context-aware Representation Learning for Personalized Search

Enhancing Re-finding Behavior with External Memories for Personalized Search