Visually-Aware Fashion Computing
Introduction & Motivation
Fashion is a multi-billion-dollar industry, with direct social, cultural, and economic implications to our society. Recently, the great demand for fashion products has motivated many applications in this domain, including fashion matching, recommendation, retrieval, and dialogue system, etc. Despite many years of research, current frameworks for fashion computing has two inherent shortcomings. The first is the inability to exploit the rich alternate fashion information for interpretable fashion matching. In particular, existing fashion matching methods primarily utilize only visual content to learn the visual compatibility and perform the matching in a latent space. These methods work like a black box and cannot reveal the reasons on why the matching is done. Moreover, the rich attributes associated with the fashion items, e.g., off-shoulder dress and black skinny jean, which describe the semantics of items in a human-interpretable way, have largely been ignored.
The second shortcoming is the inability to extract user-centric fashion knowledge. Fashion knowledge for people to dress properly concerns not only physiological needs, but also the demands of social activities and conventions. It usually involves three mutually related aspects of occasion, person and clothing. However, there are few works focusing on extracting such knowledge; thus limit the effectiveness of downstream applications, such as fashion recommendation.
To address the above shortcomings, we explore three lines of research. The first aims to extract user-centric fashion knowledge from Social Media . In this work, we propose to automatically extract user-centric and time-evolving fashion knowledge by unifying three tasks of occasion, person and clothing discovery from the massive user-generated multi-modal resources on social media. This aims to bridge the gap between fashion concept prediction and the downstream applications. Specifically, we design a contextualized fashion concept learning module to effectively capture the dependencies and correlations among different fashion concepts (occasions, clothing categories, and clothing attributes) (see Figure 1). To alleviate problem of sparse training data for labels, we enrich the learning procedure with a weak label modeling module that utilizes both the machine-labelled data and clean data.
The second line of research aims towards interpretable fashion matching with rich attributes . This work tackles the interpretable fashion matching task, aiming to inject interpretability into the compatibility modeling of items. We develop an attribute-based interpretable compatibility modeling framework (see Figure 2), which consists of three modules: a) a tree-based module that extracts the decision rules on matching prediction; b) an embedding module that learns the vector representations of rules by accounting for the attribute semantics; and c) a joint modeling module that unifies the visual embedding and rule embedding to predict the matching score. Based on this framework, we could capture the semantics of decision rules by modeling attribute interactions, and unify the strengths of visual embedding and attribute-based rule embedding by explicitly inferring the attributed-based matching patterns.
The third line of research explores knowledge enhanced neural fashion trend forecasting  (see Figure 3). This work aims to forecast fashion trends, with focus on investigating fine-grained fashion elements trends for specific user groups. We first contribute a large-scale fashion trend dataset (FIT) collected from Instagram with extracted time series fashion element records and user information. Next, in order to effectively model the time series data of fashion elements with rather complex patterns, we propose a Knowledge Enhanced Recurrent Network model (KERN), which leverages deep recurrent neural networks in modeling time-series data. Moreover, it exploits both the internal and external knowledge in fashion domain that have impacts on the time-series patterns of fashion element trends. Such incorporation of domain knowledge further enhances the deep learning model in capturing the inherent patterns of specific fashion elements and predicting the future trends.
Plans for Future Research
First, we plan to leverage multiple sources of fashion knowledge including the explicit knowledge such as the fashion taxonomy, and implicit knowledge such as the social media postings and write-ups by fashion influencers, designers, brands, ecommerce sites, and users. The integration of these knowledge is challenging but could offer us a more complete and unbiased source of information for fashion modelling and trend forecasting. Second, we plan to conduct a big data analysis to study the trends and causal relations in fashion starting from top end fashion shows, through influencer, designers and brands, down to street level fashion. Third, we will study the extraction of fine-grained fashion attributes and the development of deep learning based framework for personalized fashion search, matching, recommendation and influence.
 Yunshan Ma, Xun Yang, Lizi Liao, Yixin Cao, Tat-Seng Chua. Who, Where, and What to Wear? Extracting Fashion Knowledge from Social Media. ACM MM 2019.
 Xun Yang, Xiangnan He, Xiang Wang, Yunshan Ma, Fuli Feng, Meng Wang, Tat-Seng Chua. Interpretable Fashion Matching with Rich Attributes. SIGIR 2019.
 Yunshan Ma, Yujuan Ding, Lizi Liao, Xun Yang, Wai Keung Wong, Tat-Seng Chua. Knowledge Enhanced Neural Fashion Trend Forecasting. ACM ICMR 2020.