The MOAI project tackles multimodal opinion analysis methods in human-agent multimodal interactions in order to extract information concerning user preferences. Such information is dedicated to enrich user profiles for companion robots and virtual assistants. This challenging issue has been so far rarely and partially handled by the state of the art. The proposed approach relies on Conditional Random Fields (CRF) that have been chosen for their flexibility in order to take advantage of both the generalization capability of machine learning methods and the fine-grain modeling of semantic rules. As recordings of face-to face human-agent interactions are not yet massively available, such flexible methods constitute an alternative to deep learning methods. In this promising context, the MAOI project targets two major breakthroughs: i) feature learning driven by a priori knowledge and psycho-linguistic models in order to learn users’ preferences and ii) the integration of various levels of analysis (lexical, syntactic, prosodic, dialogic) through latent variables inside hidden CRF, allowing for grounding the opinion detection in the context of human-agent interaction.